We're updating the issue view to help you get more done. 

RFE locale-specific RBBI rules for French

Description

The BreakIterator.getWordInstance() does not split up French contractions.
"l'homme" is treated as one word, whereas it should be tokenized as "le" +
"homme", or "l" + "homme". For the complete set of rules, see
http://french.about.com/library/pronunciation/bl-contractions.htm

Status

Assignee

TracBot

Reporter

TracBot

Labels

Time Needed

Days

Components

Fix versions

Priority

major