Language-sensitive titlecasing

Description

Different languages may have different conventions for titlecasing, such as
Dutch "IJzer", or even English cases such as "Buffy the Vampire Slayer" (where
"t" is not capitalized). Consider adding structure that allows implementations
to do better titlecasing. (In general, there will still be cases that can't be
handled algorithmically, but this would allow much better treatment than
currently available.)

One possibility: use the same structure as segmentation, whereby the text is
segmented immediately prior to any point with initial capitals. Thus, for
example, it would return "|I|Jzer" or "|Buffy the |Vampire |Slayer". That would
allow us to use the same mechanisms as already exist.

xpath

None

locale

None

Status

Priority

major

Assignee

Mark Davis

Reporter

TracBot

tracReporter

Reviewer

None

Labels

Components

Fix versions

None

phase

None
Configure