We're updating the issue view to help you get more done. 

Language-sensitive titlecasing

Description

Different languages may have different conventions for titlecasing, such as
Dutch "IJzer", or even English cases such as "Buffy the Vampire Slayer" (where
"t" is not capitalized). Consider adding structure that allows implementations
to do better titlecasing. (In general, there will still be cases that can't be
handled algorithmically, but this would allow much better treatment than
currently available.)

One possibility: use the same structure as segmentation, whereby the text is
segmented immediately prior to any point with initial capitals. Thus, for
example, it would return "|I|Jzer" or "|Buffy the |Vampire |Slayer". That would
allow us to use the same mechanisms as already exist.

Environment

None

xpath

None

locale

None

Status

Assignee

Mark Davis

Reporter

TracBot

Labels

tracReporter

tracOwner

mark

tracResolution

None

tracStatus

accepted

Reviewer

None

phase

None

tracCc

deborah,markus

tracCreated

Jun 14, 2006, 7:00 PM

Components

Priority

major