Word/Line break, provide reasonable fallback behavior when dictionary data is Absent

Description

For word and line break, for languages or scripts that use dictionary based breaking, it should be possible to strip out the dictionary data and have the break iterator fall back to some reasonable default rule based breaking.

As things currently stand, when removing dictionary data (to save space), it may also be necessary to alter the rules, to avoid long runs with no breaks at all.

Exactly what rules should be used needs some thought/research/design.

A possible implementation might be to have the rules always return the default (no dictionary) boundaries, with a special status value that indicates which boundaries should be ignored when a dictionary is available. Text regions containing only these ignorable boundaries would be handed off to the dictionary to find the better boundaries.

Status

Assignee

Craig Cornelius

Reporter

Andy Heninger

Labels

None

Reviewer

None

Time Needed

Weeks

Start date

None

Components

Fix versions

Priority

major
Configure