DBBI fails with UTF-8 UText , temp fix for crash

Description

The word and sentence BreakIterator fails for Thai. The DBBI code is not prepared to handle a non-1:1-indexing UText.

Activity

Show:

UnicodeBot June 30, 2018 at 11:53 PM

Trac Comment 14 by —2016-10-05T23:13:19.237Z

Milestone 4.8RC deleted

UnicodeBot June 30, 2018 at 11:53 PM

Trac Comment 13 by —2011-05-10T07:21:18.651Z

#8550 is for the real fix

UnicodeBot June 30, 2018 at 11:53 PM

Trac Comment 12 by —2011-05-10T07:17:53.510Z

The dictionary break code will crash if the input text is UTF-8 because native indexes are different from UTF-16 indexes.

This ticket covers the temp fix to skip dictionary lookup for UTF-8 encoded text. It wont give the right breaks, but it's better than a crash.

Fixed

Details

Assignee

Reporter

Components

Priority

Time Needed

Days

Fix versions

Created June 28, 2018 at 5:14 PM
Updated October 3, 2018 at 11:00 PM
Resolved July 1, 2018 at 8:55 PM