DBBI fails with UTF-8 UText , temp fix for crash

General

Trac Data

Other Data

General

Trac Data

Other Data

Description

The word and sentence BreakIterator fails for Thai. The DBBI code is not prepared to handle a non-1:1-indexing UText.

Linked work items

relates to

ICU-8550

DBBI fails with UTF-8 UText or any UText with non-1:1 indexing

Activity

Show:

UnicodeBot
June 30, 2018 at 11:53 PM

Trac Comment 14 by @Markus Scherer—2016-10-05T23:13:19.237Z

Milestone 4.8RC deleted

UnicodeBot
June 30, 2018 at 11:53 PM

Trac Comment 13 by @Peter Edberg—2011-05-10T07:21:18.651Z

#8550 is for the real fix

UnicodeBot
June 30, 2018 at 11:53 PM

Trac Comment 12 by @Peter Edberg—2011-05-10T07:17:53.510Z

The dictionary break code will crash if the input text is UTF-8 because native indexes are different from UTF-16 indexes.

This ticket covers the temp fix to skip dictionary lookup for UTF-8 encoded text. It wont give the right breaks, but it's better than a crash.

Resize issue view side panel

Fixed

Details

Assignee

Andy Heninger

Reporter

Markus Scherer

Priority

major

Time Needed

Days

Fix versions

4.8

Created June 28, 2018 at 5:14 PM

Updated October 3, 2018 at 11:00 PM

Resolved July 1, 2018 at 8:55 PM

DBBI fails with UTF-8 UText , temp fix for crash

Description

Linked work items

relates to

Activity

UnicodeBot June 30, 2018 at 11:53 PM

Trac Comment 14 by @Markus Scherer—2016-10-05T23:13:19.237Z

UnicodeBot June 30, 2018 at 11:53 PM

Trac Comment 13 by @Peter Edberg—2011-05-10T07:21:18.651Z

UnicodeBot June 30, 2018 at 11:53 PM

Trac Comment 12 by @Peter Edberg—2011-05-10T07:17:53.510Z

Details

Assignee

Reporter

Priority

Time Needed

Fix versions

UnicodeBot
June 30, 2018 at 11:53 PM

UnicodeBot
June 30, 2018 at 11:53 PM

UnicodeBot
June 30, 2018 at 11:53 PM