Fixed
Details
Details
Assignee
Markus Scherer
Markus SchererReporter
Markus Scherer
Markus SchererComponents
Labels
Priority
Time Needed
Hours
Fix versions
Created June 28, 2018 at 5:20 PM
Updated October 3, 2018 at 10:53 PM
Resolved July 1, 2018 at 8:43 PM
Java 6 RuleBasedCollator documents that the collation rule syntax characters are `[\u003A-\u0040 \u005B-\u0060 \u007B-\u007E|\u0021-\u002F]` but ICU's isSpecialChar() (C ucol_tok_isSpecialChar()) incorrectly
includes U+0020 space
excludes U+0040 at sign @
excludes U+007C vertical line |
The space should not be both whitespace and reserved for syntax.
We do use | in the LDML/ICU collation syntax, and the parser also recognizes @ (see https://unicode-org.atlassian.net/browse/ICU-9956#icft=ICU-9956) so they really should be "special".