collation starred relations should only contain NFD-inert characters

Description

When we introduced collation starred relations (compact syntax), I think we said we wanted to forbid characters that are not NFD-inert, so that there is no ambiguity when rule strings are decomposed or otherwise normalized. If we still have consensus on this, then we should document and enforce it.

For example, fa.txt contains a starred rule with decomposable characters: `<<أٲإٳؤ` In escaped form, it is `<<\u0623\u0672\u0625\u0673\u0624`. U+0623 ARABIC LETTER ALEF WITH HAMZA ABOVE and U+0625 ARABIC LETTER ALEF WITH HAMZA BELOW are composites.

xpath

None

locale

None

Activity

Show:
TracBot
May 10, 2019, 2:19 AM
Trac Comment 1 by —2013-10-16T15:19:00.486Z

TC agrees with Markus's assessment.

TracBot
May 10, 2019, 2:19 AM
Trac Comment 3 by —2014-03-06T18:31:21.472Z

Filed : to add a test for this.

TracBot
May 10, 2019, 2:19 AM
Trac Comment 4 by —2014-04-22T20:38:05.505Z

Milestone 25rc deleted

Priority

medium

Assignee

Markus Scherer

Reporter

Markus Scherer

Reviewer

Peter Edberg

Labels

None

Components

Fix versions

None

phase

None
Configure