Generic "search" style for collator

Description

Deleted Component: other

It looks like we may need a generic "search" variant for collation. I have
already noted in cldrbug #2160 that the name "search" is probably best for the
collation requested there. It now looks like we may need also need a "search"
variant for Thai (separate bug to follow), and we may need them for other
locales as well. Typically these have more equivalences at primary level than
would be used for normal collation.

xpath

None

locale

None

Activity

Show:
TracBot
May 10, 2019, 7:18 AM
Trac Comment 20 by —2014-04-22T20:37:42.506Z

Milestone 1.9m2 deleted

TracBot
May 10, 2019, 7:18 AM
Trac Comment 17 by mfadl@37549ba819c10513—2010-09-07T14:04:28.000Z

Commenting on the Arabic related items in comment #15:
2- Accepted
3- Rejected, Waw with Hamza should be treated as equivalent to Alef, at primary level
4- Rejected, Yeh with Hamza should be treated as equivalent to Alef, at primary level
5- Rejected, Teh Marboota should be treated as equivalent to Teh, at primary level
6- Accepted

TracBot
May 10, 2019, 7:18 AM
Trac Comment 16 by —2010-07-21T05:18:23.000Z

Added the root search collator with the rules above. It will be expanded under : to add the Korean rules in that ticket.

TracBot
May 10, 2019, 7:18 AM
Trac Comment 15 by —2010-06-09T04:34:04.000Z

Basic requirements:

1. Treat the Thai and Lao prefix vowels 0E40-0E44 and 0EC0-0EC4 as standalone units, separate from the associated consonant.

2. Treat ALEF WITH MADDA ABOVE / HAMZA ABOVE / HAMZA BELOW as equivalent to ALEF at primary level.

3. Treat WAW WITH HAMZA ABOVE as equivalent to WAW at primary level.

4. Treat YEH WITH HAMZA ABOVE and ALEF MAKSURA as equivalent to YEH at primary level.

5. Treat TEH MARBUTA as equivalent to HEH at primary level.

6. Treat the following as ignorable at primary level: HEBREW PUNCTUATION GERESH & GERSHAYIM; ARABIC TATWEEL, and ARABIC marks 064C-0652 (DAMMATAN etc.); THAI CHARACTER PHINTHU.

Requesting the "search" collator for any locale will search all the way up to root for a "search" collator, rather than trying any other collators for the locale. So to get this behavior in any locale, say "en", you can just request a collator for the locale "en@collator=search". A given locale can override the behavior by providing its own search collator.

English name is "General-Purpose Search".

Rules for the above:

Also see additional Korean rules for search collator in :

TracBot
May 10, 2019, 7:18 AM
Trac Comment 13 by —2010-04-07T15:51:59.000Z

Approved update proposal at CLDR mtg 2010-04-07

Priority

medium

Assignee

Peter Edberg

Reporter

TracBot

Reviewer

Mark Davis

Labels