root collation: remove Cyrillic contractions

Description

We suppress most of the Cyrillic contractions in most of the Cyrillic-locale collation tailorings. The contractions make the sorting of Cyrillic base letters slower.

I propose that we remove them from the root collation and add them to tailorings for locales that need them. If the CLDR team agrees, I can also propose this for the DUCET. It would be much easier if we did not have to modify the CLDR root collation for this compared to DUCET.

The following table lists all of the Cyrillic-script CLDR locales.

= main locale =

= collation tailoring =

az_Cyrl

missing (only Latn)

be

`[АаӘәГгЕеЖжЗзІіОоӨөКкЧчЫыЭэѴѵ]`

bg

`[АаӘәГгЕеЖжЗзІіОоӨөКкУуЧчЫыЭэѴѵ]`

bs_Cyrl

imports sr

kk

`[АаӘәГгЕеЖжЗзІіОоӨөКкУуЧчЫыЭэѴѵ]`

ky

empty/same as root

mk

`[АаӘәЕеЖжЗзИиІіОоӨөУуЧчЫыЭэѴѵ]`

mn

missing

os

missing

ru

`[АаӘәГгЕеЖжЗзІіОоӨөКкУуЧчЫыЭэѴѵ]`

sah

missing

sr

`[АаӘәГгЕеЖжЗзИиІіОоӨөКкУуЧчЫыЭэѴѵ]`

tg

missing

uk

`[АаӘәГгЕеЖжЗзОоӨөКкУуЧчЫыЭэѴѵ]`

uz_Cyrl

missing

xpath

None

locale

None

Activity

Show:
TracBot
May 10, 2019, 1:41 AM
Trac Comment 12 by —2014-11-17T00:58:02.096Z

Integrated into trunk, and corresponding changes are in the ICU trunks as well:

TracBot
May 10, 2019, 1:41 AM
Trac Comment 10 by —2014-10-13T22:51:22.760Z

UTC & ISO proposals for the DUCET & CTT:

TracBot
May 10, 2019, 1:41 AM
Trac Comment 8 by —2014-10-10T19:25:03.546Z

New root collation data based on initial UCA 8 DUCET which only removes most of the Cyrillic contractions.

Tailorings adjusted.

Kyrgyz, which had an empty file, now has a tailoring, according to Wikipedia and discussed with a native speaker, Tilek Mamutov (Google). Sample Kyrgyz list of strings showing ё primary-after е:

TracBot
May 10, 2019, 1:41 AM
Trac Comment 3 by —2014-04-23T16:32:00.487Z

Should there be a followup ticket to evaluate the missing/empty collations?

TracBot
May 10, 2019, 1:41 AM
Trac Comment 2 by —2014-04-23T16:26:32.168Z

CLDR Committee has approved the concept and is in favor for this. Request Markus to make proposal to the UTC.

Priority

major

Assignee

Markus Scherer

Reporter

Markus Scherer

Reviewer

Mark Davis

Labels

None

Components

Fix versions

Phase

None