change Armenian (hy) uppercase of և


Background: CLDR-6736

  • Unicode SpecialCasing.txt specifies և → ԵՒ

  • This is also used in the Armenian diaspora (hy-AREVMDA).

  • In Armenia (hy), և → ԵՎ is used.

I suggest that we change the low-level uppercasing (and titlecasing?) for language hy to use the form that is used in Armenia, but probably (at least initially) without code to check for the Western variant; the root behavior should continue to conform to the Unicode Standard.



Markus Scherer
October 9, 2018, 9:38 PM

There is now a new language subtag, so we need not check for the variant:

Type: language
Subtag: hy
Description: Armenian
Added: 2005-10-16
Suppress-Script: Armn
Comments: see also hyw

Type: language
Subtag: hyw
Description: Western Armenian
Added: 2018-03-08
Comments: see also hy

Type: variant
Subtag: arevela
Description: Eastern Armenian
Added: 2006-09-18
Deprecated: 2018-03-24
Preferred-Value: hy
Prefix: hy

Type: variant
Subtag: arevmda
Description: Western Armenian
Added: 2006-09-18
Deprecated: 2018-03-24
Preferred-Value: hyw
Prefix: hy

Markus Scherer
July 29, 2020, 10:52 PM

Newer GoogleIssue:146899260

Jungshik reported this to Unicode on 2013-oct-04. It was discussed in the 2013q4 UTC meeting, and the UTC decided at that time not to change the Unicode Standard case mappings, and punted the item to the following meeting. The 2014q1 meeting was busy and this topic was not taken up. There had been no recorded disposition.

This has come up again recently, but I had forgotten about the existing CLDR and ICU tickets. I took it to Unicode again, we got some expert feedback (, and I wrote about it in the UTC #164 report from the properties & algorithms group ( We discussed it today in the UTC #164 meeting.

Changing the default (root) case mapping behavior in Unicode is problematic because it affects the identity of the ech-yiwn character, and because the uppercase and titlecase mappings would be inconsistent with the Decomposition_Mapping and the Case_Folding which cannot be changed due to stability policies.

Changing the root case mapping behavior in ICU would make ICU root case mappings not conformant with Unicode.

The recommendation is to do what I said originally in this ticket: Change the ICU case mapping for language “hy” (only) to ech+vew. This makes ICU technically non-conformant with Unicode for this language, but we already have special case mappings for Dutch and Greek that are not covered by Unicode SpecialCasing.txt, UTC participants feel that that’s ok, and SpecialCasing.txt already has a disclaimer leaving tailorings up to CLDR.

Note that the best CLDR could do is provide transform rules with such behaviors, as a reference implementation, but not necessarily intended for runtime use.

Note also that there is a newer language subtag “hyw” for Western Armenian, so we need not look at variant subtags. I do not plan to avoid the new behavior for “hy-AREVMDA“.

Markus Scherer
July 29, 2020, 11:00 PM

For reference: (minutes)

[137-A45] Action item for Roozbeh Pournader: Report back to Jungshik Shin re his feedback on Oct 2 2013, about classification of comma variants, and Armenian.

[137-A46] Action item for Rick McGowan: Copy all feedback on the Armenian issue only (Jungshik Shin, dated Oct 4, 2013) into another document for review at the next UTC meeting. Include a reference to CLDR ticket 6736, for the committee. (feedback, deferred item from last time; first error report here)

Markus Scherer
July 30, 2020, 8:59 PM

Also GoogleIssue:131433116

Markus Scherer
August 29, 2020, 9:33 PM


Markus Scherer


Markus Scherer







Time Needed


Fix versions