Likely region for Malay should be Malaysia, not Togo

Description

While working on https://unicode-org.atlassian.net/browse/CLDR-9994#icft=CLDR-9994 I noticed that the likely subtag for Malay [zlm] looks very suspicious. Currently, we have the following:

<likelySubtag from="zlm" to="zlm_Latn_TG"/> <!--{ Malay (individual language); ?; ? } => { Malay (individual language); Latin; Togo }-->

However, Ethnologue writes in https://www.ethnologue.com/language/zlm:

  • Malay: A language of Malaysia

  • Alternate Names: Bahasa Malayu, Colloquial Malay, Informal Malay, Local Malay, Malayu, Melayu

  • Population: 10,500,000 in Malaysia (2004 census). 10 million in Peninsular Malaysia, 506,000 in Sarawak, and 30,000 in Labuan. L2 users: 3,000,000 in Malaysia. Total users in all countries: 18,877,700 (as L1: 15,877,700; as L2: 3,000,000).

  • Writing: Arabic script, Naskh variant [Arab]. Latin script [Latn].

Instead of the above, we should thus probably have the following:

<likelySubtag from="zlm" to="zlm_Latn_MY"/> <!--{ Malay (individual language); ?; ? } => { Malay (individual language); Latin; Malaysia }-->

Activity

Show:

Michelle Perham March 25, 2025 at 6:13 PM

All tickets need a reviewer. Assigning Sascha since she originally reported the issue.

Michelle Perham March 26, 2025 at 6:21 PM

At the end of a release we have a BRS item to confirm that all Done/Fixed issues have a reviewer assigned in Jira. I caught a few this week and in most cases assigned the person who reviewed the PR in Github.

Conrad Nied March 10, 2025 at 9:06 PM

und_Arab_SD is handled but does not appear in the list. As requested by ICU they wanted us to compress the likelySubtags.xml – in this case und_Arab_SD returns the same result as und_SD so only und_SD’s listing is kept.

und_SD → ar_Arab_SD. Standard Arabic is preferred in written internet context over vernacular Sudanese Arabic.

Mihai Nita March 10, 2025 at 8:43 PM

Shouldn’t

<likelySubtag from="und_Arab_TG" to="apd_Arab_TG"/> <!--?‧Arabic‧Togo ➡ Sudanese Arabic‧Arabic‧Togo-->

become

<likelySubtag from="und_Arab_SD" to="apd_Arab_SD"/> <!--?‧Arabic‧Sudan ➡ Sudanese Arabic‧Arabic‧Sudan-->

instead of being deleted?

Conrad Nied October 1, 2024 at 6:06 PM

Landed the fix in the DDL branch ddl/v47

David Rowe September 5, 2024 at 7:46 PM

Documentation (https://en.wikipedia.org/wiki/Malay_language , https://www.ethnologue.com/language/zlm/ ) indicates that “zlm” should have region “MY” as reporter indicates.

Apparent cause is that GenerateMaximalLocales.java has an entry “zlm_Latn_TG”. It’s not clear whether this should be deleted or changed to “zlm_Latn_MY”. Can you advise on how this should be fixed?

Note that exemplars/main/ contains exemplar information for zlm.xml and zlm_Arab.xml.

Note: SIL’s langtags also lists zlm-Latn-TG, but gets that information from CLDR.

Fixed

Details

Priority

Fix versions

Phase

pre-sub

Assignee

Reviewer

Reporter

Created January 11, 2019 at 4:57 AM
Updated March 26, 2025 at 6:21 PM
Resolved October 1, 2024 at 6:06 PM