ICU4C default locale for "LANG=C.UTF-8" changed in v64 to "c" instead of "en_US_POSIX"

Description

Using ICU4C on Linux, with version 63 both "C" and "C.UTF-8" are treated the same and both are mapped to "en_US_POSIX".

With version 64, only "C" is mapped to "en_US_POSIX". Now "C.UTF-8" maps to just "c".

This means that “C.UTF-8” is effectively no different than a default locale of “bogus“ with ICU 64, and thus everything would fallback to root.

You can reproduce the issue with the following command, and examine the value of the:
"<param name="locale.default">" from the “icuinfo” program:

1 LANG=C.UTF-8 icuinfo

vs

1 LANG=C icuinfo

Here is the difference in table form, for 57, 63 and 64:

LANG

ICU 57

ICU 63

ICU 64

C

en_US_POSIX

en_US_POSIX

en_US_POSIX

C.UTF-8

en_US_POSIX

en_US_POSIX

c

From looking at the changes in ICU 64, it looks like this is due to the change for ICU-20187.

However, it is unclear if this is intentional or not.

Based on the comments on the ICU 64 API Proposal doc, it seems like the intent was to keep some of the POSIX mappings, but I'm not sure if this also included the C locale with the codepage as part of the name as well. (Though it does seem odd to support mapping for "C" but not "C.UTF-8").

Status

Assignee

Steven R. Loomis

Reporter

Jeff Genovy

Labels

None

Reviewer

Jeff Genovy

Time Needed

Minutes

Start date

None

Components

Fix versions

Priority

major
Configure