Fix gap in unit preferences (CLDR)

Description

The unit preferences spec and code do not handle 2 cases.

  1. There is no preference data for a given Quantity, such as for the unit ampere

  2. There is no Quantity in CLDR for a given unit, like megabyte-per-minute (these are perfectly legitimate units, and can be converted, eg to bit-per-second).

In these cases, CLDR and ICU both returned null when the usage is set. That is clearly not an good solution; it requires uses to check for null (currently in ICU), and then guess an appropriate unit to map to.

Just like we fall back gracefully if there is no usage to “default”, and fallback gracefully if there is no specific region preference to “001”, I think we should fall back gracefully if there is no preference data for a unit, to effectively “001” behavior (base units). Those will be metric, and correspond to scientific usage and most non-US usage, which is the most likely meaningful result if there is no preference data.

Of course, we can add more preference data to cover these cases over time, if and when we find out that base units are not the best for some region / usage.

This corresponds to an ICU ticket. On the CLDR side, it involves spec and test generation.


I noticed this in working on CLDR-15954, and added a test in a CLDR PR for CLDR’s behavior (and ICU’s, to check out what was happening there). I fixed that gap on the CLDR side, but ICU also needs a fix.

Error: (TestUnits.java:4581) : ICU unit pref, ampere 2.5 default en: expected "ampere", got null Warning: (TestUnits.java:4589) #  an input unit whose quantity has no preference data should get base units Error: (TestUnits.java:4581) : ICU unit pref, kilocandela 1.0 default en: expected "candela", got null Warning: (TestUnits.java:4589) #  an input unit whose quantity has no preference data should get base units Error: (TestUnits.java:4581) : ICU unit pref, candela-per-byte 1.0 default en: expected "candela-per-bit", got "This unit does not has a categorynull" Warning: (TestUnits.java:4589) #  an input unit that has no quantity should get base units Error: (TestUnits.java:4581) : ICU unit pref, candela-per-cubic-foot 1.0 default en: expected "candela-per-cubic-meter", got "This unit does not has a categorynull" Warning: (TestUnits.java:4589) #  an input unit that has no quantity should get base units

The fix for https://unicode-org.atlassian.net/browse/CLDR-15954 will land a new test data file that can be used to in ICU tests in the future (it was used to generate the above. For now, it would be good to fall back gracefully for the two known cases.

It is currently:

# Format: # input-unit; amount; usage; languageTag; expected-unit; expected-amount # comment # # • The amounts are both rationals # • The comment is optional (if it isn't present the # can be omitted) # # Use: Convert the Input amount & unit according to the Usage and Locale. # The result should match the Expected amount and unit. # # The input and expected output units are unit identifers; in particular, the output does not have further processing: # • no localization fahrenheit; 1; default; en-u-rg-uszzzz-ms-ussystem-mu-celsius; celsius; -155/9 # mu > ms > rg > (likely) region fahrenheit; 1; default; en-u-rg-uszzzz-ms-ussystem-mu-celsius; celsius; -155/9 fahrenheit; 1; default; en-u-rg-uszzzz-ms-metric; celsius; -155/9 fahrenheit; 1; default; en-u-rg-dezzzz; celsius; -155/9 fahrenheit; 1; default; en-DE; celsius; -155/9 # explicit region > likely region fahrenheit; 1; default; en-US; fahrenheit; 1 fahrenheit; 1; default; en; fahrenheit; 1 # likely region = US gallon-imperial; 2.5; fluid; en-u-rg-uszzzz-ms-metric; liter; 11.365225 gallon-imperial; 2.5; fluid; en-u-rg-dezzzz; liter; 11.365225 gallon-imperial; 2.5; fluid; en-DE; liter; 11.365225 gallon-imperial; 2.5; fluid; en-US-u-rg-uszzzz-ms-uksystem; gallon-imperial; 2.5 # ms-uksystem should behave like GB gallon-imperial; 2.5; fluid; en-u-rg-gbzzzz; gallon-imperial; 2.5 gallon-imperial; 2.5; fluid; en-GB; gallon-imperial; 2.5 gallon-imperial; 2.5; fluid; en-u-rg-uszzzz-ms-ussystem; gallon; 1,420,653,125/473176473 gallon-imperial; 2.5; fluid; en-u-rg-uszzzz; gallon; 1,420,653,125/473176473 gallon-imperial; 2.5; fluid; en-US; gallon; 1,420,653,125/473176473 gallon-imperial; 2.5; fluid; en; gallon; 1,420,653,125/473176473 # likely region = US ampere; 2.5; default; en; ampere; 2.5 # an input unit whose quantity has no preference data should get base units pound-force-foot; 12,345; default; en; kilowatt-hour; 0.004649325714486427205 kilocandela; 1; default; en; candela; 1,000 # an input unit whose quantity has no preference data should get base units candela-per-byte; 1; default; en; candela-per-bit; 0.125 # an input unit that has no quantity should get base units candela-per-cubic-foot; 1; default; en; candela-per-cubic-meter; 1,953,125,000/55306341 # an input unit that has no quantity should get base units foot; 1; default; de-u-mu-celsius; centimeter; 30.48 # a -mu unit that is not convertible from the input unit should get ignored #pound; 28; default; en-u-mu-stone; stone; 2 # only temperature units are supported

clones

Activity

Mark Davis March 28, 2024 at 3:55 PM

Annemarie Apple 🍎 March 27, 2024 at 4:38 PM

Accepted per CLDR TC meeting 2024-03-27

Fixed by Other Ticket

Details

Priority

Assignee

Reporter

Fix versions

Components

Created March 18, 2024 at 5:58 PM
Updated March 28, 2024 at 3:55 PM
Resolved March 28, 2024 at 3:55 PM