Clarify interpretation of missing or duplicate types for -u- extension keys


Deleted Component: xxx-spec

RFC 6067 specifies that key subtags in a Unicode extension subtag sequence may be followed by zero or more type subtags. UTS 35 specifies the permissible type subtag values for each key subtag, and also specifies the interpretation of a type subtag sequence following the vt key.

However, I haven't been able to find a specification saying
(a) what should happen if a key subtag isn't followed by any type subtag,
(b) what should happen if a key subtag other than vt is followed by multiple valid type subtags.

I originally contacted Mark with the comments above, and he replied with the following information (and asked to file this ticket):

The values for each key are specified. Currently all of the keys require exactly one value, with the following exceptions. However, that is not clearly stated in a single place in the LDML doc.

no value:

key/type definitions are discussed below. For information on the process for adding newkey/type, see [LocaleProject]. If the type is not included, and one of the possible type values is "true", then that value is assumed. Note that the default for key with a possible "true" value is often "false", but may not always be.

more than one value:

The type name "CODEPOINTS" is reserved for a variable representing Unicode code point(s). The syntax is:
In addition, no codepoint may exceed 10FFFF. For example, "00A0", "300b", "10D40C" and "00C1-00E1" are valid, but "A0", "U060C" and "110000" are not.






May 10, 2019, 5:17 AM
Trac Comment 2 by —2012-02-02T18:50:42.000Z

There was a description about missing type case in "Language/Locale Field Definitions" table in section 3.
I added some explanations of valid/invalid use cases of multi-subtag type value.

Your pinned fields
Click on the next to a field label to start pinning.




Yoshito Umaoka



Mark Davis

Fix versions