Handling of type and tfield in Locale canonicalization
This is needed to address https://unicode-org.atlassian.net/browse/ICU-21367
Here is the problem
"If the type is not included, then the type value "true" is assumed. " w/o any condition.
and then also in
"Any type or tfield value "true" is removed."
The issue is, if the input locale for the process contains type or tfield which is not "valid" , what should we do
For example, in the test cases of
<key name="ka" description="Collation parameter key for alternate handling" alias="colAlternate">
<type name="noignore" description="Variable collation elements are not reset to ignorable" alias="non-ignorable"/>
<type name="shifted" description="Variable collation elements are reset to zero at levels one through three"/>
What should the locale canonicalized into? since the only valid type for "ka" are
"noignore", and "shifted"
We should document that canonicalization doesn’t change invalid to valid, and use this as an example.
Yes, I understand those are invalid, the question is what should we do with the well-formed by invalid locale during the locale canonicalization process
Thanks for the report. I do think we need to clarify this, but the status is derivable from the text.
"und-u-ka-yes" — invalid, since ‘yes’ is not a valid value for ka
"und-u-ka-true" — invalid, since ‘true’ is not a valid value for ka
"und-u-ka" — invalid, since the value “true” is assumed whenever there is no value, and ‘true’ is not a valid value for ka