We're updating the issue view to help you get more done. 

require (sep alphanum{1,8}) for pu_extensions and other_extensions

Description

According to
http://www.unicode.org/reports/tr35/#pu_extensions
We have
pu_extensions = sep [xX] (sep alphanum{1,8})* ;
and
other_extensions = [alphanum-[tTuUxX]] (sep alphanum{2,8})* ;

Since it is a * not a +, it is possible we need
to set a "-x" to "en" to form "en-x" as pu_extension or
to set a "-3" or "-a" to "en" form "en-a" or "en-e" as other_extensions.

Markus Scherer wrote in Thu, 27 Dec 2018, 12:19
"In my opinion, while the locale ID syntax permits empty extensions, they are useless (add no information), and we need not support building them."

Mark Davis wrote in 9 Jan 2019, 07:32 (1 day ago) to me, Markus, Fredrik, icu-team
"
Frank, can you file a ticket for that?

I agree that there isn't any need to have empty extensions. Two options are: forbid it syntactically, or indicate in the spec that the normalized form removes empty extensions."

So I suggest we make the following change

1 2 3 4 - pu_extensions = sep [xX] (sep alphanum{1,8})* ; - other_extensions = [alphanum-[tTuUxX]] (sep alphanum{2,8})* ; + pu_extensions = sep [xX] (sep alphanum{1,8})+ ; + other_extensions = [alphanum-[tTuUxX]] (sep alphanum{2,8})+ ;

be aware there are another issue about ABNF and EBNF in so please also put down the correct verion of ABNF when fixing it. Thanks

xpath

None

locale

None

Status

Priority

major

Assignee

Mark Davis

Reporter

TracBot

tracReporter

ftang@1d5920f4b44b27a8

Reviewer

Yoshito Umaoka

Labels

None

Components

Fix versions

phase

spec-beta