collation rules: last variable after setting maxVariable

Description

Deleted Component: xxx-spec

Consider the rules `&[variable|last]<x [symbol|maxVariable] &[variable|last]<y`. (Using the future `maxVariable` syntax; equivalent to tailoring the `[top|variable]` in ICU 51.)

Should both x and y go into the same "punct" reordering group, or should y go into the "symbol" group?

ICU currently uses the root collator's default variableTop even if a preceding rule has modified it. The LDML spec agrees: "The value can be further changed by using the variable-top setting. This takes effect, however, after the rules have been built, and does not affect any characters that are reset relative to the last-variable value when the rules are being built."

However, this seems wrong. The core principle of collation tailoring is that each rule builds on the current state after the preceding rules, so a reset to [variable|last] or [regular|first] should take the current maxVariable/variableTop into account.

It looks easy to make the builder code follow the core principle.

xpath

None

locale

None

Activity

Show:
TracBot
May 10, 2019, 3:08 AM
Trac Comment 6 by —2013-06-21T15:19:42.820Z

Mark and I talked in person recently, and I had thought about implementation possibilities. I will try to make `&[variable|last]` and `&[primary ignorable|last]` etc. adjust to what is the first/last after the preceding rules, like normal rules but unlike the current ICU behavior. If it is indeed not too onerous to do so, I will propose it for the LDML spec.

I still plan to drop the setting of `[top|variable]` via rules.

I still plan to add the maxVariable setting, and not have it affect `&[variable|last]` but only have it take effect at runtime, as if set via runtime API.

TracBot
May 10, 2019, 3:08 AM
Trac Comment 7 by —2013-07-19T04:42:00.085Z

I implemented the desired behavior (comment:6) in my ICU collv2 branch; it required a small-ish amount of extra code. I added the following test cases.

TracBot
May 10, 2019, 3:08 AM
Trac Comment 9 by —2013-09-04T06:11:54.689Z

Will document maxVariable in CLDR 25, together with DTD changes for the new <settings> attribute.

TracBot
May 10, 2019, 3:08 AM
Trac Comment 11 by —2014-02-21T21:19:41.061Z

No specific changeset here:

Done via r9755, and as for comment:6 the spec already says "Each special reset position adjusts to the effects of preceding rules, just like normal reset position strings. For example, if a tailoring rule creates a new collation element after `&[variable|last]` (via explicit tailoring after that, or via tailoring after the relevant character), then this new CE becomes the new last variable CE, and is used in following resets to `[variable|last]`."

TracBot
May 10, 2019, 3:08 AM
Trac Comment 14 by —2014-04-22T20:38:05.505Z

Milestone 25rc deleted

Priority

medium

Assignee

Markus Scherer

Reporter

Markus Scherer

Reviewer

Peter Edberg

Labels

Components

None

Fix versions

None

Phase

None