Consider the rules `&[variable|last]<x [symbol|maxVariable] &[variable|last]<y`. (Using the future `maxVariable` syntax; equivalent to tailoring the `[top|variable]` in ICU 51.)
Should both x and y go into the same "punct" reordering group, or should y go into the "symbol" group?
ICU currently uses the root collator's default variableTop even if a preceding rule has modified it. The LDML spec agrees: "The value can be further changed by using the variable-top setting. This takes effect, however, after the rules have been built, and does not affect any characters that are reset relative to the last-variable value when the rules are being built."
However, this seems wrong. The core principle of collation tailoring is that each rule builds on the current state after the preceding rules, so a reset to [variable|last] or [regular|first] should take the current maxVariable/variableTop into account.
It looks easy to make the builder code follow the core principle.
Mark and I talked in person recently, and I had thought about implementation possibilities. I will try to make `&[variable|last]` and `&[primary ignorable|last]` etc. adjust to what is the first/last after the preceding rules, like normal rules but unlike the current ICU behavior. If it is indeed not too onerous to do so, I will propose it for the LDML spec.
I still plan to drop the setting of `[top|variable]` via rules.
I still plan to add the maxVariable setting, and not have it affect `&[variable|last]` but only have it take effect at runtime, as if set via runtime API.
I implemented the desired behavior (comment:6) in my ICU collv2 branch; it required a small-ish amount of extra code. I added the following test cases.
Will document maxVariable in CLDR 25, together with DTD changes for the new <settings> attribute.
No specific changeset here:
Done via r9755, and as for comment:6 the spec already says "Each special reset position adjusts to the effects of preceding rules, just like normal reset position strings. For example, if a tailoring rule creates a new collation element after `&[variable|last]` (via explicit tailoring after that, or via tailoring after the relevant character), then this new CE becomes the new last variable CE, and is used in following resets to `[variable|last]`."
Milestone 25rc deleted