Modify the GenerateSubdivisions tool for stability


We need to modify the tool to read in the previous version's data, and keep all old IDs (in case ISO changes them).

This doesn't have to be done in CLDR 28 (since it is our first time), but MUST be done for CLDR 29.






May 10, 2019, 12:03 AM
Trac Comment 8 by —2016-02-27T07:54:44.524Z

Spot-checked a couple cases; looks good.

May 10, 2019, 12:03 AM
Trac Comment 7 by —2016-01-19T13:07:26.403Z

For review, probably the most interesting are:

  • trunk/common/validity/subdivision.xml — all valid IDs. Old IDs from last time should remain, but be deprecated.

  • trunk/common/supplemental/supplementalMetadata.xml - replacements for deprecated IDs.

  • trunk/common/supplemental/subdivisions.xml — containment information. Deprecated IDs are dropped.

  • trunk/common/subdivisions/en.xml — English names - names for deprecated subdivisions are retained.

In trunk/common/supplemental/supplementalMetadata.xml, the commented-out lines are generated cases where replacements could be added after some research (and the line uncommented).

We'd do that if:
a) the new and old code mean essentially the same thing: it was just a name change,
b) the old code was broken into 2 or more new codes,
c) the an old code is a merged into a new codes.

Where lines are just redrawn, and there is no superset/subset/equality relation, we don't add any replacements.

May 10, 2019, 12:03 AM
Trac Comment 5 by —2016-01-16T22:32:27.211Z

Also, need to research the deprecated codes, to see if any can be mapped to new codes (that is, whether the form of the code changed, but the designation remained).

May 10, 2019, 12:03 AM
Trac Comment 4 by —2016-01-16T22:22:27.055Z

Did the ordering.

However, forgot to mention that we need to modify GenerateSubdivisions so that when it generates the English file:

1. It uses the old English name if there was one.
2. Otherwise, it uses the new ISO name.
3. The generated data can then be diffed to see where the ISO name needs to be modified for consistency of style.

May 10, 2019, 12:03 AM
Trac Comment 3 by —2016-01-16T22:05:14.967Z

As it turns out, the easiest modification was to GenerateValidity instead, with smaller changes to GenerateSubdivisions.

Also modified the following (in yellow), so that needs review.

Still to do:

  • Add a test that all identifiers in the validity/X.xml file from the previous release are present in the subsequent release. (Same for BCP47).

  • Alphabetize the subdivision containment. The ISO order apparently doesn't mean anything, since they don't maintain it.

  • Redo everything after the Language Subtag registry is updated, to make sure that works.




Mark Davis


Mark Davis


Sascha Brawer


Fix versions