XML data files may be broken due to CLDR github migration


contains a link http://www.unicode.org/repos/cldr/trunk/common/dtd/ldmlICU.dtd to DTD file in CLDR.

The link points to the old SVN repository, but it's now migrated to github, and returns HTTP 302 redirect to https://github.com/unicode-org/cldr/blob/master/common/dtd/ldml.dtd. Apparently, it's an HTML page.

Here is the error. If I change the link to https://raw.githubusercontent.com/unicode-org/cldr/master/common/dtd/ldmlICU.dtd, looks like the error will be fixed.


David Beaumont
September 18, 2019, 5:39 PM

TL;DR: No, it’s fine.

That version number relates to the version of CLDR from which the DTD information for the specials ICU comes. It’s not necessarily the same as the version of the data which is being processed and doesn’t need to be updated unless the ICU “specials” data needs to access new things in its schema.

While not finished yet, there’s some progress towards decoupling the core CLDR data and the non-CLDR “specials” data in terms of DTDs and other semantics.

Neil Fuller
September 18, 2019, 7:49 AM

I think the commit has added a bit of a maintenance headache; the path encoded in all the files is CLDR version-specific. Updating to CLDR 35.2 / 36 will mean needing to change all the xml files again...?

It could also make version control ops like cherry-picking from one branch to the next more difficult.

