Add collation test to verify that specific characters are present in zh stroke & pinyin

Description

Some recent issues in the tools that process Unihan data to generate CLDR collation/zh.xml and transforms/Han-Latin.xml have occasionally resulted in certain basic characters going missing from the stroke and pinyin collations and from the Han-Latin transform, see https://unicode.org/cldr/trac/ticket/10497.
Often these are basic characters (in the 4E00-9FFF block) that are similar to radicals in the CJK radicals block (2E80-2EFF).

This ticket is to add a sanity check test that at least some of these characters are present in the CJK stoke and pinyin collations. For example:

Activity

Show:
Peter Edberg
March 25, 2020, 3:56 PM

Yeah, the CLDR process is different, I was using that, sorry

Markus Scherer
March 25, 2020, 3:16 PM

FYI No need to put the ticket into “Reviewing” state if you have a PR that covers it. When the PR is approved & merged, simply close the ticket yourself.

http://site.icu-project.org/repository/gitdev#TOC-Review-commit-to-Unicode-master

  • If this was the last commit to finish work on the ticket, then go to Jira and close the ticket as Fixed.

  • You can optionally have someone (probably the same person as your PR assignee) review the ticket as well, but that's not normally necessary.

  • (We normally use ticket reviews for non-code changes, such as a non-coding task or a web site update for the User Guide etc.)

 

TracBot
July 1, 2018, 12:11 AM
Trac Comment 1 by —2018-05-23T18:23:09.018Z

Add lines to data driven collation test

Fixed
Your pinned fields
Click on the next to a field label to start pinning.

Assignee

Peter Edberg

Reporter

Peter Edberg

Components

Reviewer

Markus Scherer

Priority

medium

Time Needed

Hours

Fix versions