Skip to:
(This is the emoji component just because that's where the annotations are)
I noticed that the keywords for the symbols added in v38 are pretty bad. For example,
<annotation cp="↸">north west arrow long bar</annotation><annotation cp="↸" type="tts">north west arrow long bar</annotation>
This means a search for that character will only match if the entire name (north west arrow long bar) is matched. It should be something like:
<annotation cp="↹">leftwards | arrow | bar | rightwards</annotation>
The names could also be improved, eg "north west" => northwest. Remember also that names and keywords are not restricted to ASCII letters.
The file is:
https://github.com/unicode-org/cldr/blob/master/common/annotations/en.xml
The relevant items appear to be all before line 721, and have non-tts lines without | in them.
This issue has been bulk punted to 48 and will be re-triaged. If you feel strongly it should be fixed, please advocate for it.
CLDR TC accepted, 2024-05-08
Moving to esub for now; needs to be assessed for version and phase
Conclusion from design group:
To make things simple, we could change space to “ | “ in all non-emoji search keywords. That would at least be better than current behavior.
Then fixup odd cases like north | west => northwest
Split out a ticket for Andrew Glass to improve readability of tts for non-emoji.
Bulk moving all issues to the next version which aren't in component type: brs, charts, docs, docs-spec
(This is the emoji component just because that's where the annotations are)
I noticed that the keywords for the symbols added in v38 are pretty bad. For example,
<annotation cp="↸">north west arrow long bar</annotation>
<annotation cp="↸" type="tts">north west arrow long bar</annotation>
This means a search for that character will only match if the entire name (north west arrow long bar) is matched. It should be something like:
<annotation cp="↹">leftwards | arrow | bar | rightwards</annotation>
The names could also be improved, eg "north west" => northwest. Remember also that names and keywords are not restricted to ASCII letters.
The file is:
https://github.com/unicode-org/cldr/blob/master/common/annotations/en.xml
The relevant items appear to be all before line 721, and have non-tts lines without | in them.