Open issues

Word Break rules, $dictionary should not include Hangul characters
ICU-20812
ICU4C: Add support for building the samples with ARM and ARM64 (non-UWP)
ICU-20788
umutex.h:52:17: error: 'dllexport' attribute ignored on explicit instantiation definition [-Werror,-Wignored-attributes]
ICU-20770
Build failure with /Zc:twoPhase and vs2017
ICU-20733
Save supplementalMetadata aliases instead of ICU deprecates in locale_dependencies?
ICU-20697
Change transliterator IDs from keys to values in resource bundle
ICU-20695
u_sprintf bug
ICU-20691
Build script fails at multiple points in Windows Cygwin/MSVC environment
ICU-20677
Transliterator createFromRules should work with no translit data
ICU-20673
Add support to generate ICU data DLL for Windows arm64
ICU-20670
Add data/resource tracing to ICU4J
ICU-20656
uloc.cpp _findIndex should use binary search
ICU-20644
Add valgrind continuous build
ICU-20626
Building on Windows fails (Cygwin with MSVC compiler, not GCC or Clang)
ICU-20614
CLONE - C API for UNumberRangeFormatter
ICU-20583
Provide a static method to enumerate all possible KEYWORDS
ICU-20546
Compilation with MSYS2 + MSVC 2019 "Couldn't create the udata mappings/cns-11643-1992.cnv"
ICU-20545
udata_setCommonData checks should be more robust around versioning
ICU-20504
Make tzID in TimeZone::createTimeZone case insensitve
ICU-20483
Change StaticUnicodeSets to automatically load new parseLenients
ICU-20428
icu4c 63.1 date format is significantly slower as compared to icu4c 56.
ICU-20427
Enhance tests for U_HIDE_ conditionals to catch issues like ICU-20306
ICU-20403
Add ECMAScript-specific APIs to NumberFormatter
ICU-20349
Enable IEEE floating-point interpretation of doubles
ICU-20341
Locale::forLanguageTag() lost other value in -x while there are "lvariant" in it.
ICU-20327
Allow Formattable to wrap user-defined objects
ICU-20275
Calendar.fieldDifference() bug
ICU-20268
C API for UNumberRangeFormatter
ICU-20261
Use EUC-JP table (for JIS X 208 portion) in ISO-2022-JP converter
ICU-20251
Add support for custom currencies and currency symbols to NumberFormatter
ICU-20218
Add cherry-pick tooling to jira-ticket web UI
ICU-20201
@internal ≠ do not use (redux)
ICU-20190
Change hard coded "no data" data
ICU-20131
Add validation of LanguageTags to ICU4C (C and C++)
ICU-20105
Add 32 bit ICU4C build for Linux to the CI build system
ICU-20101
Document ownership behavior of ures_getByKeyWithFallback and related methods
ICU-20064
Cygwin build in bad shape
ICU-20060
"ucnv_toUChars" function returns different output (destLength) in ICU 61.1 from ICU 59.1
ICU-20045
CollatorServiceShim$CService holds strong reference of collation data permanently
ICU-20031
Currencies with $ decimal (PTE, CVE) have wrong field types
ICU-20028
Improve plurals support on compact decimal notation
ICU-13836
Add Normalizer2.regionMatches
ICU-13819
Limit environment test to selected locales
ICU-13806
Add getLocale method to LocalizedNumberFormatter?
ICU-13767
Portable NativeClient double conversion support
ICU-13750
API to undo forced Java Locale usage of deprecated language codes
ICU-13715
Add a C wrapper for TimeZone class
ICU-13706
Move BasicTimeZone::getOffsetFromLocal() to its parent TimeZone
ICU-13705
SPELLOUT formatting regression for fractional digits in pl-PL in ICU4J 57.1
ICU-13688
Add an option to uconv to reverse transliteration
ICU-13680
issue 1 of 2170

Word Break rules, $dictionary should not include Hangul characters

Description

The definition for the $dictionary characters in the word break rules includes Hangul characters. It probably should not, as there is no associated dictionary for Hangul. (The $dictionary set defines the set of characters that the rbbi engine will dispatch to dictionary based breaking when encountered during rule based breaking.)

See https://github.com/unicode-org/icu/blob/master/icu4c/source/data/brkitr/rules/word.txt#L64

Some additional investigation is needed to see exactly how Hangul sequences are broken into words now, whether it is correct, and whether it changes when changing the $dictionary definition.

Status

Assignee

Craig Cornelius

Reporter

Andy Heninger

Labels

None

Reviewer

None

Time Needed

Days

Start date

None

Components

Fix versions

Priority

assess
Configure