The Dictionary based break iterator code could use some refactoring and cleanup:
Give the Java and C++ code the same structure and organization, so that the code can be directly compared, and changes to one easily moved to the other. As it stands, the class structure differs, as does the distribution of various bits of functionality into methods.
Reduce the amount of nearly identical code that is replicated by copy & paste into each of the break engine classes, one each for Thai, Lao and Khmer. (The CJK one is significantly different.) It makes maintenance error-prone. Any fixes need to be manually replicated.