Some ideas for reducing the size of break iterator rule files
Use bytes rather than 16 bit values in the state table, when a byte is enough. Which it is for our standard rule types. (The ICU 60 line break table is 59 char classes by 171 states, a possible 10kB savings)
Remove fluff from the stored rule string. Remove extra spaces, unescape \u escaped non-syntax characters in the rules. Possibly store as UTF-8.
Markus is considering a byte-valued Trie table, which again would be enough for our standard break types.
Nice!
Â
I suggest we keep using this ticket for follow-up PRs after #1100 unless they have their own, very specific tickets already.
I think we should close this BUG for what we already did for 68 since we make a lot of changes in this area in 68. If we have other minor improvement we could do for post 68, we should file a different bug for those.
I agree with Frank for closing this bug as fixed. Especially since all of the original suggestions are done.