A few years ago (ticket #8322), I changed the UnicodeString structure to maximize the number of characters that we can store inside the object. I used a union of either a stack buffer or a struct of pointer/capacity/length fields, with some of the stack buffer inside the union and the remainder following it; when writing a short string with more than 8 characters, it overflows from the union array to the "rest" array.
Address sanitizers/memory checkers do not like such overflows; for example, see https://code.google.com/p/address-sanitizer/wiki/IntraObjectOverflow. Also, a C++ compiler is allowed to reorder the fields (although there would be a bunch of code that would fail with that), or do something else.
Try to find a more acceptable way to keep the object size and internal capacity the same.
Rather than split the stack buffer, duplicate the fShortLength and fFlags, so that they are at the same offsets in each half of the union. This is important: The fFlags tell us which part of the union to use. If they are inside each half of the union, then they have to be at the same offset so that we can read them without already knowing which half to use. Same for fShortLength. We want to read the length without determining the storage type.
We could add some compile-time check or a U_ASSERT to make sure that both versions of fShortLength and fFlags are at the same offsets.
I see that this is exactly what Andy suggested in 2011-Feb, see #comment:1.
Note: Further compiler optimizations may be possible if the fStackFields.fBuffer begins on a word (pointer-size-aligned) boundary, but then the fShortLength and fFlags would have to go after the stack buffer and we would need to add explicit padding in fFields to achieve that alignment. Not sure if optimizations on very short strings would be worth the harder-to-understand structure. It would probably also fail when an address sanitizer adds "poisoned gaps (redzones) between fields in the class".
Committed to trunk, and Andy already approved the changes.