We're updating the issue view to help you get more done. 

source/common/unames.c assumes 8-bit structure padding


The function calcGroupNameSetsLengths in source/common/unames.c includes the following:

groups=(uint16_t *)((char *)uCharNames+uCharNames->groupsOffset);
group=(Group *)groups;

Later code increments group with ++group. This code will not work properly unless sizeof(Group) is 6. The Group structure is defined as follows:

typedef struct {
uint16_t groupMSB,
offsetHigh, offsetLow; /* avoid padding */
} Group;

At least on debian (I am the debian maintainer for ICU), the result of sizeof(Group) on the arm platform is 8 because the structure is being padded to a 32-bit boundary. Compiling on arm with -mstructure-size-boundary=8 is sufficient to work around this problem, but that has to be manually added to CFLAGS at configure time. I'm not sure whether it is a debian arm-specific problem that structures are being padded to 32-bit boundaries, but either way, it's not really clean for ICU to be assuming that the size of that structure is 6. Ideally, the code should be modified to not make that assumption. Alternatively, there should at least be an assert(sizeof(Group) == 6). The code could also be modified to add additional padding and assume the size is 8, but that's obviously not that much better....

I'm not aware of other places in the code that may make similar assumptions about structure sizes. I found this particular case debugging an arm-only segmentation fault reported by a debian user (see http://bugs.debian.org/484138). The debian ICU packages are now passing -mstructure-size-boundary=8 explicitly for arm.

(Although I have observed this with 3.8.1, it seems to be the case on the svn trunk as well.)



Markus Scherer




Time Needed



Fix versions