We're updating the issue view to help you get more done. 

uset_addString bug

Description

Currently, we have:
U_CAPI void U_EXPORT2
uset_addString(USet* set, const UChar* str, int32_t strLen) {
UnicodeString s(strLen==-1, str, strLen);
((UnicodeSet*) set)->add(s);
}

This is wrong, as s will become a read-only alias UnicodeString. Even when we
copy it (as it is done downstream), the copy will still be an alias. The problem
is visible in the following snippet:
UChar a[4];
a[0] = 0x61;
a[1] = 0x62;
a[2] = 0x63;
a[3] = 0;
USet *s = uset_open(1, 0);
uset_addString(s, a, 3);
memset(a, 0xFE, 4*sizeof(UChar));
UChar be[50];
uset_toPattern(s, be, 50, TRUE, &status);

After it, be should contain "[{abc}]" but it contains "[{\uFEFE\uFEFE\uFEFE}]"
instead. Temporary solution is to change the constructor in the function to a
non-aliasing one.
We should also explore the possibility of adding a copy ctor that removes
aliasing.

Status

Assignee

weivsara@gmail.com

Reporter

TracBot

Labels

Reviewer

None

Time Needed

None

Start date

None

Components

Fix versions

Priority

blocker