We're updating the issue view to help you get more done. 

uset_addString bug

Description

Currently, we have:
U_CAPI void U_EXPORT2
uset_addString(USet* set, const UChar* str, int32_t strLen) {
UnicodeString s(strLen==-1, str, strLen);
((UnicodeSet*) set)->add(s);
}

This is wrong, as s will become a read-only alias UnicodeString. Even when we
copy it (as it is done downstream), the copy will still be an alias. The problem
is visible in the following snippet:
UChar a[4];
a[0] = 0x61;
a[1] = 0x62;
a[2] = 0x63;
a[3] = 0;
USet *s = uset_open(1, 0);
uset_addString(s, a, 3);
memset(a, 0xFE, 4*sizeof(UChar));
UChar be[50];
uset_toPattern(s, be, 50, TRUE, &status);

After it, be should contain "[{abc}]" but it contains "[{\uFEFE\uFEFE\uFEFE}]"
instead. Temporary solution is to change the constructor in the function to a
non-aliasing one.
We should also explore the possibility of adding a copy ctor that removes
aliasing.

Environment

Status

Assignee

weivsara@gmail.com

Reporter

TracBot

Labels

tracCreated

Aug 30, 2002, 1:20 AM

tracOwner

weiv

tracProject

ICU4C,ICU4J and ICU4JNI

tracReporter

weiv@a95c9666650cfc8d

tracResolution

fixed

tracReviewer

markus

tracStatus

closed

Components

Fix versions

Priority

blocker