On Windows, file access should be done through 'W' APIs

Description

APIs like u_setDataDirectory accepts a path in |char *|. On Windows 2k or later, this means that the path cannot contain a character outside the repertoire of the current OS default codepage even though the OS (Win 2k or later) can deal with any Unicode character in a path.

Here's a scenario when this is problematic:

Suppose your product is installed in the "home directory" (on Win2k/XP, somewhere under C:\Document and Settings\<username>\Local Settings\Application Data\foo) and icu data file is also in the same directory. Let's also suppose that the OS default codepage is set to CP1252, but <username> is Russian in Cyrillic. Even though you have the path "C:\Doc...\<username>\Lo....\A...Data\foo" in UTF-16 (wchar_t*/WCHAR* on Windows), you have to convert it to the OS codepage before calling u_setDataDirectory losing '<username>' in the conversion.

With icudata in DLL, LoadLibrary ('W' version explicitly or implicitly with 'UNICODE' defined) can be used with GetProcAddress and udata_setCommonData to avoid this problem. This works, but it's one extra build-step to generate dll instead of just using dat on all platforms.

BTW, I didn't make up the above scenario. There have been bug reports on Firefox when it's not yet a Unicode application in a situation like that.

Status

Assignee

Markus Scherer

Reporter

Jungshik Shin

Labels

Reviewer

None

Time Needed

Days

Start date

None

Components

Priority

assess