On Windows, file access should be done through 'W' APIs

Description

APIs like u_setDataDirectory accepts a path in |char *|. On Windows 2k or later, this means that the path cannot contain a character outside the repertoire of the current OS default codepage even though the OS (Win 2k or later) can deal with any Unicode character in a path.

Here's a scenario when this is problematic:

Suppose your product is installed in the "home directory" (on Win2k/XP, somewhere under C:\Document and Settings\<username>\Local Settings\Application Data\foo) and icu data file is also in the same directory. Let's also suppose that the OS default codepage is set to CP1252, but <username> is Russian in Cyrillic. Even though you have the path "C:\Doc...\<username>\Lo....\A...Data\foo" in UTF-16 (wchar_t*/WCHAR* on Windows), you have to convert it to the OS codepage before calling u_setDataDirectory losing '<username>' in the conversion.

With icudata in DLL, LoadLibrary ('W' version explicitly or implicitly with 'UNICODE' defined) can be used with GetProcAddress and udata_setCommonData to avoid this problem. This works, but it's one extra build-step to generate dll instead of just using dat on all platforms.

BTW, I didn't make up the above scenario. There have been bug reports on Firefox when it's not yet a Unicode application in a situation like that.

Activity

Show:
TracBot
June 30, 2018, 11:53 PM
Trac Comment 2 by —2009-05-26T00:01:46.000Z

Note: Using udata_setCommonData() does not require a data DLL. You can manually memory-map the .dat package and pass the pointer to udata_setCommonData().

ICU "only" offers DLL packaging for its data because it provides a popular no-configuration way of using ICU. (You can just dump all of the DLLs in a common folder and ICU finds its data.) If you don't care for no-configuration, we recommend using .dat packages.

TracBot
June 30, 2018, 11:53 PM
Trac Comment 5 by nuskooler@f74d39fa044aa309—2013-04-17T15:40:39.497Z

This bug has been around forever – is anyone planning on working on it? This can make it quite hard to use ICU in a internationalized product on modern Windows OS's.

TracBot
June 30, 2018, 11:53 PM
Trac Comment 5.6 by —2013-04-17T21:40:07.098Z

Replying to (Comment 5 nuskooler@…):

This bug has been around forever – is anyone planning on working on it? This can make it quite hard to use ICU in a internationalized product on modern Windows OS's.

Not super hard: See the workarounds documented in the ticket description and in comment 2.

TracBot
June 30, 2018, 11:53 PM
Trac Comment 8 by —2014-09-22T22:58:44.378Z

Chrome now uses what Markus wrote in comment 2. That is, we use udata_setCommonData() to load the data from the icu dat file on Windows and all other platforms.

Assignee

Jeff Genovy

Reporter

Jungshik Shin

Components

Reviewer

None

Priority

assess

Time Needed

Days

Fix versions

Configure