Unit tests fail when ICU data includes pseudolocale data

Description

When pseudolocale data is generated for ICU4C a unit test fails:

ERROR: locale ar_XB, expected j resolved char H to occur in short time pattern 'h:mm a' for gregorian (best pattern: 'HH')
ERROR: locale en_XA, expected j resolved char H to occur in short time pattern '[h:mm a 'one']' for gregorian (best pattern: '[HH 'one']')

} ERRORS (2) in testJjMapping (276ms)

Activity

Show:
David Beaumont
September 3, 2020, 10:05 PM

Note that nothing in the pseudo locales is inherited from “ar”, it’s all coming from “en”.

There’s already an enum for each pseudo locale, mainly to handle special cases, so whatever’s needed for this should probably be added there. I’m not savvy enough to know exactly what the best fix is yet, but when a decision is made, let me know.

Norbert Runge
September 3, 2020, 9:35 PM

That worked, thanks! No more failures related to the en_XA/ar_XB pseudolocales. What remains is to customize the pseudolocale generator accordingly.

Markus Scherer
September 3, 2020, 8:48 PM

When you change from “h” (12-hour format) to “H” (24-hour format) you will also want to remove the “a” field (AM/PM marker).

https://www.unicode.org/reports/tr35/tr35-dates.html#dfst-period

Norbert Runge
September 3, 2020, 8:40 PM

The issue H vs. h is with entry

gregorian{

...

DateTimePatterns{

...

"[h:mm a 'one']"

in locales/en_XA.txt, easy enough to pinpoint. Setting it (manually for now, filter() of pseudolocale generator may work) to "[H:mm a 'one']" makes the test testJjMapping pass.

Unfortunately, with “[H:mm a ‘ono’]” another test fails:

DateFormatRoundTripTest {

TestDateFormatRoundTrip {

FAIL: Pattern: [[MMM d, y 'one' 'two'], [H:mm a 'one'] 'one' 'two'] in Locale: en_XA

get dmatch: 2 (expected max 2), smatch: 2 (expected max 1)

Wed Dec 24 03:36:04.542 PST 2594 AD F> [[[[0xd0][0xe9][0xe7] one] 24, 2594 one two], [3:36 [[0xc5][0x1e40] one] one] one two] d=19\

722483364542.043

P> Wed Dec 24 00:36:00.000 PST 2594 AD F> [[[[0xd0][0xe9][0xe7] one] 24, 2594 one two], [0:36 [[0xc5][0x1e40] one] one] one two] d=19\

722472560000

P> Wed Dec 24 00:36:00.000 PST 2594 AD F> [[[[0xd0][0xe9][0xe7] one] 24, 2594 one two], [0:36 [[0xc5][0x1e40] one] one] one two] d=19\

722472560000 d== s==

} ERRORS (4) in TestDateFormatRoundTrip (838ms)

Markus Scherer
September 3, 2020, 5:03 PM

Peter:

Would it be a problem for the pseudolocale generator to generate standard time formats using 'H'?

Do you mean that the generator should read the timeFormats and the dateTimeFormats and change the contents of the pattern strings?

The generator code is here:

There is logic to parse & treat pattern strings specially, to preserve the syntax, but not yet to rewrite a subset of these patterns.

It looks like it should be possible in the filter() function to match on leaf values (pattern strings) under timeFormats and dateTimeFormats, change the string values, and return new, replacement value objects for them.

Fixed
Your pinned fields
Click on the next to a field label to start pinning.

Assignee

Norbert Runge

Reporter

Norbert Runge

Components

Priority

medium

Time Needed

Hours

Fix versions