Change to fr_CH number symbols breaks date parsing with generic NumberFormat
General
Trac Data
Other Data
General
Trac Data
Other Data
Description
Normally the ICU4J SimpleDateFormat parsing uses a special DateNumberFormat to parse the numeric parts of a dat format. However it is possible to set an arbitrary NumberFormat into a DateFormat object to use for parsing. ICU4J NumberFormatRegressionTest.java had a TestJ691 which (until r39710) did this:
In cldrbug:9370 the following was removed from fr_CH
leaving it with the decimal and grouping separator values from fr:
The fr_CH standard short date format remains as
However, in TestJ691, because it is being parsed with a generic NumberFormat using lenient parsing, and because . is not longer a decimal separator that would be removed from the set of candidate grouping separators, date parsing treats the entire string "11.10.2000" as a single numeric value with grouping separators and parses it as a numeric day value 11102000; not only is this out of range, but the pattern parse then fails to find the expected pattern literal text '.' following this "day" value.
For the time being I have fixed this in the test by adding
but we may want to explore other options, such as having DateFormat clone any passed-in NumberFormat and update it with setGroupingUsed(false) and setParseIntegerOnly(true).
Or perhaps the test in TestJ691 is just not valid...
Activity
Show:
UnicodeBot
June 30, 2018 at 11:26 PM
Trac Comment 4 by —2017-06-23T06:26:51.856Z
Added the cloning in ICU4J. ICU4C has setNumberFormat which already clones, and adoptNumberFormat which doesn't (and which is called by setNumberFormat after cloning, to do the work).
UnicodeBot
June 30, 2018 at 11:26 PM
Trac Comment 2 by —2017-03-01T19:25:19.124Z
See setNumberFormat:
I think the test case was written for this case. It looks we should also call setGroupingUsed(false) in addition to setParseIntegerOnly(true).
The question is whether we should create a defensive clone. My preference is to create a clone always...
UnicodeBot
June 30, 2018 at 11:26 PM
Trac Comment 1 by —2017-03-01T19:19:43.802Z
df.setNumberFormat should clone the NumberFormat (if it does not already) and do setParseIntegerOnly(true) and setGroupingUsed(false).
Normally the ICU4J SimpleDateFormat parsing uses a special DateNumberFormat to parse the numeric parts of a dat format. However it is possible to set an arbitrary NumberFormat into a DateFormat object to use for parsing. ICU4J NumberFormatRegressionTest.java had a TestJ691 which (until r39710) did this:
In cldrbug:9370 the following was removed from fr_CH
leaving it with the decimal and grouping separator values from fr:
The fr_CH standard short date format remains as
However, in TestJ691, because it is being parsed with a generic NumberFormat using lenient parsing, and because . is not longer a decimal separator that would be removed from the set of candidate grouping separators, date parsing treats the entire string "11.10.2000" as a single numeric value with grouping separators and parses it as a numeric day value 11102000; not only is this out of range, but the pattern parse then fails to find the expected pattern literal text '.' following this "day" value.
For the time being I have fixed this in the test by adding
but we may want to explore other options, such as having DateFormat clone any passed-in NumberFormat and update it with setGroupingUsed(false) and setParseIntegerOnly(true).
Or perhaps the test in TestJ691 is just not valid...