Performance issues with SimpleDateFormat.
The com.ibm.icu.text.SimpleDateFormat class has the following performance issues compared to the java.text.SimpleDateFormat class:
1) Creating a new com.ibm.icu.text.SimpleDateFormat object is ~12x slower than creating a java.text.SimpleDateFormat object. For example (see attach SimpleDateFormatPerformanceTest.java):
2) Parsing a string with com.ibm.icu.text.SimpleDateFormat is ~137x slower than parsing a string with java.text.SimpleDateFormat. For example (see attach SimpleDateFormatPerformanceTest.java):
Combined (creation/parsing), com.ibm.icu.text.SimpleDateFormat is ~26x slower than java.text.SimpleDateFormat. For example (see attach SimpleDateFormatPerformanceTest.java):
Times are in milliseconds. Tests are ran using the following configuration:
Note, this defect was found using ICU v4.1.0 as shipped with Eclipse 3.5.
These numbers are misleading, because there is no warmup, and not iteration (needed to collect enough data to get a reasonable average. The test needs to be restructured to get better data.
Note that this may be due to the time needed to load the timezone data; that part is being worked on.
Tried the attached test case. On my relatively old system, the output is -
with ICU trunk. So, to create SimpleDateFormat with the test case, JDK:94ms vs ICU:1203ms. However, if I run the test twice (calling twice for testJava() and testICU()), the second iteration took less than 1ms for both case. So this gap is coming from the initial data loading (time zone, locale data for date formatting etc..)
By deferring some data initialization not immediately required and optimizing the initialization code, the initialization part was slightly improved. Most of overhead is coming from code out of DateFormat - for example, accessing resource bundle, initializing numbering system used for number format, creating an instance of calendar. We'll continue to look into the performance problem - out of DateFormat.
I ran a modified version of your performance test over 200 locales twice.
This test is for opening of the SimpeDateFormat object
This test is for opening of the parsing
I modified the test to ignore the first call to "new SimpleDateFormat" and "parse" for both ICU4J and Java. As mentioned earlier, most of the performance hits are the loading of the data. After the data is loaded (even when creating new SimpleDateFormat objects with different locales) the performance is comparable to Java.