Deduplicate C & Java testdata to a common directory

Description

(Cloned from https://unicode-org.atlassian.net/browse/ICU-21205 which had a pull requested merged for ICU 76.)

There are some testdata files that are identical between ICU4J and ICU4C. This ticket proposes deduplicating these into a common directory, also giving us a place to add any future testdata shared between implementations.
(First task: checking if this is reasonable to do for the Java test framework?)

Proposed: testdata/, next to icu4c/ and icu4j/.

Example testdata files identical between C and J, and suggested new paths: not sure of inclusion of "core" in the output path, but including it would help for having no files directly in the top-level testdata/ directory (so as to not have files next to the testdata/cldr-testData directory):

CLDR Testdata from ticket ICU-21066: something like testdata/cldr-testData/…

icu4c/source/test/testdata/localeMatcherTest.txt
icu4j/main/tests/core/src/com/ibm/icu/dev/test/util/data/localeMatcherTest.txt
-> testdata/core/util/localeMatcherTest.txt ?

icu4c/source/test/testdata/break_rules/$filename
icu4j/main/tests/core/src/com/ibm/icu/dev/test/rbbi/break_rules/$filename
-> testdata/core/break_rules/$filename ?
for filename in: [ grapheme.txt line_cj.txt line_loose_cj.txt line_normal_cj.txt line_normal.txt line_loose.txt sentence.txt word.txt word_POSIX.txt ]

icu4c/source/test/testdata/$filename
icu4j/main/tests/core/src/com/ibm/icu/dev/data/unicode/$filename
-> testdata/core/unicode/$filename ?
for filename in: [ BidiCharacterTest.txt BidiTest.txt IdnaTestV2.txt ]

icu4c/source/test/testdata/$filename
icu4j/main/tests/core/src/com/ibm/icu/dev/data/$filename
-> testdata/core/$filename ?
for filename in: [ numberformattestspecification.txt numberpermutationtest.txt ]

icu4c/source/test/testdata/$filename
icu4j/main/tests/collate/src/com/ibm/icu/dev/data/$filename
-> testdata/collate/$filename
for filename in: [ CollationTest_NON_IGNORABLE_SHORT.txt CollationTest_SHIFTED_SHORT.txt collationtest.txt ]

Activity

Show:

Mihai Nita 
February 14, 2025 at 11:17 PM

Created clone ticket with a more comprehensive plan, more than MF2.

Mihai Nita 
February 14, 2025 at 11:15 PM

The PR unified the test data for MessageFormat 2, but then was reverted to unblock development.

Markus Scherer 
February 7, 2025 at 8:07 PM

There is a merged PR for this ticket.

Please close this ticket by Wed feb12, and adjust its description to say what the PR did.

I assume that you will want to clone this once more.

Markus Scherer 
September 26, 2024 at 6:05 PM

Fixed

Details

Assignee

Reporter

Components

Priority

Time Needed

Days

Fix versions

Created September 26, 2024 at 6:03 PM
Updated February 19, 2025 at 10:14 PM
Resolved February 19, 2025 at 10:14 PM