LocaleMatcher: set threshold, and non-static isMatch()

Description

I am looking at LocaleMatcher call sites and trying to figure out why some of them call match(desired, supported) rather than building a matcher instance and calling getBestMatch() as intended. I have been able to change some of them using newer API, especially setDirection(only-one-way) and setNoDefaultLocale().

A number of call sites work with a custom threshold, usually to narrow down to closer matches than default. We do have internalSetThresholdDistance(int) in Java, but I think that exposing raw numbers for default and custom thresholds is too fragile (and that's why I kept that setter "internal"): They are subject to change with implementation details such as CLDR data, algorithm, and ICU code.

Some of these call sites already call match(desired, supported) with a sample pair of languages and set the threshold to that, or (mostly) implement their own loop over supported languages, call match(), and compare with their custom threshold.

I propose that we add a Builder setter like setMaxDistance(desired, supported). For matching up to and including that distance, the code would actually set the threshold one higher than that distance. (The implementation accepts matches with a distance less than the threshold.)

Of course, even this is somewhat fragile. Like all LocaleMatcher behavior, the relative match distances for various pairs of locales are subject to change. But relative distance seem better than magic threshold numbers.


These and other call sites usually look for a match/no-match result. Since all we offer is match() returning a double, call sites have to compare the result with their custom threshold. Also, in Java match() takes two versions each of the desired and supported locales (half of which are now ignored), and is static. As a static function it cannot be tweaked via Builder options.

I propose that we add a non-static boolean isMatch(desired, supported) for call sites that still can't use getBestMatch() or similar, and have it work according to relevant options (direction and threshold).

Activity

Show:
Markus Scherer
June 2, 2020, 10:52 PM

do you agree with the two additions as described here? Ok with the API names?

Markus Scherer
June 3, 2020, 6:38 PM

(I chatted with Mark yesterday; he did agree.)

Markus Scherer
June 4, 2020, 11:45 PM

API proposal sent today 2020-jun-04.

Fixed

Assignee

Markus Scherer

Reporter

Markus Scherer

Components

Labels

None

Reviewer

None

Priority

major

Time Needed

Hours

Fix versions

Configure