levels for Arabic numerals are not resolved correctly

Description

(I suppose this might be a contributing factor to ; filing a new issue as this is more narrowly focused.)

I don’t understand the results that I’m getting from the ubidi implementation; they seem to disagree with my reading of UAX#9, and with the results given by the reference implemenations.

Consider the simple text “A١B”, with paragraph level 0. What level should result for the Arabic numeral “١” here?

It is bidi type AN, so I would expect it to end up with level 2, as a result of rule I1 in : I1. For all characters with an even (left-to-right) embedding level, those of type R go up one level and those of type AN or EN go up two levels.

Indeed, this is what the reference implementation shows:

But when I use ICU4C, calling ubidi_setPara and then ubidi_getLevels, the levels of all three characters remain 0. Is ICU4C failing to apply I1 for some reason?

Activity

Details

Assignee

Reporter

Components

Labels

Priority

Fix versions

Created December 10, 2023 at 6:11 PM
Updated May 16, 2024 at 5:46 PM