We're updating the issue view to help you get more done. 

Transliterator getSource, getTarget problems

Description

getSourceSet and getTargetSet are incorrectly implemented. Whenever a string is
affected, but not all of its constitutant characters are, only the string should
be added, not the constituants. While this cannot be exact, it should be much
closer than it is now.

The way to do this for RB transliterators is: (e.g. with getSource) as you are
walking through the rules, store as follows:

1 2 3 4 5 6 7 8 9 10 Step A Case 1 ab <set> c > ... where <set> is some non-trivial UnicodeSet, or quantified element like a* To the result add each of "ab", <set>, "c". Case 2 ab ($v1*) c > $1

To the result add each of "ab", "c". Don't add $v1, since it is preserved in the
output. Case 2 might be too hard initially; if so, do later.

Status

Assignee

Mark Davis

Reporter

TracBot

Labels

Reviewer

None

Time Needed

Days

Start date

None

Components

Priority

major