does CollationElementIterator (need to) support switching next/previous direction on the fly?

Description

We have a conflict in documentation. We need to clarify what we need and what we support.

For implementation simplicity, hopefully we can disallow direction switching in Java like we document in C/C++.

  • ICU4C CollationElementIterator documents: "The Collation Element Iterator moves only in one direction between calls to CollationElementIterator::reset. That is, CollationElementIterator::next() and CollationElementIterator:revious can not be inter-used. [...] If a change of direction is done without a CollationElementIterator::reset(), the result is undefined."

  • Same text in C ucoleitr.h file documentation.

  • Presumably, the direction can also change after calling setOffset().

  • ICU4J CollationElementIterator documents: "... when you change direction while iterating (i.e., call next() and then call previous(), or call previous() and then call next()), you'll get back the same element twice."

Activity

Show:
TracBot
June 30, 2018, 11:31 PM
Trac Comment 1 by —2012-02-09T16:23:34.681Z

Make an experiment: Add temporary instrumentation into the CollationElementIterator code to see if the test suites rely on changing direction.

TracBot
June 30, 2018, 11:31 PM
Trac Comment 2 by —2012-02-22T20:18:58.233Z

Work with dbesevic on this.

TracBot
June 30, 2018, 11:31 PM
Trac Comment 6 by —2013-04-09T04:10:23.151Z

I did some investigation if this additional restriction in ICU4J is acceptable in ICU4J implementation code. By prohibiting next() followed by previous(), or previous() followed by next() without calling reset(), following ICU4J test cases are failed.

All test case failures under Collate/CollationTest/* are caused by how those test cases are written. They directly calls next() until NULL_ORDER is returned, followed by previous() until NULL_ORDER is returned (or vise versa). No test cases are trying to access collation element in the middle back and forth. These can be updated by just putting reset() between two loops.

Test cases under Collate/SearchTest/* are failed because of StringSearch implementation. StringSearch keeps an instance of CollationElementIterator and calls next() and previous() in different locations. However, it looks all of these failing code path actually calls one of following methods between next() and previous()

setText is replacing the source string with new one. So it does not depend on previous state. setOffset and setExactOffset internally has the similar effect with reset() and it actually initialize the internal structure necessary for tracking forward/backward iteration. So, as long as we allow iteration direction change with setOffset/setText (and package private setExactOffset) in addition to reset(), the current StringSearch implementation is not a blocker for the behavior change discussed in this ticket.

TracBot
June 30, 2018, 11:31 PM
Trac Comment 8 by —2013-04-25T21:14:49.008Z

Note from the ICU PMC meeting on 2013-04-24

  • ICU4C API doc to specify setOffset/setText also have the same effect with reset() for changing direction.

  • ICU4J API doc to include setText as well.

  • ICU4J to document next()/previous() without reset() (etc) is not guaranteed, not proactively check at run time.

I put run time "assertions" for checking above to make sure our implementation/test codes are OK, although our consensus was not to check this proactively.

Fixed

Assignee

Yoshito Umaoka

Reporter

Markus Scherer

Components

Labels

None

Reviewer

None

Priority

major

Time Needed

Hours

Fix versions