add multiple paragraph support to ubidi APIs

Description

Other bidi packages support running bidi on multiple paragraphs at a
time. Clients porting from these packages to ICU currently have to
implement this support themselves. This proposal is to add/extend the
ICU ubidi APIs to support multiple paragraphs.

The approach taken is to enable multiple paragraphs by default.
Formerly, the ICU bidi implementation only operated on one paragraph
at a time; more than one paragraph could be passed to it but the
results were undefined. Now the results are defined-- by default, the
levels returned will be the same as those returned if each paragraph
(including terminating paragraph separator) were passed separately and
the results concatenated. This means the output of the bidi code will
change, but only for input that was not formerly defined as
acceptable.

For compatibility with other implementations, API is provided
to ensure that the paragraphs will end up in left-to-right order if
reordered according to the levels array-- this option sets the levels
corresponding to paragraph separators to level 0. This is state
that can be set on the Bidi object, similar to the setInverse option.

API is added to return the number of paragraphs, get information about
a paragraph containing the character at index n, and get information
about the n'th paragraph. This is modeled on the APIs that return
information about logical and visual runs.

In addition, error checking will be enhanced.

The semantic of ubidi_getParaLevel is modified somewhat now that it is
possible for there to be multiple paragraphs:

  • @return The paragraph level. If there are multiple paragraphs, their

  • level may vary if the required paraLevel is UBIDI_DEFAULT_LTR or

  • UBIDI_DEFAULT_RTL. In that case, the level of the first paragraph

  • is returned.

The proposed new APIs are as follows:

/**

  • Get the number of paragraphs.
    *

  • @param pBiDi is the paragraph or line <code>UBiDi</code> object.
    *

  • @return The number of paragraphs.

  • @stable ICU 3.4
    */
    U_STABLE int32_t U_EXPORT2
    ubidi_countParagraphs(UBiDi *pBiDi);

/**

  • Get a paragraph, given a position within the paragraph.

  • This function returns information about a paragraph.<p>
    *

  • @param pBiDi is the paragraph or line <code>UBiDi</code> object.
    *

  • @param charIndex is the index of a character within the text, in the

  • range <code>[0..ubidi_getLength(pBiDi)-1]</code>.
    *

  • @param pParaStart will receive the index of the first character in

  • the paragraph.

  • This pointer can be <code>NULL</code> if this

  • value is not necessary.
    *

  • @param pParaLimit will receive the limit of the paragraph.

  • The l-value that you point to here may be the

  • same expression (variable) as the one for

  • <code>charIndex</code>.

  • This pointer can be <code>NULL</code> if this

  • value is not necessary.
    *

  • @param pParaLevel will receive the level of the paragraph.

  • This pointer can be <code>NULL</code> if this

  • value is not necessary.
    *

  • @param pErrorCode must be a valid pointer to an error code value.
    *

  • @return The index of the paragraph containing the specified position.

  • @stable ICU 3.4
    */
    U_STABLE int32_t U_EXPORT2
    ubidi_getParagraph(const UBiDi *pBiDi, int32_t charIndex, int32_t *pParaStart,
    int32_t *pParaLimit, UBiDiLevel *pParaLevel,
    UErrorCode *pErrorCode);

/**

  • Get a paragraph, given the index of this paragraph.
    *

  • This function returns information about a paragraph.<p>
    *

  • @param pBiDi is the paragraph <code>UBiDi</code> object.
    *

  • @param paraIndex is the number of the paragraph, in the

  • range <code>[0..ubidi_countParagraphs(pBiDi)-1]</code>.
    *

  • @param pParaStart will receive the index of the first character in

  • the paragraph.

  • This pointer can be <code>NULL</code> if this

  • value is not necessary.
    *

  • @param pParaLimit will receive the limit of the paragraph.

  • This pointer can be <code>NULL</code> if this

  • value is not necessary.
    *

  • @param pParaLevel will receive the level of the paragraph.

  • This pointer can be <code>NULL</code> if this

  • value is not necessary.
    *

  • @param pErrorCode must be a valid pointer to an error code value.
    *

  • @stable ICU 3.4
    */
    U_STABLE void U_EXPORT2
    ubidi_getParagraphByIndex(const UBiDi *pBiDi, int32_t paraIndex,
    int32_t *pParaStart, int32_t *pParaLimit,
    UBiDiLevel *pParaLevel, UErrorCode *pErrorCode);

Activity

Show:
TracBot
June 30, 2018, 11:47 PM
Trac Comment by auditor—1970-01-01T01:16:44.000Z
  • Sun Jan 23 08:01:40 2005 mati changed notes2: summmary: "" to "add multiple paragraph support to ubidi APIs ",

  • Sun Jan 23 21:17:42 2005 grhoten changed notes2: review: "dougfelt" to "doug", summmary: "add multiple paragraph support to ubidi APIs" to "",

  • Sun Jan 23 21:17:42 2005 grhoten moved from incoming to others

  • Thu Feb 17 12:49:20 2005 dougfelt moved from others to fixed

  • Wed Aug 3 11:36:39 2005 weiv moved from fixed to closed

Fixed

Assignee

matitiahu.allouche@gmail.com

Reporter

TracBot

Components

Labels

None

Reviewer

None

Priority

minor

Time Needed

None

Fix versions