Uploaded image for project: 'ICU'
  1. ICU-1122

RFE: wordbreaking rule for Chinese Ideograph

    Details

    • Type: Enhancement
    • Status: Done (View workflow)
    • Priority: major
    • Resolution: Won't Fix [deprecated]
    • Affects versions: None
    • Fix versions: 3.2
    • Components: unknown
    • Labels:
    • Time Needed:
      Hours
    • tracOwner:
      andy
    • tracReporter:
    • tracResolution:
      wontfix
    • tracStatus:
      closed
    • tracWeeks:
      0.1

      Description

      Word breaking rule for Chinese ideograph in ICU4J has below,

      + "$kanji=[\u3005\u4e00-\u9fa5\uf900-\ufa2d];"
      // keep together runs of Kanji
      + "$kanji*;"

      Above rule will effect to the countries to use Chinese character, CJKT.
      One line could be selected as a word in Chinese and Taiwanese because usually
      they don't use space or symbol.

        Attachments

          Issue links

            Activity

              People

              • Assignee:
                andy.heninger Andy Heninger
                Reporter:
                apibot TracBot
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:
                  tracCreated: