We're updating the issue view to help you get more done. 

Line Break of paired « » quotes poor for French; request improvements.

Description

Hi,

following program examines the linebreaking positions within
the text "AB « CD » EF" for locale="fr_FR":

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 UBreakIterator *brit; UErrorCode status = U_ZERO_ERROR; char cStringToExamine[] = "AB « CD » EF"; UChar stringToExamine[sizeof(cStringToExamine)+1]; int it; u_uastrcpy(stringToExamine, cStringToExamine); brit = ubrk_open(UBRK_LINE, "fr_FR", stringToExamine, -1, &status); if (U_FAILURE(status)) { printf("ubrk_open error: %s\n", u_errorName(status)); exit(1); } it = ubrk_first(brit); while (it != UBRK_DONE) { it = ubrk_next(brit); printf("it=%d\n",it); } ubrk_close(brit);

The result is:

1 2 3 4 5 6 it=3 it=5 it=8 it=10 it=12 it=-1

The correct result would be

1 2 3 4 it=3 it=10 it=12 it=-1

which is the same result as with the text "AB ( CD ) EF".

In French, U+00AB («) behaves like a opening bracket and U+00BB (»)
like a closing bracket.

Kurt Stützer

Environment

Status

Assignee

Andy Heninger

Reporter

TracBot

Labels

Time Needed

Days

tracCreated

Dec 12, 2011, 6:59 PM

tracOwner

andy

tracProject

ICU4C

tracReporter

kurt@26c001cff1b556ff

tracStatus

accepted

tracWeeks

1

Components

Priority

medium