We're updating the issue view to help you get more done. 

Regex, behavior of \cx (Control-X) different from Java and Perl

Description

\cX in a regular expression pattern means Control-X

ICU ands X with 0x1f, using the ICU Unescape() function.

Java XORs X with 0x40. This gives the same result for A-Z, but differs for everything else.

PCRE first uppercases X, then XORs with 0x40. I don't know whether the upper-casing works outside of ASCII range.

Perl uppercases, then XORs with 0x40. Non-ASCII behavior is unknown.

Perl and Java differing may be a Java bug. Ask Sun.

Environment

Status

Assignee

Andy Heninger

Reporter

Andy Heninger

Time Needed

Hours

tracCreated

Dec 06, 2007, 7:12 PM

tracOwner

andy

tracProject

ICU4C

tracReporter

andy

tracStatus

accepted

tracWeeks

0.1

Components

Priority

minor