Fixes to Hiragana-Katakana for U+309B, U+309C and NFKC (NFC)

Description

Per e-mail discussion, a couple of fixes to the Hiragana-Katakana transform:

1. Remove the spacing dakuten and han-dakuten (U+309B, U+309C) from the compatibility decomposition to U+0020 plus combining mark: From the following filter set (top and bottom), subtract U+309B, U+309C:

1 :: [\u0000-\u007E 、。 ゙-゜ ァ-ー 。-゚ー[:Hiragana:] [:Katakana:] [:nonspacing mark:]] ;

2. At the end of the transform, use NFC to balance the starting NFKC. Change the following

1 2 3 :: NFKC (); // at the top ... :: (NFKC) ; // at the bottom

to

1 2 3 :: NFKC (NFC); // at the top ... :: NFC (NFKC) ; // at the bottom

 

xpath

None

locale

None

Status

Priority

medium

Assignee

Peter Edberg

Reporter

Peter Edberg

tracReporter

None

Reviewer

Mark Davis

Labels

None

Components

Fix versions

phase

rc
Configure