We're updating the issue view to help you get more done. 

possible collation rules for Dutch

Description

Hmm, as far as I know, Dutch collates ij as a variant of y.
I've also included some rules from EOR.

%% Dutch (nl, nl_*), minimal tailoring (more may be needed)

%% Inherited (and pseudo-implied) from EOR-1; far from updated for EOR-2 (and
EOR-2 still (nov.2001) needs a final review)

" & \u0020" % from EOR-2

" < \u0024" % DOLLAR SIGN
" < \u00A2" % CENT SIGN
" < \u00A3" % POUND SIGN
" < \u00A4" % CURRENCY SIGN
" < \u00A5" % YEN SIGN
" < \u20A1" % COLON SIGN
" < \u20A2" % CRUZEIRO SIGN
" < \u20A3" % FRENCH FRANC SIGN
" < \u20A4" % LIRA SIGN
" < \u20A5" % MILL SIGN
" < \u20A6" % NAIRA SIGN
" < \u20A7" % PESETA SIGN
" < \u20A8" % RUPEE SIGN
" < \u20A9" % WON SIGN
" < \u20AA" % NEW SHEQEL SIGN
" < \u20AB" % DONG SIGN
" < \u20AC" % EURO SIGN
" < \u20AD" % KIP SIGN
" < \u20AE" % TUGRIK SIGN
" < \u20AF" % DRACHMA SIGN
" < \u20A0" % EURO-CURRENCY SIGN (NOT the EURO) (not in EOR-2)
" < \u09F2" % BENGALI RUPEE MARK (not in EOR-2)
" < \u09F3" % BENGALI RUPEE SIGN (not in EOR-2)
" < \u093F" % THAI CURRENCU SYMBOL BATH (not in EOR-2)
" < \u17DB" % KHMER CURRENCY SYMBOL RIEL (not in EOR-2)
" < \uFDFC" % RIAL SIGN (not in EOR-2)

" < \u02BB" % MODIFIER LETTER TURNED COMMA
" < \u02BD" % MODIFIER LETTER REVERSED COMMA
" < \u02BC" % MODIFIER LETTER APOSTROPHE
" < \u02BF" % MODIFIER LETTER LEFT HALF RING
" < \u02D1" % MODIFIER LETTER HALF TRIANGULAR COLON
" < \u02D0" % MODIFIER LETTER TRIANGULAR COLON
" < \u02D1" % MODIFIER LETTER HALF TRIANGULAR COLON
" < \u02EE" % MODIFIER LETTER DOUBLE APOSTROPHE
" < \u0559" % ARMENIAN MODIFIER LETTER LEFT HALF RING
% more?

% Each of the characters above is to be ignored, according to EOR-2 (but is not
ignored in the 14651 CTT);
% I'm very far from sure on how to express that in ICU. [variable
top]???

" & AD" % stay true to history, at least for some select characters
" << \u0040" % COMMERCIAL AT, @ (ad ligature) (not tailored in EOR-2)

" & D" % from EOR-1
" << \u0111" % LATIN SMALL LETTER D WITH STROKE (Sami, pronounced the same as
ETH is)
" <<< \u0112" % LATIN CAPITAL LETTER D WITH STROKE
" << \u00F0" % LATIN SMALL LETTER ETH
" <<< \u00D0" % LATIN CAPITAL LETTER ETH

" & ET" % stay true to history, at least for some select characters
" << \u0026" % AMPERSAND, & (et ligature) (not tailored in EOR-2)

" & E" % from EOR-1
" << \u0259" % LATIN SMALL LETTER SCHWA
" <<< \u018F" % LATIN CAPITAL LETTER SCHWA

" & F" % from EOR-1
" << \u0192" % LATIN SMALL LETTER F WITH HOOK
" <<< \u0191" % LATIN CAPITAL LETTER F WITH HOOK (EOR-2)

" & G" % from EOR-1
" << \u01E5" % LATIN SMALL LETTER G WITH STROKE
" <<< \u01E4" % LATIN CAPITAL LETTER G WITH STROKE

" & H" % from EOR-1
" << \u0127" % LATIN SMALL LETTER H WITH STROKE
" <<< \u0126" % LATIN CAPITAL LETTER H WITH STROKE

" & I" % from EOR-1
" << \u0131" % LATIN SMALL LETTER DOTLESS I

" & L" % from EOR-1
" << \u0142" % LATIN SMALL LETTER L WITH STROKE
" <<< \u0141" % LATIN CAPITAL LETTER L WITH STROKE
" << \u0140" % LATIN SMALL LETTER L WITH MIDDLE DOT
" <<< \u013F" % LATIN CAPITAL LETTER L WITH MIDDLE DOT

" & LB" % stay true to history, at least for some select characters (not in
EOR-2)
" << \u2114" % L B BAR SYMBOL
" << \u0023" % NUMBER SIGN

" & N" % from EOR-1
" << \u014B" % LATIN SMALL LETTER ENG
" <<< \u014A" % LATIN CAPITAL LETTER ENG
" << \u0149" % LATIN SMALL LETTER N PRECEDED BY APOSTROPHE

" & P" % order Weierstrass p as a variant of P
" << \u2118" % SCRIPT CAPITAL P, a calligraphic small p really (not in
EOR-2)

" & R" % from EOR-1
" << \u027C" % LATIN SMALL LETTER R WITH LONG LEG

" & T" % from EOR-1
" << \u0167" % LATIN SMALL LETTER T WITH STROKE (pronounced the same as THORN
is)
" <<< \u0166" % LATIN CAPITAL LETTER T WITH STROKE

" & W" % from EOR-2
" << \u01BF" % LATIN LETTER WYNN
" <<< \u01F7" % LATIN CAPITAL LETTER WYNN

" & Z" % from EOR-1
" << \u0292" % LATIN SMALL LETTER EZH
" <<< \u01B7" % LATIN SMALL LETTER EZH

" & Y" % order IJ as a variant of Y
" << ij"
" <<< \u0133" % LATIN SMALL LIGATURE IJ
% Ij and iJ are miscapitalisations w.r.t. Dutch and are not handled here.
" <<< IJ"
" <<< \u0132" % LATIN CAPITAL LIGATURE IJ

Environment

Status

Assignee

Mark Davis

Reporter

TracBot

Labels

Time Needed

Hours

tracCreated

Nov 06, 2001, 1:44 PM

tracOwner

mark

tracReporter

kentk@e2be6508970570f4

tracResolution

wontfix

tracStatus

closed

tracWeeks

0.1

Components

Fix versions

Priority

major