The LDML spec Collation Settings table says for the kk=normalization attribute: "If off, then all strings that are in [FCD] will sort correctly"
This is not quite true. If a string (FCD or not) contains one of the Tibetan precomposed vowels (U+0F73, U+0F75 or U+0F81), then the precomposed vowel must be decomposed or such a string might not sort correctly. The problem is that any contraction with the second part of the vowel decomposition needs to skip the first part. (Discontiguous contraction matching: UCA algorithm S2.1.1-S2.1.3) The DUCET itself has such contractions: The precomposed vowels’ decompositions themselves.
Suggestion: Change the normalization attribute spec to say "If off, then all strings that are in [FCD] and do not contain U+0F73 nor U+0F75 nor U+0F81 will sort correctly"
We might need to add U+0344 to this list.
Richard Wordingham provided an example today on the unicode list ("FCD and Collation") where U+0344 COMBINING GREEK DIALYTIKA TONOS, which is equivalent to <0308 0301> (both ccc=230), can also cause incorrect results despite FCD input.
Consider a tailoring with contractions 03B1+0308 and 0301+0345.
Assume a builder that adds further contractions to cover overlaps between contractions and decompositions. It would add 03B1+0344, 03B1+0344+0345 and 0344+0345.
Input string: 03B1 0359 0344 0345 (with U+0359 COMBINING ASTERISK BELOW as an example for any character with ccc<230) processed via discontiguous-contraction matching as 03B1+0344+0345, 0359
... but when processing the NFD form 03B1 0359 0308 0301 0345 we get 03B1+0308, 0359, 0301+0345 – note the different position of the 0359.
The full set of problematic characters appears to be `
` (link to Mark's demo) == `[\u0344\u0F73\u0F75\u0F81]`.
(I am copying this text into as well.)