toLanguageTag() is defined as following the BCP47 spec. I've not seen it specify all the same things as the Unicode LDML specification at
http://www.unicode.org/reports/tr35/, however some things specified in the latter are not incompatible with BCP47 and I think should be implemented when canonicalizing Language Tags:
1) Sorting of variants: "en-scouse-fonipa" -> "en-fonipa-scouse"
2) Dropping of "true" in u extensions:
"und-u-foo-bar-nu-thai-ca-buddhist-kk-true" -> "u-bar-foo-ca-buddhist-kk-nu-thai"
Those two changes are required for toLanguageTag() to not break the LDML specification on this point.
Observing other differences in Unicode LDML Locale Identifiers and the specifically-BCP47 implementation of forLanguageTag/toLanguageTag - some details might be "working as intended", others might require either specification fix or improvements to code behaviour:
toLanguageTag can return "root", which is not a valid language subtag. (It's a special case, which in LMDL spec gets replaced with "und" when producing canonical Unicode BCP47 Locale Identifiers.),
forLanguageTag doesn't support underscores,
forLanguageTag accepts the zero-length string as valid (which seems to not be a valid LanguageTag) - produces 'und' for this,
forLanguageTag does not permit "en-a", "en-z", "en-x". The LDML spec rejects "en-t" and "en-u" (cannot be empty), but the spec allows empty extensions for the other 24 singletons,
Deprecated items handling...
For deprecated items:
The LDML spec speaks of <deprecatedItems> at http://www.unicode.org/reports/tr35/#Deprecated_Structure
There is no <deprecatedItems> tag in https://www.unicode.org/repos/cldr/tags/latest/common/supplemental/supplementalMetadata.xml
In place of deprecatedItems perhaps, the spec speaks of languageAlias and territoryAlias at http://www.unicode.org/reports/tr35/#Language_Tag_to_Locale_Identifier
The spec gives some script and variant deprecations in tables. These deprecations are in supplementalMetadata.xml as scriptAlias and variantAlias, perhaps worth mentioning in the spec? (And how about subdividionAlias and zoneAlias?)
Except for POSIX, the variant deprecations aren't implemented.
The spec suggests languageAlias should be able to influence more than just language subtags (e.g. "mo" -> "ro-MD"), but the implementation replaces only language subtags ("mo" -> "ro").