future of CLDR JSON distribution?

Description

Currently, there is a separate github organization https://github.com/unicode-cldr/ with 41 repositories.

Only one of those (`cldr-json`) really has any original content in it (an ant build script, which could be folded into the main cldr repo). The rest have pieces of CLDR JSON data in them.

I think the goal was for users to be able to browse, clone, download subsets of the data.

Additionally, for context,

each of the 40 other repositories is published to `npm` (these repositories are installable via `bower`, although bower is now deprecated.) These are very popular, https://www.npmjs.com/package/cldr-core for example gets 156,000 downloads weekly.

Experimentally, I attached a .zip file to the CLDR beta release https://github.com/unicode-org/cldr/releases/tag/release-38-beta ( 85 MB)

Problem Statement

`unicode-cldr` does not have any official Unicode status, and until the other week only had two members. ( and myself)
41 repositories is a lot of repositories. That's a lot of security policies, branches, and pull requests to manage.
Having a separate organization makes Unicode oversight/management more complex. Merging all 41 into https://github.com/unicode.org would double the number of repositories there, and make it more complicated for all other projects.

Proposal

eliminate all 40 data repositories: Perhaps Move them to "archived" status so they are still around, at least for a transition period. But after a while, they would be misleading and could obscure access to current data.
merge `cldr-json` into the main cldr repo. It's just a build script.

Options for data generation

1. We could put cldr-json data into the cldr-staging repo. It is derived from the staging data, but sometimes has a different release cycle.

2. We could have a separate cldr-json-staging repo, with all of the data in one repo.

3. We could not check the actual data in anywhere, but go straight from generation to zipfile + npm publication. The downside here is that the data couldn't be inspected easily without downloading. However, this is 100% derived data, and so it seems to me that it doesn't need to be handled under change control.

I am leaning towards option 3.

Linked work items

is duplicated by

CLDR-14317

CLDR 38 is not tagged in the JSON repos

CLDR-13799

Publish single full NPM package with all CLDR JSON data

CLDR-13352

Reorganize JSON data in Github

Activity

Shane Carr
April 22, 2021 at 8:52 PM

The repos in the old org have all been archived. Marking as fixed.

Steven R. Loomis
December 19, 2020 at 11:54 PM

Now publishing https://github.com/unicode-org/cldr-json/releases/tag/38.1.0-BETA4

Steven R. Loomis
December 16, 2020 at 9:52 PM

can you PR our top level README to mention cdnjs? maybe the USERS.md file?

Steven R. Loomis
December 16, 2020 at 9:51 PM

OK everyone, please check out https://github.com/unicode-org/cldr-json/releases/tag/38.1.0-BETA3 - also published to npm as the same.

Matt Cowley
December 16, 2020 at 5:30 PM
(edited)

Hey – for cdnjs, we’d either need the JSON files in an NPM package, or in a tagged git repo.

It looks like https://github.com/unicode-org/cldr-json/ has recently been updated and appears to have all the JSON files in a single tagged repo now, which should work perfectly for us!

Resize issue view side panel

Fixed

Details

Priority

medium

Assignee

Steven R. Loomis

Reporter

Steven R. Loomis

Reviewer

Shane Carr

Fix versions

Components

Labels

Infrastructure

Time tracking

17m logged

Created October 26, 2020 at 4:46 PM

Updated October 11, 2022 at 8:23 PM

Resolved April 22, 2021 at 8:53 PM