Hi With the exception of the musl translation itself, I think the most parts are doable. My problem at the moment is, that I am not a C hero like you guys and don't know exactly how these locale files should look like (file format, content). As a consequence fully answering your questions is non trivial at the moment. I can do some research and do some dirty work, but first I would need a sample locale file in the musl format, or some documentation to get kick started. I have worked in the past and even created locale files for glibc and cldr, so I am at least not a complete newbie on the topic. Unfortunately I have not enough time to act as a maintainer, but I could periodically help out if someone stands up and take the lead. For the translation of musl itself: Do you plan to add a *.pot file to the musl repository? Regards Kevin On Sun, Jul 27, 2014 at 5:27 AM, Rich Felker wrote: > On Sat, Jul 26, 2014 at 11:27:38PM +0200, Wermut wrote: >> Hi >> >> I don't like the idea of an entirely new tree of locale data written >> from scratch. Glibc has one (with a lot of unmaintained data) and then >> there is also the CLDR repository which aims to be the central source >> for such data, maintained by unicode. The CLDR data is also used as a >> basis for the Microsoft and Apple locale files and is often maintained >> by national language experts. What I could offer is an effort to write >> some magic code that imports the actual CLDR data and converts the >> relevant information to the musl formatted ones. The CLDR data is >> freely available from: http://cldr.unicode.org/index/downloads > > I have no objection to using data from CLDR if there's no restrictive > license, but at first glance it looks like most of the data is outside > the scope of the C/POSIX locale system. What we need is: CLDR license (botom of the page): http://unicode.org/copyright.html I my eyes this is a BSD like license. If somebody thinks the license is not OK, please say so. Copy is attached to this mail. > > 1. Weekday and month names (full and abbreviated) - these should > almost certainly be available from CLDR or other public sources. > > 2. Time format strings for strftime - unless CLDR has C-oriented data > like that, these might not be available in a form that's easy to > automatically adapt. Research on this topic is welcome. > > 3. Regexes for yes and no responses - seems unlikely to be in CLDR, > but again I'd be happy for someone to prove me wrong. > > 4. Translations of the message strings in libc. Note that musl's > strings already deviate some from the legacy strings used on glibc > and other systems. For example the strerror strings are adjusted to > align more closely with the POSIX description and the actual > situations they arise in than the legacy strings (like "Not a > typewriter"). I'd like to aim to have our translated strings > equally modernized. And before really spending a lot of work on > these we should review the English strings again for possible > improvements and missing messages (I think some newer error codes > may be missing). > > 5. Collation rules - these almost certainly can come from Unicode/CLDR > but musl does not even support collation yet. > > 6. Monetary formatting and currency names - these almost certain can > come from CLDR or other public sources, but again the code to use > the data isn't there yet. > >> Contribution is not completely open, but you normally interested >> people get access if they want to. I got mine within a week. >> >> This is only a suggestion open to discussion. What do you guys think about it? > > Overall I like it. But I think we still need a maintainer to manage > pulling the data, maintaining string translations for messages, etc. > Any comments on my items 1-6 above? > > Rich