mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: Re: [musl] Locale support considered harmful noise
Date: Tue, 18 Feb 2020 22:36:04 -0500	[thread overview]
Message-ID: <20200219033604.GZ1663@brightrain.aerifal.cx> (raw)
In-Reply-To: <alpine.DEB.2.00.2002181937050.23857@ny4.eemta.org>

On Tue, Feb 18, 2020 at 07:38:29PM +0000, Jacob Welsh wrote:
> Hello,
> 
> In TMSR we've made extensive use of musl, due to the very welcome
> dose of clear and concise code it provides as compared to the
> competition [1]. For example we have a static Ada compiler [2], the
> Bitcoin reference implementation [3], a reproducible and
> self-contained Gentoo system [4], and not least of all my own
> distribution [5] used in my consulting business [6].
> 
> However, the apparent goal of aggressive expansion of Unicode and
> localization "features" in musl sets off alarms; for instance, on
> the roadmap [7] I see:

I think you're rather under-informed on this topic. Basically none of
the following add any complexity:

> >Unicode 12.1 update and related character handling work

This was (1) an update of existing tables and (2) throwing out
hand-written case mapping code that made lots of fragile assumptions
and had to be updated by hand with every addition of new case
mappings, and that got slower with each addition, and replacing it
with a table-based approach I'd designed a year or so ago that's more
like the rest of the character tables and admits automatic generation.

> >Locale support overhaul.

This is not adding anything new but fixing bugs where the code that's
already there doesn't work as intended.

> >Hostname resolver support for non-ASCII domains (IDN)
> 
> >LC_COLLATE support for collation orders other than simple codepoint order

These have been serious missing functionality since the beginning.
There is no change here. If you missed them being on the roadmap for
the past 6+ years, you weren't looking very closely.

> >Support for LC_MONETARY and LC_NUMERIC properties.

This is the only item that's controversial, but you don't seem to be
coming from a good position to have input on it.

> >Message translation support for dynamic linker

This has also been on the agenda for a long time. It's the only place
in musl where format strings containing natural-language text are
used, and format strings are not candidates for translation because
it's unsafe (data can replace format specifiers with incompatible
ones), making it inconsistent with the rest of musl which does have
message translation support.

> >Locale data and libc message translations

This is purely a matter of creating data to be used with functionality
that already exists.

> We think this is such a bad idea that it threatens to undermine
> musl's otherwise substantial virtues. This kind of bloat imposes
> real costs on the users that matter - namely the literate ones, who
> value predictable, stable and bug-free code - in exchange for
> entirely unclear benefits.

If you think the above imply bloat, musl must already be bloated.

You should probably be aware that first-class support for all
characters in Unicode (vs glibc's bloated gconv-plugin layer for UTF-8
which originally made GNU grep over 100x slower than in 8-bit codepage
locales) was _THE_ original motivation for what became musl. None of
this is new. Not treating users like they're "illiterate" if they want
to be able to write their own name has always been the most important
core value of the project, and your attitude towards the matter here
does not make me interested in going out of my way to cater to you. I
suspect others in this community feel similarly.

> Especially considering the rate at which bugs are still turning up,
> there is no justification for this added complexity. In any event we
> will not be using "upgrades" that import additional nonsense into
> this critical system component.

If you want to stick with old versions and maintain them yourself or
pay someone else to do so, that's your choice.

Rich

      parent reply	other threads:[~2020-02-19  3:42 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-18 19:38 Jacob Welsh
2020-02-18 21:42 ` A. Wilcox
2020-02-18 22:23   ` Hadrien Lacour
2020-02-18 23:29     ` [musl] race condition in sem_wait Sebastian Gottschall
2020-02-19  0:46       ` Sebastian Gottschall
2020-02-19  3:39         ` Rich Felker
2020-02-19  8:26           ` Sebastian Gottschall
2020-02-19 14:13             ` Rich Felker
2020-02-19  4:07         ` Bobby Bingham
2020-02-19 21:28   ` [musl] Locale support considered harmful noise Jacob Welsh
2020-02-19 22:06     ` Rich Felker
2020-02-19  3:36 ` Rich Felker [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200219033604.GZ1663@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).