mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@aerifal.cx>
To: musl@lists.openwall.com
Cc: nsz@port70.net
Subject: Re: iconv Korean and Traditional Chinese research so far
Date: Mon, 5 Aug 2013 08:54:57 -0400	[thread overview]
Message-ID: <20130805125456.GG221@brightrain.aerifal.cx> (raw)
In-Reply-To: <20130805090343.6d2f9f00@ralda.gmx.de>

On Mon, Aug 05, 2013 at 09:03:43AM +0200, Harald Becker wrote:
> The only code that get a bit more, is the file system search.
> This depends if we only try single location or walk through a
> search path list. But this is the cost of flexibility to
> dynamically load character set conversions (which I would really
> prefer for seldom used char sets).

The only "seldom used char sets" are either extremely small (8bit
codepages) or simply encoding variants of an existing CJK DBCS (in
which case it's just a matter of code, not large data tables, to
support them).

> .... and for application writer it is only more, if he likes to
> add some charset tables into his program, which are not in
> statical libc.

This is only helpful if the application writer is designing around
musl. This is a practice we explicitly discourage.

> The problem is, all tables in libc need to be linked to your
> program, if you include iconv. So each added charset conversion
> increases size of your program ... and I definitly won't include
> Japanese, Chinese or Korean charsets in my program. No that I
> ignore those peoples need, I just wont need it, so I don't like
> to add those conversions to programs sitting on my disk.

How many programs do you intend to use iconv in that _don't_ need to
support arbitrary encodings including ones you might not be using
yourself? Even if you don't read Korean, if a Korean user sends you an
email containing non-ASCII punctuation, Greek letters like epsilon,
etc. there's a fair chance their MUA will choose to encode with a
legacy Korean encoding rather than UTF-8, and then you need the
conversion.

It would be nice if everybody encoded everything in UTF-8 so the
recipient was not responsible for supporting a wide range of legacy
encodings, but that's not the reality today.

> If the definition of the iconv virtual state machine is modified,
> you need to do extra care on update (delete old charset files,
> install new lib, install new charset files, restart system) ...
> but this is only required on a major update. As soon as the

Even if there were really good reasons for the design you're
proposing, such a violation of the stability and atomic upgrade policy
would require a strong overriding justification. We don't have that
here.

Rich


  reply	other threads:[~2013-08-05 12:54 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-04 16:51 Rich Felker
2013-08-04 22:39 ` Harald Becker
2013-08-05  0:44   ` Szabolcs Nagy
2013-08-05  1:24     ` Harald Becker
2013-08-05  3:13       ` Szabolcs Nagy
2013-08-05  7:03         ` Harald Becker
2013-08-05 12:54           ` Rich Felker [this message]
2013-08-05  0:49   ` Rich Felker
2013-08-05  1:53     ` Harald Becker
2013-08-05  3:39       ` Rich Felker
2013-08-05  7:53         ` Harald Becker
2013-08-05  8:24           ` Justin Cormack
2013-08-05 14:43             ` Rich Felker
2013-08-05 14:35           ` Rich Felker
2013-08-05  0:46 ` Harald Becker
2013-08-05  5:00 ` Rich Felker
2013-08-05  8:28 ` Roy
2013-08-05 15:43   ` Rich Felker
2013-08-05 17:31     ` Rich Felker
2013-08-05 19:12   ` Rich Felker
2013-08-06  6:14     ` Roy
2013-08-06 13:32       ` Rich Felker
2013-08-06 15:11         ` Roy
2013-08-06 16:22           ` Rich Felker
2013-08-07  0:54             ` Roy
2013-08-07  7:20               ` Roy
     [not found] <20130804232816.dc30d64f61e5ec441c34ffd4f788e58e.313eb9eea8.wbe@email22.secureserver.net>
2013-08-05 12:46 ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130805125456.GG221@brightrain.aerifal.cx \
    --to=dalias@aerifal.cx \
    --cc=musl@lists.openwall.com \
    --cc=nsz@port70.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).