mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: Re: Build option to disable locale [was: Byte-based C locale, draft 1]
Date: Tue, 9 Jun 2015 00:27:30 -0400	[thread overview]
Message-ID: <20150609042730.GH17573@brightrain.aerifal.cx> (raw)
In-Reply-To: <20150609032025.GA1605@localhost>

On Mon, Jun 08, 2015 at 08:20:26PM -0700, Isaac Dunham wrote:
> On Mon, Jun 08, 2015 at 04:46:42AM +0200, Harald Becker wrote:
> > On 08.06.2015 02:33, Rich Felker wrote:
> > >So aside from iconv, the above seem to total around 19k, and at least
> > >6k of that is mandatory if you want to be able to claim to support
> > >UTF-8. So the topic at hand seems to be whether you can save <13k of
> > >libc.so size by hacking out character handling/locale related features
> > >that are non-essential to basic UTF-8 support...
> > 
> > I like to get a stripped down version, which eliminate all the unnecessary
> > char set handling code used in dedicated systems, but stripping that on
> > every release is too much work to do.
> > 
> > The benefit may be for:
> > 
> > - embedded systems
> > - small initramfs based systems
> > - container systems
> > - minimal chroot environments
> 
> Somehow it sounds like you may not have gotten wat Rich was asking.
> 
> IIRC, the goals of musl include full native support for UTF-8; keeping 
> the time complexity to a minimum; and clean, correct code.
> 
> Dropping out 'legacy' charsets doesn't really sacrifice those goals.
> But the other changes are have a much bigger impact on them.
> So you're probably going to have to convince Rich that there *is* a
> major benefit ('is' != 'could be').
> 
> For container systems or minimal chroot environments, you're dealing
> with something that doesn't have a hard size limit, and if a chroot
> or container runs ~6 MB ordinarily, you might be able to run 0.3% more
> on the same hardware. That's probably not enough of a case.
> For initramfs-based systems, you've got a similar situation but no
> chance to multiply the effect, unless you're using a VM or hypervisor.
> 
> Now, since embedded systems have hard limits on size, you might be
> able to make a case there. But you will need to come up with somthing
> more specific, such as "I have a system where I could upgrade the kernel
> to 2.6.xx *if* musl were ~20k smaller than building with a minimal
> iconv" or "If we did this, there would be enough space to switch XYZ
> router firmware from telnetd to dropbear".

Yes, this is roughly what I was saying. Thank you for expressing it
better than I could.

And along those lines, if you really need to minimize libc.so for such
a special case, the solution is not manually maintaining extra knobs
and #ifdefs, but changing the way libc.so is generated. Instead of
linking all the object files directly, put them in a .a file first,
then link with something like:

$CC -shared -o libc.so -Wl,-u,sym1 -Wl,-u,sym2 ... libc_so.a

where the list sym1, sym2, ... is generated from 'nm' output for all
the binaries you need to run, plus a few mandatory libc-internal
symbols that need to be linked. This will produce the minimal libc.so
needed for your exact set of programs.

In the specific case of UTF-8 and locale-related code, I believe that
if none of your programs call setlocale or use any of the wchar
functions, regex/fnmatch/glob, or iconv explicitly, the only code that
we discussed that would get linked into libc.so is mbtowc.c and
wcrtomb.c, for a total of about 550 bytes. Even these would be omitted
if you don't use printf or scanf (printf needs wcrtomb; scanf needs
mbtowc). Using fnmatch/glob/regex would pull in another ~9k for the
character class and case mapping functions.

Rich


  reply	other threads:[~2015-06-09  4:27 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-06 21:40 [PATCH] Byte-based C locale, draft 1 Rich Felker
2015-06-06 22:39 ` Harald Becker
2015-06-06 23:10   ` Rich Felker
2015-06-06 23:59     ` Harald Becker
2015-06-07  0:24       ` Rich Felker
2015-06-07 23:59         ` Build option to disable locale [was: Byte-based C locale, draft 1] Harald Becker
2015-06-08  0:28           ` Josiah Worcester
2015-06-08  1:57             ` Harald Becker
2015-06-08  2:36               ` Rich Felker
2015-06-08  3:35                 ` Harald Becker
2015-06-08  3:51                   ` Josiah Worcester
2015-06-08  0:33           ` Rich Felker
2015-06-08  2:46             ` Harald Becker
2015-06-08  4:06               ` Rich Felker
2015-06-09  3:20               ` Isaac Dunham
2015-06-09  4:27                 ` Rich Felker [this message]
2015-06-07  1:17 ` [PATCH] Byte-based C locale, draft 1 Rich Felker
2015-06-07  2:50 ` Rich Felker
2015-06-13  7:06   ` [PATCH] Byte-based C locale, draft 2 Rich Felker
2015-06-16  4:26     ` Rich Felker
2015-06-16  4:35       ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150609042730.GH17573@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).