mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: Re: [PATCH] split __libc_start_main.c into two files (Wasm)
Date: Tue, 19 Dec 2017 10:56:35 -0500	[thread overview]
Message-ID: <20171219155635.GS1627@brightrain.aerifal.cx> (raw)
In-Reply-To: <VI1PR0502MB38850DDE3B01E1AFCD5F8E9EE70F0@VI1PR0502MB3885.eurprd05.prod.outlook.com>

On Tue, Dec 19, 2017 at 11:04:33AM +0000, Nicholas Wilson wrote:
> On 19 December 2017 01:08, Rich Felker wrote:
> > I still don't see any good reason to call __init_libc instead of
> > __libc_start_main. While the crt1 entry point (_start) itself is a not
> > a normal C function but something specific to the ELF entry
> > conventions, __libc_start_main is a perfectly good C function that is
> > not "ELF specific".
> 
> There are two ELF-specific bits to __libc_start_main:
> 
> 1) The first thing is that it calls main and exit! Wasm users don't
> want to call exit. (This is different from the discussion we had
> before as to whether it's OK to *link in* exit when it's never
> called. This is about whether it's actually OK to call exit.) exit
> destroys C++ global variables, and Wasm users don't want that. We
> need to able to call into the Wasm module repeatedly from a
> JavaScript web page.

This is not ELF-specific, and it's not different from the discussion
we had before. You seem to be under the impression that exit "gets
called" because __libc_start_main is called. This is not true. exit is
only called if main returns. (In fact, if you have LTO, the linker
will even optimize out exit like you wanted if main does not return.)

As for why it's not ELF-specific, returning from main producing
behavior as if exit were called is part of the C language.

> Maybe think of a Wasm (statically-linked) binary as being a bit more
> like a shared library: instead of exporting a single "entrypoint"
> for the kernel to run (like a Unix process), it exports a list of
> functions for the JavaScript environment to run repeatedly. We're
> not using the language of "shared library" though, because we're
> using that terminology to mean Wasm libraries calling each other,
> and we don't have all the tooling in place for that in web browsers
> yet.

Are you saying you don't want main to get called at all?

> 2) This second point is just for your interest. It doesn't affect
> the rest of Musl, since Wasm can provide its own crt/wasm/crt1.c.
> (See:
> https://github.com/NWilson/musl/blob/musl-wasm-native/crt/wasm/crt1.c)
> 
> Another difference is that __libc_start_init uses the
> __init_array_start/end symbols, which come from special
> compiler-provided crtbegin.o/crtend.o objects on ELF platforms. The
> linker has to arrange for a table of function pointers for Musl to
> call, and it does that using the magic .init_array ELF section (plus
> some priority fiddling).

These are provided by the linker itself, not by crtbegin/crtend. (The
latter are legacy cruft used by the compiler for implementing the Java
runtime and an old, deprecated way of invoking C++ ctors. They're not
needed by musl or modern toolchains.)

> In Wasm, most functions are not callable by pointer: a function
> pointer is only assigned to code that has its address taken. For
> security, a decision was taken *not* to use a table of function
> pointers for the startup code, since that would mean taking the
> address of those functions, and they otherwise would be unlikely to
> need to go in the function pointer table. It's a hardening measure,
> to reduce the number of functions that can be used as the target of
> an indirect call.
> 
> Hence, the linker synthesises the init-function as a list of
> call-instructions, rather than synthesising a list of function
> pointers for a pre-written function to iterate over.
> 
> This decision wasn't mine, but I'm sympathetic to it. We did discuss
> whether it would be unhelpful to diverge from ELF like this, but the
> consensus was that keeping the table of function pointers small was
> more worthwhile.

I'm not sure it's a good decision, but it doesn't mean you have to
replace __libc_start_init. Note that __libc_start_init also calls
_init() and this is thee symbol that's expected to call into the
legacy "string of init function calls" mechanism you're using. The
__init_array_{start,end} symbols can just be defined both as null, or
as aliases for the same dummy object, and the init array loop becomes
a no-op. (And with LTO it would even be removed completely.)

> Note - I realise that we could override the ELF-y __libc_start_init
> since it's weak.

This is the intended usage (for musl internals, not as an interface
boundary between musl proper and arch code) and how the dynamic linker
works. But as stated above I don't think there's any reason to
override it here.

> However, given that we need to call __init_libc
> directly anyway, we may as well save some code and just register
> __init_libc with the Wasm start mechanism.

From my perspective, doing things in gratuitously arch-dependent ways
to "save some code" doesn't make sense when you're trading a small
(trivial relative size) amount of code for a permanent interface
boundary and maintenance burden.

> > It does require a pointer to the args/environment
> > formatted as an array of:
> > ...
> > but __init_libc and other code in musl also requires such an array to
> > be present.
> 
> Yes, I've got an array of argv/envp for keeping __init_libc happy.
> That's OK, that's an ELF convention that we're quite happy to follow
> to reduce friction with Musl. In general, where possible I don't
> have a problem with emulating ELF.
> 
> Thanks for your patience and for asking questions. I hope the
> answers help!

No problem. Hope this continues to be helpful.

Rich


  parent reply	other threads:[~2017-12-19 15:56 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-07 14:51 Nicholas Wilson
2017-12-07 17:03 ` Rich Felker
2017-12-15  4:19   ` Rich Felker
2017-12-15 11:34     ` Nicholas Wilson
2017-12-15 12:33       ` Szabolcs Nagy
2017-12-15 13:04         ` Nicholas Wilson
2017-12-15 17:23           ` Rich Felker
2017-12-15 17:43             ` Nicholas Wilson
2017-12-15 17:56               ` Rich Felker
2017-12-16 13:21                 ` Nicholas Wilson
2017-12-19  1:08                   ` Rich Felker
2017-12-19 11:04                     ` Nicholas Wilson
2017-12-19 15:27                       ` Szabolcs Nagy
2017-12-19 15:56                       ` Rich Felker [this message]
2017-12-19 17:46                         ` Nicholas Wilson
2017-12-19 17:54                           ` Alexander Monakov
2017-12-19 18:03                             ` Nicholas Wilson
2017-12-19 21:03                           ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171219155635.GS1627@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).