From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/2429 Path: news.gmane.org!not-for-mail From: Paul Schutte Newsgroups: gmane.linux.lib.musl.general Subject: Re: static linking and dlopen Date: Sun, 9 Dec 2012 01:29:45 +0200 Message-ID: References: <20121208225237.GV20323@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=0016e6db2da417daeb04d05fb502 X-Trace: ger.gmane.org 1355009406 28509 80.91.229.3 (8 Dec 2012 23:30:06 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 8 Dec 2012 23:30:06 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-2430-gllmg-musl=m.gmane.org@lists.openwall.com Sun Dec 09 00:30:15 2012 Return-path: Envelope-to: gllmg-musl@plane.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1ThTqT-000164-Bu for gllmg-musl@plane.gmane.org; Sun, 09 Dec 2012 00:30:09 +0100 Original-Received: (qmail 32293 invoked by uid 550); 8 Dec 2012 23:29:57 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 32285 invoked from network); 8 Dec 2012 23:29:56 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=I4yUt/9RSNuHxT5z3UTQyaDn51OMFvlRUkFi0vZjXSU=; b=wBL0ejdz5SJgNEQkmyHCUtdvP/Y0fCob1NEi4Sl9k2V0NOUyq7QQkZW1zq2xqAYuLS ZkTfiifFYTTkE/gderEHc2nNmq9nQkVgM5mz89ku+VqT2DLd8lj1BE+EMdJJZPkdPKP4 VZj9Ys+XMcXF+vvuB3S0/RfoQys12asau3mFVNwjkdroxw2zT76rHoLCmzsWL9mVX+Sd TvIMP6+JFs4NgYBdLfmdRhOPzHdpzLEzY0/Sdik7Lhgly++VMTNuuxNWnTqQ6QGMfY4z W3PtMnFYX+SpY8PO2op9V9GnlPaIiOxa/EHV5rSJRuEe/GMEH/EkB6Z709Lf4QU0zD8b Bosw== In-Reply-To: <20121208225237.GV20323@brightrain.aerifal.cx> Xref: news.gmane.org gmane.linux.lib.musl.general:2429 Archived-At: --0016e6db2da417daeb04d05fb502 Content-Type: text/plain; charset=ISO-8859-1 Thanks for the comprehensive answer, this is the answer I was looking for. There is currently not a situation that require me to do this. I was just thinking about it and decided to ask the experts. On Sun, Dec 9, 2012 at 12:52 AM, Rich Felker wrote: > On Sat, Dec 08, 2012 at 08:09:35PM +0200, Paul Schutte wrote: > > Hi, > > > > I have a strong preference towards static linking these days because the > > running program use so much less memory. > > > > When I link a binary statically and that binary then use dlopen, would > that > > work 100% ? > > Presently, it does not work at all. At best, it loses all the > advantages of static linking. > > > What would open if the shared object that was dlopened want's to call > > functions in other shared libraries ? > > Dependencies of any loaded library also get loaded. > > > I understand that when using dynamic linking those libraries would just > get > > loaded, but I am not sure what would happen with static linking. > > With static linking, they would have to be loaded too. This means a > static-linked program using dlopen would have to contain the entire > dynamic linker logic. What's worse, it would also have to contain at > least the entire libc, and if you were using static versions of any > other library in the main program, and a loaded module referenced a > dynamic version of the same library, you'd probably run into > unpredictable crashing when the versions do not match exactly. > > The source of all these problems is basically the same as the benefit > of static linking: the fact that the linker resolves, statically at > link time, which object files are needed to satisfy the needs of the > program. With dlopen, however, there is no static answer; *any* object > is potentially-needed, not directly by the main program, but possibly > by loaded modules. Consider what happens now if you only link part of > libc into the main program statically: additional modules loaded at > runtime won't necessarily have all the stuff they need, so dlopen > would also have to load libc.so. But now you're potentially using two > different versions of libc in the same program; if > implementation-internal data structures like FILE or the pthread > structure are not identical between the 2 versions, you'll hit an ABI > incompatibility, despite the fact that these data structures were > intended to be implementation-internal and never affect ABI. Even > without that issue, you have issues like potentially 2 copies of > malloc trying to manage the heap without being aware of one another, > and thus clobbering it. > > For libc, the issues are all fixable by making sure that a static > version of dlopen depends on every single function in libc, so that > another copy never needs to get loaded. However, for other static > libraries pulled into the main program, there is really no fix without > help from the linker (it would have to pull in the entire library, and > somehow leave a note for dlopen to see that library is already loaded > and avoid loading it dynamically too). > > Note that, even if we could get this working with a reasonable level > of robustness, almost all the advantages of static linking would be > gone. Static-linked programs using dlopen would be huge and ugly. > > If you really want to make single-file binaries with no dependencies > and dlopen support, I think the solution is to first build them > dynamically linked, then merge the main program and all .so files into > a single ELF file. I don't know of any tools capable of doing this, > but in principle it's possible to write one. There are at least 2 > different approaches to this. One is to process the ELF files and > merge their list of LOAD segments, symbol and relocation tables, etc. > all into a single ELF file, leaving the relocations in place for the > dynamic linker to perform at startup. This would require some > modification to the dynamic linker still. The other approach is the > equivalent of emacs' unexec dumper: place some kind of hook to run > after the dynamic linker loads everything, but before any other > application code runs, which dumps the entire memory space to an ELF > file which, when run, will reconstruct itself. > > Rich > --0016e6db2da417daeb04d05fb502 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Thanks for the comprehensive answer, this is the answer I was looking for.<= div>
There is currently not a situation that require me to do= this. I was just thinking about it and decided to ask the experts.

On Sun, Dec 9, 2012 at 12:52= AM, Rich Felker <dalias@aerifal.cx> wrote:
On Sat, Dec 08, 2012 at 08:09:35PM +0200, Paul Schutte wrote:
> Hi,
>
> I have a strong preference towards static linking these days because t= he
> running program use so much less memory.
>
> When I link a binary statically and that binary then use dlopen, would= that
> work 100% ?

Presently, it does not work at all. At best, it loses all the
advantages of static linking.

> What would open if the shared object that was dlopened want's to c= all
> functions in other shared libraries ?

Dependencies of any loaded library also get loaded.

> I understand that when using dynamic linking those libraries would jus= t get
> loaded, but I am not sure what would happen with static linking.

With static linking, they would have to be loaded too. This means a
static-linked program using dlopen would have to contain the entire
dynamic linker logic. What's worse, it would also have to contain at least the entire libc, and if you were using static versions of any
other library in the main program, and a loaded module referenced a
dynamic version of the same library, you'd probably run into
unpredictable crashing when the versions do not match exactly.

The source of all these problems is basically the same as the benefit
of static linking: the fact that the linker resolves, statically at
link time, which object files are needed to satisfy the needs of the
program. With dlopen, however, there is no static answer; *any* object
is potentially-needed, not directly by the main program, but possibly
by loaded modules. Consider what happens now if you only link part of
libc into the main program statically: additional modules loaded at
runtime won't necessarily have all the stuff they need, so dlopen
would also have to load libc.so. But now you're potentially using two different versions of libc in the same program; if
implementation-internal data structures like FILE or the pthread
structure are not identical between the 2 versions, you'll hit an ABI incompatibility, despite the fact that these data structures were
intended to be implementation-internal and never affect ABI. Even
without that issue, you have issues like potentially 2 copies of
malloc trying to manage the heap without being aware of one another,
and thus clobbering it.

For libc, the issues are all fixable by making sure that a static
version of dlopen depends on every single function in libc, so that
another copy never needs to get loaded. However, for other static
libraries pulled into the main program, there is really no fix without
help from the linker (it would have to pull in the entire library, and
somehow leave a note for dlopen to see that library is already loaded
and avoid loading it dynamically too).

Note that, even if we could get this working with a reasonable level
of robustness, almost all the advantages of static linking would be
gone. Static-linked programs using dlopen would be huge and ugly.

If you really want to make single-file binaries with no dependencies
and dlopen support, I think the solution is to first build them
dynamically linked, then merge the main program and all .so files into
a single ELF file. I don't know of any tools capable of doing this,
but in principle it's possible to write one. There are at least 2
different approaches to this. One is to process the ELF files and
merge their list of LOAD segments, symbol and relocation tables, etc.
all into a single ELF file, leaving the relocations in place for the
dynamic linker to perform at startup. This would require some
modification to the dynamic linker still. The other approach is the
equivalent of emacs' unexec dumper: place some kind of hook to run
after the dynamic linker loads everything, but before any other
application code runs, which dumps the entire memory space to an ELF
file which, when run, will reconstruct itself.

Rich

--0016e6db2da417daeb04d05fb502--