mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] Running musl executables without a preinstalled dynamic linker
@ 2022-08-15 21:35 Colin Cross
  2022-08-20  9:43 ` Szabolcs Nagy
  0 siblings, 1 reply; 10+ messages in thread
From: Colin Cross @ 2022-08-15 21:35 UTC (permalink / raw)
  To: musl; +Cc: Ryan Prichard

I would like to distribute dynamic binaries built against musl to
systems that do not have the musl dynamic linker installed in any
known location (e.g. /lib/ld-musl-$ARCH.so.1).  I have two prototypes
that enable this, and I’d like to gauge whether either is something
that would be of interest to check in to musl, or whether it would be
something we should keep in our project.

The first solution is based on the embedded linker we use to test
bionic libc on non-Android systems.  The dynamic linker is compiled as
usual, then the resulting elf file is embedded as raw data into
Scrt1.o and the PT_INTERP section removed.  The entry point is changed
to point to a trampoline that modifies AT_BASE, AT_ENTRY and AT_PHDR
to simulate how the kernel would initialize them if the dynamic linker
was mapped separately by the kernel instead of as part of the main
executable, and then jumps to the dynamic linker.

This embedded linker solution works relatively well, except that the
dynamic linker’s elf sections are inside the main executable’s elf
sections, which can break reasonable assumptions.  For example, musl’s
dladdr fails to find symbols in the embedded linker, and gdb has
trouble finding debug information from the linker.  Musl’s reuse of
libc.so as the linker means that these problems apply to everything in
libc.so, and also increases the size of every binary by including all
of libc.so.

These problems with the embedded linker could be somewhat mitigated by
splitting the dynamic linker out of libc.so when using the embedded
linker.  That requires compiling the ldso sources against a statically
linked libc.a, tweaking some of the initialization, and forwarding the
dl* calls from libc.so to the separate linker.  The changes are
relatively small, but result in a pretty big difference in musl’s
internals with and without the embedded linker that may be hard to
maintain.

The second solution we call “relinterp”.  It was originally designed
by Ryan Prichard as a standalone trampoline that could be used with
musl, glibc or bionic, but I’ve more tightly integrated it with musl
in order to reuse CRTJMP for architecture portability and some of
musl’s string functions to reduce the size of the code.  It uses a
similar trampoline in Scrt1.o, but with a much larger implementation
that reads DT_RUNPATH to construct a path to the dynamic linker that
is relative to the executable.  It then maps the dynamic linker as the
kernel would, modifies AT_BASE, AT_ENTRY and AT_PHDR, and jumps to the
dynamic linker.

The current prototype of relinterp is tricky to compile, as it
requires using -fvisibility=hidden and ld -r partial linking to build
a Scrt1.o file that uses some of the src/string/*.c sources without
any relocations, and then objcopy –keep-global-symbol to hide the
string symbols.  It’s only useful if DT_RUNPATH contains $ORIGIN so
that the dynamic linker can be distributed alongside the executable,
so it is probably never going to be suitable for setuid binaries.

If relinterp were going to be included with musl I’d refactor it to
reuse the __dls* bootstrapping from dynlink.c so that it can link
against libc.a and not worry about avoiding any relocations.

An alternative solution to these two would be to distribute statically
linked binaries, which precludes the use of dlopen, or to wrap every
executable in a shell script that runs the dynamic linker directly.

Do either of these prototypes seem interesting enough to clean up and
post as upstream patches, or should I keep them as a side project that
I can bolt on to musl with minimal invasive changes?

Colin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [musl] Running musl executables without a preinstalled dynamic linker
  2022-08-15 21:35 [musl] Running musl executables without a preinstalled dynamic linker Colin Cross
@ 2022-08-20  9:43 ` Szabolcs Nagy
  2022-08-23  0:22   ` Colin Cross
  0 siblings, 1 reply; 10+ messages in thread
From: Szabolcs Nagy @ 2022-08-20  9:43 UTC (permalink / raw)
  To: Colin Cross; +Cc: musl, Ryan Prichard

* Colin Cross <ccross@android.com> [2022-08-15 14:35:33 -0700]:
> I would like to distribute dynamic binaries built against musl to
> systems that do not have the musl dynamic linker installed in any
> known location (e.g. /lib/ld-musl-$ARCH.so.1).  I have two prototypes
> that enable this, and I’d like to gauge whether either is something
> that would be of interest to check in to musl, or whether it would be
> something we should keep in our project.
> 
> The first solution is based on the embedded linker we use to test
> bionic libc on non-Android systems.  The dynamic linker is compiled as
> usual, then the resulting elf file is embedded as raw data into
> Scrt1.o and the PT_INTERP section removed.  The entry point is changed
> to point to a trampoline that modifies AT_BASE, AT_ENTRY and AT_PHDR
> to simulate how the kernel would initialize them if the dynamic linker
> was mapped separately by the kernel instead of as part of the main
> executable, and then jumps to the dynamic linker.
> 
> This embedded linker solution works relatively well, except that the
> dynamic linker’s elf sections are inside the main executable’s elf
> sections, which can break reasonable assumptions.  For example, musl’s
> dladdr fails to find symbols in the embedded linker, and gdb has
> trouble finding debug information from the linker.  Musl’s reuse of
> libc.so as the linker means that these problems apply to everything in
> libc.so, and also increases the size of every binary by including all
> of libc.so.
> 
> These problems with the embedded linker could be somewhat mitigated by
> splitting the dynamic linker out of libc.so when using the embedded
> linker.  That requires compiling the ldso sources against a statically
> linked libc.a, tweaking some of the initialization, and forwarding the
> dl* calls from libc.so to the separate linker.  The changes are
> relatively small, but result in a pretty big difference in musl’s
> internals with and without the embedded linker that may be hard to
> maintain.
> 

that breaks atomic update of the libc and introduces libc internal abi.
(i.e. bad for long term security and maintainability)

> The second solution we call “relinterp”.  It was originally designed
> by Ryan Prichard as a standalone trampoline that could be used with
> musl, glibc or bionic, but I’ve more tightly integrated it with musl
> in order to reuse CRTJMP for architecture portability and some of
> musl’s string functions to reduce the size of the code.  It uses a
> similar trampoline in Scrt1.o, but with a much larger implementation
> that reads DT_RUNPATH to construct a path to the dynamic linker that
> is relative to the executable.  It then maps the dynamic linker as the
> kernel would, modifies AT_BASE, AT_ENTRY and AT_PHDR, and jumps to the
> dynamic linker.
> 

i think this is a better approach.

i would not use Scrt1.o though, the same toolchain should be
usable for normal linking and relinterp linking, just use a
different name like Xcrt1.o.

> The current prototype of relinterp is tricky to compile, as it
> requires using -fvisibility=hidden and ld -r partial linking to build
> a Scrt1.o file that uses some of the src/string/*.c sources without
> any relocations, and then objcopy –keep-global-symbol to hide the
> string symbols.  It’s only useful if DT_RUNPATH contains $ORIGIN so
> that the dynamic linker can be distributed alongside the executable,
> so it is probably never going to be suitable for setuid binaries.
> 
> If relinterp were going to be included with musl I’d refactor it to
> reuse the __dls* bootstrapping from dynlink.c so that it can link
> against libc.a and not worry about avoiding any relocations.
> 

i would make Xcrt1.o self-contained and size optimized: it only
runs at start up, this is a different requirement from the -O3
build of normal string functions. and then there is no dependency
on libc internals (which may have various instrumentations that
does not work in Xcrt1.o).

> An alternative solution to these two would be to distribute statically
> linked binaries, which precludes the use of dlopen, or to wrap every
> executable in a shell script that runs the dynamic linker directly.
> 

i think it is possible to support static linking such that if dlopen
is linked then the entire libc gets linked into the main exe with
libc apis exported. then dlopen can work from an otherwise static exe.
(may not be easy to implement in practice though)

> Do either of these prototypes seem interesting enough to clean up and
> post as upstream patches, or should I keep them as a side project that
> I can bolt on to musl with minimal invasive changes?
> 
> Colin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [musl] Running musl executables without a preinstalled dynamic linker
  2022-08-20  9:43 ` Szabolcs Nagy
@ 2022-08-23  0:22   ` Colin Cross
  2022-08-23  8:18     ` Szabolcs Nagy
  0 siblings, 1 reply; 10+ messages in thread
From: Colin Cross @ 2022-08-23  0:22 UTC (permalink / raw)
  To: Colin Cross, musl, Ryan Prichard

On Sat, Aug 20, 2022 at 2:43 AM Szabolcs Nagy <nsz@port70.net> wrote:
>
> * Colin Cross <ccross@android.com> [2022-08-15 14:35:33 -0700]:
> > I would like to distribute dynamic binaries built against musl to
> > systems that do not have the musl dynamic linker installed in any
> > known location (e.g. /lib/ld-musl-$ARCH.so.1).  I have two prototypes
> > that enable this, and I’d like to gauge whether either is something
> > that would be of interest to check in to musl, or whether it would be
> > something we should keep in our project.
> >
> > The first solution is based on the embedded linker we use to test
> > bionic libc on non-Android systems.  The dynamic linker is compiled as
> > usual, then the resulting elf file is embedded as raw data into
> > Scrt1.o and the PT_INTERP section removed.  The entry point is changed
> > to point to a trampoline that modifies AT_BASE, AT_ENTRY and AT_PHDR
> > to simulate how the kernel would initialize them if the dynamic linker
> > was mapped separately by the kernel instead of as part of the main
> > executable, and then jumps to the dynamic linker.
> >
> > This embedded linker solution works relatively well, except that the
> > dynamic linker’s elf sections are inside the main executable’s elf
> > sections, which can break reasonable assumptions.  For example, musl’s
> > dladdr fails to find symbols in the embedded linker, and gdb has
> > trouble finding debug information from the linker.  Musl’s reuse of
> > libc.so as the linker means that these problems apply to everything in
> > libc.so, and also increases the size of every binary by including all
> > of libc.so.
> >
> > These problems with the embedded linker could be somewhat mitigated by
> > splitting the dynamic linker out of libc.so when using the embedded
> > linker.  That requires compiling the ldso sources against a statically
> > linked libc.a, tweaking some of the initialization, and forwarding the
> > dl* calls from libc.so to the separate linker.  The changes are
> > relatively small, but result in a pretty big difference in musl’s
> > internals with and without the embedded linker that may be hard to
> > maintain.
> >
>
> that breaks atomic update of the libc and introduces libc internal abi.
> (i.e. bad for long term security and maintainability)

The intent was to distribute the binary with the embedded linker
alongside a matching copy of libc, but yes, this would increase the
chances of an unintended version skew.

> > The second solution we call “relinterp”.  It was originally designed
> > by Ryan Prichard as a standalone trampoline that could be used with
> > musl, glibc or bionic, but I’ve more tightly integrated it with musl
> > in order to reuse CRTJMP for architecture portability and some of
> > musl’s string functions to reduce the size of the code.  It uses a
> > similar trampoline in Scrt1.o, but with a much larger implementation
> > that reads DT_RUNPATH to construct a path to the dynamic linker that
> > is relative to the executable.  It then maps the dynamic linker as the
> > kernel would, modifies AT_BASE, AT_ENTRY and AT_PHDR, and jumps to the
> > dynamic linker.
> >
>
> i think this is a better approach.

Agreed, I generally prefer this approach.

> i would not use Scrt1.o though, the same toolchain should be
> usable for normal linking and relinterp linking, just use a
> different name like Xcrt1.o.

Is there some way to get gcc/clang to use Xcrt1.o without using
-nostdlib and passing all the crtbegin/end objects manually?

> > The current prototype of relinterp is tricky to compile, as it
> > requires using -fvisibility=hidden and ld -r partial linking to build
> > a Scrt1.o file that uses some of the src/string/*.c sources without
> > any relocations, and then objcopy –keep-global-symbol to hide the
> > string symbols.  It’s only useful if DT_RUNPATH contains $ORIGIN so
> > that the dynamic linker can be distributed alongside the executable,
> > so it is probably never going to be suitable for setuid binaries.
> >
> > If relinterp were going to be included with musl I’d refactor it to
> > reuse the __dls* bootstrapping from dynlink.c so that it can link
> > against libc.a and not worry about avoiding any relocations.
> >
>
> i would make Xcrt1.o self-contained and size optimized: it only
> runs at start up, this is a different requirement from the -O3
> build of normal string functions. and then there is no dependency
> on libc internals (which may have various instrumentations that
> does not work in Xcrt1.o).

Doesn't this same logic apply to most of the code in dynlink.c?  My
main worry with a self contained implementation is that it requires
reimplementations of various string functions that are easy to get
wrong.  The current prototype reuses the C versions of musl's string
functions, but implements its own syscall wrappers to avoid
interactions with musl internals like errno.

> > An alternative solution to these two would be to distribute statically
> > linked binaries, which precludes the use of dlopen, or to wrap every
> > executable in a shell script that runs the dynamic linker directly.
> >
>
> i think it is possible to support static linking such that if dlopen
> is linked then the entire libc gets linked into the main exe with
> libc apis exported. then dlopen can work from an otherwise static exe.
> (may not be easy to implement in practice though)
>
> > Do either of these prototypes seem interesting enough to clean up and
> > post as upstream patches, or should I keep them as a side project that
> > I can bolt on to musl with minimal invasive changes?
> >
> > Colin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [musl] Running musl executables without a preinstalled dynamic linker
  2022-08-23  0:22   ` Colin Cross
@ 2022-08-23  8:18     ` Szabolcs Nagy
  2022-09-26 22:38       ` Colin Cross
  0 siblings, 1 reply; 10+ messages in thread
From: Szabolcs Nagy @ 2022-08-23  8:18 UTC (permalink / raw)
  To: Colin Cross; +Cc: musl, Ryan Prichard

* Colin Cross <ccross@android.com> [2022-08-22 17:22:06 -0700]:
> On Sat, Aug 20, 2022 at 2:43 AM Szabolcs Nagy <nsz@port70.net> wrote:
> > i would not use Scrt1.o though, the same toolchain should be
> > usable for normal linking and relinterp linking, just use a
> > different name like Xcrt1.o.
> 
> Is there some way to get gcc/clang to use Xcrt1.o without using
> -nostdlib and passing all the crtbegin/end objects manually?

this requires compiler changes (new cmdline flag) but then i think
the code is upstreamable.

> > i would make Xcrt1.o self-contained and size optimized: it only
> > runs at start up, this is a different requirement from the -O3
> > build of normal string functions. and then there is no dependency
> > on libc internals (which may have various instrumentations that
> > does not work in Xcrt1.o).
> 
> Doesn't this same logic apply to most of the code in dynlink.c?  My
> main worry with a self contained implementation is that it requires
> reimplementations of various string functions that are easy to get
> wrong.  The current prototype reuses the C versions of musl's string
> functions, but implements its own syscall wrappers to avoid
> interactions with musl internals like errno.

dynlink is in libc.so so it can use code from there.

but moving libc code into the executable has different constraints.
so you will have to make random decisions that string functions are
in but errno is out, wrt which libc internal makes sense in the exe.

i would just keep a separate implementation (or at least compile
the code separately). string functions are easy to implement if
you dont try to optimize them imo. then you have full control over
what is going on in the exe entry code.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [musl] Running musl executables without a preinstalled dynamic linker
  2022-08-23  8:18     ` Szabolcs Nagy
@ 2022-09-26 22:38       ` Colin Cross
  2022-09-26 22:42         ` Colin Cross
  0 siblings, 1 reply; 10+ messages in thread
From: Colin Cross @ 2022-09-26 22:38 UTC (permalink / raw)
  To: Colin Cross, musl, Ryan Prichard

On Tue, Aug 23, 2022 at 1:18 AM Szabolcs Nagy <nsz@port70.net> wrote:
>
> * Colin Cross <ccross@android.com> [2022-08-22 17:22:06 -0700]:
> > On Sat, Aug 20, 2022 at 2:43 AM Szabolcs Nagy <nsz@port70.net> wrote:
> > > i would not use Scrt1.o though, the same toolchain should be
> > > usable for normal linking and relinterp linking, just use a
> > > different name like Xcrt1.o.
> >
> > Is there some way to get gcc/clang to use Xcrt1.o without using
> > -nostdlib and passing all the crtbegin/end objects manually?
>
> this requires compiler changes (new cmdline flag) but then i think
> the code is upstreamable.

I've used relinterp.o for now, and selected instead of Scrt1.o in
musl-gcc.specs and ld.musl-clang.

>
> > > i would make Xcrt1.o self-contained and size optimized: it only
> > > runs at start up, this is a different requirement from the -O3
> > > build of normal string functions. and then there is no dependency
> > > on libc internals (which may have various instrumentations that
> > > does not work in Xcrt1.o).
> >
> > Doesn't this same logic apply to most of the code in dynlink.c?  My
> > main worry with a self contained implementation is that it requires
> > reimplementations of various string functions that are easy to get
> > wrong.  The current prototype reuses the C versions of musl's string
> > functions, but implements its own syscall wrappers to avoid
> > interactions with musl internals like errno.
>
> dynlink is in libc.so so it can use code from there.
>
> but moving libc code into the executable has different constraints.
> so you will have to make random decisions that string functions are
> in but errno is out, wrt which libc internal makes sense in the exe.
>
> i would just keep a separate implementation (or at least compile
> the code separately). string functions are easy to implement if
> you dont try to optimize them imo. then you have full control over
> what is going on in the exe entry code.

I left the reimplementations of string functions and syscalls as
suggested.  Patch attached.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [musl] Running musl executables without a preinstalled dynamic linker
  2022-09-26 22:38       ` Colin Cross
@ 2022-09-26 22:42         ` Colin Cross
  2022-09-26 23:02           ` Rich Felker
  0 siblings, 1 reply; 10+ messages in thread
From: Colin Cross @ 2022-09-26 22:42 UTC (permalink / raw)
  To: Colin Cross, musl, Ryan Prichard

[-- Attachment #1: Type: text/plain, Size: 2208 bytes --]

On Mon, Sep 26, 2022 at 3:38 PM Colin Cross <ccross@android.com> wrote:
>
> On Tue, Aug 23, 2022 at 1:18 AM Szabolcs Nagy <nsz@port70.net> wrote:
> >
> > * Colin Cross <ccross@android.com> [2022-08-22 17:22:06 -0700]:
> > > On Sat, Aug 20, 2022 at 2:43 AM Szabolcs Nagy <nsz@port70.net> wrote:
> > > > i would not use Scrt1.o though, the same toolchain should be
> > > > usable for normal linking and relinterp linking, just use a
> > > > different name like Xcrt1.o.
> > >
> > > Is there some way to get gcc/clang to use Xcrt1.o without using
> > > -nostdlib and passing all the crtbegin/end objects manually?
> >
> > this requires compiler changes (new cmdline flag) but then i think
> > the code is upstreamable.
>
> I've used relinterp.o for now, and selected instead of Scrt1.o in
> musl-gcc.specs and ld.musl-clang.
>
> >
> > > > i would make Xcrt1.o self-contained and size optimized: it only
> > > > runs at start up, this is a different requirement from the -O3
> > > > build of normal string functions. and then there is no dependency
> > > > on libc internals (which may have various instrumentations that
> > > > does not work in Xcrt1.o).
> > >
> > > Doesn't this same logic apply to most of the code in dynlink.c?  My
> > > main worry with a self contained implementation is that it requires
> > > reimplementations of various string functions that are easy to get
> > > wrong.  The current prototype reuses the C versions of musl's string
> > > functions, but implements its own syscall wrappers to avoid
> > > interactions with musl internals like errno.
> >
> > dynlink is in libc.so so it can use code from there.
> >
> > but moving libc code into the executable has different constraints.
> > so you will have to make random decisions that string functions are
> > in but errno is out, wrt which libc internal makes sense in the exe.
> >
> > i would just keep a separate implementation (or at least compile
> > the code separately). string functions are easy to implement if
> > you dont try to optimize them imo. then you have full control over
> > what is going on in the exe entry code.
>
> I left the reimplementations of string functions and syscalls as
> suggested.  Patch attached.

[-- Attachment #2: 0001-Add-entry-point-to-find-dynamic-loader-relative-to-t.patch --]
[-- Type: text/x-patch, Size: 36213 bytes --]

From 0df460188b95f79272003bd0e5c12bceb2a3c25f Mon Sep 17 00:00:00 2001
From: Colin Cross <ccross@android.com>
Date: Thu, 22 Sep 2022 19:14:01 -0700
Subject: [PATCH] Add entry point to find dynamic loader relative to the
 executable

Distributing binaries built against musl to systems that don't already
have musl is problematic due to the hardcoded absolute path to the
dynamic loader (e.g. /lib/ld-musl-$ARCH.so.1) in the PT_INTERP header.
This patch adds a feature to avoid the problem by leaving out PT_INTERP
and replacing Scrt1.o with an entry point that can find the dynamic
loader using DT_RUNPATH or LD_LIBRARY_PATH.

The entry point is in crt/relinterp.c.  It uses auxval to get the
program headers and find the load address of the binary, then
searches LD_LIBRARY_PATH or DT_RUNPATH for the dynamic loader.
Once found, it mmaps the loader similar to the way the kernel
does when PT_INTERP is set.  The musl loader uses PT_INTERP to set
the path to the loader in the shared library info exported to the
debugger, so relinterp creates a copy of the program headers
with the PT_INTERP entry added pointing to the found location of
the dynamic loader.  It updates AT_BASE to point to the address
of the dynamic loader, then jumps to the loaders entry point.

The dynamic loader then loads shared libraries and handles
relocations before jumping to the executable's entry point, which is
the entry point in relinterp.c again.  Relinterp detects that
relocations have been performed and calls __libc_start_main, the
same way Scrt1.o would have.

Since relinterp runs before relocations have been performed it has
to avoid referecing any libc functions.  That means reimplementing
the few syscalls and string functions that it uses, and avoiding
implicit calls to memcpy and memset that may  be inserted by the
compiler.

Enabling relinterp is handled in the spec file for gcc and in
the linker script for clang via a -relinterp argument.

Normally gdb and lldb look for a symbol named "_dl_debug_state" in
the interpreter to get notified when the dynamic loader has modified
the list of shared libraries.  When using relinterp the debugger is
not aware of the interpreter (at process launch PT_INTERP is unset
and auxv AT_BASE is 0) so it doesn't know where to look for the symbol.

They fall back to looking in the executable, so we can provide a symbol
in relinterp.c for it to find.  The dynamic loader is then modified
to also find the symbol in the exectuable and to call it from its own
_dl_debug_state function.

The same tests in libc_test pass with or without LDFLAGS += -relinterp
with both musl-gcc and musl-clang.

Ryan Prichard (rprichard@google.com) authored the original prototype
of relinterp.
---
 Makefile                |   2 +-
 crt/relinterp.c         | 896 ++++++++++++++++++++++++++++++++++++++++
 ldso/dynlink.c          |   9 +
 tools/ld.musl-clang.in  |  26 +-
 tools/musl-clang.in     |  17 +-
 tools/musl-gcc.specs.sh |   4 +-
 6 files changed, 948 insertions(+), 6 deletions(-)
 create mode 100644 crt/relinterp.c

diff --git a/Makefile b/Makefile
index e8cc4436..9b38024b 100644
--- a/Makefile
+++ b/Makefile
@@ -113,7 +113,7 @@ obj/crt/crt1.o obj/crt/scrt1.o obj/crt/rcrt1.o obj/ldso/dlstart.lo: $(srcdir)/ar
 
 obj/crt/rcrt1.o: $(srcdir)/ldso/dlstart.c
 
-obj/crt/Scrt1.o obj/crt/rcrt1.o: CFLAGS_ALL += -fPIC
+obj/crt/Scrt1.o obj/crt/rcrt1.o obj/crt/relinterp.o: CFLAGS_ALL += -fPIC
 
 OPTIMIZE_SRCS = $(wildcard $(OPTIMIZE_GLOBS:%=$(srcdir)/src/%))
 $(OPTIMIZE_SRCS:$(srcdir)/%.c=obj/%.o) $(OPTIMIZE_SRCS:$(srcdir)/%.c=obj/%.lo): CFLAGS += -O3
diff --git a/crt/relinterp.c b/crt/relinterp.c
new file mode 100644
index 00000000..fa68bd46
--- /dev/null
+++ b/crt/relinterp.c
@@ -0,0 +1,896 @@
+/*
+ * Copyright (C) 2021 The Android Open Source Project
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *  * Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ *  * Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in
+ *    the documentation and/or other materials provided with the
+ *    distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+ * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+ * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
+ * OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
+ * AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+ * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
+ * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include <elf.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <link.h>
+#include <stdalign.h>
+#include <stdarg.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <sys/mman.h>
+#include <sys/param.h>
+#include <sys/syscall.h>
+#include <sys/user.h>
+#include <unistd.h>
+
+#include "reloc.h"
+#include "syscall.h"
+
+#ifndef LOADER_PATH
+#define LOADER_PATH "libc.so"
+#endif
+
+typedef void EntryFunc(void);
+
+// arm64 doesn't have a constant page size and has to use the value from AT_PAGESZ.
+#ifndef PAGE_SIZE
+#define PAGE_SIZE g_page_size
+#endif
+
+#define PAGE_START(x) ((x) & (~(PAGE_SIZE-1)))
+#define PAGE_END(x) PAGE_START((x) + (PAGE_SIZE - 1))
+
+#define START "_start"
+#include "crt_arch.h"
+
+int main();
+weak void _init();
+weak void _fini();
+int __libc_start_main(int (*)(), int, char **, void (*)(), void(*)(), void(*)());
+
+static ElfW(Phdr) replacement_phdr_table[64];
+static char replacement_interp[PATH_MAX];
+
+static bool g_debug = false;
+static const char* g_prog_name = NULL;
+static uintptr_t g_page_size = 0;
+static int g_errno = 0;
+
+__attribute__((visibility("hidden"))) extern ElfW(Dyn) _DYNAMIC[];
+
+__attribute__((used))
+static long ri_set_errno(unsigned long val) {
+	if (val > -4096UL) {
+		g_errno = -val;
+		return -1;
+	}
+	return val;
+}
+
+#define ri_syscall(...) ri_set_errno(__syscall(__VA_ARGS__))
+
+static ssize_t ri_write(int fd, const void* buf, size_t amt) {
+	return ri_syscall(SYS_write, fd, buf, amt);
+}
+
+__attribute__((noreturn))
+static void ri_exit(int status) {
+	ri_syscall(SYS_exit, status);
+	__builtin_unreachable();
+}
+
+static int ri_open(const char* path, int flags, mode_t mode) {
+	return ri_syscall(SYS_openat, AT_FDCWD, path, flags, mode);
+}
+
+static int ri_close(int fd) {
+	return ri_syscall(SYS_close, fd);
+}
+
+static off_t ri_lseek(int fd, off_t offset, int whence) {
+	return ri_syscall(SYS_lseek, fd, offset, whence);
+}
+
+static ssize_t ri_readlink(const char* path, char* buf, size_t size) {
+	return ri_syscall(SYS_readlinkat, AT_FDCWD, path, buf, size);
+}
+
+static void* ri_mmap(void* addr, size_t length, int prot, int flags, int fd, off_t offset) {
+#ifdef SYS_mmap2
+	return (void*)ri_syscall(SYS_mmap2, addr, length, prot, flags, fd, offset/SYSCALL_MMAP2_UNIT);
+#else
+	return (void*)ri_syscall(SYS_mmap, addr, length, prot, flags, fd, offset);
+#endif
+}
+
+static void* ri_munmap(void* addr, size_t length) {
+	return (void*)ri_syscall(SYS_munmap, addr, length);
+}
+
+static int ri_mprotect(void* addr, size_t len, int prot) {
+	return ri_syscall(SYS_mprotect, addr, len, prot);
+}
+
+static size_t ri_strlen(const char* src) {
+	for (size_t len = 0;; ++len) {
+		if (src[len] == '\0') return len;
+	}
+}
+
+static char* ri_strcpy(char* dst, const char* src) {
+	char* result = dst;
+	while ((*dst = *src) != '\0') {
+		++dst;
+		++src;
+	}
+	return result;
+}
+
+static char* ri_strcat(char* dst, const char* src) {
+	ri_strcpy(dst + ri_strlen(dst), src);
+	return dst;
+}
+
+static void* ri_memset(void* dst, int val, size_t len) {
+	for (size_t i = 0; i < len; ++i) {
+		((char*)dst)[i] = val;
+	}
+	return dst;
+}
+
+__attribute__ ((unused))
+static void* ri_memcpy(void* dst, const void* src, size_t len) {
+	for (size_t i = 0; i < len; ++i) {
+		((char*)dst)[i] = ((char*)src)[i];
+	}
+	return dst;
+}
+
+static int ri_strncmp(const char* x, const char *y, size_t maxlen) {
+	for (size_t i = 0;; ++i) {
+		if (i == maxlen) return 0;
+		int result = (unsigned char)x[i] - (unsigned char)y[i];
+		if (result != 0) return result;
+		if (x[i] == '\0') return 0;
+	}
+}
+
+static int ri_strcmp(const char* x, const char *y) {
+	return ri_strncmp(x, y, SIZE_MAX);
+}
+
+static char* ri_strrchr(const char* str, int ch) {
+	char* result = NULL;
+	while (true) {
+		if (*str == ch) result = (char*)str;
+		if (*str == '\0') break;
+		++str;
+	}
+	return result;
+}
+
+static char* ri_strchr(const char* str, int ch) {
+	while (*str) {
+		if (*str == ch) return (char*)str;
+		++str;
+	}
+	return NULL;
+}
+
+static void ri_dirname(char* path) {
+	char* last_slash = ri_strrchr(path, '/');
+	if (last_slash == NULL) {
+		path[0] = '.';   // returns "."
+		path[1] = '\0';
+	} else if (last_slash == path) {
+		path[1] = '\0';  // returns "/"
+	} else {
+		*last_slash = '\0';
+	}
+}
+
+static void out_str_n(const char* str, size_t n) {
+	ri_write(STDERR_FILENO, str, n);
+}
+
+static void out_str(const char* str) {
+	out_str_n(str, ri_strlen(str));
+}
+
+static char* ul_to_str(unsigned long i, char* out, unsigned char base) {
+	char buf[65];
+	char* cur = &buf[65];
+	*--cur = '\0';
+	do {
+		*--cur = "0123456789abcdef"[i % base];
+		i /= base;
+	} while (i > 0);
+	return ri_strcpy(out, cur);
+}
+
+static char* l_to_str(long i, char* out, unsigned char base) {
+	if (i < 0) {
+		*out = '-';
+		ul_to_str(-(unsigned long)i, out + 1, base);
+		return out;
+	} else {
+		return ul_to_str(i, out, base);
+	}
+}
+
+static const char* ri_strerror(int err) {
+	switch (err) {
+	case EPERM: return "Operation not permitted";
+	case ENOENT: return "No such file or directory";
+	case EIO: return "I/O error";
+	case ENXIO: return "No such device or address";
+	case EAGAIN: return "Try again";
+	case ENOMEM: return "Out of memory";
+	case EACCES: return "Permission denied";
+	case ENODEV: return "No such device";
+	case ENOTDIR: return "Not a directory";
+	case EINVAL: return "Invalid argument";
+	case ENFILE: return "File table overflow";
+	case EMFILE: return "Too many open files";
+	case ESPIPE: return "Illegal seek";
+	case ENAMETOOLONG: return "File name too long";
+	case ELOOP: return "Too many symbolic links encountered";
+	}
+	static char buf[64];
+	ri_strcpy(buf, "Unknown error ");
+	l_to_str(err, buf + ri_strlen(buf), 10);
+	return buf;
+}
+
+static void outv(const char *fmt, va_list ap) {
+	char buf[65];
+	while (true) {
+		if (fmt[0] == '\0') break;
+
+#define NUM_FMT(num_fmt, type, func, base)                  \
+		if (!ri_strncmp(fmt, num_fmt, sizeof(num_fmt) - 1)) {   \
+			out_str(func(va_arg(ap, type), buf, base));           \
+			fmt += sizeof(num_fmt) - 1;                           \
+			continue;                                             \
+		}
+		NUM_FMT("%d",  int,           l_to_str,  10);
+		NUM_FMT("%ld", long,          l_to_str,  10);
+		NUM_FMT("%u",  unsigned int,  ul_to_str, 10);
+		NUM_FMT("%lu", unsigned long, ul_to_str, 10);
+		NUM_FMT("%zu", size_t,        ul_to_str, 10);
+		NUM_FMT("%x",  unsigned int,  ul_to_str, 16);
+		NUM_FMT("%lx", unsigned long, ul_to_str, 16);
+		NUM_FMT("%zx", size_t,        ul_to_str, 16);
+#undef NUM_FMT
+
+		if (!ri_strncmp(fmt, "%p", 2)) {
+			out_str(ul_to_str((unsigned long)va_arg(ap, void*), buf, 16));
+			fmt += 2;
+		} else if (!ri_strncmp(fmt, "%s", 2)) {
+			const char* arg = va_arg(ap, const char*);
+			out_str(arg ? arg : "(null)");
+			fmt += 2;
+		} else if (!ri_strncmp(fmt, "%%", 2)) {
+			out_str("%");
+			fmt += 2;
+		} else if (fmt[0] == '%') {
+			buf[0] = fmt[1];
+			buf[1] = '\0';
+			out_str("relinterp error: unrecognized output specifier: '%");
+			out_str(buf);
+			out_str("'\n");
+			ri_exit(1);
+		} else {
+			size_t len = 0;
+			while (fmt[len] != '\0' && fmt[len] != '%') ++len;
+			out_str_n(fmt, len);
+			fmt += len;
+		}
+	}
+}
+
+__attribute__((format(printf, 1, 2)))
+static void debug(const char* fmt, ...) {
+	if (!g_debug) return;
+	out_str("relinterp: ");
+
+	va_list ap;
+	va_start(ap, fmt);
+	outv(fmt, ap);
+	va_end(ap);
+	out_str("\n");
+}
+
+__attribute__((format(printf, 1, 2), noreturn))
+static void fatal(const char* fmt, ...) {
+	out_str("relinterp: ");
+	if (g_prog_name) {
+		out_str(g_prog_name);
+		out_str(": ");
+	}
+	out_str("fatal error: ");
+
+	va_list ap;
+	va_start(ap, fmt);
+	outv(fmt, ap);
+	va_end(ap);
+	out_str("\n");
+	ri_exit(1);
+}
+
+static void* optimizer_barrier(void* val) {
+	__asm__ volatile ("nop" :: "r"(&val) : "memory");
+	return val;
+}
+
+typedef struct {
+	unsigned long key;
+	unsigned long value;
+} AuxEntry;
+
+typedef struct {
+	int argc;
+	char **argv;
+	char **envp;
+	size_t envp_count;
+	AuxEntry* auxv;
+	size_t auxv_count;
+} KernelArguments;
+
+static KernelArguments read_args(void* raw_args) {
+	KernelArguments result;
+	result.argc = *(long*)raw_args;
+	result.argv = (char**)((void**)raw_args + 1);
+	result.envp = result.argv + result.argc + 1;
+
+	char** envp = result.envp;
+	while (*envp != NULL) ++envp;
+	result.envp_count = envp - result.envp;
+	++envp;
+
+	result.auxv = (AuxEntry*)envp;
+	size_t count = 0;
+	while (result.auxv[count].key != 0) {
+		++count;
+	}
+	result.auxv_count = count;
+	return result;
+}
+
+static void dump_auxv(const KernelArguments* args) {
+	for (size_t i = 0; i < args->auxv_count; ++i) {
+		const char* name = "";
+		switch (args->auxv[i].key) {
+		case AT_BASE: name = " [AT_BASE]"; break;
+		case AT_EGID: name = " [AT_EGID]"; break;
+		case AT_ENTRY: name = " [AT_ENTRY]"; break;
+		case AT_EUID: name = " [AT_EUID]"; break;
+		case AT_GID: name = " [AT_GID]"; break;
+		case AT_PAGESZ: name = " [AT_PAGESZ]"; break;
+		case AT_PHDR: name = " [AT_PHDR]"; break;
+		case AT_PHENT: name = " [AT_PHENT]"; break;
+		case AT_PHNUM: name = " [AT_PHNUM]"; break;
+		case AT_SECURE: name = " [AT_SECURE]"; break;
+		case AT_SYSINFO: name = " [AT_SYSINFO]"; break;
+		case AT_SYSINFO_EHDR: name = " [AT_SYSINFO_EHDR]"; break;
+		case AT_UID: name = " [AT_UID]"; break;
+		}
+		debug("  %lu => 0x%lx%s", args->auxv[i].key, args->auxv[i].value, name);
+	}
+}
+
+static unsigned long ri_getauxval(const KernelArguments* args, unsigned long kind,
+                                  bool allow_missing) {
+	for (size_t i = 0; i < args->auxv_count; ++i) {
+		if (args->auxv[i].key == kind) return args->auxv[i].value;
+	}
+	if (!allow_missing) fatal("could not find aux vector entry %lu", kind);
+	return 0;
+}
+
+static int elf_flags_to_prot(int flags) {
+	int result = 0;
+	if (flags & PF_R) result |= PROT_READ;
+	if (flags & PF_W) result |= PROT_WRITE;
+	if (flags & PF_X) result |= PROT_EXEC;
+	return result;
+}
+
+typedef struct {
+	int fd;
+	char path[PATH_MAX];
+} OpenedLoader;
+
+typedef struct {
+	void* base_addr;
+	EntryFunc* entry;
+} LoadedInterp;
+
+static LoadedInterp load_interp(const OpenedLoader *loader, ElfW(Ehdr)* hdr) {
+	ElfW(Phdr)* phdr = (ElfW(Phdr)*)((char*)hdr + hdr->e_phoff);
+	size_t phdr_count = hdr->e_phnum;
+
+	size_t max_vaddr = 0;
+
+	// Find the virtual address extent.
+	for (size_t i = 0; i < phdr_count; ++i) {
+		if (phdr[i].p_type == PT_LOAD) {
+			max_vaddr = PAGE_END(MAX(max_vaddr, phdr[i].p_vaddr + phdr[i].p_memsz));
+		}
+	}
+
+	// Map an area to fit the loader.
+	void* loader_vaddr = ri_mmap(NULL, max_vaddr, PROT_READ | PROT_WRITE,
+	                             MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+	if (loader_vaddr == (void*)MAP_FAILED) {
+		fatal("reservation mmap of 0x%zx bytes for %s failed: %s", max_vaddr, loader->path,
+		      ri_strerror(g_errno));
+	}
+
+	// Map each PT_LOAD.
+	for (size_t i = 0; i < phdr_count; ++i) {
+		if (phdr[i].p_type == PT_LOAD) {
+			size_t start = PAGE_START(phdr[i].p_vaddr);
+			const size_t end = PAGE_END(phdr[i].p_vaddr + phdr[i].p_memsz);
+			if (phdr[i].p_filesz > 0) {
+				const size_t file_end = phdr[i].p_vaddr + phdr[i].p_filesz;
+				void* tmp = ri_mmap((char*)loader_vaddr + start,
+				                    file_end - start,
+				                    elf_flags_to_prot(phdr[i].p_flags),
+				                    MAP_PRIVATE | MAP_FIXED, loader->fd, PAGE_START(phdr[i].p_offset));
+				if (tmp == (void*)MAP_FAILED) {
+					fatal("PT_LOAD mmap failed (%s segment #%zu): %s", loader->path, i,
+					      ri_strerror(g_errno));
+				}
+				start = file_end;
+				if (phdr[i].p_flags & PF_W) {
+					// The bytes between p_filesz and PAGE_END(p_filesz) currently come from the file mapping,
+					// but they need to be zeroed. (Apparently this zeroing isn't necessary if the segment isn't
+					// writable, and zeroing a non-writable page would be inconvenient.)
+					ri_memset((char*)loader_vaddr + start, '\0', PAGE_END(start) - start);
+				}
+				start = PAGE_END(start);
+			}
+			if (start < end) {
+				// The memory is already zeroed, because it comes from an anonymous file mapping. Just set
+				// the protections correctly.
+				int result = ri_mprotect((char*)loader_vaddr + start, end - start,
+				                         elf_flags_to_prot(phdr[i].p_flags));
+				if (result != 0) {
+					fatal("mprotect of PT_LOAD failed (%s segment #%zu): %s", loader->path, i,
+					      ri_strerror(g_errno));
+				}
+			}
+		}
+	}
+
+	return (LoadedInterp) {
+		.base_addr = loader_vaddr,
+		.entry = (EntryFunc*)((uintptr_t)loader_vaddr + hdr->e_entry),
+	};
+}
+
+typedef struct {
+	ElfW(Phdr)* phdr;
+	size_t phdr_count;
+	uintptr_t load_bias;
+	uintptr_t page_size;
+	char* search_paths;
+	ElfW(Ehdr)* ehdr;
+	ElfW(Phdr)* first_load;
+	bool secure;
+} ExeInfo;
+
+static ExeInfo get_exe_info(const KernelArguments* args) {
+	ExeInfo result = { 0 };
+	result.phdr = (ElfW(Phdr)*)ri_getauxval(args, AT_PHDR, false);
+	result.phdr_count = ri_getauxval(args, AT_PHNUM, false);
+	result.page_size = ri_getauxval(args, AT_PAGESZ, false);
+
+	unsigned long uid = ri_getauxval(args, AT_UID, false);
+	unsigned long euid = ri_getauxval(args, AT_EUID, false);
+	unsigned long gid = ri_getauxval(args, AT_GID, false);
+	unsigned long egid = ri_getauxval(args, AT_EGID, false);
+	unsigned long secure = ri_getauxval(args, AT_SECURE, true);
+	result.secure = uid != euid || gid != egid || secure;
+
+	debug("orig phdr     = %p", (void*)result.phdr);
+	debug("orig phnum    = %zu", result.phdr_count);
+
+	for (size_t i = 0; i < result.phdr_count; ++i) {
+		if (result.phdr[i].p_type == PT_DYNAMIC) {
+			result.load_bias = (uintptr_t)&_DYNAMIC - result.phdr[i].p_vaddr;
+		}
+	}
+	debug("load_bias     = 0x%lx", (unsigned long)result.load_bias);
+
+	for (size_t i = 0; i < result.phdr_count; ++i) {
+		ElfW(Phdr)* phdr = &result.phdr[i];
+		if (phdr->p_type != PT_LOAD) continue;
+		result.first_load = phdr;
+		if (phdr->p_offset != 0) {
+			fatal("expected zero p_offset for first PT_LOAD, found 0x%zx instead",
+			      (size_t)phdr->p_offset);
+		}
+		result.ehdr = (ElfW(Ehdr)*)(phdr->p_vaddr + result.load_bias);
+		break;
+	}
+	debug("ehdr          = %p", (void*)result.ehdr);
+
+	ElfW(Word) runpath_offset = -1;
+	char* strtab = NULL;
+	for (ElfW(Dyn)* dyn = _DYNAMIC; dyn->d_tag != DT_NULL; dyn++) {
+		switch (dyn->d_tag) {
+		case DT_RUNPATH:
+			runpath_offset = dyn->d_un.d_val;
+			break;
+		case DT_RPATH:
+			if (runpath_offset == -1) runpath_offset = dyn->d_un.d_val;
+			break;
+		case DT_STRTAB:
+			strtab = (char*)(dyn->d_un.d_ptr + result.load_bias);
+			break;
+		}
+	}
+
+	if (strtab && runpath_offset != -1) {
+		result.search_paths = strtab + runpath_offset;
+		debug("dt_runpath    = %s", result.search_paths);
+	}
+	return result;
+}
+
+// Loaders typically read the PT_INTERP of the executable, e.g. to set a pathname on the loader.
+// glibc insists on the executable having PT_INTERP, and aborts if it's missing.  Musl passes it
+// to debuggers to find symbols for the loader, which includes all the libc symbols.
+//
+// Make a copy of the phdr table and insert PT_INTERP into the copy.
+//
+static void insert_pt_interp_into_phdr_table(const KernelArguments* args, const ExeInfo* exe,
+                                             const char* loader_realpath) {
+	// Reserve extra space for the inserted PT_PHDR and PT_INTERP segments and a null terminator.
+	if (exe->phdr_count + 3 > sizeof(replacement_phdr_table) / sizeof(replacement_phdr_table[0])) {
+		fatal("too many phdr table entries in executable");
+	}
+
+	ElfW(Phdr) newPhdr = {
+		.p_type = PT_PHDR,
+		// The replacement phdr is in the BSS section, which has no file location.
+		// Use 0 for the offset.  If this causes a problem the replacement phdr could
+		// be moved to the data section and the correct p_offset calculated.
+		.p_offset = 0,
+		.p_vaddr = (uintptr_t)&replacement_phdr_table - exe->load_bias,
+		.p_paddr = (uintptr_t)&replacement_phdr_table - exe->load_bias,
+		.p_memsz = (exe->phdr_count + 1) * sizeof(ElfW(Phdr)),
+		.p_filesz = (exe->phdr_count + 1) * sizeof(ElfW(Phdr)),
+		.p_flags = PF_R,
+		.p_align = alignof(ElfW(Phdr)),
+	};
+
+	ElfW(Phdr*) cur = replacement_phdr_table;
+	if (exe->phdr[0].p_type != PT_PHDR) {
+		// ld.bfd does not insert a PT_PHDR if there is no PT_INTERP, fake one.
+		// It has to be first.  We're adding an entry so increase memsz and filesz.
+		newPhdr.p_memsz += sizeof(ElfW(Phdr));
+		newPhdr.p_filesz += sizeof(ElfW(Phdr));
+		*cur = newPhdr;
+		++cur;
+	}
+
+	for (size_t i = 0; i < exe->phdr_count; ++i) {
+		switch (exe->phdr[i].p_type) {
+		case 0:
+			fatal("unexpected null phdr entry at index %zu", i);
+			break;
+		case PT_PHDR:
+			*cur = newPhdr;
+			break;
+		default:
+			*cur = exe->phdr[i];
+		}
+		++cur;
+	}
+
+	// Insert PT_INTERP at the end.
+	cur->p_type = PT_INTERP;
+	cur->p_offset = 0;
+	cur->p_vaddr = (uintptr_t)&replacement_interp - exe->load_bias;
+	cur->p_paddr = cur->p_vaddr;
+	cur->p_filesz = ri_strlen(replacement_interp) + 1;
+	cur->p_memsz = ri_strlen(replacement_interp) + 1;
+	cur->p_flags = PF_R;
+	cur->p_align = 1;
+	++cur;
+
+	ri_strcpy(replacement_interp, loader_realpath);
+
+	debug("new phdr      = %p", (void*)&replacement_phdr_table);
+	debug("new phnum     = %zu", cur - replacement_phdr_table);
+
+	// Update the aux vector with the new phdr+phnum.
+	for (size_t i = 0; i < args->auxv_count; ++i) {
+		if (args->auxv[i].key == AT_PHDR) {
+			args->auxv[i].value = (unsigned long)&replacement_phdr_table;
+		} else if (args->auxv[i].key == AT_PHNUM) {
+			args->auxv[i].value = cur - replacement_phdr_table;
+		}
+	}
+
+	// AT_PHDR and AT_PHNUM are now updated to point to the replacement program
+	// headers, but the e_phoff and e_phnum in the ELF headers still point to the
+	// original program headers.  dynlink.c doesn't use e_phoff value from the
+	// main application's program headers.  The e_phoff and e_phnum values could
+	// be updated, but that would require using mprotect to allow modifications
+	// to the read-only first page.
+}
+
+static void realpath_fd(int fd, const char* orig_path, char* out, size_t len) {
+	char path[64];
+	ri_strcpy(path, "/proc/self/fd/");
+	ul_to_str(fd, path + ri_strlen(path), 10);
+	ssize_t result = ri_readlink(path, out, len);
+	if (result == -1) fatal("could not get realpath of %s: %s", orig_path, ri_strerror(g_errno));
+	if ((size_t)result >= len) fatal("realpath of %s too long", orig_path);
+}
+
+static int open_loader(const char* path, OpenedLoader* loader) {
+	debug("trying to open '%s'", path);
+	loader->fd = ri_open(path, O_RDONLY, 0);
+	if (loader->fd < 0) {
+		debug("could not open loader %s: %s", path, ri_strerror(g_errno));
+		return -1;
+	}
+
+	realpath_fd(loader->fd, path, loader->path, sizeof(loader->path));
+
+	return 0;
+}
+
+static int open_rel_loader(const char* dir, const char* rel, OpenedLoader* loader) {
+	char buf[PATH_MAX];
+
+	size_t dir_len = ri_strlen(dir);
+
+	if (dir_len + (dir_len == 0 ? 1 : 0) + ri_strlen(rel) + 2 > sizeof(buf)) {
+		debug("path to loader exceeds PATH_MAX: %s/%s", dir, rel);
+		return 1;
+	}
+
+	if (dir_len == 0) {
+		ri_strcpy(buf, ".");
+	} else {
+		ri_strcpy(buf, dir);
+		if (dir[dir_len-1] != '/') {
+			ri_strcat(buf, "/");
+		}
+	}
+	ri_strcat(buf, rel);
+
+	return open_loader(buf, loader);
+}
+
+static void get_origin(char* buf, size_t buf_len) {
+	ssize_t len = ri_readlink("/proc/self/exe", buf, buf_len);
+	if (len <= 0 || (size_t)len >= buf_len) {
+		fatal("could not readlink /proc/self/exe: %s", ri_strerror(g_errno));
+	}
+	buf[len] = '\0';
+
+	ri_dirname(buf);
+}
+
+static int search_path_list_for_loader(const char* loader_rel_path, const char* search_path,
+                                       const char* search_path_name, bool expand_origin, OpenedLoader *loader) {
+	char origin_buf[PATH_MAX];
+	char* origin = NULL;
+
+	const char* p = search_path;
+	while (p && p[0]) {
+		const char* start = p;
+		const char* end = ri_strchr(p, ':');
+		if (end == NULL) {
+			end = start + ri_strlen(p);
+			p = NULL;
+		} else {
+			p = end + 1;
+		}
+		size_t n = end - start;
+		char search_path_entry[PATH_MAX];
+		if (n >= sizeof(search_path_entry)) {
+			// Too long, skip.
+			debug("%s entry too long: %s", search_path_name, start);
+			continue;
+		}
+
+		ri_memcpy(search_path_entry, start, n);
+		search_path_entry[n] = '\0';
+
+		char buf[PATH_MAX];
+		char* d = NULL;
+		if (expand_origin) {
+			d = ri_strchr(search_path_entry, '$');
+		}
+		if (d && (!ri_strncmp(d, "$ORIGIN", 7) || !ri_strncmp(d, "${ORIGIN}", 9))) {
+			if (!origin) {
+				get_origin(origin_buf, sizeof(origin_buf));
+				origin = origin_buf;
+			}
+
+			size_t s = 7;
+			if (d[1] == '{') {
+				s += 2;
+			}
+			ri_memcpy(buf, search_path_entry, d - search_path_entry);
+			buf[d - search_path_entry] = '\0';
+			if (ri_strlen(buf) + ri_strlen(origin) + ri_strlen(d+s) >= sizeof(buf)) {
+				debug("path to loader %s%s%s too long", buf, origin, d+s);
+				continue;
+			}
+
+			ri_strcat(buf, origin);
+			ri_strcat(buf, d+s);
+			debug("trying loader %s at %s", loader_rel_path, buf);
+		} else {
+			ri_strcpy(buf, search_path_entry);
+		}
+		if (!open_rel_loader(buf, loader_rel_path, loader)) {
+			return 0;
+		}
+	}
+
+	return -1;
+}
+
+static int find_and_open_loader(const ExeInfo* exe, const char* ld_library_path, OpenedLoader* loader) {
+	const char* loader_rel_path = LOADER_PATH;
+
+	if (loader_rel_path[0] == '/') {
+		return open_loader(loader_rel_path, loader);
+	}
+
+	if (exe->secure) {
+		fatal("relinterp not supported for secure executables");
+	}
+
+	if (!search_path_list_for_loader(loader_rel_path, ld_library_path, "LD_LIBRARY_PATH", false, loader)) {
+		return 0;
+	}
+
+	if (!exe->search_paths || ri_strlen(exe->search_paths) == 0) {
+		// If no DT_RUNPATH search relative to the exe.
+		char origin[PATH_MAX];
+		get_origin(origin, sizeof(origin));
+		return open_rel_loader(origin, loader_rel_path, loader);
+	}
+
+	if (!search_path_list_for_loader(loader_rel_path, exe->search_paths, "rpath", true, loader)) {
+		return 0;
+	}
+
+	fatal("unable to find loader %s in rpath %s", loader_rel_path, exe->search_paths);
+}
+
+// Use a trick to determine whether the executable has been relocated yet. This variable points to
+// a variable in libc. It will be NULL if and only if the program hasn't been linked yet. This
+// should accommodate these situations:
+//  - The program was actually statically-linked instead.
+//  - Either a PIE or non-PIE dynamic executable.
+//  - Any situation where the loader calls the executable's _start:
+//     - In normal operation, the kernel calls the executable's _start, _start jumps to the loader's
+//       entry point, which jumps to _start again after linking it.
+//     - The executable actually has its PT_INTERP set after all.
+//     - The user runs the loader, passing it the path of the executable.
+// This C file must always be compiled as PIC, or else the linker will use a COPY relocation and
+// duplicate "environ" into the executable.
+static bool is_exe_relocated(void) {
+	// Use the GOT to get the address of environ.
+	extern char** environ;
+	void* read_environ = optimizer_barrier(&environ);
+	debug("read_environ = %p", read_environ);
+	return read_environ != NULL;
+}
+
+void _start_c(long* raw_args) {
+	const KernelArguments args = read_args(raw_args);
+	const char* ld_library_path = NULL;
+
+	for (size_t i = 0; i < args.envp_count; ++i) {
+		if (!ri_strcmp(args.envp[i], "RELINTERP_DEBUG=1")) {
+			g_debug = true;
+		}
+		if (!ri_strncmp(args.envp[i], "LD_LIBRARY_PATH=", 16)) {
+			ld_library_path = args.envp[i] + 16;
+		}
+	}
+	if (args.argc >= 1) {
+		g_prog_name = args.argv[0];
+	}
+
+	if (is_exe_relocated()) {
+		debug("exe is already relocated, starting main executable");
+		int argc = raw_args[0];
+		char **argv = (void *)(raw_args+1);
+		__libc_start_main(main, argc, argv, _init, _fini, 0);
+	}
+
+	debug("entering relinterp");
+
+	const ExeInfo exe = get_exe_info(&args);
+	g_page_size = exe.page_size;
+
+	OpenedLoader loader;
+	if (find_and_open_loader(&exe, ld_library_path, &loader)) {
+		fatal("failed to open loader");
+	}
+	off_t len = ri_lseek(loader.fd, 0, SEEK_END);
+	if (len == (off_t)-1) fatal("lseek on %s failed: %s", loader.path, ri_strerror(g_errno));
+
+	void* loader_data = ri_mmap(NULL, len, PROT_READ, MAP_PRIVATE, loader.fd, 0);
+	if (loader_data == (void*)MAP_FAILED) {
+		fatal("could not mmap %s: %s", loader.path, ri_strerror(g_errno));
+	}
+
+	LoadedInterp interp = load_interp(&loader, (ElfW(Ehdr)*)loader_data);
+	if (ri_munmap(loader_data, len) != 0) fatal("munmap failed: %s", ri_strerror(g_errno));
+
+	debug("original auxv:");
+	dump_auxv(&args);
+
+	// Create a virtual phdr table that includes PT_INTERP, for the benefit of loaders that read the
+	// executable PT_INTERP.
+	insert_pt_interp_into_phdr_table(&args, &exe, loader.path);
+	ri_close(loader.fd);
+
+	// TODO: /proc/pid/auxv isn't updated with the new auxv vector. Is it possible to update it?
+	// XXX: If we try to update it, we'd use prctl(PR_SET_MM, PR_SET_MM_AUXV, &vec, size, 0)
+	// Maybe updating it would be useful as a way to communicate the loader's base to a debugger.
+	// e.g. lldb uses AT_BASE in the aux vector, but it caches the values at process startup, so
+	// it wouldn't currently notice a changed value.
+
+	// The loader uses AT_BASE to locate itself, so search for the entry and update it. Even though
+	// its value is always zero, the kernel still includes the entry[0]. If this changes (or we want
+	// to make weaker assumptions about the kernel's behavior), then we can copy the kernel arguments
+	// onto the stack (e.g. using alloca) before jumping to the loader's entry point.
+	// [0] https://github.com/torvalds/linux/blob/v5.13/fs/binfmt_elf.c#L263
+	for (size_t i = 0; i < args.auxv_count; ++i) {
+		if (args.auxv[i].key == AT_BASE) {
+			args.auxv[i].value = (unsigned long)interp.base_addr;
+			debug("new auxv:");
+			dump_auxv(&args);
+			debug("transferring to real loader");
+			CRTJMP(interp.entry, raw_args);
+		}
+	}
+	fatal("AT_BASE not found in aux vector");
+}
+
+
+// Normally gdb and lldb look for a symbol named "_dl_debug_state" in the
+// interpreter to get notified when the dynamic loader has modified the
+// list of shared libraries.  When using relinterp, the debugger is not
+// aware of the interpreter (PT_INTERP is unset and auxv AT_BASE is 0) so it
+// doesn't know where to look for the symbol.  It falls back to looking in the
+// executable, so provide a symbol for it to find.  The dynamic loader will
+// need to forward its calls to its own _dl_debug_state symbol to this one.
+//
+// This has to be defined in a .c file because lldb looks for a symbol with
+// DWARF language type DW_LANG_C.
+extern void _dl_debug_state() {
+}
diff --git a/ldso/dynlink.c b/ldso/dynlink.c
index 03f5fd59..526a3971 100644
--- a/ldso/dynlink.c
+++ b/ldso/dynlink.c
@@ -150,6 +150,7 @@ static struct fdpic_loadmap *app_loadmap;
 static struct fdpic_dummy_loadmap app_dummy_loadmap;
 
 struct debug *_dl_debug_addr = &debug;
+static void (*exe_dl_debug_state)(void) = 0;
 
 extern hidden int __malloc_replaced;
 
@@ -1582,6 +1583,8 @@ void __libc_start_init(void)
 
 static void dl_debug_state(void)
 {
+	if (exe_dl_debug_state)
+		exe_dl_debug_state();
 }
 
 weak_alias(dl_debug_state, _dl_debug_state);
@@ -2007,6 +2010,12 @@ void __dls3(size_t *sp, size_t *auxv)
 	if (find_sym(head, "aligned_alloc", 1).dso != &ldso)
 		__aligned_alloc_replaced = 1;
 
+	/* Determine if another DSO is providing the _dl_debug_state symbol
+	 * and forward calls to it. */
+	struct symdef debug_sym = find_sym(head, "_dl_debug_state", 1);
+	if (debug_sym.dso != &ldso)
+		exe_dl_debug_state = (void (*)(void))laddr(debug_sym.dso, debug_sym.sym->st_value);
+
 	/* Switch to runtime mode: any further failures in the dynamic
 	 * linker are a reportable failure rather than a fatal startup
 	 * error. */
diff --git a/tools/ld.musl-clang.in b/tools/ld.musl-clang.in
index 93763d6b..aec1c6ac 100644
--- a/tools/ld.musl-clang.in
+++ b/tools/ld.musl-clang.in
@@ -3,10 +3,19 @@ cc="@CC@"
 libc_lib="@LIBDIR@"
 ldso="@LDSO@"
 cleared=
+relinterp=
 shared=
 userlinkdir=
 userlink=
 
+for x ; do
+    case "$x" in
+	-relinterp)
+	    relinterp=1
+	    ;;
+    esac
+done
+
 for x ; do
     test "$cleared" || set -- ; cleared=1
 
@@ -29,6 +38,13 @@ for x ; do
         crtbegin*.o|crtend*.o)
             set -- "$@" $($cc -print-file-name=$x)
             ;;
+	*/Scrt1.o|*/crt1.o)
+            if test $relinterp; then
+                set -- "$@" $libc_lib/relinterp.o
+            else
+                set -- "$@" "$x"
+            fi
+            ;;
         -lgcc|-lgcc_eh)
             file=lib${x#-l}.a
             set -- "$@" $($cc -print-file-name=$file)
@@ -36,6 +52,8 @@ for x ; do
         -l*)
             test "$userlink" && set -- "$@" "$x"
             ;;
+	-relinterp)
+	    ;;
         -shared)
             shared=1
             set -- "$@" -shared
@@ -48,4 +66,10 @@ for x ; do
     esac
 done
 
-exec $($cc -print-prog-name=ld) -nostdlib "$@" -lc -dynamic-linker "$ldso"
+if test $relinterp; then
+    dynamic_linker_flags="-no-dynamic-linker"
+else
+    dynamic_linker_flags="-dynamic-linker $ldso"
+fi
+
+exec $($cc -print-prog-name=ld) -nostdlib "$@" -lc $dynamic_linker_flags
diff --git a/tools/musl-clang.in b/tools/musl-clang.in
index 623de6f6..e71283ee 100644
--- a/tools/musl-clang.in
+++ b/tools/musl-clang.in
@@ -8,10 +8,22 @@ thisdir="`cd "$(dirname "$0")"; pwd`"
 # prevent clang from running the linker (and erroring) on no input.
 sflags=
 eflags=
+
+cleared=
+
 for x ; do
+    test "$cleared" || set -- ; cleared=1
     case "$x" in
-        -l*) input=1 ;;
-        *) input= ;;
+        -l*)
+	    input=1
+	    set -- "$@" "$x"
+	    ;;
+	-relinterp)
+	    relinterp_flags=-Wl,-relinterp ;;
+        *)
+	    input=
+	    set -- "$@" "$x"
+	    ;;
     esac
     if test "$input" ; then
         sflags="-l-user-start"
@@ -21,6 +33,7 @@ for x ; do
 done
 
 exec $cc \
+    $relinterp_flags \
     -B"$thisdir" \
     -fuse-ld=musl-clang \
     -static-libgcc \
diff --git a/tools/musl-gcc.specs.sh b/tools/musl-gcc.specs.sh
index 30492574..095273e0 100644
--- a/tools/musl-gcc.specs.sh
+++ b/tools/musl-gcc.specs.sh
@@ -17,13 +17,13 @@ cat <<EOF
 libgcc.a%s %:if-exists(libgcc_eh.a%s)
 
 *startfile:
-%{!shared: $libdir/Scrt1.o} $libdir/crti.o crtbeginS.o%s
+%{!shared: %{relinterp: $libdir/relinterp.o} %{!relinterp: $libdir/Scrt1.o}} $libdir/crti.o crtbeginS.o%s
 
 *endfile:
 crtendS.o%s $libdir/crtn.o
 
 *link:
--dynamic-linker $ldso -nostdlib %{shared:-shared} %{static:-static} %{rdynamic:-export-dynamic}
+%{relinterp: -no-dynamic-linker} %{!relinterp: -dynamic-linker $ldso} -nostdlib %{shared:-shared} %{static:-static} %{rdynamic:-export-dynamic}
 
 *esp_link:
 
-- 
2.37.3.998.g577e59143f-goog


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [musl] Running musl executables without a preinstalled dynamic linker
  2022-09-26 22:42         ` Colin Cross
@ 2022-09-26 23:02           ` Rich Felker
  2022-09-26 23:12             ` Colin Cross
  0 siblings, 1 reply; 10+ messages in thread
From: Rich Felker @ 2022-09-26 23:02 UTC (permalink / raw)
  To: Colin Cross; +Cc: musl, Ryan Prichard

On Mon, Sep 26, 2022 at 03:42:01PM -0700, Colin Cross wrote:
> On Mon, Sep 26, 2022 at 3:38 PM Colin Cross <ccross@android.com> wrote:
> >
> > On Tue, Aug 23, 2022 at 1:18 AM Szabolcs Nagy <nsz@port70.net> wrote:
> > >
> > > * Colin Cross <ccross@android.com> [2022-08-22 17:22:06 -0700]:
> > > > On Sat, Aug 20, 2022 at 2:43 AM Szabolcs Nagy <nsz@port70.net> wrote:
> > > > > i would not use Scrt1.o though, the same toolchain should be
> > > > > usable for normal linking and relinterp linking, just use a
> > > > > different name like Xcrt1.o.
> > > >
> > > > Is there some way to get gcc/clang to use Xcrt1.o without using
> > > > -nostdlib and passing all the crtbegin/end objects manually?
> > >
> > > this requires compiler changes (new cmdline flag) but then i think
> > > the code is upstreamable.
> >
> > I've used relinterp.o for now, and selected instead of Scrt1.o in
> > musl-gcc.specs and ld.musl-clang.
> >
> > >
> > > > > i would make Xcrt1.o self-contained and size optimized: it only
> > > > > runs at start up, this is a different requirement from the -O3
> > > > > build of normal string functions. and then there is no dependency
> > > > > on libc internals (which may have various instrumentations that
> > > > > does not work in Xcrt1.o).
> > > >
> > > > Doesn't this same logic apply to most of the code in dynlink.c?  My
> > > > main worry with a self contained implementation is that it requires
> > > > reimplementations of various string functions that are easy to get
> > > > wrong.  The current prototype reuses the C versions of musl's string
> > > > functions, but implements its own syscall wrappers to avoid
> > > > interactions with musl internals like errno.
> > >
> > > dynlink is in libc.so so it can use code from there.
> > >
> > > but moving libc code into the executable has different constraints.
> > > so you will have to make random decisions that string functions are
> > > in but errno is out, wrt which libc internal makes sense in the exe.
> > >
> > > i would just keep a separate implementation (or at least compile
> > > the code separately). string functions are easy to implement if
> > > you dont try to optimize them imo. then you have full control over
> > > what is going on in the exe entry code.
> >
> > I left the reimplementations of string functions and syscalls as
> > suggested.  Patch attached.

> From 0df460188b95f79272003bd0e5c12bceb2a3c25f Mon Sep 17 00:00:00 2001
> From: Colin Cross <ccross@android.com>
> Date: Thu, 22 Sep 2022 19:14:01 -0700
> Subject: [PATCH] Add entry point to find dynamic loader relative to the
>  executable
> 
> Distributing binaries built against musl to systems that don't already
> have musl is problematic due to the hardcoded absolute path to the
> dynamic loader (e.g. /lib/ld-musl-$ARCH.so.1) in the PT_INTERP header.
> This patch adds a feature to avoid the problem by leaving out PT_INTERP
> and replacing Scrt1.o with an entry point that can find the dynamic
> loader using DT_RUNPATH or LD_LIBRARY_PATH.
> 
> The entry point is in crt/relinterp.c.  It uses auxval to get the
> program headers and find the load address of the binary, then
> searches LD_LIBRARY_PATH or DT_RUNPATH for the dynamic loader.
> Once found, it mmaps the loader similar to the way the kernel
> does when PT_INTERP is set.  The musl loader uses PT_INTERP to set
> the path to the loader in the shared library info exported to the
> debugger, so relinterp creates a copy of the program headers
> with the PT_INTERP entry added pointing to the found location of
> the dynamic loader.  It updates AT_BASE to point to the address
> of the dynamic loader, then jumps to the loaders entry point.
> 
> The dynamic loader then loads shared libraries and handles
> relocations before jumping to the executable's entry point, which is
> the entry point in relinterp.c again.  Relinterp detects that
> relocations have been performed and calls __libc_start_main, the
> same way Scrt1.o would have.
> 
> Since relinterp runs before relocations have been performed it has
> to avoid referecing any libc functions.  That means reimplementing
> the few syscalls and string functions that it uses, and avoiding
> implicit calls to memcpy and memset that may  be inserted by the
> compiler.
> 
> Enabling relinterp is handled in the spec file for gcc and in
> the linker script for clang via a -relinterp argument.
> 
> Normally gdb and lldb look for a symbol named "_dl_debug_state" in
> the interpreter to get notified when the dynamic loader has modified
> the list of shared libraries.  When using relinterp the debugger is
> not aware of the interpreter (at process launch PT_INTERP is unset
> and auxv AT_BASE is 0) so it doesn't know where to look for the symbol.
> 
> They fall back to looking in the executable, so we can provide a symbol
> in relinterp.c for it to find.  The dynamic loader is then modified
> to also find the symbol in the exectuable and to call it from its own
> _dl_debug_state function.
> 
> The same tests in libc_test pass with or without LDFLAGS += -relinterp
> with both musl-gcc and musl-clang.
> 
> Ryan Prichard (rprichard@google.com) authored the original prototype
> of relinterp.

Have you looked at https://www.openwall.com/lists/musl/2020/03/29/9
where this has already been done? It's not upstream but my
understanding is that the author has been using it successfully for a
long time, and it's been through some review and as I recall was at
least close to acceptable for upstream.

Rich

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [musl] Running musl executables without a preinstalled dynamic linker
  2022-09-26 23:02           ` Rich Felker
@ 2022-09-26 23:12             ` Colin Cross
  2022-09-27  4:33               ` Colin Cross
  0 siblings, 1 reply; 10+ messages in thread
From: Colin Cross @ 2022-09-26 23:12 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl, Ryan Prichard

On Mon, Sep 26, 2022 at 4:02 PM Rich Felker <dalias@libc.org> wrote:
>
> On Mon, Sep 26, 2022 at 03:42:01PM -0700, Colin Cross wrote:
> > On Mon, Sep 26, 2022 at 3:38 PM Colin Cross <ccross@android.com> wrote:
> > >
> > > On Tue, Aug 23, 2022 at 1:18 AM Szabolcs Nagy <nsz@port70.net> wrote:
> > > >
> > > > * Colin Cross <ccross@android.com> [2022-08-22 17:22:06 -0700]:
> > > > > On Sat, Aug 20, 2022 at 2:43 AM Szabolcs Nagy <nsz@port70.net> wrote:
> > > > > > i would not use Scrt1.o though, the same toolchain should be
> > > > > > usable for normal linking and relinterp linking, just use a
> > > > > > different name like Xcrt1.o.
> > > > >
> > > > > Is there some way to get gcc/clang to use Xcrt1.o without using
> > > > > -nostdlib and passing all the crtbegin/end objects manually?
> > > >
> > > > this requires compiler changes (new cmdline flag) but then i think
> > > > the code is upstreamable.
> > >
> > > I've used relinterp.o for now, and selected instead of Scrt1.o in
> > > musl-gcc.specs and ld.musl-clang.
> > >
> > > >
> > > > > > i would make Xcrt1.o self-contained and size optimized: it only
> > > > > > runs at start up, this is a different requirement from the -O3
> > > > > > build of normal string functions. and then there is no dependency
> > > > > > on libc internals (which may have various instrumentations that
> > > > > > does not work in Xcrt1.o).
> > > > >
> > > > > Doesn't this same logic apply to most of the code in dynlink.c?  My
> > > > > main worry with a self contained implementation is that it requires
> > > > > reimplementations of various string functions that are easy to get
> > > > > wrong.  The current prototype reuses the C versions of musl's string
> > > > > functions, but implements its own syscall wrappers to avoid
> > > > > interactions with musl internals like errno.
> > > >
> > > > dynlink is in libc.so so it can use code from there.
> > > >
> > > > but moving libc code into the executable has different constraints.
> > > > so you will have to make random decisions that string functions are
> > > > in but errno is out, wrt which libc internal makes sense in the exe.
> > > >
> > > > i would just keep a separate implementation (or at least compile
> > > > the code separately). string functions are easy to implement if
> > > > you dont try to optimize them imo. then you have full control over
> > > > what is going on in the exe entry code.
> > >
> > > I left the reimplementations of string functions and syscalls as
> > > suggested.  Patch attached.
>
> > From 0df460188b95f79272003bd0e5c12bceb2a3c25f Mon Sep 17 00:00:00 2001
> > From: Colin Cross <ccross@android.com>
> > Date: Thu, 22 Sep 2022 19:14:01 -0700
> > Subject: [PATCH] Add entry point to find dynamic loader relative to the
> >  executable
> >
> > Distributing binaries built against musl to systems that don't already
> > have musl is problematic due to the hardcoded absolute path to the
> > dynamic loader (e.g. /lib/ld-musl-$ARCH.so.1) in the PT_INTERP header.
> > This patch adds a feature to avoid the problem by leaving out PT_INTERP
> > and replacing Scrt1.o with an entry point that can find the dynamic
> > loader using DT_RUNPATH or LD_LIBRARY_PATH.
> >
> > The entry point is in crt/relinterp.c.  It uses auxval to get the
> > program headers and find the load address of the binary, then
> > searches LD_LIBRARY_PATH or DT_RUNPATH for the dynamic loader.
> > Once found, it mmaps the loader similar to the way the kernel
> > does when PT_INTERP is set.  The musl loader uses PT_INTERP to set
> > the path to the loader in the shared library info exported to the
> > debugger, so relinterp creates a copy of the program headers
> > with the PT_INTERP entry added pointing to the found location of
> > the dynamic loader.  It updates AT_BASE to point to the address
> > of the dynamic loader, then jumps to the loaders entry point.
> >
> > The dynamic loader then loads shared libraries and handles
> > relocations before jumping to the executable's entry point, which is
> > the entry point in relinterp.c again.  Relinterp detects that
> > relocations have been performed and calls __libc_start_main, the
> > same way Scrt1.o would have.
> >
> > Since relinterp runs before relocations have been performed it has
> > to avoid referecing any libc functions.  That means reimplementing
> > the few syscalls and string functions that it uses, and avoiding
> > implicit calls to memcpy and memset that may  be inserted by the
> > compiler.
> >
> > Enabling relinterp is handled in the spec file for gcc and in
> > the linker script for clang via a -relinterp argument.
> >
> > Normally gdb and lldb look for a symbol named "_dl_debug_state" in
> > the interpreter to get notified when the dynamic loader has modified
> > the list of shared libraries.  When using relinterp the debugger is
> > not aware of the interpreter (at process launch PT_INTERP is unset
> > and auxv AT_BASE is 0) so it doesn't know where to look for the symbol.
> >
> > They fall back to looking in the executable, so we can provide a symbol
> > in relinterp.c for it to find.  The dynamic loader is then modified
> > to also find the symbol in the exectuable and to call it from its own
> > _dl_debug_state function.
> >
> > The same tests in libc_test pass with or without LDFLAGS += -relinterp
> > with both musl-gcc and musl-clang.
> >
> > Ryan Prichard (rprichard@google.com) authored the original prototype
> > of relinterp.
>
> Have you looked at https://www.openwall.com/lists/musl/2020/03/29/9
> where this has already been done? It's not upstream but my
> understanding is that the author has been using it successfully for a
> long time, and it's been through some review and as I recall was at
> least close to acceptable for upstream.
>
> Rich

No, I had not seen that, and it looks to have identical functionality.
I'll experiment with switching to it.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [musl] Running musl executables without a preinstalled dynamic linker
  2022-09-26 23:12             ` Colin Cross
@ 2022-09-27  4:33               ` Colin Cross
  2022-09-27  5:09                 ` Ridley Combs
  0 siblings, 1 reply; 10+ messages in thread
From: Colin Cross @ 2022-09-27  4:33 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl, Ryan Prichard, Rodger Combs

[-- Attachment #1: Type: text/plain, Size: 6425 bytes --]

On Mon, Sep 26, 2022 at 4:12 PM Colin Cross <ccross@android.com> wrote:
>
> On Mon, Sep 26, 2022 at 4:02 PM Rich Felker <dalias@libc.org> wrote:
> >
> > On Mon, Sep 26, 2022 at 03:42:01PM -0700, Colin Cross wrote:
> > > On Mon, Sep 26, 2022 at 3:38 PM Colin Cross <ccross@android.com> wrote:
> > > >
> > > > On Tue, Aug 23, 2022 at 1:18 AM Szabolcs Nagy <nsz@port70.net> wrote:
> > > > >
> > > > > * Colin Cross <ccross@android.com> [2022-08-22 17:22:06 -0700]:
> > > > > > On Sat, Aug 20, 2022 at 2:43 AM Szabolcs Nagy <nsz@port70.net> wrote:
> > > > > > > i would not use Scrt1.o though, the same toolchain should be
> > > > > > > usable for normal linking and relinterp linking, just use a
> > > > > > > different name like Xcrt1.o.
> > > > > >
> > > > > > Is there some way to get gcc/clang to use Xcrt1.o without using
> > > > > > -nostdlib and passing all the crtbegin/end objects manually?
> > > > >
> > > > > this requires compiler changes (new cmdline flag) but then i think
> > > > > the code is upstreamable.
> > > >
> > > > I've used relinterp.o for now, and selected instead of Scrt1.o in
> > > > musl-gcc.specs and ld.musl-clang.
> > > >
> > > > >
> > > > > > > i would make Xcrt1.o self-contained and size optimized: it only
> > > > > > > runs at start up, this is a different requirement from the -O3
> > > > > > > build of normal string functions. and then there is no dependency
> > > > > > > on libc internals (which may have various instrumentations that
> > > > > > > does not work in Xcrt1.o).
> > > > > >
> > > > > > Doesn't this same logic apply to most of the code in dynlink.c?  My
> > > > > > main worry with a self contained implementation is that it requires
> > > > > > reimplementations of various string functions that are easy to get
> > > > > > wrong.  The current prototype reuses the C versions of musl's string
> > > > > > functions, but implements its own syscall wrappers to avoid
> > > > > > interactions with musl internals like errno.
> > > > >
> > > > > dynlink is in libc.so so it can use code from there.
> > > > >
> > > > > but moving libc code into the executable has different constraints.
> > > > > so you will have to make random decisions that string functions are
> > > > > in but errno is out, wrt which libc internal makes sense in the exe.
> > > > >
> > > > > i would just keep a separate implementation (or at least compile
> > > > > the code separately). string functions are easy to implement if
> > > > > you dont try to optimize them imo. then you have full control over
> > > > > what is going on in the exe entry code.
> > > >
> > > > I left the reimplementations of string functions and syscalls as
> > > > suggested.  Patch attached.
> >
> > > From 0df460188b95f79272003bd0e5c12bceb2a3c25f Mon Sep 17 00:00:00 2001
> > > From: Colin Cross <ccross@android.com>
> > > Date: Thu, 22 Sep 2022 19:14:01 -0700
> > > Subject: [PATCH] Add entry point to find dynamic loader relative to the
> > >  executable
> > >
> > > Distributing binaries built against musl to systems that don't already
> > > have musl is problematic due to the hardcoded absolute path to the
> > > dynamic loader (e.g. /lib/ld-musl-$ARCH.so.1) in the PT_INTERP header.
> > > This patch adds a feature to avoid the problem by leaving out PT_INTERP
> > > and replacing Scrt1.o with an entry point that can find the dynamic
> > > loader using DT_RUNPATH or LD_LIBRARY_PATH.
> > >
> > > The entry point is in crt/relinterp.c.  It uses auxval to get the
> > > program headers and find the load address of the binary, then
> > > searches LD_LIBRARY_PATH or DT_RUNPATH for the dynamic loader.
> > > Once found, it mmaps the loader similar to the way the kernel
> > > does when PT_INTERP is set.  The musl loader uses PT_INTERP to set
> > > the path to the loader in the shared library info exported to the
> > > debugger, so relinterp creates a copy of the program headers
> > > with the PT_INTERP entry added pointing to the found location of
> > > the dynamic loader.  It updates AT_BASE to point to the address
> > > of the dynamic loader, then jumps to the loaders entry point.
> > >
> > > The dynamic loader then loads shared libraries and handles
> > > relocations before jumping to the executable's entry point, which is
> > > the entry point in relinterp.c again.  Relinterp detects that
> > > relocations have been performed and calls __libc_start_main, the
> > > same way Scrt1.o would have.
> > >
> > > Since relinterp runs before relocations have been performed it has
> > > to avoid referecing any libc functions.  That means reimplementing
> > > the few syscalls and string functions that it uses, and avoiding
> > > implicit calls to memcpy and memset that may  be inserted by the
> > > compiler.
> > >
> > > Enabling relinterp is handled in the spec file for gcc and in
> > > the linker script for clang via a -relinterp argument.
> > >
> > > Normally gdb and lldb look for a symbol named "_dl_debug_state" in
> > > the interpreter to get notified when the dynamic loader has modified
> > > the list of shared libraries.  When using relinterp the debugger is
> > > not aware of the interpreter (at process launch PT_INTERP is unset
> > > and auxv AT_BASE is 0) so it doesn't know where to look for the symbol.
> > >
> > > They fall back to looking in the executable, so we can provide a symbol
> > > in relinterp.c for it to find.  The dynamic loader is then modified
> > > to also find the symbol in the exectuable and to call it from its own
> > > _dl_debug_state function.
> > >
> > > The same tests in libc_test pass with or without LDFLAGS += -relinterp
> > > with both musl-gcc and musl-clang.
> > >
> > > Ryan Prichard (rprichard@google.com) authored the original prototype
> > > of relinterp.
> >
> > Have you looked at https://www.openwall.com/lists/musl/2020/03/29/9
> > where this has already been done? It's not upstream but my
> > understanding is that the author has been using it successfully for a
> > long time, and it's been through some review and as I recall was at
> > least close to acceptable for upstream.
> >
> > Rich
>
> No, I had not seen that, and it looks to have identical functionality.
> I'll experiment with switching to it.

dcrt1 seems to be a perfect drop in replacement for this.  Quick
testing shows it works for my needs on x86_64, x86, aarch64 and arm
architectures.  I've attached a minor patch that fixes some unused
variable and label warnings.

[-- Attachment #2: 0001-Fix-unused-variable-and-label-warnings-in-dcrt1.c.patch --]
[-- Type: text/x-patch, Size: 1607 bytes --]

From ef6dad5b158a0a7acbed6b6979964cdc228bc79f Mon Sep 17 00:00:00 2001
From: Colin Cross <ccross@android.com>
Date: Mon, 26 Sep 2022 21:26:20 -0700
Subject: [PATCH] Fix unused variable and label warnings in dcrt1.c

Change-Id: I54add45f0525046c72e6291392571475d3164089
---
 crt/dcrt1.c    | 5 -----
 ldso/dlstart.c | 2 ++
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/crt/dcrt1.c b/crt/dcrt1.c
index 5e844048..eb1d6652 100644
--- a/crt/dcrt1.c
+++ b/crt/dcrt1.c
@@ -116,9 +116,6 @@ static char *crt_getenv(const char *name, char **environ)
 
 static inline void *map_library(int fd)
 {
-	void *allocated_buf=0;
-	size_t allocated_buf_size;
-	size_t phsize;
 	size_t addr_min=SIZE_MAX, addr_max=0;
 	size_t this_min, this_max;
 	off_t off_start = 0;
@@ -219,7 +216,6 @@ static size_t find_linker(char *outbuf, size_t bufsize, const char *this_path, s
 {
 	const char *paths[2]; // envpath, rpath/runpath
 	size_t i;
-	int fd;
 
 	// In the suid/secure case, skip everything and use the fixed path
 	if (secure)
@@ -293,7 +289,6 @@ hidden _Noreturn void __dls2(unsigned char *base, size_t *p)
 	char **argv = (void *)(p+1);
 	int fd;
 	int secure;
-	int prot = PROT_READ;
 	Ehdr *loader_hdr;
 	Phdr *new_hdr;
 	void *entry;
diff --git a/ldso/dlstart.c b/ldso/dlstart.c
index 49e6a992..d8218211 100644
--- a/ldso/dlstart.c
+++ b/ldso/dlstart.c
@@ -163,7 +163,9 @@ hidden void _dlstart_c(size_t *sp, size_t *dynv)
 #endif
 
 	stage2_func dls2;
+#ifdef DL_DNI
 skip_relocs:
+#endif
 	GETFUNCSYM(&dls2, __dls2, base+dyn[DT_PLTGOT]);
 	dls2((void *)base, sp);
 }
-- 
2.37.3.998.g577e59143f-goog


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [musl] Running musl executables without a preinstalled dynamic linker
  2022-09-27  4:33               ` Colin Cross
@ 2022-09-27  5:09                 ` Ridley Combs
  0 siblings, 0 replies; 10+ messages in thread
From: Ridley Combs @ 2022-09-27  5:09 UTC (permalink / raw)
  To: musl; +Cc: Rich Felker, Ryan Prichard, Colin Cross

[-- Attachment #1: Type: text/plain, Size: 8032 bytes --]



> On Sep 26, 2022, at 23:33, Colin Cross <ccross@android.com> wrote:
> 
> On Mon, Sep 26, 2022 at 4:12 PM Colin Cross <ccross@android.com> wrote:
>> 
>> On Mon, Sep 26, 2022 at 4:02 PM Rich Felker <dalias@libc.org> wrote:
>>> 
>>> On Mon, Sep 26, 2022 at 03:42:01PM -0700, Colin Cross wrote:
>>>> On Mon, Sep 26, 2022 at 3:38 PM Colin Cross <ccross@android.com> wrote:
>>>>> 
>>>>> On Tue, Aug 23, 2022 at 1:18 AM Szabolcs Nagy <nsz@port70.net> wrote:
>>>>>> 
>>>>>> * Colin Cross <ccross@android.com> [2022-08-22 17:22:06 -0700]:
>>>>>>> On Sat, Aug 20, 2022 at 2:43 AM Szabolcs Nagy <nsz@port70.net> wrote:
>>>>>>>> i would not use Scrt1.o though, the same toolchain should be
>>>>>>>> usable for normal linking and relinterp linking, just use a
>>>>>>>> different name like Xcrt1.o.
>>>>>>> 
>>>>>>> Is there some way to get gcc/clang to use Xcrt1.o without using
>>>>>>> -nostdlib and passing all the crtbegin/end objects manually?
>>>>>> 
>>>>>> this requires compiler changes (new cmdline flag) but then i think
>>>>>> the code is upstreamable.
>>>>> 
>>>>> I've used relinterp.o for now, and selected instead of Scrt1.o in
>>>>> musl-gcc.specs and ld.musl-clang.
>>>>> 
>>>>>> 
>>>>>>>> i would make Xcrt1.o self-contained and size optimized: it only
>>>>>>>> runs at start up, this is a different requirement from the -O3
>>>>>>>> build of normal string functions. and then there is no dependency
>>>>>>>> on libc internals (which may have various instrumentations that
>>>>>>>> does not work in Xcrt1.o).
>>>>>>> 
>>>>>>> Doesn't this same logic apply to most of the code in dynlink.c?  My
>>>>>>> main worry with a self contained implementation is that it requires
>>>>>>> reimplementations of various string functions that are easy to get
>>>>>>> wrong.  The current prototype reuses the C versions of musl's string
>>>>>>> functions, but implements its own syscall wrappers to avoid
>>>>>>> interactions with musl internals like errno.
>>>>>> 
>>>>>> dynlink is in libc.so so it can use code from there.
>>>>>> 
>>>>>> but moving libc code into the executable has different constraints.
>>>>>> so you will have to make random decisions that string functions are
>>>>>> in but errno is out, wrt which libc internal makes sense in the exe.
>>>>>> 
>>>>>> i would just keep a separate implementation (or at least compile
>>>>>> the code separately). string functions are easy to implement if
>>>>>> you dont try to optimize them imo. then you have full control over
>>>>>> what is going on in the exe entry code.
>>>>> 
>>>>> I left the reimplementations of string functions and syscalls as
>>>>> suggested.  Patch attached.
>>> 
>>>> From 0df460188b95f79272003bd0e5c12bceb2a3c25f Mon Sep 17 00:00:00 2001
>>>> From: Colin Cross <ccross@android.com>
>>>> Date: Thu, 22 Sep 2022 19:14:01 -0700
>>>> Subject: [PATCH] Add entry point to find dynamic loader relative to the
>>>> executable
>>>> 
>>>> Distributing binaries built against musl to systems that don't already
>>>> have musl is problematic due to the hardcoded absolute path to the
>>>> dynamic loader (e.g. /lib/ld-musl-$ARCH.so.1) in the PT_INTERP header.
>>>> This patch adds a feature to avoid the problem by leaving out PT_INTERP
>>>> and replacing Scrt1.o with an entry point that can find the dynamic
>>>> loader using DT_RUNPATH or LD_LIBRARY_PATH.
>>>> 
>>>> The entry point is in crt/relinterp.c.  It uses auxval to get the
>>>> program headers and find the load address of the binary, then
>>>> searches LD_LIBRARY_PATH or DT_RUNPATH for the dynamic loader.
>>>> Once found, it mmaps the loader similar to the way the kernel
>>>> does when PT_INTERP is set.  The musl loader uses PT_INTERP to set
>>>> the path to the loader in the shared library info exported to the
>>>> debugger, so relinterp creates a copy of the program headers
>>>> with the PT_INTERP entry added pointing to the found location of
>>>> the dynamic loader.  It updates AT_BASE to point to the address
>>>> of the dynamic loader, then jumps to the loaders entry point.
>>>> 
>>>> The dynamic loader then loads shared libraries and handles
>>>> relocations before jumping to the executable's entry point, which is
>>>> the entry point in relinterp.c again.  Relinterp detects that
>>>> relocations have been performed and calls __libc_start_main, the
>>>> same way Scrt1.o would have.
>>>> 
>>>> Since relinterp runs before relocations have been performed it has
>>>> to avoid referecing any libc functions.  That means reimplementing
>>>> the few syscalls and string functions that it uses, and avoiding
>>>> implicit calls to memcpy and memset that may  be inserted by the
>>>> compiler.
>>>> 
>>>> Enabling relinterp is handled in the spec file for gcc and in
>>>> the linker script for clang via a -relinterp argument.
>>>> 
>>>> Normally gdb and lldb look for a symbol named "_dl_debug_state" in
>>>> the interpreter to get notified when the dynamic loader has modified
>>>> the list of shared libraries.  When using relinterp the debugger is
>>>> not aware of the interpreter (at process launch PT_INTERP is unset
>>>> and auxv AT_BASE is 0) so it doesn't know where to look for the symbol.
>>>> 
>>>> They fall back to looking in the executable, so we can provide a symbol
>>>> in relinterp.c for it to find.  The dynamic loader is then modified
>>>> to also find the symbol in the exectuable and to call it from its own
>>>> _dl_debug_state function.
>>>> 
>>>> The same tests in libc_test pass with or without LDFLAGS += -relinterp
>>>> with both musl-gcc and musl-clang.
>>>> 
>>>> Ryan Prichard (rprichard@google.com) authored the original prototype
>>>> of relinterp.
>>> 
>>> Have you looked at https://www.openwall.com/lists/musl/2020/03/29/9
>>> where this has already been done? It's not upstream but my
>>> understanding is that the author has been using it successfully for a
>>> long time, and it's been through some review and as I recall was at
>>> least close to acceptable for upstream.
>>> 
>>> Rich
>> 
>> No, I had not seen that, and it looks to have identical functionality.
>> I'll experiment with switching to it.
> 
> dcrt1 seems to be a perfect drop in replacement for this.  Quick
> testing shows it works for my needs on x86_64, x86, aarch64 and arm
> architectures.  I've attached a minor patch that fixes some unused
> variable and label warnings.
> <0001-Fix-unused-variable-and-label-warnings-in-dcrt1.c.patch>

Here's the current version of the patch from my fork; it already has the changes you mentioned: https://github.com/rcombs/musl/commit/740155e21f7057a33e75a9ed4cb6fbf07f75d2a7 <https://github.com/rcombs/musl/commit/740155e21f7057a33e75a9ed4cb6fbf07f75d2a7>

Also notable, I have in fact tested this against glibc and it works fine with that as well; I'd imagine there's a decent chance it'd also work with bionic.

I haven't kept up with upstreaming it since it's fairly self-contained and unlikely to regularly have substantial conflicts with upstream, so it's far easier to just rebase a fork now and then than to deal with mailing-list crap, but if anybody wants to merge this, go right on ahead; I'll be happy to answer any questions or discuss any improvements on IRC (rcombs on the relevant networks), Discord (rcombs#1111), Twitter (@11rcombs), GitHub (rcombs), or wherever else is convenient. If none of those work, feel free to reach out to me directly at this address and we can figure something out.

I won't be sending any additional emails on this thread (the sender-forging the listserv will do after I send this one alone is already going to subject me to plenty of DMARC rejection notices, thank you very much). Again, happy to help in any way I can, just not via mailing list. I'm making an exception this once since you reached out directly about something I'd worked on, but in general I do my best not to interact with these things unless somebody's paying me to.

--Ridley


[-- Attachment #2: Type: text/html, Size: 10444 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-09-27  5:09 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-15 21:35 [musl] Running musl executables without a preinstalled dynamic linker Colin Cross
2022-08-20  9:43 ` Szabolcs Nagy
2022-08-23  0:22   ` Colin Cross
2022-08-23  8:18     ` Szabolcs Nagy
2022-09-26 22:38       ` Colin Cross
2022-09-26 22:42         ` Colin Cross
2022-09-26 23:02           ` Rich Felker
2022-09-26 23:12             ` Colin Cross
2022-09-27  4:33               ` Colin Cross
2022-09-27  5:09                 ` Ridley Combs

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).