[-- Attachment #1: Type: text/plain, Size: 556 bytes --] I'm hitting an issue where some test infrastructure is setting LD_LIBRARY_PATH to a list that contains both 32-bit and 64-bit libraries because it is unsure whether the code under test is going to execute 32-bit or 64-bit processes or both. When using musl the dynamic loader takes the first library with a matching name and then fails to load it if it is for the wrong elf class. The attached patch verifies the elf machine and class when searching the path list, continuing the search if a valid elf header with an incorrect machine or class is found. [-- Attachment #2: 0001-Ignore-incorrect-elf-architecture-libraries.patch --] [-- Type: text/x-patch, Size: 5112 bytes --] From 373f0731f8740cd9be4c8d429ee602316be9ee68 Mon Sep 17 00:00:00 2001 From: Colin Cross <ccross@android.com> Date: Tue, 7 Feb 2023 14:35:57 -0800 Subject: [PATCH] Ignore incorrect elf architecture libraries In multilib systems LD_LIBRARY_PATH is sometimes set to a list that contains both the lib32 and lib64 directories, for example when running code that may execute 32-bit or 64-bit subprocesses or both. Musl's dynamic loader aborts searching a path list when it finds any library that is not loadable, preventing searching the other multilib directory. Modify the loader to continue searching when a file is found that is a valid elf file but for the an elf machine or class that differs from that of the loader itself. --- ldso/dynlink.c | 77 +++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 61 insertions(+), 16 deletions(-) diff --git a/ldso/dynlink.c b/ldso/dynlink.c index 24fe47cf..3c2a7179 100644 --- a/ldso/dynlink.c +++ b/ldso/dynlink.c @@ -65,6 +65,8 @@ struct dso { size_t *dynv; struct dso *next, *prev; + int elfmachine; + int elfclass; Phdr *phdr; int phnum; size_t phentsize; @@ -644,6 +646,19 @@ static void unmap_library(struct dso *dso) } } +static int verify_elf_magic(const Ehdr* eh) { + return eh->e_ident[0] == ELFMAG0 && + eh->e_ident[1] == ELFMAG1 && + eh->e_ident[2] == ELFMAG2 && + eh->e_ident[3] == ELFMAG3; +} + +/* Verifies that an elf header's machine and class match the loader */ +static int verify_elf_arch(const Ehdr* eh) { + return eh->e_machine == ldso.elfmachine && + eh->e_ident[EI_CLASS] == ldso.elfclass; +} + static void *map_library(int fd, struct dso *dso) { Ehdr buf[(896+sizeof(Ehdr))/sizeof(Ehdr)]; @@ -666,6 +681,10 @@ static void *map_library(int fd, struct dso *dso) if (l<0) return 0; if (l<sizeof *eh || (eh->e_type != ET_DYN && eh->e_type != ET_EXEC)) goto noexec; + if (!verify_elf_magic(eh)) goto noexec; + if (!verify_elf_arch(eh)) goto noexec; + dso->elfmachine = eh->e_machine; + dso->elfclass = eh->e_ident[EI_CLASS]; phsize = eh->e_phentsize * eh->e_phnum; if (phsize > sizeof buf - sizeof *eh) { allocated_buf = malloc(phsize); @@ -830,29 +849,53 @@ error: return 0; } -static int path_open(const char *name, const char *s, char *buf, size_t buf_size) +static int path_open_library(const char *name, const char *s, char *buf, size_t buf_size) { size_t l; int fd; + const char *p; for (;;) { s += strspn(s, ":\n"); + p = s; l = strcspn(s, ":\n"); if (l-1 >= INT_MAX) return -1; - if (snprintf(buf, buf_size, "%.*s/%s", (int)l, s, name) < buf_size) { - if ((fd = open(buf, O_RDONLY|O_CLOEXEC))>=0) return fd; - switch (errno) { - case ENOENT: - case ENOTDIR: - case EACCES: - case ENAMETOOLONG: - break; - default: - /* Any negative value but -1 will inhibit - * futher path search. */ + s += l; + if (snprintf(buf, buf_size, "%.*s/%s", (int)l, p, name) < buf_size) { + fd = open(buf, O_RDONLY|O_CLOEXEC); + if (fd < 0) { + switch (errno) { + case ENOENT: + case ENOTDIR: + case EACCES: + case ENAMETOOLONG: + /* Keep searching in path list. */ + continue; + default: + /* Any negative value but -1 will + * inhibit further path search in + * load_library. */ + return -2; + } + } + Ehdr eh; + ssize_t n = pread(fd, &eh, sizeof(eh), 0); + /* If the elf file is invalid return -2 to inhibit + * further path search in load_library. */ + if (n < 0 || + n != sizeof eh || + !verify_elf_magic(&eh)) { + close(fd); return -2; } + /* If the elf file has a valid header but is for the + * wrong architecture ignore it and keep searching the + * path list. */ + if (!verify_elf_arch(&eh)) { + close(fd); + continue; + } + return fd; } - s += l; } } @@ -1076,12 +1119,12 @@ static struct dso *load_library(const char *name, struct dso *needed_by) } if (strlen(name) > NAME_MAX) return 0; fd = -1; - if (env_path) fd = path_open(name, env_path, buf, sizeof buf); + if (env_path) fd = path_open_library(name, env_path, buf, sizeof buf); for (p=needed_by; fd == -1 && p; p=p->needed_by) { if (fixup_rpath(p, buf, sizeof buf) < 0) fd = -2; /* Inhibit further search. */ if (p->rpath) - fd = path_open(name, p->rpath, buf, sizeof buf); + fd = path_open_library(name, p->rpath, buf, sizeof buf); } if (fd == -1) { if (!sys_path) { @@ -1120,7 +1163,7 @@ static struct dso *load_library(const char *name, struct dso *needed_by) } } if (!sys_path) sys_path = "/lib:/usr/local/lib:/usr/lib"; - fd = path_open(name, sys_path, buf, sizeof buf); + fd = path_open_library(name, sys_path, buf, sizeof buf); } pathname = buf; } @@ -1694,6 +1737,8 @@ hidden void __dls2(unsigned char *base, size_t *sp) ldso.phnum = ehdr->e_phnum; ldso.phdr = laddr(&ldso, ehdr->e_phoff); ldso.phentsize = ehdr->e_phentsize; + ldso.elfmachine = ehdr->e_machine; + ldso.elfclass = ehdr->e_ident[EI_CLASS]; kernel_mapped_dso(&ldso); decode_dyn(&ldso); -- 2.39.1.519.gcb327c4b5f-goog
On Tue, Feb 07, 2023 at 02:58:37PM -0800, Colin Cross wrote:
> I'm hitting an issue where some test infrastructure is setting
> LD_LIBRARY_PATH to a list that contains both 32-bit and 64-bit
> libraries because it is unsure whether the code under test is going to
> execute 32-bit or 64-bit processes or both. When using musl the
> dynamic loader takes the first library with a matching name and then
> fails to load it if it is for the wrong elf class.
>
> The attached patch verifies the elf machine and class when searching
> the path list, continuing the search if a valid elf header with an
> incorrect machine or class is found.
While it requires some consideration to ensure that this yields safe &
consistent behavior, I think it at least admits that; I haven't
checked the actual code, but conceptually, it should be equivalent to
treating finding a mismatched-arch library as a conclusive result
whose behavior is searching the remainder of the search path.
I'm a little bit skeptical of the motivation though. In general, it's
not safe to just set LD_LIBRARY_PATH and run programs that might
invoke other programs, since the math might contain outdated or
mismatched libraries relative to what those other programs might want.
On a system with multiple libcs present, the libraries found could
even be for the wrong one. Really, LD_LIBRARY_PATH should just be set
for invoking a single program (or family of binaries) that ships with
its own versions of libraries or when you're overriding certain
libraries for it, etc. This is contrary to how the environment works,
and one reason it's probably better to use ldso --library-path=...
rather than LD_LIBRARY_PATH when overriding libraries for a particular
program invocation.
If your goal is to change the default path that gets searched for all
programs, the right place is /etc/ld-musl-$ARCH.path. It's already
per-arch and non-conflicting with a potentially coexisting alternate
libc.
Please don't take any of this as a rejection of the proposed change. I
think it's probably okay, and probably an improvement on the usability
of it. But I do think the apparent need suggests that .path files were
likely a better solution to the problem at hand.
Rich
On 2023-02-08 03:49, Rich Felker wrote:
> On Tue, Feb 07, 2023 at 02:58:37PM -0800, Colin Cross wrote:
>> I'm hitting an issue where some test infrastructure is setting
>> LD_LIBRARY_PATH to a list that contains both 32-bit and 64-bit
>> libraries because it is unsure whether the code under test is going to
>> execute 32-bit or 64-bit processes or both. When using musl the
>> dynamic loader takes the first library with a matching name and then
>> fails to load it if it is for the wrong elf class.
>>
>> The attached patch verifies the elf machine and class when searching
>> the path list, continuing the search if a valid elf header with an
>> incorrect machine or class is found.
>
> While it requires some consideration to ensure that this yields safe &
> consistent behavior, I think it at least admits that; I haven't
> checked the actual code, but conceptually, it should be equivalent to
> treating finding a mismatched-arch library as a conclusive result
> whose behavior is searching the remainder of the search path.
>
> I'm a little bit skeptical of the motivation though. In general, it's
> not safe to just set LD_LIBRARY_PATH and run programs that might
> invoke other programs, since the math might contain outdated or
> mismatched libraries relative to what those other programs might want.
> On a system with multiple libcs present, the libraries found could
> even be for the wrong one. Really, LD_LIBRARY_PATH should just be set
> for invoking a single program (or family of binaries) that ships with
> its own versions of libraries or when you're overriding certain
> libraries for it, etc. This is contrary to how the environment works,
> and one reason it's probably better to use ldso --library-path=...
> rather than LD_LIBRARY_PATH when overriding libraries for a particular
> program invocation.
>
It's sometimes convenient to use LD_PRELOAD in conjunction with
LD_LIBRARY_PATH to override libc functions in a specific process tree
containing binaries for different architectures, e.g. 32-bit/64-bit on
x86 systems or native/foreign (run via qemu-user). I don't know how to
achieve that with current musl other than by overriding
exec*/posix_spawn functions and manually checking the architecture of
the binary (and not forgetting to process "#!" for scripts), so I'd be
glad to see the proposed functionality in musl.
In general, it would still not work on systems with multiple libcs, but
I'd expect the mixed-libc case to be much rarer than the mixed-arch one.
Alexey
On Tue, Feb 7, 2023 at 4:49 PM Rich Felker <dalias@libc.org> wrote: > > On Tue, Feb 07, 2023 at 02:58:37PM -0800, Colin Cross wrote: > > I'm hitting an issue where some test infrastructure is setting > > LD_LIBRARY_PATH to a list that contains both 32-bit and 64-bit > > libraries because it is unsure whether the code under test is going to > > execute 32-bit or 64-bit processes or both. When using musl the > > dynamic loader takes the first library with a matching name and then > > fails to load it if it is for the wrong elf class. > > > > The attached patch verifies the elf machine and class when searching > > the path list, continuing the search if a valid elf header with an > > incorrect machine or class is found. > > While it requires some consideration to ensure that this yields safe & > consistent behavior, I think it at least admits that; I haven't > checked the actual code, but conceptually, it should be equivalent to > treating finding a mismatched-arch library as a conclusive result > whose behavior is searching the remainder of the search path. > > I'm a little bit skeptical of the motivation though. In general, it's > not safe to just set LD_LIBRARY_PATH and run programs that might > invoke other programs, since the math might contain outdated or > mismatched libraries relative to what those other programs might want. > On a system with multiple libcs present, the libraries found could > even be for the wrong one. Really, LD_LIBRARY_PATH should just be set > for invoking a single program (or family of binaries) that ships with > its own versions of libraries or when you're overriding certain > libraries for it, etc. This is contrary to how the environment works, > and one reason it's probably better to use ldso --library-path=... > rather than LD_LIBRARY_PATH when overriding libraries for a particular > program invocation. We intend to distribute binaries built against musl using relinterp (or Xcrt1 if that gets merged), which finds the dynamic loader relative to the binary. These will be used on end user systems that normally use glibc. The testing environment is taking a hermetic bundle of musl and the test binaries built against musl, and are intended to be hermetic so they shouldn't be executing anything that would use a non-musl libc. Normally the binaries use relative DT_RUNPATH entries to find their libraries, but some older branches don't have correct DT_RUNPATH entries for the layout of the tests vs. their libraries, and so the test infrastructure uses LD_LIBRARY_PATH. I'd like to remove the use of LD_LIBRARY_PATH at least in the cases that have correct DT_RUNPATH entries, but this still seemed like a reasonable enhancement to musl and will solve my immediate problems. > If your goal is to change the default path that gets searched for all > programs, the right place is /etc/ld-musl-$ARCH.path. It's already > per-arch and non-conflicting with a potentially coexisting alternate > libc. We need to search for libraries relative to the binary, but if I'm reading the code correctly relative paths in ../etc/ld-musl-$ARCH.path would search relative to PWD. The test infrastructure could write a custom ../etc/ld-musl-$ARCH.path with absolute paths to wherever it happened to install the bundle, but currently it is agnostic to the details of the libc included in the bundle, and LD_LIBRARY_PATH works for both glibc and musl (after this patch). > Please don't take any of this as a rejection of the proposed change. I > think it's probably okay, and probably an improvement on the usability > of it. But I do think the apparent need suggests that .path files were > likely a better solution to the problem at hand. > > Rich
On Tue, Feb 07, 2023 at 09:28:10PM -0800, Colin Cross wrote: > On Tue, Feb 7, 2023 at 4:49 PM Rich Felker <dalias@libc.org> wrote: > > > > On Tue, Feb 07, 2023 at 02:58:37PM -0800, Colin Cross wrote: > > > I'm hitting an issue where some test infrastructure is setting > > > LD_LIBRARY_PATH to a list that contains both 32-bit and 64-bit > > > libraries because it is unsure whether the code under test is going to > > > execute 32-bit or 64-bit processes or both. When using musl the > > > dynamic loader takes the first library with a matching name and then > > > fails to load it if it is for the wrong elf class. > > > > > > The attached patch verifies the elf machine and class when searching > > > the path list, continuing the search if a valid elf header with an > > > incorrect machine or class is found. > > > > While it requires some consideration to ensure that this yields safe & > > consistent behavior, I think it at least admits that; I haven't > > checked the actual code, but conceptually, it should be equivalent to > > treating finding a mismatched-arch library as a conclusive result > > whose behavior is searching the remainder of the search path. > > > > I'm a little bit skeptical of the motivation though. In general, it's > > not safe to just set LD_LIBRARY_PATH and run programs that might > > invoke other programs, since the math might contain outdated or > > mismatched libraries relative to what those other programs might want. > > On a system with multiple libcs present, the libraries found could > > even be for the wrong one. Really, LD_LIBRARY_PATH should just be set > > for invoking a single program (or family of binaries) that ships with > > its own versions of libraries or when you're overriding certain > > libraries for it, etc. This is contrary to how the environment works, > > and one reason it's probably better to use ldso --library-path=... > > rather than LD_LIBRARY_PATH when overriding libraries for a particular > > program invocation. > > We intend to distribute binaries built against musl using relinterp > (or Xcrt1 if that gets merged), which finds the dynamic loader > relative to the binary. These will be used on end user systems that > normally use glibc. The testing environment is taking a hermetic > bundle of musl and the test binaries built against musl, and are > intended to be hermetic so they shouldn't be executing anything that > would use a non-musl libc. Normally the binaries use relative > DT_RUNPATH entries to find their libraries, but some older branches > don't have correct DT_RUNPATH entries for the layout of the tests vs. > their libraries, and so the test infrastructure uses LD_LIBRARY_PATH. > I'd like to remove the use of LD_LIBRARY_PATH at least in the cases > that have correct DT_RUNPATH entries, but this still seemed like a > reasonable enhancement to musl and will solve my immediate problems. OK, this makes a lot of sense. Thanks for explaining! > > If your goal is to change the default path that gets searched for all > > programs, the right place is /etc/ld-musl-$ARCH.path. It's already > > per-arch and non-conflicting with a potentially coexisting alternate > > libc. > > We need to search for libraries relative to the binary, but if I'm > reading the code correctly relative paths in ../etc/ld-musl-$ARCH.path > would search relative to PWD. The test infrastructure could write a > custom ../etc/ld-musl-$ARCH.path with absolute paths to wherever it > happened to install the bundle, but currently it is agnostic to the > details of the libc included in the bundle, and LD_LIBRARY_PATH works > for both glibc and musl (after this patch). OK, that seems to be an important oversight: the inability to have $ORIGIN relative (or, ideally, ldso-relative) library paths in the ld-musl-*.path files. Maybe rpath kinda reduces the need for that, since you can just put the rpaths in the application binaries, but it still seems like it'd be more elegant (and more flexible) to have it in the path files. I think it's also worth asking if there should be different default library path (in the absence of a path file) when ldso is not in /lib. Possible reasonable-sounding behaviors would be no path at all (everything fails without an explicit path file or LD_LIBRARY_PATH or rpath) or having a default path that's just the same directory as ldso resides in. (In some sense, the latter is what we probably should have done all along even for installation in /lib, but having /usr/lib and /usr/local/lib also included was a nice convenience... :/) Rich