mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: Re: dlsym(handle) may search in unrelated libraries
Date: Thu, 7 Feb 2019 16:29:57 -0500	[thread overview]
Message-ID: <20190207212957.GS23599@brightrain.aerifal.cx> (raw)
In-Reply-To: <20190207174312.GE5469@voyager>

On Thu, Feb 07, 2019 at 06:43:12PM +0100, Markus Wichmann wrote:
> On Thu, Feb 07, 2019 at 04:42:15PM +0300, Alexey Izbyshev wrote:
> > I think the easiest way is simply to modify load_deps() to always traverse
> > DT_NEEDED in breadth-first order without relying on the dso list in the
> > outer loop. load_deps() already effectively maintains a queue (deps) that
> > can be used for BFS, so no recursion is needed.
> 
> OK, since we have to implement a BFS, that does in fact work. So I
> implemented that. Still needs testing, though.

Comments below:

> One side effect is, patch 7 from the previous mail was reverted.
> 
> Another is that now load_deps() depends on the deps array as loop
> structure. I was almost as far as just using the runtime code and adding
> an a_crash() in case of allocation failure during loadtime, but then I
> decided to just split loadtime and runtime apart. So
> load_deps_loadtime() is just a copy of load_deps(), refactored with the
> assumption runtime==0, and load_deps_runtime() is a copy of load_deps(),
> with the patch discussed here and refactored under the assumption
> runtime!=0.

The error() function sets ldso_fail to true, which prevents running of
the program. So it should work just fine to keep them together, and to
build the deps lists at program start time if we want (which is still
an open question, I think).

> I had noticed, during the refactoring, that this means that app->deps ==
> {0}, always. So I wondered if that might bite us. However, the only
> normal way to obtain a handle to the app itself is to call dlsym() with
> RTLD_NEXT. Which is one of the special symbols that will load symbols
> from the given DSO and all following ones in the symbol list. And all of
> the main app's dependencies are immediately added to the symbol list
> after the first load_deps() call (now load_deps_loadtime()).

This seems correct.

> For Rich's comfort, I am attaching patch 6 again, so all relevant
> patches are in one mail.

One thing to fix in it, see below..

> From e823910d69ff56ffccecaa9b29fd4b67b901798a Mon Sep 17 00:00:00 2001
> From: Markus Wichmann <nullplan@gmx.net>
> Date: Wed, 6 Feb 2019 16:51:53 +0100
> Subject: [PATCH 6/9] Make libc and vdso explicitly have no deps.
> 
> Alexey Izbyshev reported that without this, dlopen("libc.so") returns a
> handle that is capable of finding every symbol in libraries loaded as
> dependencies, since dso->deps == 0 usually means dependencies haven't
> been loaded.
> ---
>  ldso/dynlink.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/ldso/dynlink.c b/ldso/dynlink.c
> index ec921dfd..6ffeca85 100644
> --- a/ldso/dynlink.c
> +++ b/ldso/dynlink.c
> @@ -1244,6 +1244,7 @@ static void reloc_all(struct dso *p)
>  static void kernel_mapped_dso(struct dso *p)
>  {
>  	size_t min_addr = -1, max_addr = 0, cnt;
> +	static const struct dso *sentinel = 0;

This fragment is unused and looks like cruft leftover from an earlier
idea.

>  	Phdr *ph = p->phdr;
>  	for (cnt = p->phnum; cnt--; ph = (void *)((char *)ph + p->phentsize)) {
>  		if (ph->p_type == PT_DYNAMIC) {
> @@ -1428,6 +1429,7 @@ hidden void __dls2(unsigned char *base, size_t *sp)
>  	ldso.phdr = laddr(&ldso, ehdr->e_phoff);
>  	ldso.phentsize = ehdr->e_phentsize;
>  	kernel_mapped_dso(&ldso);
> +	ldso.deps = (struct dso**)&nodeps_dummy;
>  	decode_dyn(&ldso);
>  
>  	if (DL_FDPIC) makefuncdescs(&ldso);
> @@ -1675,6 +1677,7 @@ _Noreturn void __dls3(size_t *sp)
>  		vdso.prev = tail;
>  		tail->next = &vdso;
>  		tail = &vdso;
> +		vdso.deps = (struct dso**)&nodeps_dummy;
>  	}

Style nit: (struct dso **)

> From 18008eb03acd59f6cbaa82c607f1969c70707e21 Mon Sep 17 00:00:00 2001
> From: Markus Wichmann <nullplan@gmx.net>
> Date: Thu, 7 Feb 2019 18:17:25 +0100
> Subject: [PATCH 9/9] Fix runtime dependency accounting in dlopen().
> 
> As Alexey Izbyshev pointed out, the library given as argument to
> dlopen() does not necessarily have to be the last in the chain. Nor do
> any of the dependencies. Therefore it is wrong to assume that walking
> the chain of libraries from any of them forward will only walk over
> dependencies of the freshly loaded library.
> 
> I had to split the runtime and loadtime paths of load_deps() apart, or
> otherwise I would have had to allocate the deps array for the
> application at loadtime. And then I would have needed a resolution for
> allocation failure, which would have been a crash.
> ---
>  ldso/dynlink.c | 43 ++++++++++++++++++++++++++++---------------
>  1 file changed, 28 insertions(+), 15 deletions(-)
> 
> diff --git a/ldso/dynlink.c b/ldso/dynlink.c
> index 6ffeca85..66e6f18b 100644
> --- a/ldso/dynlink.c
> +++ b/ldso/dynlink.c
> @@ -1136,10 +1136,10 @@ static struct dso *load_library(const char *name, struct dso *needed_by)
>  	return p;
>  }
>  
> -static void load_deps(struct dso *p)
> -{
> -	size_t i, ndeps=0;
> -	struct dso ***deps = &p->deps, **tmp, *dep;
> +static void load_deps_loadtime(struct dso *p) {
> +	size_t i;
> +	struct dso *dep;
> +	p->deps = (struct dso**)&nodeps_dummy;
>  	for (; p; p=p->next) {
>  		for (i=0; p->dynv[i]; i+=2) {
>  			if (p->dynv[i] != DT_NEEDED) continue;
> @@ -1147,19 +1147,32 @@ static void load_deps(struct dso *p)
>  			if (!dep) {
>  				error("Error loading shared library %s: %m (needed by %s)",
>  					p->strings + p->dynv[i+1], p->name);
> -				if (runtime) longjmp(*rtld_fail, 1);
> -				continue;
>  			}
> -			if (runtime) {
> -				tmp = realloc(*deps, sizeof(*tmp)*(ndeps+2));
> -				if (!tmp) longjmp(*rtld_fail, 1);
> -				tmp[ndeps++] = dep;
> -				tmp[ndeps] = 0;
> -				*deps = tmp;
> +		}
> +	}
> +}
> +
> +static void load_deps_runtime(struct dso *p)
> +{
> +	size_t i, ndeps=0, j=0;
> +	struct dso ***deps = &p->deps, **tmp, *dep;
> +	for (; p; p=(*deps)[j++]) {
> +		for (i=0; p->dynv[i]; i+=2) {
> +			if (p->dynv[i] != DT_NEEDED) continue;
> +			dep = load_library(p->strings + p->dynv[i+1], p);
> +			if (!dep) {
> +				error("Error loading shared library %s: %m (needed by %s)",
> +					p->strings + p->dynv[i+1], p->name);
> +				longjmp(*rtld_fail, 1);
>  			}
> +			tmp = realloc(*deps, sizeof(*tmp)*(ndeps+2));
> +			if (!tmp) longjmp(*rtld_fail, 1);
> +			tmp[ndeps++] = dep;
> +			tmp[ndeps] = 0;
> +			*deps = tmp;
>  		}

Aside from above remark about not splitting the two versions, I don't
think the algorithm works. In the case of circular dependencies, which
are awful but do happen in the wild, the loop will run forever and
keep appending to the deps array. I think this can be fixed via
comparison of each new dep against prior slots (linear time for each
addition, so quadratic overall) to avoid adding it more than once, or
since we hold a lock, a tag could be added to struct dso to tag which
ones we've already hit (constant time). Since load_library is already
a linear search with a larger value of N than the number of
dependencies, I don't really see any advantage to avoiding the linear
search here, and would just go with it since it's simpler and less
invasive.

>  	}
> -	if (!*deps) *deps = (struct dso **)&nodeps_dummy;
> +	if (!*deps) *deps = (struct dso**)&nodeps_dummy;

Spurious style regression. :)

Rich


  parent reply	other threads:[~2019-02-07 21:29 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-05 21:02 Alexey Izbyshev
2019-02-06 13:40 ` Alexey Izbyshev
2019-02-06 16:02 ` Markus Wichmann
2019-02-06 17:02   ` Alexey Izbyshev
2019-02-06 20:25     ` Markus Wichmann
2019-02-06 21:23       ` Alexey Izbyshev
2019-02-07  5:33         ` Markus Wichmann
2019-02-07 13:42           ` Alexey Izbyshev
2019-02-07 17:43             ` Markus Wichmann
2019-02-07 20:37               ` Markus Wichmann
2019-02-07 21:29               ` Rich Felker [this message]
2019-02-07 16:54           ` Rich Felker
2019-02-07 18:36             ` Markus Wichmann
2019-02-07 18:57               ` Rich Felker
2019-02-07 20:31                 ` Markus Wichmann
2019-02-07 21:33                   ` Rich Felker
2019-02-07 21:37                     ` Rich Felker
2019-02-08 10:19             ` A. Wilcox
2019-02-08 12:00               ` Szabolcs Nagy
2019-02-08 16:09                 ` Rich Felker
2019-02-09 22:53 Alexey Izbyshev
2019-02-10  1:03 ` Rich Felker
2019-02-26 15:07   ` Rich Felker
2019-03-04  2:11     ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190207212957.GS23599@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).