mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Markus Wichmann <nullplan@gmx.net>
To: musl@lists.openwall.com
Subject: Re: [musl] Deadlock in dynamic linker?
Date: Tue, 27 May 2025 18:59:12 +0200	[thread overview]
Message-ID: <aDXvYJ6S4WBo63N0@voyager> (raw)
In-Reply-To: <20250527152007.GO1827@brightrain.aerifal.cx>

Am Tue, May 27, 2025 at 11:20:07AM -0400 schrieb Rich Felker:
> On Sat, May 24, 2025 at 07:45:45AM +0200, Markus Wichmann wrote:
> > I'm thinking something like this: Thread A initializes liba.so. liba.so
> > has initializers and finalizers, so thread A adds liba.so to the fini
> > list before calling the initializers. The liba initializer calls
> > dlopen("libb.so"). libb.so also has initializers.
> > 
> > While thread A is not holding the init_fini_lock, thread B calls exit().
> > That progresses until __libc_exit_fini() sets shutting_down to 1. Then
> > it tries to destroy all the libraries, but the loop stops when it comes
> > to liba.
> > 
> > liba.so has a ctor_visitor, namely thread A, so thread B cannot advance.
> > Thread A meanwhile is hanging in the infinite wait loop trying to
> > initialize libb.so. The situation cannot change, and the process hangs
> > indefinitely.
> 
> I see. In particular you're assuming the dlopen of libb happened after
> the exit started.
> 

I had completely neglected to look at the global ldso lock, actually.
But looking at it again, I am actually assuming that the dlopen() is
*starting* before the __libc_exit_fini() (so that thread B hangs waiting
for the lock), but that thread B then overtakes thread A between the
latter's release of the global lock and the taking of the init_fini_lock.

This does mean that taking the init_fini_lock before releasing the
global lock would entirely prevent the issue. Not sure if that's
acceptable, though.

> > A simple way out of this pickle could be to add liba.so to the fini list
> > only after it was initialized. That way, thread B cannot hang on it, or
> > more generally, the finalizing thread cannot be halted by an incomplete
> > initialization in another thread. This might change the order of nodes
> > on the fini list, but only to account for dynamic dependencies. Isn't
> > that a good thing?
> 
> No, I think it's non-conforming, and also unsafe, as it can result in
> failure to run a dtor for something whose ctor already ran but did not
> finish. This is a worse outcome than a deadlock in a situation that's
> arguably undefined to begin with.
> 

But __libc_exit_fini() refuses to destroy libraries that haven't been
constructed completely. If p->constructed is zero, a node is skipped
even if it is on the fini list. And that flag is set in do_init_fini()
only after all constructors have returned.

By adding nodes to the fini list only when construction has finished, we
would also get rid of the need to have __libc_exit_fini() wait for
construction to finish first.

> What might be acceptable, though, is moving the setting of
> shutting_down to take place after the last dtor is peeled off the
> list. However, this probably requires splitting shutting_down into two
> variables, due to lock order issues. The value is needed under the
> global ldso lock in dlopen() to make dlopen return with an error if
> exit has already begun (this one should be kept before the dtor loop,
> I think), and the value is needed in do_init_fini to block execution
> of new ctors (this one should only take effect after all dtors have
> been run).
> 
> Does that sound right?
> 

After __libc_exit_fini() has run its course, there is no need to record
anything, because it keeps the init_fini_lock. Anything that could query
shutting_down at that point would already hang taking that lock.

Ciao,
Markus

  parent reply	other threads:[~2025-05-27 16:59 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-24  5:45 Markus Wichmann
     [not found] ` <20250527152007.GO1827@brightrain.aerifal.cx>
2025-05-27 16:59   ` Markus Wichmann [this message]
2025-05-27 18:26     ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aDXvYJ6S4WBo63N0@voyager \
    --to=nullplan@gmx.net \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).