mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: [musl] Serious missing synchronization bug between internal locks and threads_minus_1 1->0 transition
Date: Fri, 22 May 2020 12:21:42 -0400	[thread overview]
Message-ID: <20200522162142.GV1079@brightrain.aerifal.cx> (raw)

This was reported on #musl yesterday as a problem in malloc
(oldmalloc). When the second-to-last thread exits (leaving the process
single-threaded again), it decrements libc.threads_minus_1. The
remaining thread can then observe this as a relaxed-order atomic,
which is fine in itself, but then uses the observed
single-threadedness to skip locks. This means it also skips
synchronization with any changes made to memory by the exiting thread.
The race goes like (L exiting, M remaining):

M accesses and caches data D
                                    L takes lock protecting D
                                    L modifies D
                                    L releases lock protecting D
                                    L call pthread_exit
                                    L locks thread list
                                    L decrements threads_minus_1
                                    L exits & releases list
M observes threads_minus_1==0
M skips taking lock protecting D
M accesses D, using cached value

Note that while one might expect this not to happen on x86 with strong
memory ordering, the lack of memory barrier also ends up being a lack
of *compiler barrier*, and in the observed breakage, it was actually
the compiler, not the cpu, caching D between the first and last line
of the above example.

The simple, safe fix for this is to stop using libc.threads_minus_1 to
skip locking and instead use libc.threaded, which is
permanent-once-set. This is the first change I plan to commit, to have
a working baseline, but it will be something of a performance
regression for mostly-single-threaded programs which only occasionally
use threads since they'll be stuck using locks all the time.

I've been trying to work out a better fix, based on SYS_membarrier
from the exiting thread, but that's complicated by it only being able
to impose memory barriers, not compiler barriers.

Anyway, first fix coming soon. This will be important for distros to
pick up.

Rich

             reply	other threads:[~2020-05-22 16:22 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-22 16:21 Rich Felker [this message]
2020-05-22 21:56 ` [musl] [PATCH] (series) Fix serious " Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200522162142.GV1079@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).