From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/13748
Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail
From: Rich Felker <dalias@libc.org>
Newsgroups: gmane.linux.lib.musl.general
Subject: Re: __synccall: deadlock and reliance on racy /proc/self/task
Date: Sat, 9 Feb 2019 23:01:50 -0500
Message-ID: <20190210040150.GC23599@brightrain.aerifal.cx>
References: <1cc54dbe2e4832d804184f33cda0bdd1@ispras.ru>
 <20190207183626.GQ23599@brightrain.aerifal.cx>
 <f368f992-9c4c-8b7f-7b0e-39e39c27ebf7@ispras.ru>
 <20190208183357.GX23599@brightrain.aerifal.cx>
 <20190209162101.GN21289@port70.net>
 <6e0306699add531af519843de20c343a@ispras.ru>
 <20190209214045.GO21289@port70.net>
 <20190210005250.GZ23599@brightrain.aerifal.cx>
 <20190210011623.GP21289@port70.net>
 <20190210012032.GB23599@brightrain.aerifal.cx>
Reply-To: musl@lists.openwall.com
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226";
	logging-data="162943"; mail-complaints-to="usenet@blaine.gmane.org"
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: Alexey Izbyshev <izbyshev@ispras.ru>
To: musl@lists.openwall.com
Original-X-From: musl-return-13764-gllmg-musl=m.gmane.org@lists.openwall.com Sun Feb 10 05:02:07 2019
Return-path: <musl-return-13764-gllmg-musl=m.gmane.org@lists.openwall.com>
Envelope-to: gllmg-musl@m.gmane.org
Original-Received: from mother.openwall.net ([195.42.179.200])
	by blaine.gmane.org with smtp (Exim 4.89)
	(envelope-from <musl-return-13764-gllmg-musl=m.gmane.org@lists.openwall.com>)
	id 1gsgJi-000gDM-OW
	for gllmg-musl@m.gmane.org; Sun, 10 Feb 2019 05:02:06 +0100
Original-Received: (qmail 32247 invoked by uid 550); 10 Feb 2019 04:02:04 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
List-ID: <musl.lists.openwall.com>
Original-Received: (qmail 32227 invoked from network); 10 Feb 2019 04:02:03 -0000
Content-Disposition: inline
In-Reply-To: <20190210012032.GB23599@brightrain.aerifal.cx>
Original-Sender: Rich Felker <dalias@aerifal.cx>
Xref: news.gmane.org gmane.linux.lib.musl.general:13748
Archived-At: <http://permalink.gmane.org/gmane.linux.lib.musl.general/13748>

On Sat, Feb 09, 2019 at 08:20:32PM -0500, Rich Felker wrote:
> On Sun, Feb 10, 2019 at 02:16:23AM +0100, Szabolcs Nagy wrote:
> > * Rich Felker <dalias@libc.org> [2019-02-09 19:52:50 -0500]:
> > > On Sat, Feb 09, 2019 at 10:40:45PM +0100, Szabolcs Nagy wrote:
> > > > the assumption is that if /proc/self/task is read twice such that
> > > > all tids in it seem to be active and caught, then all the active
> > > > threads of the process are caught (no new threads that are already
> > > > started but not visible there yet)
> > > 
> > > I'm skeptical of whether this should work in principle. If the first
> > > scan of /proc/self/task misses tid J, and during the next scan, tid J
> > > creates tid K then exits, it seems like we could see the same set of
> > > tids on both scans.
> > > 
> > > Maybe it's salvagable though. Since __block_new_threads is true, in
> > > order for this to happen, tid J must have been between the
> > > __block_new_threads check in pthread_create and the clone syscall at
> > > the time __synccall started. The number of threads in such a state
> > > seems to be bounded by some small constant (like 2) times
> > > libc.threads_minus_1+1, computed at any point after
> > > __block_new_threads is set to true, so sufficiently heavy presignaling
> > > (heavier than we have now) might suffice to guarantee that all are
> > > captured. 
> > 
> > heavier presignaling may catch more threads, but we don't
> > know how long should we wait until all signal handlers are
> > invoked (to ensure that all tasks are enqueued on the call
> > serializer chain before we start walking that list)
> 
> That's why reading /proc/self/task is still necessary. However, it
> seems useful to be able to prove you've queued enough signals that at
> least as many threads as could possibly exist are already in a state
> where they cannot return from a syscall with signals unblocked without
> entering the signal handler. In that case you would know there's no
> more racing going on to create new threads, so reading /proc/self/task
> is purely to get the list of threads you're waiting to enqueue
> themselves on the chain, not to find new threads you need to signal.

One thing to note: SYS_kill is not required to queue an unlimited
number of signals, and might not report failure to do so. We should
probably be using SYS_rt_sigqueue, counting the number of signals
successfully queued, and continue sending them during the loop that
monitors progress building the chain until the necessary number have
been successfully sent, if we're going to rely on the above properties
to guarantee that we've caught every thread.

Rich