From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/5914
Path: news.gmane.org!not-for-mail
From: Rich Felker <dalias@libc.org>
Newsgroups: gmane.linux.lib.musl.general
Subject: Re: Multi-threaded performance progress
Date: Tue, 26 Aug 2014 16:26:43 -0400
Message-ID: <20140826202643.GI12888@brightrain.aerifal.cx>
References: <20140826034321.GA13999@brightrain.aerifal.cx>
 <1409036654.4835.14.camel@eris.loria.fr>
 <1409070919.8054.47.camel@eris.loria.fr>
 <20140826175304.GD12888@brightrain.aerifal.cx>
 <1409077839.8054.54.camel@eris.loria.fr>
 <20140826190507.GE12888@brightrain.aerifal.cx>
 <1409081653.8054.60.camel@eris.loria.fr>
Reply-To: musl@lists.openwall.com
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Trace: ger.gmane.org 1409084832 8484 80.91.229.3 (26 Aug 2014 20:27:12 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Tue, 26 Aug 2014 20:27:12 +0000 (UTC)
Cc: musl@lists.openwall.com
To: Jens Gustedt <jens.gustedt@inria.fr>
Original-X-From: musl-return-5920-gllmg-musl=m.gmane.org@lists.openwall.com Tue Aug 26 22:27:05 2014
Return-path: <musl-return-5920-gllmg-musl=m.gmane.org@lists.openwall.com>
Envelope-to: gllmg-musl@plane.gmane.org
Original-Received: from mother.openwall.net ([195.42.179.200])
	by plane.gmane.org with smtp (Exim 4.69)
	(envelope-from <musl-return-5920-gllmg-musl=m.gmane.org@lists.openwall.com>)
	id 1XMNKU-0000PL-FB
	for gllmg-musl@plane.gmane.org; Tue, 26 Aug 2014 22:26:58 +0200
Original-Received: (qmail 11912 invoked by uid 550); 26 Aug 2014 20:26:57 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
Original-Received: (qmail 11902 invoked from network); 26 Aug 2014 20:26:57 -0000
Content-Disposition: inline
In-Reply-To: <1409081653.8054.60.camel@eris.loria.fr>
User-Agent: Mutt/1.5.21 (2010-09-15)
Original-Sender: Rich Felker <dalias@aerifal.cx>
Xref: news.gmane.org gmane.linux.lib.musl.general:5914
Archived-At: <http://permalink.gmane.org/gmane.linux.lib.musl.general/5914>

On Tue, Aug 26, 2014 at 09:34:13PM +0200, Jens Gustedt wrote:
> Am Dienstag, den 26.08.2014, 15:05 -0400 schrieb Rich Felker:
> > On Tue, Aug 26, 2014 at 08:30:39PM +0200, Jens Gustedt wrote:
> > > Or do you mean that I should use an atomic store at the other end?
> >
> > Yes. With an atomic store at the other end, I think it could be
> > correct, but I'd need to review it further to be sure.
> 
> ok, it shouldn't be difficult to use atomic ops, then.

Based on what you've said below, though, I think there's still a big
problem..

> > > > Note that the performance of the code in which you're trying to avoid
> > > > the lock does not matter in the slightest except when a race happens
> > > > between a thread acting on cancellation or timeout and a signaler
> > > > (since that's the only time it runs). I expect this is extremely rare,
> > > > so unless we determine otherwise, I'd rather not add complexity here.
> > > 
> > > If we have a broadcaster working a long list of waiters, this might
> > > still happen sufficiently often. And the "complexity" is hidden in the
> > > execution pattern of the current version, where control and handling
> > > of the list alternates between different threads, potentially as many
> > > times as there are waiters in the list.
> > 
> > Does your code eliminate that all the time? If so it's more
> > interesting. I'll look at it more.
> 
> Yes, it will be the signaling or broadcasting thread that will be
> working on the integrity of the list while it is holding the lock. At
> the end those that it detected to be leaving will be isolated list
> items, those that are to be signaled form one consecutive list that is
> detached from the cv, and the ones that remain (in the signaling case)
> form a valid cv-with-list.
> 
> The only small gap that remains (and that annoys me) is the leaving
> thread that sneaks in
> 
>  - marks itself as leaving before the end of the  the CS
>  - only asks for _c_lock *after* the signaling thread has left its CS
> 
> This is all our problem of late access to the cv variable revisited,
> but here it is condensed in a very narrow time frame. Both threads
> must be active for this to happen, so my hope is that when both are
> spinning for some time on the  c_lock for the waiter and on ref for
> the signaler, none of them will "ever" be forced into a futex wait.

That's a bug thast needs to be fixed to go forward with this, since
it's essentially a use-after-free. Now that you mention it, avoiding
use-after-free was one of my main motivations for having such waiters
synchronize with the signaling thread. That should have been
documented in a comment somewhere, but the point seems to have slipped
my mind sometime between the design phase and writing the code and
comments.

Do you see any solution whereby a waiter that's waking up can know
reliably, without accessing the cv, whether a signaling thread is
there to take responsibility for removing it from the list? I'm not
seeing any solution to that problem.

I'm also still skeptical that there's a problem to be solved here; for
it to matter, the incidence of such races needs to be pretty high, I
think. Perhaps if you think it's going to matter you could work on a
test case that shows performance problems under load with lots of
timedwait expirations (or cancellations, but I think worry about
cancellation performance is somewhat silly to begin with). Or, if you
don't have time to spend on side projects like test cases, maybe
someone else could test it?

Rich