From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/5951 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: [PATCH 2/2] avoid taking _c_lock if we know it isn't necessary Date: Wed, 27 Aug 2014 17:48:53 -0400 Message-ID: <20140827214853.GV12888@brightrain.aerifal.cx> References: <1409133335.4476.30.camel@eris.loria.fr> <20140827200756.GS12888@brightrain.aerifal.cx> <1409175026.4476.71.camel@eris.loria.fr> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1409176152 22774 80.91.229.3 (27 Aug 2014 21:49:12 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 27 Aug 2014 21:49:12 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-5958-gllmg-musl=m.gmane.org@lists.openwall.com Wed Aug 27 23:49:07 2014 Return-path: Envelope-to: gllmg-musl@plane.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1XMl5W-00037I-Rs for gllmg-musl@plane.gmane.org; Wed, 27 Aug 2014 23:49:06 +0200 Original-Received: (qmail 22062 invoked by uid 550); 27 Aug 2014 21:49:06 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 22054 invoked from network); 27 Aug 2014 21:49:05 -0000 Content-Disposition: inline In-Reply-To: <1409175026.4476.71.camel@eris.loria.fr> User-Agent: Mutt/1.5.21 (2010-09-15) Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:5951 Archived-At: On Wed, Aug 27, 2014 at 11:30:26PM +0200, Jens Gustedt wrote: > > I also have some other potential changes to this > > code based on my latest comments to: > > > > http://austingroupbugs.net/view.php?id=609 > > > > regarding things they seem to deem as requirements, and which musl > > does not satisfy, that are specified in non-normative text. So there's > > likely to be more cond var work to do before the release still... > > Ah, the cancelation stuff. As if condition variables wouldn't be > complicated enough already, without cancelation. We already have two > different ordered sequences of events, those on the cv and those on > the mutex. The discussion (and our implementation struggles) already > shows how difficult it is to get these two linear sequences ordered in > a convenient way. If you add a third set of events that are neither > ordered among themselves (cancelation to different threads are > asynchronous) nor with any of the two sequences, the semantics aren't > clear at all. (This is why I think that generally thread cancelation > is not a good idea, and why it is not very widely used. It contributes > for more than 50% to the complexity of the implementation of > pthreads.) > > But with the current implementation, I would think that it basically > fulfills (or can be easily made to fulfill) the requirement that > cancelation would not be "consuming" a signal when some other thread > is available. We are marking threads as WAITING, LEAVING or SIGNALED > and only for WAITING, a thread can be consired "blocked" on the > cv. The transition between these is atomic, and so once a signaler > marked a thread SIGNALED, it is not blocked and has rightly consumed > the signal. Yet this transition to SIGNALED can happen when the waiter is already executing the cancellation cleanup handler, before the a_cas there. In this case, it has "consumed the signal", but __timedwait never returns (the __syscall_cp in timedwait never returns). I have a patch which solves this problem via setjmp in pthread_cond_timedwait and longjmp in unwait when SIGNALED won the a_cas race, but it has noticable performance cost (due to unconditional setjmp on each call). The ideal solution would be to implement the cancellation variant I've been wanting to add for some time now: a cancellation mode where the cancelled function returns with ECANCELED rather than acting on cancellation immediately. This can be implemented by having the cancellation signal handler not just check the program counter, but also modify it, when this mode is in effect, so that returning from the signal handler skips the syscall and instead returns -ECANCELED. With that done, all of the nasty libc-internal use of cancellation cleanup handlers could be replaced with temporarily changing the cancellation mode and simply checking return values/errno for ECANCELED. And it allows us to implement things like the cond var behavior where deciding whether to act on cancellation or leave it pending should take place in userspace after the syscall returns. We can also expose this behavior as an experimental public interface and propose it for standardization, but there are a lot of corner cases I'd want to analyze in more detail before doing so to make sure they're done right. Rich