From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/13545 Path: news.gmane.org!.POSTED!not-for-mail From: Markus Wichmann Newsgroups: gmane.linux.lib.musl.general Subject: Re: sem_wait and EINTR Date: Sun, 9 Dec 2018 07:50:33 +0100 Message-ID: <20181209065033.GB2554@voyager> References: <20181205194759.GA32233@voyager> <20181205212716.sx6ra2xqhuei735q@core.my.home> <20181205215826.GX23599@brightrain.aerifal.cx> <20181206024340.202e0fc4@orivej.orivej.org> <20181206031756.GZ23599@brightrain.aerifal.cx> <20181206155756.GB32233@voyager> <20181206162336.GB23599@brightrain.aerifal.cx> <20181206170359.GC32233@voyager> <20181206173337.GD32233@voyager> <20181209025140.GL23599@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1544338149 12869 195.159.176.226 (9 Dec 2018 06:49:09 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sun, 9 Dec 2018 06:49:09 +0000 (UTC) User-Agent: Mutt/1.10.1 (2018-07-13) To: musl@lists.openwall.com Original-X-From: musl-return-13561-gllmg-musl=m.gmane.org@lists.openwall.com Sun Dec 09 07:49:05 2018 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1gVstj-0003CQ-3y for gllmg-musl@m.gmane.org; Sun, 09 Dec 2018 07:49:03 +0100 Original-Received: (qmail 9726 invoked by uid 550); 9 Dec 2018 06:51:11 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 9704 invoked from network); 9 Dec 2018 06:51:10 -0000 Content-Disposition: inline In-Reply-To: <20181209025140.GL23599@brightrain.aerifal.cx> X-Provags-ID: V03:K1:J1VX0pdUnmIL6o7LS6UBdRZNjLGJaCxlCLykQgC2WPDhruQzvIu QqPjGmJsLibirlICmwLkIiFIvls2PaOvuIeOdJ9AgT4hIC1zKL6G0lu7dI0s0JW4GZ+xUf8 uecWBP+mLG4QqklvVI/pD8yi+o/cmtd6xQTRLb59JqqhrsvIqjBBe7kzp4MvZLV8iFSIShh uZ3yoM4fXeAhktKH1pYcw== X-UI-Out-Filterresults: notjunk:1;V03:K0:x/eXUqfLpZU=:g2GPz56ICxno+6wx/sP5gc Eo/aQZJVIbeT7Z+gtmVjNsF2MrNuuro79BklOHCrvejse0C1WmNV5caxdSczWMG/+o6g78GRC 4RoU7uqaw+GDw7Hiv3yPcYLyjdT+kzIDPtQLDu37fWVx6ehVH42anF68stXw8K+mdIeuICFft uSEFocP6gbltms2PiyDPTJamubKuJs7YZNpQGPAA4c+vplWezhBKOQ5n/eIXmDpEhgR7UrNr8 xpcArnkb96QeDgdYg2AYSDeDCTfiDhMRsH2MqIKWdrg85+3tiOC+nE4z4PTdxQP1jZFWLCi+W hUrDD16pyC8/nYfgKjz7STC2aaG60wMmXIUnPNsK5ftzdXHOI6MqtTbfYpBd4ZMphRCAmI/yR /K4S0cps9FOFZdqoaFyXV06ZkEyRGDEWECDuGJ886gdKpruR157FbVo7PjQEhnBOWwRhet/AG fvl8Wb1MtftnRgPDX/qSZ+5WFHIiGQFGn+be/7AhyEBCYfTSAXlr27mKWY5x9EyM6aUs7VwxQ A95HgmqenHR80kqSKHlY/ICoPhkrwp24qZkzZqXlPSXwXhFRK41W6m3/kxZyE7XKQ2uPCjODy bZ2pxTJ3OgcgIcKQafLlBLIvwskSPETMokz7m9dj6yc7FAAqu2zaojW4A0A2MeOSZObUJIWfI 0IybBcoWb0PC2JWjpql7ilCraGlcsDRBKC76lR0RZSSXJ0AN2yj9zWez451t5wqbyA3zc1m+h 7wtd5oms39Ji6P/IDeqKN58KGf061Dfh6PFlYzK/PTV7ToxrSAOu58X7BIZInAKW9H8R/VLx Xref: news.gmane.org gmane.linux.lib.musl.general:13545 Archived-At: On Sat, Dec 08, 2018 at 09:51:40PM -0500, Rich Felker wrote: > Conceptually the logic looks correct, but I'm trying to phase out the > __libc structure, Oh? Well, I suppose it does offer no benefit the linker doesn't already give you. I suppose we could make the flag a static int in sigaction(), with a hidden getter function. > and more importantly, introducing a bitfield here is > not safe unless access to all bitfield members is controlled by a > common mutex or other synchronization mechanism. There is a data race > if any access to the other fields is accessed when sigaction is > callable from another thread, and the impact of a data race on any of > these is critical. > The flags can_do_thrads, threaded, and secure are all set before the program becomes multi-threaded and they keep their values throughout the rest of the program. So the only accesses are read accesses. handling_sigs on the other hand might be written by multiple threads at the same time, but to the same value. Can that ever be an issue? Wait... I think I figured it out. On some architectures you can have incoherent caches. Threads need to synchronize to access shared ressources, and those synchronization functions take care of that problem, but I circumvented them. Now, if the caches are set to write-back mode (which is common), then writing to handling_sigs only writes to the cache line for the address of handling_sigs, but not to main memory. So another thread on another core will not see the update. Worse: If by a bad roll of the dice the writing thread's cache line is evicted before the reading thread's cache line is (it might have written somewhere else in that cache line), that latter eviction will overwrite handling_sigs back to 0. Am I close? > Unfortunately, even without the bitfield, there is a data race between > access to the new flag from sem_timedwait and from sigaction. In > theory, this potentially results in a race window where sem_wait > retries on EINTR because it doesn't yet see the updated handling_sigs. > In that case, the sem_wait() and the sigaction() would be unserialized, and any correct program cannot depend on the order of operations. In particular, the caller if the sem_wait() can't depend on the sigaction() having installed the signal handler yet. > The "right" fix would be to use an AS-safe lock to protect the > handling_sigs object, or make it atomic (a_store on the sigaction > side, a_barrier before load on the sem side). Alternatively we could > assume the above reasoning about sequencing of sigaction before EINTR > and just use a lock on the sigaction side, but the atomic is probably > simplest and "safer". > Plus, a lock for a single int that can only ever go from 0 to 1 would be a bit overkill. > A couple other small nits: comments especially should not exceed 80 > columns (preferably <75 or so) assuming tab width 8; code should avoid > it too but I see it slightly does in a few places there. Spaces should > not be used for indention. Ah crap. I forgot the comment the first time I was in the file, and when I started vim again, I forgot to re-apply the musl settings. > And the commit message should reflect the > change made; it's not adding a workaround, but reducing the scope of a > previous workaround (which should be cited by commit id) to fix a > conformance bug. The name "handling_sigs" is also a bit misleading; > what it's indicating is that interrupting signal handlers are or have > been present. > > I'm happy to fix up these issues and commit if there aren't mistakes > in the above comments that still need to be addressed. > > Rich Well, it certainly was a learning experience for me. Just goes to show that you never learn as much as when you're making a mistake and have to figure it out. Ciao, Markus