From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.4 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,NICE_REPLY_A,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 20094 invoked from network); 28 Feb 2022 15:51:07 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 28 Feb 2022 15:51:07 -0000 Received: (qmail 7972 invoked by uid 550); 28 Feb 2022 15:51:04 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 7940 invoked from network); 28 Feb 2022 15:51:03 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zhasha.com; s=wirbelwind; t=1646063451; bh=F/uBitIVoIf3GdPsiQCR4vjdqnW3Uuy8NCuK0QSOdEI=; h=Date:From:To:Subject:In-Reply-To:References; b=lKz/N3ZfCR1YNTb3RHd/AAku3I/O89ukJEcM/3X57SI/synyb75xpxRv+wt79oA2N 5BGcL51XB1XlgcMxymbWH9GINfAI2pVay4y5Kqqao04AcfaoBDisY+7H616o9rS0gg XyLVutz7Q/XXco6uKlQfi7W+oMSqSPsHHOLyH/zc= Date: Mon, 28 Feb 2022 16:50:51 +0100 From: Joakim Sindholt To: musl@lists.openwall.com Message-Id: <20220228165051.dd53c432ccd85517cb9513cd@zhasha.com> In-Reply-To: References: <20220221174223.GA2079@voyager> <20220223185746.GB2079@voyager> <20220226123855.392c22acb208a966210c7af2@zhasha.com> <20220228094814.be34ddcf7be25724e8f8c21b@zhasha.com> X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.33; x86_64-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="Multipart=_Mon__28_Feb_2022_16_50_51_+0100_Wn5s_naupArf4dOe" Subject: Re: [musl] Suggestion for thread safety This is a multi-part message in MIME format. --Multipart=_Mon__28_Feb_2022_16_50_51_+0100_Wn5s_naupArf4dOe Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit On Mon, 28 Feb 2022 14:43:36 +0000, Lee Shallis wrote: > Seems the wait just wasn't long enough, at about 4 yields onwards the > results become consistent success, I've attached the file I did the > experiments in, I even tried it under -O3 and no exits were > encountered, so yes my method works, just needs a bit more wait time > for extreme cases Between the lines > if ( !(shared->tid) ) and > shared->tid = tid; the kernel might suspend the running thread and allow the other to run, or you might simply get unlucky and have the two threads do the checks close enough to simultaneously that the memory hasn't been synchronized yet. Either way you end up with both threads seeing that shared->tid is zero and both of them writing their tids to it, and thus both enter the critical section at the same time. And so the lock fails at the very first hurdle: mutual exclusion. No amount of sleeping will make the bug go away, only slightly more difficult to trigger. The point of the clock_nanosleep call was to force a reschedule while holding the lock. This also increases the runtime inside the lock which in this case increases the likelihood that the thread trying to take the lock will be waiting for it and end up racing with the thread that currently has it when it unlocks and tries to relock it. Now that you've inserted lots of sched_yield()s your lock is not only still broken (in more ways than the one we've been trying to get you to understand) but also extremely slow. As a hint for your future education: the first (and far from only) thing you'll need is compare-and-swap, aka. CAS. You can read up on this class of bugs if you'd like. It's called "Time Of Check to Time Of Use" or "TOCTOU" for short. I didn't even need to poke at the code this time as the code you sent breaks just the same on my machine. I hope you'll learn from this. --Multipart=_Mon__28_Feb_2022_16_50_51_+0100_Wn5s_naupArf4dOe Content-Type: text/plain; name="not-on-my-machine.txt" Content-Disposition: attachment; filename="not-on-my-machine.txt" Content-Transfer-Encoding: 7bit zhasha@wirbelwind /home/zhasha ; cat lock.c #define _GNU_SOURCE #include #include #include #include #include #include #include #include typedef unsigned int uint; typedef struct _LOCK { uint num; pid_t tid; } LOCK; void LockSiData( LOCK *shared ) { pid_t tid = gettid(); uint const yield = 4; uint i; /* struct timespec ts = {0}; ts.tv_nsec = 1; */ while ( shared->tid != tid ) { if ( !(shared->tid) ) shared->tid = tid; //clock_nanosleep(CLOCK_MONOTONIC, 0, &ts, 0); for ( i = 0; i < yield; ++i ) sched_yield(); } shared->num++; } void FreeSiData( LOCK *shared ) { pid_t tid = gettid(); if ( shared->tid != tid ) return; shared->num--; if ( shared->num ) return; shared->tid = (pid_t)0; //sched_yield(); } volatile uint quit = 0; volatile uint data = 0; LOCK lock = {0}; void* thread( void *ud ) { uint i; struct timespec ts = {0}; ts.tv_nsec = 1; (void)ud; while ( quit < CLOCKS_PER_SEC ) { LockSiData( &lock ); i = data++; if (i != 0) { flockfile( stdout ); printf("var = %u, expected 0\n", i); funlockfile( stdout ); FreeSiData( &lock ); exit(1); } /* Why is needed for the test? */ clock_nanosleep(CLOCK_MONOTONIC, 0, &ts, 0); i = data--; if (i != 1) { flockfile( stdout ); printf("var = %u, expected 1\n", i); funlockfile( stdout ); FreeSiData( &lock ); exit(1); } ++quit; FreeSiData( &lock ); } return ud; } int main() { pthread_t pt; int i; for (i = 0; i < 2; i++) { if ((errno = pthread_create(&pt, 0, thread, 0)) != 0 ) { flockfile( stdout ); printf("pthread_create failed: %m\n"); funlockfile( stdout ); return 1; } } pthread_exit(0); } zhasha@wirbelwind /home/zhasha ; gcc -lpthread -O3 lock.c zhasha@wirbelwind /home/zhasha ; ./a.out var = 1, expected 0 zhasha@wirbelwind /home/zhasha ; ./a.out var = 1, expected 0 var = 2, expected 1 zhasha@wirbelwind /home/zhasha ; --Multipart=_Mon__28_Feb_2022_16_50_51_+0100_Wn5s_naupArf4dOe--