From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.2 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by inbox.vuxu.org (OpenSMTPD) with SMTP id 66aac201 for ; Wed, 19 Feb 2020 14:13:49 +0000 (UTC) Received: (qmail 26295 invoked by uid 550); 19 Feb 2020 14:13:48 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 26274 invoked from network); 19 Feb 2020 14:13:47 -0000 Date: Wed, 19 Feb 2020 09:13:35 -0500 From: Rich Felker To: musl@lists.openwall.com Message-ID: <20200219141335.GF1663@brightrain.aerifal.cx> References: <543bcfcc-41f8-6960-8b6a-8e7fd5f01a01@adelielinux.org> <20200218222325.la2lz2yiny6rm47u@gentoo-zen2700x> <8a85b4ac-e1ed-dc30-bdad-b1e33ed20257@newmedia-net.de> <20200219033952.GA1663@brightrain.aerifal.cx> <706da236-bee2-eb6d-9dcb-f55c2b69b6e6@dd-wrt.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <706da236-bee2-eb6d-9dcb-f55c2b69b6e6@dd-wrt.com> User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] race condition in sem_wait On Wed, Feb 19, 2020 at 09:26:30AM +0100, Sebastian Gottschall wrote: > > Am 19.02.2020 um 04:39 schrieb Rich Felker: > >On Wed, Feb 19, 2020 at 01:46:34AM +0100, Sebastian Gottschall wrote: > >>Hello > >> > >>i discovered recently a race condition while playing with threads > >>and sem_wait/sem_post > >>sem_wait may fail with errno set EAGAIN which is not valid since > >>only sem_trywait is able to set that errno code. > >>this was causing a bug with a later select() and accept() which > >>failed since accept does not work if errno is set to EAGAIN. > >>from my point of view the bug is in sem_timedwait.c > >> > >>         if (!sem_trywait(sem)) return 0; > >> > >>         int spins = 100; > >>         while (spins-- && sem->__val[0] <= 0 && !sem->__val[1]) a_spin(); > >> > >>         while (sem_trywait(sem)) { > >> > >> > >>the fist sem_trywait will fail with -1 and sets EAGAIN. but the > >>second sem_trywait will not fail and does return 0. the problem now > >>is that errno is still present and not reset. > >>this may cause if sem_post is called from a second thread on the > >>same semaphore. > >>of course the same bug affects sem_timedwait itself. > >>so i assume sem_wait is not thread safe which is bad and is not > >>follow the posix specification > >> > >>or am i wrong here? > >errno is only meaningful on failure; unless specified otherwise (a few > >functions are special because you can't [easily] distinguish success > >from failure for them without examining errno), any standard function > >may have changed the value of errno when it returns with success. The > >only thing it's not allowed to do is clear it (set it to 0). > the problem is the posix manual specifies exclicit that EAGAIN > cannot be returned by sem_wait and in my code sample > > the following happens > > sem_wait(semaphort) > select(....) > socket = accept(....)  -> fails > > accept fails because sem_wait did set errno to EAGAIN and accept > will fail if errno is set to EAGAIN > i use sem_wait to limit the number of threads in my webserver. on > the thread itself i call sem_post. > but to make it work correct i have to set errno=0 before calling > accept since accept will not work if errno is set to EAGAIN > if you read the posix man for accept, you will find out that accept > will read errno unconditional and this is also the case for the musl > implementation accept does not use errno as input. Unless I'm forgetting something, no interfaces in libc except perror, syslog (%m), and *printf (%m extension) use errno as input. If accept is failing (returning -1) with errno==EAGAIN it's not because errno was EAGAIN before you called it but because your listening socket is in non-blocking mode and there is no pending connection to accept. Rich