From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/14735 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: [PATCH] time: Fix bug in timer_create Date: Wed, 25 Sep 2019 23:41:47 -0400 Message-ID: <20190926034147.GB9017@brightrain.aerifal.cx> References: <5114FFD1D7758546892A923CB3078CD4121624A3@dggemi523-mbx.china.huawei.com> <20190925171905.GW9017@brightrain.aerifal.cx> <20190925173424.GX9017@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="178215"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-14751-gllmg-musl=m.gmane.org@lists.openwall.com Thu Sep 26 05:42:05 2019 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1iDKfM-000kGM-Du for gllmg-musl@m.gmane.org; Thu, 26 Sep 2019 05:42:04 +0200 Original-Received: (qmail 28129 invoked by uid 550); 26 Sep 2019 03:42:01 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 28111 invoked from network); 26 Sep 2019 03:42:00 -0000 Content-Disposition: inline In-Reply-To: <20190925173424.GX9017@brightrain.aerifal.cx> Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:14735 Archived-At: On Wed, Sep 25, 2019 at 01:34:24PM -0400, Rich Felker wrote: > On Wed, Sep 25, 2019 at 01:19:05PM -0400, Rich Felker wrote: > > On Tue, Sep 24, 2019 at 11:16:41AM +0000, changdiankang wrote: > > > time: Fix bug in timer_create > > > > > > When create a SIGEV_THREAD timer, in helper thread "start", > > > "id" is assigned with self->timer_id immediately after pthread_create. > > > But the real timer_id is assigned to self->timer_id after SYS_timer_create. > > > So "id" in "start" always be 0, and timer_delete will always fail. > > > Put assignement of "id" after pthread_barrier_wait can fix this bug. > > > > > > Signed-off-by: Chang Diankang > > > > > > > diff --git a/src/time/timer_create.c b/src/time/timer_create.c > > > index c5e40a1..4172b9e 100644 > > > --- a/src/time/timer_create.c > > > +++ b/src/time/timer_create.c > > > @@ -48,13 +48,14 @@ static void *start(void *arg) > > > { > > > pthread_t self = __pthread_self(); > > > struct start_args *args = arg; > > > - int id = self->timer_id; > > > + int id; > > > jmp_buf jb; > > > > > > void (*notify)(union sigval) = args->sev->sigev_notify_function; > > > union sigval val = args->sev->sigev_value; > > > > > > pthread_barrier_wait(&args->b); > > > + id = self->timer_id; > > > for (;;) { > > > siginfo_t si; > > > while (sigwaitinfo(SIGTIMER_SET, &si) < 0); > > > > > > > Thanks! In the future, please send patches as attachments rather than > > inline since it looks like your mailer software is corrupting the > > patch (converting tabs to spaces, maybe other problems too). I can > > apply it manually though. > > Actually, I don't think this patch is a complete fix. After the > barrier wait passes, the caller can immediately call timer_delete, > which sets the sign bit of ->timer_id, possibly before the start > function reads the initial value of ->timer_id and saves it. > > I'll extend this to a full fix. Just masking off the bit would be a > hack that would work, but would still leave the formal data race, so I > think it would make sense to just drop use of the barrier and instead > use a semaphore that both sides have to post (timer thread does > wait->post, calling thread does post->wait). That would be nice in > terms of getting rid of linking dependency on pthread barriers too. It turns out it was easiest to just move the load of timer_id to the deletion code path. All of this will go away at some point anyway when we stop using kernel timer objects for SIGEV_THREAD timers and instead just use clock_nanosleep and manual overrun counting. Rich