From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL,SUBJ_OBFU_PUNCT_FEW autolearn=ham autolearn_force=no version=3.4.2 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by inbox.vuxu.org (OpenSMTPD) with SMTP id 8bf542a1 for ; Mon, 17 Feb 2020 16:29:58 +0000 (UTC) Received: (qmail 17604 invoked by uid 550); 17 Feb 2020 16:29:56 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 17586 invoked from network); 17 Feb 2020 16:29:55 -0000 Date: Mon, 17 Feb 2020 10:15:27 -0500 From: Rich Felker To: zuotina Cc: musl@lists.openwall.com Message-ID: <20200217151527.GV1663@brightrain.aerifal.cx> References: <19e6adf5.abdf.170487c3154.Coremail.zuotingyang@126.com> <20200215172726.GP1663@brightrain.aerifal.cx> <23f3de92.ba55.17051fd5c6b.Coremail.zuotingyang@126.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <23f3de92.ba55.17051fd5c6b.Coremail.zuotingyang@126.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: Rich Felker Subject: Re: [musl] Re:Re: [musl] [timer] timer_delete function async problem On Mon, Feb 17, 2020 at 03:12:03PM +0800, zuotina wrote: > The pseudo-code example program as follows. There are two processes A and B. > process A: > void timer_handler(union sigval arg) > { > pctx = (struct ctx *)arg.sival_ptr; > if (pctx == NULL) > return; > // Here may be scheduled to timer_release of thread B. > // When scheduled again, the pctx was illeagal, so panic. > lock(pctx->lock); > /* do something with pctx */ > unlock(pctx->lock); > } > void create_timer_function() > { > struct sigevent sig; > sig.sigev_notify = SIGEV_THREAD; > sig.sigev_value.sival_ptr = pctx; > sig.sigev_notify_function = timer_handler; > timer_create(CLOCK_REALTIME, &sig, &pctx->timerid); > } > > > process B: > void timer_release() > { > lock(pctx->lock); > timer_delete(pctx->timerid); > unlock(pctx->lock); > // here will do something... > free(pctx); > } > > > After the call to timer_delete is made but before the signal is sent, > a new handler which is not already-running will be coming at uncertain time. > I've tried synchronization and mutual between 'timer_release' and 'timer_handler', > neither can be resolved. > How to solve the panic in the example, hope to give some suggestions. As written, the code has use-after-free, independent of what timer_delete does. It's always possible that the timer thread is at the entry point of timer_handler when thread B calls timer_delete. Then thread B frees an object which the timer thread is accessing. Nothing timer_delete could do could prevent this situation from arising. At first glance, it seems like this is a fundamental design flaw in the SIGEV_THREAD timer API -- there's no way to determine whether there's still (at least) one thread that will run, since the state of having started but not executed any application code in the handler yet is not observable/distinguishable from not having run at all. If this analysis is correct, then it seems like the timer handler thread must use at least one object of permanent lifetime (i.e. that's never freed) to synchronize and determine if it can access other objects. If you can assume (as is true in the musl implementation, but doesn't seem to be guaranteed) that at most one timer handler thread for a given timer is runnable at once, then you can simply have the timer handler thread be responsible for deleting the timer and freeing the associated data object based on a flag that another thread set inside the object (under appropriate locking). If you can't assume this, you can make it true by not using auto-rearming timers (i.e. no it_interval) and having the timer thread re-arm itself (it_value) every time it runs. None of this is nice, and suggests that SIGEV_THREAD timers (and the POSIX timers API in general) is poorly designed and difficult to use safely. Really, the right solution here is just making your own thread that calls clock_nanosleep (probably with TIMER_ABSTIME) in a loop and runs the code you want each time the sleep finishes. This is far easier to get right and doesn't have complex lifetime issues to deal with. Rich