From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 10649 invoked from network); 6 Jul 2020 22:22:16 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 6 Jul 2020 22:22:16 -0000 Received: (qmail 20190 invoked by uid 550); 6 Jul 2020 22:22:15 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 20171 invoked from network); 6 Jul 2020 22:22:14 -0000 Date: Mon, 6 Jul 2020 18:22:02 -0400 From: Rich Felker To: Hydro Flask Cc: musl@lists.openwall.com Message-ID: <20200706222202.GN6430@brightrain.aerifal.cx> References: <20200630044323.GD6430@brightrain.aerifal.cx> <20200630092644.GE6430@brightrain.aerifal.cx> <20200630145851.GD13001@voyager> <275470aa6820d420339929a1fe409d89@yqxmail.com> <477cc243-b950-3363-e9f4-4c8a203b6bea@samersoff.net> <20200630195409.GH6430@brightrain.aerifal.cx> <300a1bdb9e9041bf05312ef032cbdc66@yqxmail.com> <20200706220024.GJ6430@brightrain.aerifal.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] Potential deadlock in pthread_kill() On Mon, Jul 06, 2020 at 03:14:43PM -0700, Hydro Flask wrote: > On 2020-07-06 15:00, Rich Felker wrote: > >Yes, I see it clearly now. Sorry it took a while. I have prepared the > >attached patch which I'll push soon if there are no problems. > > Needs one more tiny tweak. I noticed that pthread_cancel() calls > pthread_kill(). That means pthread_kill() must be async-cancel-safe. > If an asynchronous cancellation happens after pthread_kill() grabs > the killlock, then it will deadlock because the asynchronous > pthread_exit(PTHREAD_CANCELED) call will then recursively try to > grab killlock. > > The solution as far as I can tell is to not just block app signals > when grabbing killlock, but all signals. Indeed. It'd also work to disable async cancellation for the duration of the pthread_cancel call, but that's almost surely more work. Are you in agreement that it suffices for only pthread_kill to block all signals? (Still blocking just app signals everywhere else) The scheduling functions could be changed too, but I'm hesitant to change pthread_exit without thinking about it further since it has a lot more subtleties. And I think only pthread_kill needs it since it's the only one that needs to be AC-safe. Rich