From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_MSPIKE_H2 autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 18088 invoked from network); 21 Nov 2022 23:42:37 -0000 Received: from second.openwall.net (193.110.157.125) by inbox.vuxu.org with ESMTPUTF8; 21 Nov 2022 23:42:37 -0000 Received: (qmail 30188 invoked by uid 550); 21 Nov 2022 23:42:33 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 30153 invoked from network); 21 Nov 2022 23:42:32 -0000 Date: Mon, 21 Nov 2022 18:42:17 -0500 From: Rich Felker To: Markus Wichmann Cc: musl@lists.openwall.com Message-ID: <20221121234216.GP29905@brightrain.aerifal.cx> References: <20221121205413.GC2497@voyager> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20221121205413.GC2497@voyager> User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] Stack error through asynchronous cancellation On Mon, Nov 21, 2022 at 09:54:13PM +0100, Markus Wichmann wrote: > Hi all, > > I noticed that when a thread is cancelled while asynchronous > cancellation is in effect, then execution is redirected to __cp_cancel, > but __cp_cancel is a label only meant to undo the stack manipulation of > __syscall_cp, then jump to __cancel. This means, during asynchronous > cancellation, invalid values will be read from stack, and the stack > pointer may be set to a misaligned value before continuing to > __cp_cancel. For the most part, it will not matter, since __cancel will > not return if asynchronous cancellation is in effect, but there is still > something iffy about this whole behavior. > > I wonder if we maybe need a new label __cp_cancel_async, that > transitions to __cancel from unknown legal processor state. On x86_64, > for example, it is possible (though unlikely) that the direction flag is > set, thus leading to undefined behavior when __cancel is invoked with > the flag set, even though the ABI says it is not supposed to be set on > entry to any function. > > On architectures that do actually adjust the stack in __syscall_cp, the > changes to the stack pointer could also mean overwriting a thread > cancellation handler. And while pthread_cleanup_push() and ..._pop() are > not async-cancel-safe, can they not be invoked under deferred > cancellation and then have asynchronous cancellation be used between > them? Thanks! This was also reported to me off-list recently, and I have a fix; I just hadn't pushed it yet. The behavior of changing the return address is only valid and only makes sense for synchronous cancellation at a cancellation point. For async, we just need to go back to the behavior prior to commit 102f6a01e249ce4495f1119ae6d963a2a4a53ce5, where __cancel is called from the cancellation signal handler. There are other ways but they would require a lot of arch-specific knowledge (like DF you mentioned) to construct a valid frame to return into, and that's just not needed. Rich