From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.1 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: from second.openwall.net (second.openwall.net [193.110.157.125]) by inbox.vuxu.org (Postfix) with SMTP id 3E5D92A37E for ; Sun, 18 Feb 2024 02:34:25 +0100 (CET) Received: (qmail 7695 invoked by uid 550); 18 Feb 2024 01:31:12 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 7659 invoked from network); 18 Feb 2024 01:31:12 -0000 Date: Sat, 17 Feb 2024 20:34:28 -0500 From: Rich Felker To: Valery Ushakov Cc: musl@lists.openwall.com, Rob Landley , toybox Message-ID: <20240218013428.GJ4163@brightrain.aerifal.cx> References: <349f4e17-8027-c521-eeb3-aa69e8f2b5a4@landley.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] Re: Not sure how to debug this one. On Sun, Feb 18, 2024 at 12:45:35AM +0300, Valery Ushakov wrote: > On Fri, Feb 16, 2024 at 19:48:27 -0600, Rob Landley wrote: > > > https://git.musl-libc.org/cgit/musl/tree/src/signal/sh/sigsetjmp.s > > I haven't touched superh asm in a while and the code has zero > comments (*ugh*), but I *think* sigsetjmp clobbers caller's r8. > > r8 is callee saved. sigsetjmp wants to use it to save its env > argument across the call to __setjmp. So it saves caller's r8 and > uses r8 to save its env b/c __setjmp it's about to call will clobber > it. Then __setjmp saves this r8 = env in the jump buf, not caller's > r8. The instruction in the delay slot of the tail call to > __sigsetjmp_tail vaguely looks like it might have been intended to > patch it, but it loads r8, not stores it. I'm not sure why it would > want to load r8 at that point. > > Sorry, I only skimmed through the code and as I said, there're no > comments (which for asm code is borderline criminal, IMHO :) so I > might be completely misinterpreting what this code does... I think you've stumbled across the bug even if you don't realize it. :-) The idea of the sigsetjmp implementation is explained in commit 583e55122e767b1586286a0d9c35e2a4027998ab. As you've noted, comments would be nice, but I'm not sure where they belong since it's the same logic on every arch. sigsetjmp wants to save a position in itself for longjmp to return to, but because sigsetjmp will be returning, it can't save any state on the stack; it doesn't have a stack frame. The only storage it has available is call-saved registers, which longjmp will necessarily restore for it. So, it picks one of its caller's call-saved registers (r8) and stores that inside the sigjmp_buf, then uses that register to preserve the pointer to the sigjmp_buf to access after setjmp returns (either the first time or the second time). The only problem with the sh implementation is that, due to limited displacement range for load/store instructions, we have to add 60 to the sigjmp_buf base address to get the base register for accessing the saved return address and r8. This takes place in a temp register r6. However, the last line of this code path, the delay slot after the tail call to __sigsetjmp_tail, erroneously uses r4 (the base sigjmp_buf pointer) as the base for restoring r8: mov.l @(4+8,r4), r8 This corrupts the caller's saved register file. If I'm not mistaken, it overwrites the caller's r8 with the caller's r11. Presumably Rob didn't actually hit a crash in sigsetjmp, but just inferred the crash was there via printf debugging. The actual crash must be in the caller once it gets back control with the wrong value in r8. This has been an area I've wanted to have testing for for a long time, but I don't have a really good idea for how to implement the test. We would want the compiler to generate code that puts lots of intermediates in as many call-saved registers as possible, calls setjmp, then calls another function that *also* puts pressure on the register allocation and then calls longjmp. If it's okay for the test to be arch-specific, this could just work via __asm__ register constraints and a call to setjmp from asm, but I'm not sure if that would be a good way to do things in libc-test... In any case, thanks to everyone who participated in this. This is a rather bad bug and a nice one to get fixed up in the release window! Rich