On Sat, Feb 17, 2024 at 08:34:28PM -0500, Rich Felker wrote: > On Sun, Feb 18, 2024 at 12:45:35AM +0300, Valery Ushakov wrote: > > On Fri, Feb 16, 2024 at 19:48:27 -0600, Rob Landley wrote: > > > > > https://git.musl-libc.org/cgit/musl/tree/src/signal/sh/sigsetjmp.s > > > > I haven't touched superh asm in a while and the code has zero > > comments (*ugh*), but I *think* sigsetjmp clobbers caller's r8. > > > > r8 is callee saved. sigsetjmp wants to use it to save its env > > argument across the call to __setjmp. So it saves caller's r8 and > > uses r8 to save its env b/c __setjmp it's about to call will clobber > > it. Then __setjmp saves this r8 = env in the jump buf, not caller's > > r8. The instruction in the delay slot of the tail call to > > __sigsetjmp_tail vaguely looks like it might have been intended to > > patch it, but it loads r8, not stores it. I'm not sure why it would > > want to load r8 at that point. > > > > Sorry, I only skimmed through the code and as I said, there're no > > comments (which for asm code is borderline criminal, IMHO :) so I > > might be completely misinterpreting what this code does... > > I think you've stumbled across the bug even if you don't realize it. > :-) > > The idea of the sigsetjmp implementation is explained in commit > 583e55122e767b1586286a0d9c35e2a4027998ab. As you've noted, comments > would be nice, but I'm not sure where they belong since it's the same > logic on every arch. > > sigsetjmp wants to save a position in itself for longjmp to return to, > but because sigsetjmp will be returning, it can't save any state on > the stack; it doesn't have a stack frame. The only storage it has > available is call-saved registers, which longjmp will necessarily > restore for it. So, it picks one of its caller's call-saved registers > (r8) and stores that inside the sigjmp_buf, then uses that register to > preserve the pointer to the sigjmp_buf to access after setjmp returns > (either the first time or the second time). > > The only problem with the sh implementation is that, due to limited > displacement range for load/store instructions, we have to add 60 to > the sigjmp_buf base address to get the base register for accessing the > saved return address and r8. This takes place in a temp register r6. > However, the last line of this code path, the delay slot after the > tail call to __sigsetjmp_tail, erroneously uses r4 (the base > sigjmp_buf pointer) as the base for restoring r8: > > mov.l @(4+8,r4), r8 > > This corrupts the caller's saved register file. If I'm not mistaken, > it overwrites the caller's r8 with the caller's r11. > > Presumably Rob didn't actually hit a crash in sigsetjmp, but just > inferred the crash was there via printf debugging. The actual crash > must be in the caller once it gets back control with the wrong value > in r8. > > This has been an area I've wanted to have testing for for a long time, > but I don't have a really good idea for how to implement the test. We > would want the compiler to generate code that puts lots of > intermediates in as many call-saved registers as possible, calls > setjmp, then calls another function that *also* puts pressure on the > register allocation and then calls longjmp. If it's okay for the test > to be arch-specific, this could just work via __asm__ register > constraints and a call to setjmp from asm, but I'm not sure if that > would be a good way to do things in libc-test... > > In any case, thanks to everyone who participated in this. This is a > rather bad bug and a nice one to get fixed up in the release window! The attached patch should fix it. Rich