From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <musl-return-20413-ml=inbox.vuxu.org@lists.openwall.com>
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org
X-Spam-Level: 
X-Spam-Status: No, score=-3.1 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS,
	MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,T_SCC_BODY_TEXT_LINE
	autolearn=ham autolearn_force=no version=3.4.4
Received: from second.openwall.net (second.openwall.net [193.110.157.125])
	by inbox.vuxu.org (Postfix) with SMTP id 31B0B214E5
	for <ml@inbox.vuxu.org>; Sun, 18 Feb 2024 21:33:03 +0100 (CET)
Received: (qmail 9715 invoked by uid 550); 18 Feb 2024 20:29:48 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
List-ID: <musl.lists.openwall.com>
Reply-To: musl@lists.openwall.com
Received: (qmail 9683 invoked from network); 18 Feb 2024 20:29:48 -0000
Date: Sun, 18 Feb 2024 15:33:06 -0500
From: Rich Felker <dalias@libc.org>
To: Valery Ushakov <uwe@stderr.spb.ru>
Cc: musl@lists.openwall.com, toybox <toybox@lists.landley.net>
Message-ID: <20240218203306.GM4163@brightrain.aerifal.cx>
References: <349f4e17-8027-c521-eeb3-aa69e8f2b5a4@landley.net>
 <ZdEo_58yYjO6wl8y@snips.stderr.spb.ru>
 <20240218013428.GJ4163@brightrain.aerifal.cx>
 <20240218014049.GK4163@brightrain.aerifal.cx>
 <ZdH-SF2_dzVR4hJe@snips.stderr.spb.ru>
 <20240218143312.GL4163@brightrain.aerifal.cx>
 <ZdIdBj000y2-RpOS@snips.stderr.spb.ru>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <ZdIdBj000y2-RpOS@snips.stderr.spb.ru>
User-Agent: Mutt/1.5.21 (2010-09-15)
Subject: Re: [musl] Re: Not sure how to debug this one.

On Sun, Feb 18, 2024 at 06:06:46PM +0300, Valery Ushakov wrote:
> On Sun, Feb 18, 2024 at 09:33:13 -0500, Rich Felker wrote:
> 
> > On Sun, Feb 18, 2024 at 03:55:36PM +0300, Valery Ushakov wrote:
> > > On Sat, Feb 17, 2024 at 20:40:50 -0500, Rich Felker wrote:
> > > 
> > > > due to incorrect base address register when attempting to reload the
> > > > saved value of r8, the caller's value of r8 was not preserved.
> > > > ---
> > > >  src/signal/sh/sigsetjmp.s | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/src/signal/sh/sigsetjmp.s b/src/signal/sh/sigsetjmp.s
> > > > index 1e2270be..f0f604e2 100644
> > > > --- a/src/signal/sh/sigsetjmp.s
> > > > +++ b/src/signal/sh/sigsetjmp.s
> > > > @@ -27,7 +27,7 @@ __sigsetjmp:
> > > >  
> > > >  	mov.l 3f, r0
> > > >  4:	braf r0
> > > > -	 mov.l @(4+8,r4), r8
> > > > +	 mov.l @(4+8,r6), r8
> > > >  
> > > >  9:	mov.l 5f, r0
> > > >  6:	braf r0
> > > 
> > > That takes care of restoring caller's r8 for the first return from
> > > sigsetjmp, but isn't there still the problem that the jump buffer
> > > contains the wrong one, so on the second return from sigsetjmp the
> > > caller will have clobbered r8?
> > > 
> > > Sorry for a drive-by reply.  I'll try to take a closer look in the
> > > evening.
> > 
> > No, that's the return path for both returns.
> >
> > The whole reason a call-saved register like r8 is used here is so
> > that we can return twice into the body of sigsetjmp, in order to
> > tailcall __sigsetjmp_tail at both the first return and subsequent
> > return.
> 
> Doh, right!  Sorry.  A comment to that effect to alert the reader
> would certainly have helped :) Neat trick that I missed on the quick
> reading.

Yes. Perhaps a single comment in each asm file pointing to a common
document location (the dummy sigsetjmp.c file would be a good
candidate) would be a good approach. This could also document what
needs to be done when writing a new port.

> > This is what makes it possible to restore the signal mask from the
> > returned-to frame rather than the returning-from frame (which is why
> > the attached doesn't crash with stack overflow on musl like it does
> > on glibc).
> 
> Restoring the context in siglongjmp should not be a problem per-se.
> NetBSD libc does that and the example code doesn't crash there (quick
> unscientific test on a ppc that I happen to have a terminal open on).
> But then NetBSD libc doesn't bother to carefully factor that code to
> minimize the need for MD asm.
> 
> Thanks, and sorry for the noise.

If you restore the signal mask from the returning context rather than
in the returned-to context, there's always the possibility of stack
overflow; in the worst case, this happens on the sigaltstack where
you're specifically taking measures to avoid stack overflow being a
fatal error. The test program is artificial, but the real-world way
this would happen is getting a flood of signals like SIGINT or SIGTSTP
or something coming in faster than you can respond to them, so that
every time you try to return via siglongjmp, you actually consume
another stack frame on the signal stack.

If NetBSD didn't crash, maybe it just has a much larger default stack
size limit? Or maybe they reload sp before calling sigprocmask? That
would work too, but the reason musl doesn't do it that way is that our
setjmp/longjmp are compatible with an old ABI where there is no extra
space in the jmp_buf for sigjmp_buf stuff.

Rich