mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@libc.org>
To: Rob Landley <rob@landley.net>
Cc: toybox <toybox@lists.landley.net>, musl <musl@lists.openwall.com>
Subject: Re: [musl] Not sure how to debug this one.
Date: Sat, 17 Feb 2024 12:02:06 -0500	[thread overview]
Message-ID: <20240217170205.GH4163@brightrain.aerifal.cx> (raw)
In-Reply-To: <349f4e17-8027-c521-eeb3-aa69e8f2b5a4@landley.net>

On Fri, Feb 16, 2024 at 07:48:27PM -0600, Rob Landley wrote:
> While grinding away at release prep, I hit a WEIRD one. The qemu-system-sh4
> target got broken by commit 3e0e8c687eee (PID 1 exits trying to run the init
> script), which is the commit that changed the stdout buffering type.
> 
> It's not the kernel, if I use the last release kernel with the new root
> filesystem I see the problem, and newly built kernel from today's git with last
> release's initramfs.cpio.gz boots to a shell prompt.
> 
> The actual _problem_ is that sigsetjmp() is faulting (in sh.c function
> run_command()), for NO OBVIOUS REASON. Calling memset() to zero the struct
> before the sigsetjmp() works fine, but the sigsetjmp() call (built against
> musl-libc) never returns.
> 
> Not siglongjmp, _sigsetjmp_. Which means it's failing somewhere in:
> 
> https://git.musl-libc.org/cgit/musl/tree/src/signal/sh/sigsetjmp.s
> 
> And I dunno how to stick a printf into superh assembly code.

Rather than "stick a printf in there", can you identify (with gdb or
strace or qemu user execution tracing) exactly which instruction it's
crashing at, and the register values at the point of crash?

Provided it was called with a valid pointer to the sigjmp_buf, there
should be no way the initial call to sigsetjmp can segfault. The only
memory accesses it makes are to that object. It does make a call to
setjmp, which in theory could clobber the call-saved r8 containing
sigjmp_buf address, but setjmp does not do that. It's possible that,
on second return, this has been clobbered; even a single-byte buffer
overflow into the sigjmp_buf would do that, and sh may be unique in
having the relevant register at the beginning of the buffer, which
could explain it happening only on sh. But that would affect second
return not the first.

> The sigjmp_buf lives on the stack, but I confirmed it's 8 byte aligned, and not
> even straddling a page boundary. I can access variables I stick before and after
> it, so it can't be some kind of "fault due to guard page" weirdness? (I suppose
> the optimizer may be invalidating that test, I could try adding "volatile"...)
> 
> While debugging I made the problem GO AWAY more than once by sticking printfs()
> and similar into the code, but that's not FIXING it. Adding another sigjmp_buf
> declaration and call to sigsetjmp() right at the start of the function works
> fine (although the other one in the place it's in now still fails). I confirmed

This all suggests that there's a buffer overflow and shuffling things
around on the stack is preventing it. Have you tried running (even on
unaffected archs) under valgrind to look for such errors?

Rich

  parent reply	other threads:[~2024-02-17 17:02 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-17  1:48 Rob Landley
2024-02-17  3:23 ` [musl] Re: [Toybox] " Mouse
2024-02-17 13:32   ` Rob Landley
2024-02-17 15:01     ` [musl] " Thorsten Glaser
2024-02-17 15:21     ` [musl] " Mouse
2024-02-17 17:02 ` Rich Felker [this message]
2024-02-17 21:45 ` [musl] " Valery Ushakov
2024-02-17 23:09   ` Thorsten Glaser
2024-02-18 12:15     ` [musl] " Valery Ushakov
2024-02-18 22:51       ` [musl] " Thorsten Glaser
2024-02-18  1:34   ` Rich Felker
2024-02-18  1:40     ` Rich Felker
2024-02-18 12:55       ` Valery Ushakov
2024-02-18 14:33         ` Rich Felker
2024-02-18 15:06           ` Valery Ushakov
2024-02-18 20:33             ` Rich Felker
2024-02-19 11:00               ` Valery Ushakov
2024-02-19 17:54       ` Rob Landley
2024-02-19 23:05         ` Rich Felker
2024-02-18 12:47     ` Valery Ushakov
2024-02-19 13:12     ` Rob Landley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240217170205.GH4163@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    --cc=rob@landley.net \
    --cc=toybox@lists.landley.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).