mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] SIGSEGV/stack overflow in pthread_create - race condition?
@ 2024-09-13 11:30 Lukas Zeller
  2024-09-13 15:25 ` Rich Felker
  0 siblings, 1 reply; 13+ messages in thread
From: Lukas Zeller @ 2024-09-13 11:30 UTC (permalink / raw)
  To: musl

Hello list,

I hope this is the right place to post the following.

Using OpenWrt 22.03 with musl 1.2.3, *some* times, on *some* RPi devices (the faster, the more likely) I get the following:

> Thread 2 "debugtarget" received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 4993.5022]
> 0xb6ec42f0 in pk_parse_kite_request () from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1
> (gdb) bt
> #0  0xb6ec42f0 in pk_parse_kite_request ()
>    from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1
> #1  0xb6ec457c in pk_parse_pagekite_response ()
>    from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1
> #2  0xb6ec4b1c in pk_connect_ai ()
>    from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1
> #3  0xb6ec8494 in pkm_reconnect_all ()
>    from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1
> #4  0xb6ec79d4 in pkb_check_tunnels ()
>    from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1
> #5  0xb6ec7b94 in pkb_run_blocker ()
>    from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1
> #6  0xb6fd0af4 in start (p=0xb6adfd68) at src/thread/pthread_create.c:203
> #7  0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> #8  0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> #9  0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> #10 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> #11 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> #12 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> #13 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> #14 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> #15 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> #16 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> #17 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> #18 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> #19 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> #20 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> #21 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> #22 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> #23 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> #24 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> #25 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> #26 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> [... thousands of iterations ...]

Searching the internet i found that this is not specific to my setup, OpenWrt or libpagekite, but happens in different, otherwise completely unrelated setups, such as https://github.com/mikebrady/shairport-sync/issues/388 or https://github.com/void-linux/void-packages/issues/980. 

I could not spot any conclusive findings - in the second example, apparently they just made the stack bigger to "solve" it, which indicates that maybe the race can come to a benign end eventually and unwind the stack before it explodes.

As I am aware musl 1.2.3 is not the current version, I applied the changes in pthread_create() between 1.2.3 and current master, which is only one commit, "d64148a - fix potential unsynchronized access to killlock state at thread exit". Applying this did not make any difference.

Any ideas how to start digging deeper here? I guess I'm out of my depth here, neither familiar with musl internals (nor pagekitec's, to hack a workaround).

Thanks in advance!

Lukas

--
Lukas Zeller, plan44.ch
luz@plan44.ch - https://plan44.ch






^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] SIGSEGV/stack overflow in pthread_create - race condition?
  2024-09-13 11:30 [musl] SIGSEGV/stack overflow in pthread_create - race condition? Lukas Zeller
@ 2024-09-13 15:25 ` Rich Felker
  2024-09-13 15:34   ` alice
  2024-09-13 19:28   ` Lukas Zeller
  0 siblings, 2 replies; 13+ messages in thread
From: Rich Felker @ 2024-09-13 15:25 UTC (permalink / raw)
  To: Lukas Zeller; +Cc: musl

On Fri, Sep 13, 2024 at 01:30:00PM +0200, Lukas Zeller wrote:
> Hello list,
> 
> I hope this is the right place to post the following.
> 
> Using OpenWrt 22.03 with musl 1.2.3, *some* times, on *some* RPi devices (the faster, the more likely) I get the following:
> 
> > Thread 2 "debugtarget" received signal SIGSEGV, Segmentation fault.
> > [Switching to Thread 4993.5022]
> > 0xb6ec42f0 in pk_parse_kite_request () from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1
> > (gdb) bt
> > #0  0xb6ec42f0 in pk_parse_kite_request ()
> >    from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1
> > #1  0xb6ec457c in pk_parse_pagekite_response ()
> >    from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1
> > #2  0xb6ec4b1c in pk_connect_ai ()
> >    from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1
> > #3  0xb6ec8494 in pkm_reconnect_all ()
> >    from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1
> > #4  0xb6ec79d4 in pkb_check_tunnels ()
> >    from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1
> > #5  0xb6ec7b94 in pkb_run_blocker ()
> >    from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1
> > #6  0xb6fd0af4 in start (p=0xb6adfd68) at src/thread/pthread_create.c:203
> > #7  0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > #8  0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > #9  0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > #10 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > #11 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > #12 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > #13 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > #14 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > #15 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > #16 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > #17 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > #18 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > #19 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > #20 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > #21 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > #22 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > #23 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > #24 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > #25 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > #26 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > [... thousands of iterations ...]
> 
> Searching the internet i found that this is not specific to my
> setup, OpenWrt or libpagekite, but happens in different, otherwise
> completely unrelated setups, such as
> https://github.com/mikebrady/shairport-sync/issues/388 or
> https://github.com/void-linux/void-packages/issues/980.
> 
> I could not spot any conclusive findings - in the second example,
> apparently they just made the stack bigger to "solve" it, which
> indicates that maybe the race can come to a benign end eventually
> and unwind the stack before it explodes.

Why do you expect this is a race condition? The backtrace is not
sufficient to show it, but my default assumption would just be that
this is just a stack overflow in the application code, i.e. allocating
too much on the stack (in automatic storage local variables).

You can increase the default stack size at link time with
-Wl,stack-size=N where N is the size you want (default 128k so
increase from there), or make the program explicitly request the
amount of space it needs with pthread attribute functions.

Rich

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] SIGSEGV/stack overflow in pthread_create - race condition?
  2024-09-13 15:25 ` Rich Felker
@ 2024-09-13 15:34   ` alice
  2024-09-13 19:26     ` Lukas Zeller
  2024-09-13 19:28   ` Lukas Zeller
  1 sibling, 1 reply; 13+ messages in thread
From: alice @ 2024-09-13 15:34 UTC (permalink / raw)
  To: musl, Lukas Zeller

On Fri Sep 13, 2024 at 5:25 PM CEST, Rich Felker wrote:
> On Fri, Sep 13, 2024 at 01:30:00PM +0200, Lukas Zeller wrote:
> > Hello list,
> > 
> > I hope this is the right place to post the following.
> > 
> > Using OpenWrt 22.03 with musl 1.2.3, *some* times, on *some* RPi devices (the faster, the more likely) I get the following:
> > 
> > > Thread 2 "debugtarget" received signal SIGSEGV, Segmentation fault.
> > > [Switching to Thread 4993.5022]
> > > 0xb6ec42f0 in pk_parse_kite_request () from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1
> > > (gdb) bt
> > > #0  0xb6ec42f0 in pk_parse_kite_request ()
> > >    from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1
> > > #1  0xb6ec457c in pk_parse_pagekite_response ()
> > >    from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1
> > > #2  0xb6ec4b1c in pk_connect_ai ()
> > >    from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1
> > > #3  0xb6ec8494 in pkm_reconnect_all ()
> > >    from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1
> > > #4  0xb6ec79d4 in pkb_check_tunnels ()
> > >    from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1
> > > #5  0xb6ec7b94 in pkb_run_blocker ()
> > >    from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1
> > > #6  0xb6fd0af4 in start (p=0xb6adfd68) at src/thread/pthread_create.c:203
> > > #7  0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > > #8  0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > > #9  0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > > #10 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > > #11 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > > #12 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > > #13 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > > #14 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > > #15 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > > #16 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > > #17 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > > #18 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > > #19 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > > #20 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > > #21 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > > #22 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > > #23 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > > #24 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > > #25 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > > #26 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > > [... thousands of iterations ...]
> > 
> > Searching the internet i found that this is not specific to my
> > setup, OpenWrt or libpagekite, but happens in different, otherwise
> > completely unrelated setups, such as
> > https://github.com/mikebrady/shairport-sync/issues/388 or
> > https://github.com/void-linux/void-packages/issues/980.
> > 
> > I could not spot any conclusive findings - in the second example,
> > apparently they just made the stack bigger to "solve" it, which
> > indicates that maybe the race can come to a benign end eventually
> > and unwind the stack before it explodes.
>
> Why do you expect this is a race condition? The backtrace is not
> sufficient to show it, but my default assumption would just be that
> this is just a stack overflow in the application code, i.e. allocating
> too much on the stack (in automatic storage local variables).
>
> You can increase the default stack size at link time with
> -Wl,stack-size=N where N is the size you want (default 128k so
> increase from there), or make the program explicitly request the
> amount of space it needs with pthread attribute functions.
>
> Rich

something that is probably leading to confusion here is the infinite backtrace
in clone. iirc this was related to the lack of cfi directives on arm for gdb to
unwind correctly? or old debugger, it's been a long time..

the actual code does not loop there forever; all the actual stack use is in the
frames above in the application. it's just an unwinder shortcoming.

something you can check in this case (from memory with lldb on a coredump or the
application when it crashes):

f 0 #crashed frame
reg read $sp #stack pointer
f 7 #clone entry
reg read $sp #stack pointer

and then subtract the two values you get for $sp. you'll see something like
'131000' or bigger; and that means the code overflowed the 128k default musl
stack size.

if you compile the project with gcc -fstack-usage, you can probably look at the
generated .su files and find a function that takes like 400k of stack on entry
(and causes this overflow)


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] SIGSEGV/stack overflow in pthread_create - race condition?
  2024-09-13 15:34   ` alice
@ 2024-09-13 19:26     ` Lukas Zeller
  2024-09-13 19:32       ` Rich Felker
  0 siblings, 1 reply; 13+ messages in thread
From: Lukas Zeller @ 2024-09-13 19:26 UTC (permalink / raw)
  To: alice, musl

Hi Alice,

> On 13 Sep 2024, at 17:34, alice <alice@ayaya.dev> wrote:
> 
> something that is probably leading to confusion here is the infinite backtrace
> in clone.

Confusing indeed ;-)

> iirc this was related to the lack of cfi directives on arm for gdb to
> unwind correctly? or old debugger, it's been a long time..

Oh, thanks - so that might just be a gdb display artifact?
I am using the toolset of OpenWrt 22.03, which is gcc-11.2.0. So not really old, but definitely not latest.

> the actual code does not loop there forever;

Not forever, it obviously eventually exits from clone() into the thread function pkb_run_blocker() via the start() wrapper.
Which then hits the stack limit. 

> all the actual stack use is in the
> frames above in the application. it's just an unwinder shortcoming.

Maybe, but I'm not sure, see below

> something you can check in this case (from memory with lldb on a coredump or the
> application when it crashes):
> 
> f 0 #crashed frame
> reg read $sp #stack pointer
> f 7 #clone entry
> reg read $sp #stack pointer
> 
> and then subtract the two values you get for $sp. you'll see something like
> '131000' or bigger; and that means the code overflowed the 128k default musl
> stack size.

Ok, here's what I got:

> (gdb) f 0
> #0  0xb6ec42f0 in pk_parse_kite_request ()
>    from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1
> (gdb) info reg $sp
> sp             0xb6adbbd0          0xb6adbbd0
> (gdb) f 7
> #7  0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> 23 bl 3f
> (gdb) info reg $sp
> sp             0xb6adfd60          0xb6adfd60

0xb6adfd60 - 0xb6adbbd0 = 0x4190 = 16784

So that child thread has put only 16k on the stack between starting and when it crashes.


Interestingly, however, the previous (bogus?) stack frames each seem to have consumed 16 bytes.
This is what I would expect from "stmfd sp!,{r4,r5,r6,r7}" on line 7 of clone.s

> (gdb) f 8
> #8  0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> 23 bl 3f
> (gdb) info reg $sp
> sp             0xb6adfd70          0xb6adfd70
> (gdb) f 9
> #9  0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> 23 bl 3f
> (gdb) info reg $sp
> sp             0xb6adfd80          0xb6adfd80
> (gdb) 
> 
> ...
> 
> (gdb) f 1000
> #1000 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> 23 bl 3f
> (gdb) info reg $sp
> sp             0xb6ae3b70          0xb6ae3b70
> (gdb) 

0xb6ae3b70 - 0xb6adfd80 = 0x3DF0 -> ~16k consumed by 1000 "frames"

Unless the shown $sp values are gdb display artifacts as well, these "iterations" DO consume stack space and very much look like a real recursion happening within __clone().

The only path I can imagine would be start() in pthread_create():194 causing __clone() somehow to get called again on the same thread instead of calling the actual thread function passed to pthread_create (pagekite's pkb_run_blocker). This would push those 4 regs / 16 bytes on the stack.

Why the clone syscall then would not create a new thread, but still return 0 indicating a new child, I have no idea. I do not understand the preamble to calling the thread function in start(), but can see that delicate things happen there with a_cas(), __wait() and SYS_set_tid_address.

Does this make any sense to you?

One piece of context that might be relevant: I found libpagekite calls that pthread_create() 16 times (MAX_BLOCKING_THREADS) in a row in a quite tight loop:

> int pkb_start_blockers(struct pk_manager *pkm, int n)
> {
>   int i;
>   for (i = 0; i < MAX_BLOCKING_THREADS && n > 0; i++) {
>     if (pkm->blocking_threads[i] == NULL) {
>       pkm->blocking_threads[i] = malloc(sizeof(struct pk_blocker));
>       pkm->blocking_threads[i]->manager = pkm;
>       if (0 > pthread_create(&(pkm->blocking_threads[i]->thread), NULL,
>                              pkb_run_blocker,
>                              (void *) pkm->blocking_threads[i])) {
>         pk_log(PK_LOG_MANAGER_ERROR, "Failed to start blocking thread.");
>         free(pkm->blocking_threads[i]);
>         pkm->blocking_threads[i] = NULL;
>         return (pk_error = ERR_NO_THREAD);
>       }
>       n--;
>     }
>     else {
>       pk_log(PK_LOG_MANAGER_ERROR, "Blocking thread %d already started?", i);
>     }
>   }
>   return 0;
> }

PS: having gdb doing "f 1000" took a minute, "f 2000" about 10 and "f 4000" took an hour...

--
Lukas Zeller, plan44.ch
luz@plan44.ch - https://plan44.ch






^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] SIGSEGV/stack overflow in pthread_create - race condition?
  2024-09-13 15:25 ` Rich Felker
  2024-09-13 15:34   ` alice
@ 2024-09-13 19:28   ` Lukas Zeller
  2024-09-13 19:54     ` Rich Felker
  1 sibling, 1 reply; 13+ messages in thread
From: Lukas Zeller @ 2024-09-13 19:28 UTC (permalink / raw)
  To: musl, Rich Felker

Hi Rich,

> On 13 Sep 2024, at 17:25, Rich Felker <dalias@libc.org> wrote:
> 
> Why do you expect this is a race condition? The backtrace is not
> sufficient to show it, but my default assumption would just be that
> this is just a stack overflow in the application code, i.e. allocating
> too much on the stack (in automatic storage local variables).

Sorry, I might be mislead due to not really understanding the details.

I thought of a race condition because it seems CPU execution time dependent - I can swap the SD card from an RPi2 where it never happens and libpagekit is in heavy use, and an Rpi3 or 4 where it happens in almost 100% of the startups of the pagekite app.

But please see other message in reply to alice, I don't think this can be a regular stack overflow from the tests I did in the meantime.

--
Lukas Zeller, plan44.ch
luz@plan44.ch - https://plan44.ch






^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] SIGSEGV/stack overflow in pthread_create - race condition?
  2024-09-13 19:26     ` Lukas Zeller
@ 2024-09-13 19:32       ` Rich Felker
  0 siblings, 0 replies; 13+ messages in thread
From: Rich Felker @ 2024-09-13 19:32 UTC (permalink / raw)
  To: Lukas Zeller; +Cc: alice, musl

On Fri, Sep 13, 2024 at 09:26:27PM +0200, Lukas Zeller wrote:
> Hi Alice,
> 
> > On 13 Sep 2024, at 17:34, alice <alice@ayaya.dev> wrote:
> > 
> > something that is probably leading to confusion here is the infinite backtrace
> > in clone.
> 
> Confusing indeed ;-)
> 
> > iirc this was related to the lack of cfi directives on arm for gdb to
> > unwind correctly? or old debugger, it's been a long time..
> 
> Oh, thanks - so that might just be a gdb display artifact?
> I am using the toolset of OpenWrt 22.03, which is gcc-11.2.0. So not really old, but definitely not latest.
> 
> > the actual code does not loop there forever;
> 
> Not forever, it obviously eventually exits from clone() into the thread function pkb_run_blocker() via the start() wrapper.
> Which then hits the stack limit. 
> 
> > all the actual stack use is in the
> > frames above in the application. it's just an unwinder shortcoming.
> 
> Maybe, but I'm not sure, see below
> 
> > something you can check in this case (from memory with lldb on a coredump or the
> > application when it crashes):
> > 
> > f 0 #crashed frame
> > reg read $sp #stack pointer
> > f 7 #clone entry
> > reg read $sp #stack pointer
> > 
> > and then subtract the two values you get for $sp. you'll see something like
> > '131000' or bigger; and that means the code overflowed the 128k default musl
> > stack size.
> 
> Ok, here's what I got:
> 
> > (gdb) f 0
> > #0  0xb6ec42f0 in pk_parse_kite_request ()
> >    from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1
> > (gdb) info reg $sp
> > sp             0xb6adbbd0          0xb6adbbd0
> > (gdb) f 7
> > #7  0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > 23 bl 3f
> > (gdb) info reg $sp
> > sp             0xb6adfd60          0xb6adfd60
> 
> 0xb6adfd60 - 0xb6adbbd0 = 0x4190 = 16784
> 
> So that child thread has put only 16k on the stack between starting and when it crashes.

That is only going to work if gdb correctly recovered the frame state
for __clone.

> Interestingly, however, the previous (bogus?) stack frames each seem to have consumed 16 bytes.
> This is what I would expect from "stmfd sp!,{r4,r5,r6,r7}" on line 7 of clone.s
> 
> > (gdb) f 8
> > #8  0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > 23 bl 3f
> > (gdb) info reg $sp
> > sp             0xb6adfd70          0xb6adfd70
> > (gdb) f 9
> > #9  0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > 23 bl 3f
> > (gdb) info reg $sp
> > sp             0xb6adfd80          0xb6adfd80
> > (gdb) 
> > 
> > ...
> > 
> > (gdb) f 1000
> > #1000 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23
> > 23 bl 3f
> > (gdb) info reg $sp
> > sp             0xb6ae3b70          0xb6ae3b70
> > (gdb) 
> 
> 0xb6ae3b70 - 0xb6adfd80 = 0x3DF0 -> ~16k consumed by 1000 "frames"
> 
> Unless the shown $sp values are gdb display artifacts as well, these
> "iterations" DO consume stack space and very much look like a real
> recursion happening within __clone().

They are not. It is most definitely gdb misinterpreting the process
state and making up these stack offsets to agree with its
interpretation of the call stack.

If you want to see the original stack pointer for the thread, it will
be very close to the value of the thread pointer and you could look
that up.

Rich

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] SIGSEGV/stack overflow in pthread_create - race condition?
  2024-09-13 19:28   ` Lukas Zeller
@ 2024-09-13 19:54     ` Rich Felker
  2024-09-14 11:27       ` Lukas Zeller
  0 siblings, 1 reply; 13+ messages in thread
From: Rich Felker @ 2024-09-13 19:54 UTC (permalink / raw)
  To: Lukas Zeller; +Cc: musl

On Fri, Sep 13, 2024 at 09:28:59PM +0200, Lukas Zeller wrote:
> Hi Rich,
> 
> > On 13 Sep 2024, at 17:25, Rich Felker <dalias@libc.org> wrote:
> > 
> > Why do you expect this is a race condition? The backtrace is not
> > sufficient to show it, but my default assumption would just be that
> > this is just a stack overflow in the application code, i.e. allocating
> > too much on the stack (in automatic storage local variables).
> 
> Sorry, I might be mislead due to not really understanding the details.
> 
> I thought of a race condition because it seems CPU execution time
> dependent - I can swap the SD card from an RPi2 where it never
> happens and libpagekit is in heavy use, and an Rpi3 or 4 where it
> happens in almost 100% of the startups of the pagekite app.
> 
> But please see other message in reply to alice, I don't think this
> can be a regular stack overflow from the tests I did in the
> meantime.

Can you dump the disassembly (disasm command) at the point of crash?
That will show what's attempting to be accessed and what "type of
segfault" it is.

If there's an unpredictable crash, indeed it seems less likely to be
stack overflow. I don't see any way it could be a race condition on
musl's side but it could be one in the application/libpagekite code,
or it could be any sort of hardware or kernel fault.

Rich

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] SIGSEGV/stack overflow in pthread_create - race condition?
  2024-09-13 19:54     ` Rich Felker
@ 2024-09-14 11:27       ` Lukas Zeller
  2024-09-17 10:37         ` Lukas Zeller
  0 siblings, 1 reply; 13+ messages in thread
From: Lukas Zeller @ 2024-09-14 11:27 UTC (permalink / raw)
  To: Rich Felker, musl

> On 13 Sep 2024, at 21:54, Rich Felker <dalias@libc.org> wrote:
> 
> [...]
> Can you dump the disassembly (disasm command) at the point of crash?
> That will show what's attempting to be accessed and what "type of
> segfault" it is.

I will, as soon as I have access to that Rpi again (next week) and post results.

> If there's an unpredictable crash, indeed it seems less likely to be
> stack overflow. I don't see any way it could be a race condition on
> musl's side but it could be one in the application/libpagekite code,
> or it could be any sort of hardware or kernel fault.

Hardware fault is unlikely - I see the same behaviour on at least two spearate RPi 3. One being in the field, which led to running in the problem at all, because it was supposed to be remote reachable by pagekite and wasn't, until we found the pagekite daemon crashes every time shortly after start. Then I tried the same locally, and got the same behaviour.

I'll investigate further and let you know. Thanks for the help so far!

>> 0xb6adfd60 - 0xb6adbbd0 = 0x4190 = 16784
>> 
>> So that child thread has put only 16k on the stack between starting and when it crashes.
> 
> That is only going to work if gdb correctly recovered the frame state
> for __clone.

I can imagine gdb starts to get confused once it traces back as far as into __clone. However, it's a back-trace, so sp at the crash is real, and unwinding starts from there. I would be very surprised if those first few stack frames displayed were wrong, in particular the sp itself. 

After all, isn't the sp the back bone of a back trace? Walking back from the current sp, and trying to make sense (frames) of what data is found on the way? But while tracing the pc or other registers might derail, there's no room for interpretation for the value of sp itself from frame to frame, or is there?

Lukas






^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] SIGSEGV/stack overflow in pthread_create - race condition?
  2024-09-14 11:27       ` Lukas Zeller
@ 2024-09-17 10:37         ` Lukas Zeller
  2024-09-17 12:12           ` Rich Felker
  0 siblings, 1 reply; 13+ messages in thread
From: Lukas Zeller @ 2024-09-17 10:37 UTC (permalink / raw)
  To: Rich Felker, musl; +Cc: alice

Hello Rich,

bottom line: problem solved, you were right :-)

It was a use-after-free problem in libpagekite.

I created PRs on libpagekite [1] and openwrt packages [2] to fix it.

I was mostly mislead into "stack overflow" thinking by the gdb backtrace
artifacts looking like __clone() recursion and using that red herring
to search for other similar problems, some of which actually *were*
stack overflow related.

Thanks for the help and sorry for wasting some of your time.

Details for the record:

> On 13 Sep 2024, at 21:54, Rich Felker <dalias@libc.org> wrote:
> 
> [...]
> Can you dump the disassembly (disasm command) at the point of crash?
> That will show what's attempting to be accessed and what "type of
> segfault" it is.

Did that, eventually revealed the faulty access, which looks like this in C
(pkproto.c:765):

  /* Cleanup */
  free(copy);

  /* Pass judgement */
  if ('\0' == *public_domain) return pk_err_null(ERR_PARSE_NO_KITENAME);

Unfortunately, public_domain was a pointer into the `copy` buffer freed
on the line before. A very very short race window but apparently
together with the fact that 16 of these threads were fired up in rapid
succession it could obviously happen that the page got mapped away in
between.

Regarding the gdb bt artifact - before last night I had never thougt
about how difficult stack backtracing gets with the ARM link register
compared to traditional return-address-on-stack. And learned about
.cfi. Would it be possible to instrument clone.s with .cfi such
that gdb would realize that __clone is the origin of a new thread
on a new stack and there is nothing to backtrack beyond?
Probably a puzzle for another time...

Best Regards
Lukas

[1] https://github.com/pagekite/libpagekite/pull/79
[2] https://github.com/openwrt/packages/pull/24982

--
Lukas Zeller, plan44.ch
luz@plan44.ch - https://plan44.ch






^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] SIGSEGV/stack overflow in pthread_create - race condition?
  2024-09-17 10:37         ` Lukas Zeller
@ 2024-09-17 12:12           ` Rich Felker
  2024-09-17 13:35             ` Markus Wichmann
  0 siblings, 1 reply; 13+ messages in thread
From: Rich Felker @ 2024-09-17 12:12 UTC (permalink / raw)
  To: Lukas Zeller; +Cc: musl, alice

On Tue, Sep 17, 2024 at 12:37:01PM +0200, Lukas Zeller wrote:
> Hello Rich,
> 
> bottom line: problem solved, you were right :-)
> 
> It was a use-after-free problem in libpagekite.
> 
> I created PRs on libpagekite [1] and openwrt packages [2] to fix it.
> 
> I was mostly mislead into "stack overflow" thinking by the gdb backtrace
> artifacts looking like __clone() recursion and using that red herring
> to search for other similar problems, some of which actually *were*
> stack overflow related.
> 
> Thanks for the help and sorry for wasting some of your time.

No problem!

> Details for the record:
> 
> > On 13 Sep 2024, at 21:54, Rich Felker <dalias@libc.org> wrote:
> > 
> > [...]
> > Can you dump the disassembly (disasm command) at the point of crash?
> > That will show what's attempting to be accessed and what "type of
> > segfault" it is.
> 
> Did that, eventually revealed the faulty access, which looks like this in C
> (pkproto.c:765):
> 
>   /* Cleanup */
>   free(copy);
> 
>   /* Pass judgement */
>   if ('\0' == *public_domain) return pk_err_null(ERR_PARSE_NO_KITENAME);
> 
> Unfortunately, public_domain was a pointer into the `copy` buffer freed
> on the line before. A very very short race window but apparently
> together with the fact that 16 of these threads were fired up in rapid
> succession it could obviously happen that the page got mapped away in
> between.

I don't think you were racing with the freeing (which would happen
deterministically in the same thread) but rather had a
nondeterministic order of allocations, where sometimes the freed
object would be the last one in its group (thereby resulting in free
unmapping it) and other times would have other live objects still in
the same allocation group (which would prevent unmapping it).

> Regarding the gdb bt artifact - before last night I had never thougt
> about how difficult stack backtracing gets with the ARM link register
> compared to traditional return-address-on-stack. And learned about
> ..cfi. Would it be possible to instrument clone.s with .cfi such
> that gdb would realize that __clone is the origin of a new thread
> on a new stack and there is nothing to backtrack beyond?
> Probably a puzzle for another time...

Yes, getting minimal annotation needed for the debugger to handle this
right is an open todo item. Ideally it would be contingent on whether
you enabled debugging so that it wouldn't end up in no-debug-info
builds. Slimming down the asm source files to the minimum that
actually have to be asm source files (like clone, I think), which is
also a wishlist item, would make it less of an invasive change, which
is probably why I've held off on digging into it.

Rich

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] SIGSEGV/stack overflow in pthread_create - race condition?
  2024-09-17 12:12           ` Rich Felker
@ 2024-09-17 13:35             ` Markus Wichmann
  2024-09-17 18:02               ` Rich Felker
  0 siblings, 1 reply; 13+ messages in thread
From: Markus Wichmann @ 2024-09-17 13:35 UTC (permalink / raw)
  To: musl

Am Tue, Sep 17, 2024 at 08:12:41AM -0400 schrieb Rich Felker:
> Yes, getting minimal annotation needed for the debugger to handle this
> right is an open todo item. Ideally it would be contingent on whether
> you enabled debugging so that it wouldn't end up in no-debug-info
> builds. Slimming down the asm source files to the minimum that
> actually have to be asm source files (like clone, I think), which is
> also a wishlist item, would make it less of an invasive change, which
> is probably why I've held off on digging into it.
>
> Rich

Would it not be enough to clear fp in the clone child? That should
initialize the frame pointer chain, right? OK, it is an actual
instruction more for everyone, but I doubt the impact would even be
measurable.

Ciao,
Markus

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] SIGSEGV/stack overflow in pthread_create - race condition?
  2024-09-17 13:35             ` Markus Wichmann
@ 2024-09-17 18:02               ` Rich Felker
  2024-09-18 14:25                 ` Markus Wichmann
  0 siblings, 1 reply; 13+ messages in thread
From: Rich Felker @ 2024-09-17 18:02 UTC (permalink / raw)
  To: Markus Wichmann; +Cc: musl

On Tue, Sep 17, 2024 at 03:35:20PM +0200, Markus Wichmann wrote:
> Am Tue, Sep 17, 2024 at 08:12:41AM -0400 schrieb Rich Felker:
> > Yes, getting minimal annotation needed for the debugger to handle this
> > right is an open todo item. Ideally it would be contingent on whether
> > you enabled debugging so that it wouldn't end up in no-debug-info
> > builds. Slimming down the asm source files to the minimum that
> > actually have to be asm source files (like clone, I think), which is
> > also a wishlist item, would make it less of an invasive change, which
> > is probably why I've held off on digging into it.
> 
> Would it not be enough to clear fp in the clone child? That should
> initialize the frame pointer chain, right? OK, it is an actual
> instruction more for everyone, but I doubt the impact would even be
> measurable.

Oh, maybe it's not a cfi issue at all, just failure to clear frame
pointer? Fixing that would be perfectly acceptable.

Rich

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] SIGSEGV/stack overflow in pthread_create - race condition?
  2024-09-17 18:02               ` Rich Felker
@ 2024-09-18 14:25                 ` Markus Wichmann
  0 siblings, 0 replies; 13+ messages in thread
From: Markus Wichmann @ 2024-09-18 14:25 UTC (permalink / raw)
  To: musl

Am Tue, Sep 17, 2024 at 02:02:50PM -0400 schrieb Rich Felker:
> On Tue, Sep 17, 2024 at 03:35:20PM +0200, Markus Wichmann wrote:
> > Am Tue, Sep 17, 2024 at 08:12:41AM -0400 schrieb Rich Felker:
> > > Yes, getting minimal annotation needed for the debugger to handle this
> > > right is an open todo item. Ideally it would be contingent on whether
> > > you enabled debugging so that it wouldn't end up in no-debug-info
> > > builds. Slimming down the asm source files to the minimum that
> > > actually have to be asm source files (like clone, I think), which is
> > > also a wishlist item, would make it less of an invasive change, which
> > > is probably why I've held off on digging into it.
> >
> > Would it not be enough to clear fp in the clone child? That should
> > initialize the frame pointer chain, right? OK, it is an actual
> > instruction more for everyone, but I doubt the impact would even be
> > measurable.
>
> Oh, maybe it's not a cfi issue at all, just failure to clear frame
> pointer? Fixing that would be perfectly acceptable.
>
> Rich

I should add that I don't know if that would fix it, because I don't
know ARM very well. However, the ARM CRT code does initialize fp and the
ARM __clone code does not.

Ciao,
Markus

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2024-09-18 14:25 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-09-13 11:30 [musl] SIGSEGV/stack overflow in pthread_create - race condition? Lukas Zeller
2024-09-13 15:25 ` Rich Felker
2024-09-13 15:34   ` alice
2024-09-13 19:26     ` Lukas Zeller
2024-09-13 19:32       ` Rich Felker
2024-09-13 19:28   ` Lukas Zeller
2024-09-13 19:54     ` Rich Felker
2024-09-14 11:27       ` Lukas Zeller
2024-09-17 10:37         ` Lukas Zeller
2024-09-17 12:12           ` Rich Felker
2024-09-17 13:35             ` Markus Wichmann
2024-09-17 18:02               ` Rich Felker
2024-09-18 14:25                 ` Markus Wichmann

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).