* SIGSEGV related to threads since 1.1.20?
@ 2018-11-10 23:31 Sebastian Kemper
2018-11-10 23:35 ` Sebastian Kemper
2018-11-10 23:42 ` Rich Felker
0 siblings, 2 replies; 4+ messages in thread
From: Sebastian Kemper @ 2018-11-10 23:31 UTC (permalink / raw)
To: musl
Hello all,
I've got an issue with mariadb segfaulting. And apparently it has to do
with the switch from musl 1.1.19 to 1.1.20.
First off, I'm not a programmer, so the info below might be warped a
bit.
I maintain the mariadb package on OpenWrt. There was a report on the
issues tracker about a segfault:
https://github.com/openwrt/packages/issues/7230
I installed a current openwrt snapshot today, then installed
mariadb-server. Afterwards I ran
mysql_install_db --force --basedir=/usr
to init the database. And then there was a segfault:
Sat Nov 10 23:41:08 2018 kern.info kernel: [17053.144829] do_page_fault(): sending SIGSEGV to mysqld for invalid write access to 00000000
Sat Nov 10 23:41:08 2018 kern.info kernel: [17053.144839] epc = 77fc2058 in libc.so[77f4a000+93000]
Sat Nov 10 23:41:08 2018 kern.info kernel: [17053.144863] ra = 77fc1fa0 in libc.so[77f4a000+93000]
The messages look the same as in the report. Although the reporter used
a different way to get to this result (he attempted to connect to the
running server, whereas I tried to create a DB).
This is on an old dlink router (mips_24kc, ar71xx). The reporter used
something else (mips32r2, mir3g).
I went and compiled mariadb with debug symbols and installed the
unstripped binaries. Then I ran gdbserver on the mips device and
connected to it from my laptop. When I ran the commands in gdb I got
this output:
(gdb) c
Continuing.
Thread 2 "mysqld" received signal SIGSEGV, Segmentation fault.
__pthread_timedjoin_np (t=0x6bdced60, res=0x0, at=0x0) at src/thread/pthread_join.c:15
15 if (state >= DT_DETACHED) a_crash();
(gdb) bt
#0 __pthread_timedjoin_np (t=0x6bdced60, res=0x0, at=0x0) at src/thread/pthread_join.c:15
#1 0x006bf754 in handle_bootstrap_impl (thd=<optimized out>) at /home/sk/tmp/openwrt/build_dir/target-mips_24kc_musl/mariadb-10.2.17/sql/sql_parse.cc:950
#2 0x006bfd58 in do_handle_bootstrap (thd=<optimized out>) at /home/sk/tmp/openwrt/build_dir/target-mips_24kc_musl/mariadb-10.2.17/sql/sql_parse.cc:1094
#3 0x006bfdfc in handle_bootstrap (arg=0x1dc7448) at /home/sk/tmp/openwrt/build_dir/target-mips_24kc_musl/mariadb-10.2.17/sql/sql_parse.cc:1077
#4 0x77fd10fc in start (p=0x77fd10fc <start+100>) at src/thread/pthread_create.c:147
#5 0x77f6702c in __clone () at src/thread/mips/clone.s:32
Backtrace stopped: frame did not save the PC
So apparently __pthread_timedjoin_np gets some NULL input and then the
program segfaults. I reran this with a breakpoint on the function and it
got called before the segfault and in these calls the args were not
NULL.
Anyway. I checked on openwrt's github what happened to musl in the past
months. And on Sep 21 musl was upgraded from 1.1.19 to 1.1.20. So I
reverted this commit and compiled 1.1.19. I then just downgraded musl on
the router (on-the-fly). That caused some programs like dropbear to stop
working properly due to missing symbols. OK, expected.
But when I ran
mysql_install_db --force --basedir=/usr
it completed without errors. And once I upgraded to musl 1.1.20 I got
the segfault again.
I was hoping that maybe you could take a look at this :)
Kind regards,
Seb
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: SIGSEGV related to threads since 1.1.20?
2018-11-10 23:31 SIGSEGV related to threads since 1.1.20? Sebastian Kemper
@ 2018-11-10 23:35 ` Sebastian Kemper
2018-11-10 23:42 ` Rich Felker
1 sibling, 0 replies; 4+ messages in thread
From: Sebastian Kemper @ 2018-11-10 23:35 UTC (permalink / raw)
To: musl
Oh, please keep me in the loop as I'm not subscribed!
Kind regards,
Seb
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: SIGSEGV related to threads since 1.1.20?
2018-11-10 23:31 SIGSEGV related to threads since 1.1.20? Sebastian Kemper
2018-11-10 23:35 ` Sebastian Kemper
@ 2018-11-10 23:42 ` Rich Felker
2018-11-10 23:46 ` Sebastian Kemper
1 sibling, 1 reply; 4+ messages in thread
From: Rich Felker @ 2018-11-10 23:42 UTC (permalink / raw)
To: Sebastian Kemper; +Cc: musl
On Sun, Nov 11, 2018 at 12:31:45AM +0100, Sebastian Kemper wrote:
> Hello all,
>
> I've got an issue with mariadb segfaulting. And apparently it has to do
> with the switch from musl 1.1.19 to 1.1.20.
>
> First off, I'm not a programmer, so the info below might be warped a
> bit.
>
> I maintain the mariadb package on OpenWrt. There was a report on the
> issues tracker about a segfault:
> https://github.com/openwrt/packages/issues/7230
>
> I installed a current openwrt snapshot today, then installed
> mariadb-server. Afterwards I ran
>
> mysql_install_db --force --basedir=/usr
>
> to init the database. And then there was a segfault:
>
> Sat Nov 10 23:41:08 2018 kern.info kernel: [17053.144829] do_page_fault(): sending SIGSEGV to mysqld for invalid write access to 00000000
> Sat Nov 10 23:41:08 2018 kern.info kernel: [17053.144839] epc = 77fc2058 in libc.so[77f4a000+93000]
> Sat Nov 10 23:41:08 2018 kern.info kernel: [17053.144863] ra = 77fc1fa0 in libc.so[77f4a000+93000]
>
> The messages look the same as in the report. Although the reporter used
> a different way to get to this result (he attempted to connect to the
> running server, whereas I tried to create a DB).
>
> This is on an old dlink router (mips_24kc, ar71xx). The reporter used
> something else (mips32r2, mir3g).
>
> I went and compiled mariadb with debug symbols and installed the
> unstripped binaries. Then I ran gdbserver on the mips device and
> connected to it from my laptop. When I ran the commands in gdb I got
> this output:
>
> (gdb) c
> Continuing.
>
> Thread 2 "mysqld" received signal SIGSEGV, Segmentation fault.
> __pthread_timedjoin_np (t=0x6bdced60, res=0x0, at=0x0) at src/thread/pthread_join.c:15
> 15 if (state >= DT_DETACHED) a_crash();
> (gdb) bt
> #0 __pthread_timedjoin_np (t=0x6bdced60, res=0x0, at=0x0) at src/thread/pthread_join.c:15
> #1 0x006bf754 in handle_bootstrap_impl (thd=<optimized out>) at /home/sk/tmp/openwrt/build_dir/target-mips_24kc_musl/mariadb-10.2.17/sql/sql_parse.cc:950
> #2 0x006bfd58 in do_handle_bootstrap (thd=<optimized out>) at /home/sk/tmp/openwrt/build_dir/target-mips_24kc_musl/mariadb-10.2.17/sql/sql_parse.cc:1094
> #3 0x006bfdfc in handle_bootstrap (arg=0x1dc7448) at /home/sk/tmp/openwrt/build_dir/target-mips_24kc_musl/mariadb-10.2.17/sql/sql_parse.cc:1077
> #4 0x77fd10fc in start (p=0x77fd10fc <start+100>) at src/thread/pthread_create.c:147
> #5 0x77f6702c in __clone () at src/thread/mips/clone.s:32
> Backtrace stopped: frame did not save the PC
>
> So apparently __pthread_timedjoin_np gets some NULL input and then the
> program segfaults. I reran this with a breakpoint on the function and it
> got called before the segfault and in these calls the args were not
> NULL.
This it an intentional trap for undefined behavior when the caller
attempts to join a detached thread or detach a thread that was not
joinable (already detached or already being joined by another thread).
In the case of mariadb, it was reported as:
https://jira.mariadb.org/browse/MDEV-17200
and the corresponding Alping Linux bug:
https://bugs.alpinelinux.org/issues/9407
The patch is available in Alpine Linux's aport repo:
https://git.alpinelinux.org/cgit/aports/tree/main/mariadb/fix-pthread-detach.patch
Rich
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: SIGSEGV related to threads since 1.1.20?
2018-11-10 23:42 ` Rich Felker
@ 2018-11-10 23:46 ` Sebastian Kemper
0 siblings, 0 replies; 4+ messages in thread
From: Sebastian Kemper @ 2018-11-10 23:46 UTC (permalink / raw)
To: Rich Felker; +Cc: musl
On Sat, Nov 10, 2018 at 06:42:59PM -0500, Rich Felker wrote:
> This it an intentional trap for undefined behavior when the caller
> attempts to join a detached thread or detach a thread that was not
> joinable (already detached or already being joined by another thread).
>
> In the case of mariadb, it was reported as:
>
> https://jira.mariadb.org/browse/MDEV-17200
>
> and the corresponding Alping Linux bug:
>
> https://bugs.alpinelinux.org/issues/9407
>
> The patch is available in Alpine Linux's aport repo:
>
> https://git.alpinelinux.org/cgit/aports/tree/main/mariadb/fix-pthread-detach.patch
>
> Rich
Hello Rich,
Thank you very much!!! I'll take a look at this :-)
Have a great weekend!
Kind regards,
Seb
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-11-10 23:46 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-10 23:31 SIGSEGV related to threads since 1.1.20? Sebastian Kemper
2018-11-10 23:35 ` Sebastian Kemper
2018-11-10 23:42 ` Rich Felker
2018-11-10 23:46 ` Sebastian Kemper
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).