* [BUG] Crash due to malloc call in signal handler @ 2019-12-12 18:28 ` Antoine C. 2019-12-13 9:40 ` Peter Stephenson 0 siblings, 1 reply; 6+ messages in thread From: Antoine C. @ 2019-12-12 18:28 UTC (permalink / raw) To: Zsh Workers List Hello, I finally found the cause of the frequent crashes I reported one year ago ( https://www.zsh.org/mla/workers/2019/msg00059.html ). This is due to malloc calls from signal handler, for instance: #0 tcache_get (tc_idx=17) at malloc.c:2943 #1 __GI___libc_malloc (bytes=296) at malloc.c:3050 #2 0x000055c2217b27b5 in malloc (size=8) at ./main.c:255 #3 0x000055c2218166f9 in zalloc (size=8) at mem.c:966 #4 0x000055c221806da2 in addbgstatus (pid=11959, status=0) at jobs.c:2192 #5 0x000055c2218478e7 in wait_for_processes () at signals.c:583 #6 0x000055c221847cdc in zhandler (sig=17) at signals.c:648 #7 <signal handler called> #8 0x00007f8895b69209 in __GI___sigsuspend (set=0x7ffe759b7160) at ../sysdeps/unix/sysv/linux/sigsuspend.c:26 #9 0x000055c221847376 in signal_suspend (sig=17, wait_cmd=1) at signals.c:393 #10 0x000055c2218054e8 in waitforpid (pid=11953, wait_cmd=1) at jobs.c:1551 #11 0x000055c221807a10 in bin_fg (name=0x7f8896af4798 "wait", argv=0x7f8896af4830, ops=0x7ffe759b75c0, func=4) at jobs.c:2371 All the backtraces I get does not always show a signal, and I get a lot a various errors occuring either in a malloc or a free; however, I have been debugging this problem by enabling mcheck(), and in this very case, all the crashes occur within freehook() and when tracing back the associated malloc() I can find it always occurs during double interlaced malloc() calls from the main and signal contexts. I can provide more info to reproduce the problem. Antoine ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [BUG] Crash due to malloc call in signal handler 2019-12-12 18:28 ` [BUG] Crash due to malloc call in signal handler Antoine C. @ 2019-12-13 9:40 ` Peter Stephenson 2019-12-13 9:45 ` Roman Perepelitsa 0 siblings, 1 reply; 6+ messages in thread From: Peter Stephenson @ 2019-12-13 9:40 UTC (permalink / raw) To: zsh-workers On Thu, 2019-12-12 at 19:28 +0100, Antoine C. wrote: > Hello, > > I finally found the cause of the frequent crashes I reported one > year ago ( https://protect2.fireeye.com/url?k=605d4b55-3d89f611-605cc01a-0cc47a31381a-5ceba38dc2a22d2c&u=https://www.zsh.org/mla/workers/2019/msg00059.html ). > > This is due to malloc calls from signal handler, for instance: > > #0 tcache_get (tc_idx=17) at malloc.c:2943 > #1 __GI___libc_malloc (bytes=296) at malloc.c:3050 > #2 0x000055c2217b27b5 in malloc (size=8) at ./main.c:255 > #3 0x000055c2218166f9 in zalloc (size=8) at mem.c:966 > #4 0x000055c221806da2 in addbgstatus (pid=11959, status=0) at jobs.c:2192 > #5 0x000055c2218478e7 in wait_for_processes () at signals.c:583 > #6 0x000055c221847cdc in zhandler (sig=17) at signals.c:648 > #7 <signal handler called> > #8 0x00007f8895b69209 in __GI___sigsuspend (set=0x7ffe759b7160) at ../sysdeps/unix/sysv/linux/sigsuspend.c:26 > #9 0x000055c221847376 in signal_suspend (sig=17, wait_cmd=1) at signals.c:393 > #10 0x000055c2218054e8 in waitforpid (pid=11953, wait_cmd=1) at jobs.c:1551 > #11 0x000055c221807a10 in bin_fg (name=0x7f8896af4798 "wait", argv=0x7f8896af4830, ops=0x7ffe759b75c0, func=4) at jobs.c:2371 The main shell is suspended, waiting for a child to finish, so the fact it's in the signal handler isn't saying anything. From the look of it, some memory corruption must already have occurred at this point to get the malloc to fail. pws ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [BUG] Crash due to malloc call in signal handler 2019-12-13 9:40 ` Peter Stephenson @ 2019-12-13 9:45 ` Roman Perepelitsa 2019-12-13 10:16 ` Peter Stephenson 0 siblings, 1 reply; 6+ messages in thread From: Roman Perepelitsa @ 2019-12-13 9:45 UTC (permalink / raw) To: Peter Stephenson; +Cc: Zsh hackers list On Fri, Dec 13, 2019 at 10:40 AM Peter Stephenson <p.stephenson@samsung.com> wrote: > The main shell is suspended, waiting for a child to finish, so the fact > it's in the signal handler isn't saying anything. > > From the look of it, some memory corruption must already have occurred > at this point to get the malloc to fail. malloc is not async signal safe. It's illegal to call it from a signal handler. I'm not saying this is what's causing a crash. Roman. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [BUG] Crash due to malloc call in signal handler 2019-12-13 9:45 ` Roman Perepelitsa @ 2019-12-13 10:16 ` Peter Stephenson 2019-12-13 10:19 ` Roman Perepelitsa 0 siblings, 1 reply; 6+ messages in thread From: Peter Stephenson @ 2019-12-13 10:16 UTC (permalink / raw) To: Zsh hackers list On Fri, 2019-12-13 at 10:45 +0100, Roman Perepelitsa wrote: > On Fri, Dec 13, 2019 at 10:40 AM Peter Stephenson > <p.stephenson@samsung.com> wrote: > > > > The main shell is suspended, waiting for a child to finish, so the fact > > it's in the signal handler isn't saying anything. > > > > From the look of it, some memory corruption must already have occurred > > at this point to get the malloc to fail. > malloc is not async signal safe. It's illegal to call it from a signal > handler. I'm not saying this is what's causing a crash. In zsh, this is handled by queuing interrupts and only allowing them to run in a few places in the code. Obviously, waiting for a child to exit is one of those places. pws ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [BUG] Crash due to malloc call in signal handler 2019-12-13 10:16 ` Peter Stephenson @ 2019-12-13 10:19 ` Roman Perepelitsa 2019-12-13 10:31 ` Peter Stephenson 0 siblings, 1 reply; 6+ messages in thread From: Roman Perepelitsa @ 2019-12-13 10:19 UTC (permalink / raw) To: Peter Stephenson; +Cc: Zsh hackers list On Fri, Dec 13, 2019 at 11:17 AM Peter Stephenson <p.stephenson@samsung.com> wrote: > > On Fri, 2019-12-13 at 10:45 +0100, Roman Perepelitsa wrote: > > On Fri, Dec 13, 2019 at 10:40 AM Peter Stephenson > > <p.stephenson@samsung.com> wrote: > > > > > > The main shell is suspended, waiting for a child to finish, so the fact > > > it's in the signal handler isn't saying anything. > > > > > > From the look of it, some memory corruption must already have occurred > > > at this point to get the malloc to fail. > > malloc is not async signal safe. It's illegal to call it from a signal > > handler. I'm not saying this is what's causing a crash. > > In zsh, this is handled by queuing interrupts and only allowing them to > run in a few places in the code. Obviously, waiting for a child to > exit is one of those places. The stack trace shows malloc being called zhandler. zhandler is a signal handler. What am I missing? Roman. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [BUG] Crash due to malloc call in signal handler 2019-12-13 10:19 ` Roman Perepelitsa @ 2019-12-13 10:31 ` Peter Stephenson 0 siblings, 0 replies; 6+ messages in thread From: Peter Stephenson @ 2019-12-13 10:31 UTC (permalink / raw) To: Zsh hackers list On Fri, 2019-12-13 at 11:19 +0100, Roman Perepelitsa wrote: > On Fri, Dec 13, 2019 at 11:17 AM Peter Stephenson > <p.stephenson@samsung.com> wrote: > > > > > > On Fri, 2019-12-13 at 10:45 +0100, Roman Perepelitsa wrote: > > > > > > On Fri, Dec 13, 2019 at 10:40 AM Peter Stephenson > > > <p.stephenson@samsung.com> wrote: > > > > > > > > > > > > The main shell is suspended, waiting for a child to finish, so the fact > > > > it's in the signal handler isn't saying anything. > > > > > > > > From the look of it, some memory corruption must already have occurred > > > > at this point to get the malloc to fail. > > > malloc is not async signal safe. It's illegal to call it from a signal > > > handler. I'm not saying this is what's causing a crash. > > In zsh, this is handled by queuing interrupts and only allowing them to > > run in a few places in the code. Obviously, waiting for a child to > > exit is one of those places. > The stack trace shows malloc being called zhandler. zhandler is a > signal handler. What am I missing? You're not missing anything there, that's how it works. Interrupts are queued so they don't normally go off. In certain places they are allowed to take place; one of these is when we are sitting waiting for a child to exit. At this point the signal handler will then run. Thus the signal handler is supposed not to be running when any memory management is taking place underneath. So it's not asynchronous with respect to code actually running in the main shell (despite being run from a signal handler which can formally occur anywhere, but we make sure it doesn't). Of course, there's the possibility of bugs in this, but the stack in this case doesn't show evidence of that at the point in question. You'll find long discussions of this in the mail archive going back some years. pws ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-12-13 10:32 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <CGME20191212182957epcas5p48645aa560e781ac1c34f00662a6d6176@epcas5p4.samsung.com> 2019-12-12 18:28 ` [BUG] Crash due to malloc call in signal handler Antoine C. 2019-12-13 9:40 ` Peter Stephenson 2019-12-13 9:45 ` Roman Perepelitsa 2019-12-13 10:16 ` Peter Stephenson 2019-12-13 10:19 ` Roman Perepelitsa 2019-12-13 10:31 ` Peter Stephenson
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/zsh/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).