zsh-workers
 help / color / mirror / code / Atom feed
* deadlock in free() called from a signal handler
@ 2015-02-19 11:06 Kamil Dudka
  2015-02-19 17:23 ` Bart Schaefer
  0 siblings, 1 reply; 4+ messages in thread
From: Kamil Dudka @ 2015-02-19 11:06 UTC (permalink / raw)
  To: zsh-workers

We have a bug report about deadlock in zsh due to a call to free() from
a signal handler.  I have discovered a similar issue here on the list:

http://www.zsh.org/mla/workers/2014/msg01402.html

However, the above comment does not sound correct to me.  zfree() contains
calls to do signal queueing, only if zsh is compiled with ZSH_MEM, which
is not the default configuration.  Is this on purpose?

Would it make sense to surround also the plain free() wrapper by the signal
queueing macros?  I would be happy to provide a patch...

A backtrace of the deadlock (captured with zsh-4.3.10-9.el6.x86_64, but the
customer claims the issue is still reproducible with upstream zsh-5.0.7)
follows:

#0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:97
#1  0x000000380167d0a0 in _L_lock_5189 () from /lib64/libc-2.12.so
#2  0x00000038016789fb in _int_free (av=0x380198fe80, p=0x289d910, have_lock=0) at malloc.c:4959
#3  0x0000000000441d9d in freejob (jn=0x28854a0, deleting=1) at jobs.c:1031
#4  0x0000000000442d55 in printjob (jn=0x28854a0, lng=0, synch=<value optimized out>) at jobs.c:994
#5  0x00000000004458f1 in update_job (jn=0x28854a0) at jobs.c:460
#6  0x0000000000471fa3 in zhandler (sig=<value optimized out>) at signals.c:532
#7  <signal handler called>
#8  0x0000003801678723 in _int_free (av=0x380198fe80, p=0x289d2c0, have_lock=0) at malloc.c:4969
#9  0x00000000004228b5 in setunderscore (str=0x7f25cd27bd60 "ACTIONS\213\060\061") at exec.c:2162
#10 0x0000000000427851 in execcmd (state=<value optimized out>, input=0, output=0, how=0, last1=2) at exec.c:2571
#11 0x000000000042a396 in execpline2 (state=0x7fff909bd190, pcode=<value optimized out>, how=2, input=0, output=0, last1=0) at exec.c:1569
#12 0x000000000042a756 in execpline (state=0x7fff909bd190, slcode=<value optimized out>, how=2, last1=0) at exec.c:1355
#13 0x000000000042b9df in execlist (state=0x7fff909bd190, dont_change_job=1, exiting=0) at exec.c:1152
#14 0x0000000000449902 in execif (state=0x7fff909bd190, do_exec=0) at loop.c:515
#15 0x0000000000429274 in execcmd (state=<value optimized out>, input=0, output=0, how=0, last1=2) at exec.c:3016
#16 0x000000000042a396 in execpline2 (state=0x7fff909bd190, pcode=<value optimized out>, how=2, input=0, output=0, last1=0) at exec.c:1569
#17 0x000000000042a756 in execpline (state=0x7fff909bd190, slcode=<value optimized out>, how=2, last1=0) at exec.c:1355
#18 0x000000000042b9df in execlist (state=0x7fff909bd190, dont_change_job=1, exiting=0) at exec.c:1152
#19 0x000000000042bcb3 in execode (p=0x2898980, dont_change_job=1, exiting=0) at exec.c:980
#20 0x000000000042be34 in runshfunc (prog=0x2898980, wrap=0x0, name=0x7f25cd27b3a8 "PROCESS_CONTROLFILE") at exec.c:4469
#21 0x0000000000425e44 in doshfunc (shfunc=0x2898850, doshargs=<value optimized out>, noreturnval=<value optimized out>) at exec.c:4363
#22 0x00000000004262ad in execshfunc (shf=0x2898850, args=0x7f25cd27b290) at exec.c:4089
#23 0x0000000000429dd2 in execcmd (state=<value optimized out>, input=0, output=0, how=0, last1=2) at exec.c:3064
#24 0x000000000042a396 in execpline2 (state=0x7fff909c0310, pcode=<value optimized out>, how=18, input=0, output=0, last1=0) at exec.c:1569
...
#53 0x0000000000429274 in execcmd (state=<value optimized out>, input=0, output=0, how=0, last1=2) at exec.c:3016
#54 0x000000000042a396 in execpline2 (state=0x7fff909c0310, pcode=<value optimized out>, how=18, input=0, output=0, last1=0) at exec.c:1569
#55 0x000000000042a756 in execpline (state=0x7fff909c0310, slcode=<value optimized out>, how=18, last1=0) at exec.c:1355
#56 0x000000000042b9df in execlist (state=0x7fff909c0310, dont_change_job=0, exiting=0) at exec.c:1152
#57 0x000000000042bcb3 in execode (p=0x7f25cd282c98, dont_change_job=0, exiting=0) at exec.c:980
#58 0x000000000043c657 in loop (toplevel=1, justonce=0) at init.c:183
#59 0x000000000043e1fe in zsh_main (argc=<value optimized out>, argv=<value optimized out>) at init.c:1471
#60 0x000000380161ed1d in __libc_start_main (main=0x40db10 <main>, argc=2, ubp_av=0x7fff909c0548, init=<value optimized out>, fini=<value optimized out>, rtld_fini=<value optimized out>, stack_end=0x7fff909c0538) at libc-start.c:226
#61 0x000000000040da49 in _start ()


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: deadlock in free() called from a signal handler
  2015-02-19 11:06 deadlock in free() called from a signal handler Kamil Dudka
@ 2015-02-19 17:23 ` Bart Schaefer
  2015-02-21  2:28   ` PATCH " Bart Schaefer
  0 siblings, 1 reply; 4+ messages in thread
From: Bart Schaefer @ 2015-02-19 17:23 UTC (permalink / raw)
  To: Kamil Dudka, zsh-workers

On Feb 19, 12:06pm, Kamil Dudka wrote:
} Subject: deadlock in free() called from a signal handler
}
} We have a bug report about deadlock in zsh due to a call to free() from
} a signal handler.

In every case I can remember so far, when this happens it means that we
ought to be using the signal queuing macros at a scope outside the call
to the malloc library.

} I have discovered a similar issue here on the list:
} 
} http://www.zsh.org/mla/workers/2014/msg01402.html
} 
} However, the above comment does not sound correct to me.  zfree() contains
} calls to do signal queueing, only if zsh is compiled with ZSH_MEM, which
} is not the default configuration.

Yes, I believe this eventually got noted in another thread (or another
branch of that thread) because we later came upon a couple of other
cases where signal queuing was needed around some global state updates.

} Is this on purpose?

It's not by accident, but it's been that way forever, so the reasoning
is lost in time.  I believe that at one point there was no zfree() and
ZSH_MEM simply redefined free() directly, so the other branch may at
one time have been empty and was only filled in with the minimal code
when the zfree symbol was introduced.

} Would it make sense to surround also the plain free() wrapper by the
} signal queueing macros? I would be happy to provide a patch...

I was going to say no, it's not, because we don't wrap malloc() calls
similarly ... but on a quick examination there is only one direct call
to malloc() left in the source anywhere outside of mem.c, and mem.c
does wrap queue_signals() around all calls to malloc().

So ... per my very first remark above, it's probably worth examining the
context in execcmd() to see if signal queuing is appropriate there, or
if we just need it in setunderscore().  But it's may also make sense to
replace that one last malloc() and put queuing in zfree()/zsfree() too.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* PATCH Re: deadlock in free() called from a signal handler
  2015-02-19 17:23 ` Bart Schaefer
@ 2015-02-21  2:28   ` Bart Schaefer
  2015-02-23 14:15     ` Kamil Dudka
  0 siblings, 1 reply; 4+ messages in thread
From: Bart Schaefer @ 2015-02-21  2:28 UTC (permalink / raw)
  To: Kamil Dudka, zsh-workers

On Feb 19,  9:23am, Bart Schaefer wrote:
}
} In every case I can remember so far, when this happens it means that we
} ought to be using the signal queuing macros at a scope outside the call
} to the malloc library.
}  [...]
} So ... per my very first remark above, it's probably worth examining the
} context in execcmd() to see if signal queuing is appropriate there, or
} if we just need it in setunderscore().  But it may also make sense to
} replace that one last malloc() and put queuing in zfree()/zsfree() too.

(Above edited for silly typo.)

There doesn't appear to be an appropriate scope for queuing signals in
execcmd().  However, setunderscore() and all of the functions in text.c 
manipulate global state, so they should be considered non-reentrant.

The following patch does not yet put signal queuing into zfree(), but
does wrap it around setunderscore() and the three non-static functions
in text.c, as well as replace that one remaining malloc() with zalloc().
Hopefully this doesn't slow down execution noticeably.


diff --git a/Src/exec.c b/Src/exec.c
index 947b815..1a6149a 100644
--- a/Src/exec.c
+++ b/Src/exec.c
@@ -2337,6 +2337,7 @@ addvars(Estate state, Wordcode pc, int addflags)
 void
 setunderscore(char *str)
 {
+    queue_signals();
     if (str && *str) {
 	int l = strlen(str) + 1, nl = (l + 31) & ~31;
 
@@ -2354,6 +2355,7 @@ setunderscore(char *str)
 	*zunderscore = '\0';
 	underscoreused = 1;
     }
+    unqueue_signals();
 }
 
 /* These describe the type of expansions that need to be done on the words
@@ -5319,7 +5321,7 @@ execsave(void)
 {
     struct execstack *es;
 
-    es = (struct execstack *) malloc(sizeof(struct execstack));
+    es = (struct execstack *) zalloc(sizeof(struct execstack));
     es->list_pipe_pid = list_pipe_pid;
     es->nowait = nowait;
     es->pline_level = pline_level;
diff --git a/Src/text.c b/Src/text.c
index 75f642c..b58c251 100644
--- a/Src/text.c
+++ b/Src/text.c
@@ -173,6 +173,8 @@ getpermtext(Eprog prog, Wordcode c, int start_indent)
 {
     struct estate s;
 
+    queue_signals();
+
     if (!c)
 	c = prog->prog;
 
@@ -193,6 +195,9 @@ getpermtext(Eprog prog, Wordcode c, int start_indent)
     *tptr = '\0';
     freeeprog(prog);		/* mark as unused */
     untokenize(tbuf);
+
+    unqueue_signals();
+
     return tbuf;
 }
 
@@ -206,6 +211,8 @@ getjobtext(Eprog prog, Wordcode c)
 
     struct estate s;
 
+    queue_signals();
+
     if (!c)
 	c = prog->prog;
 
@@ -224,6 +231,9 @@ getjobtext(Eprog prog, Wordcode c)
     *tptr = '\0';
     freeeprog(prog);		/* mark as unused */
     untokenize(jbuf);
+
+    unqueue_signals();
+
     return jbuf;
 }
 
@@ -883,6 +893,9 @@ getredirs(LinkList redirs)
 	">", ">|", ">>", ">>|", "&>", "&>|", "&>>", "&>>|", "<>", "<",
 	"<<", "<<-", "<<<", "<&", ">&", NULL /* >&- */, "<", ">"
     };
+
+    queue_signals();
+
     taddchr(' ');
     for (n = firstnode(redirs); n; incnode(n)) {
 	Redir f = (Redir) getdata(n);
@@ -970,4 +983,6 @@ getredirs(LinkList redirs)
 	}
     }
     tptr--;
+
+    unqueue_signals();
 }

-- 
Barton E. Schaefer


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: PATCH Re: deadlock in free() called from a signal handler
  2015-02-21  2:28   ` PATCH " Bart Schaefer
@ 2015-02-23 14:15     ` Kamil Dudka
  0 siblings, 0 replies; 4+ messages in thread
From: Kamil Dudka @ 2015-02-23 14:15 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers

On Friday 20 February 2015 18:28:12 Bart Schaefer wrote:
> On Feb 19,  9:23am, Bart Schaefer wrote:
> }
> } In every case I can remember so far, when this happens it means that we
> } ought to be using the signal queuing macros at a scope outside the call
> } to the malloc library.
> }  [...]
> } So ... per my very first remark above, it's probably worth examining the
> } context in execcmd() to see if signal queuing is appropriate there, or
> } if we just need it in setunderscore().  But it may also make sense to
> } replace that one last malloc() and put queuing in zfree()/zsfree() too.
> 
> (Above edited for silly typo.)
> 
> There doesn't appear to be an appropriate scope for queuing signals in
> execcmd().  However, setunderscore() and all of the functions in text.c
> manipulate global state, so they should be considered non-reentrant.
> 
> The following patch does not yet put signal queuing into zfree(), but
> does wrap it around setunderscore() and the three non-static functions
> in text.c, as well as replace that one remaining malloc() with zalloc().
> Hopefully this doesn't slow down execution noticeably.

Thank you for the patch!  We will give it a try.

Kamil


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-02-23 14:15 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-19 11:06 deadlock in free() called from a signal handler Kamil Dudka
2015-02-19 17:23 ` Bart Schaefer
2015-02-21  2:28   ` PATCH " Bart Schaefer
2015-02-23 14:15     ` Kamil Dudka

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).