From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 16543 invoked by alias); 6 Aug 2015 15:55:01 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 36007 Received: (qmail 21400 invoked from network); 6 Aug 2015 15:54:59 -0000 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2 autolearn=ham autolearn_force=no version=3.4.0 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:message-id:date:in-reply-to:comments :references:to:subject:mime-version:content-type; bh=eZIi9OuUmbSN96HnNZ3JPMLBLLRUx+7DN+foSIL/7Kw=; b=J+O+fy4fbjB5/G7VmCb+tZM6+Uej85tT3lOuwhKsew9qDGhTIMWjboypCTXIeWpmJj CHi+cRzNy0IDJQtT+A/pUW5M57XsIOmzMpgpWPnnvT/E3I5sFpbo+5/2p1Nuak2RXtkT QwrTZY022Hl3gI5f3mnXHTVA0lsL9tOk7kfEtuFVZIsD7T++JZdrgqpJ4iHUqE1a7GgO rhaVLBpYeWJgvEcWI4cRbdl7fzOqlF6SnUZvKpdoe6pZ70xKXVnC7HkfDTA9z3biXqZP GAQD4Em7JmkN/QT7+cQnd27Kb8rs/s8PPKSQ78QuUUYnlqgcKWO1GCuwHa0BoPUDOMVI W4/A== X-Gm-Message-State: ALoCoQnYo2HDPwQr/0WK+qSS0sg3j9l+fFqXu9vTwvPDKuRv4uWNf5c3WxRJeiEs6lUXnIjsUkTy X-Received: by 10.182.112.163 with SMTP id ir3mr2333342obb.44.1438876496480; Thu, 06 Aug 2015 08:54:56 -0700 (PDT) From: Bart Schaefer Message-Id: <150806085451.ZM402@torch.brasslantern.com> Date: Thu, 6 Aug 2015 08:54:51 -0700 In-Reply-To: Comments: In reply to Mathias Fredriksson "Re: Deadlock when receiving kill-signal from child process" (Aug 6, 11:24am) References: <150803085228.ZM24837@torch.brasslantern.com> <150803135818.ZM24977@torch.brasslantern.com> <150804235400.ZM9958@torch.brasslantern.com> <150805085258.ZM17673@torch.brasslantern.com> <150805115249.ZM7158@torch.brasslantern.com> <150805132014.ZM7746@torch.brasslantern.com> <150805220656.ZM18545@torch.brasslantern.com> X-Mailer: OpenZMail Classic (0.9.2 24April2005) To: zsh-workers@zsh.org Subject: Re: Deadlock when receiving kill-signal from child process MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii On Aug 6, 11:24am, Mathias Fredriksson wrote: } Subject: Re: Deadlock when receiving kill-signal from child process } } On Thu, Aug 6, 2015 at 8:06 AM, Bart Schaefer wrote: } } } } I played around with this a bit by hacking loop() but the effect is } } that with the test script Mathais provided, most of the USR1 signals } } are just thrown away (they collapse into a single call to the trap } } handler). Not sure if that's actually the desired effect. } } I would imagine some might rely on every signal being handled, e.g. } keeping a count. Even without spending most of the time in queuing, *some* of the signals get dropped at the OS level. The only way I can get them all to be tallied is to remove the "sleep" from the trap function. } The following traces have the last patches applied (I did multiple } runs to see if I could hit different states): } } #15 0x000000010df38e63 in runshfunc () } #16 0x000000010df38936 in doshfunc () This is confirms my suspicion about doshfunc(). Sadly it's called all over the place, sometimes with signals explicitly un-queued and other times with no change to the surrounding context. } #0 0x00007fff8abfe166 in __psynch_mutexwait () } #1 0x00007fff8e4b578a in _pthread_mutex_lock () } #2 0x00007fff82ce5750 in fputc () } #9 } #22 } #32 } #34 0x00007fff8e4b5714 in _pthread_mutex_lock () } #35 0x00007fff82ce43a3 in ferror () This is the stdio thing again. Anyone reading this familar enough with the POSIX or C standards to point to whether stdio is required to be signal-safe with pthreads? I.e., is this our bug or someone else's? (Not that zsh is using threads, but stdio is using pthread mutexes.) } setopt NO_ASYNC_TRAPS: NO_TRAPS_ASYNC ? Anyway, same two issues as above, just slightly different paths (no multiple signal handers in the stdio case, but one is enough). As with the previous dotrapargs() patch, I'm a little nervous about the dont_queue_signals() bits, but that's the only safe way to do the disabling part of signal queueing when the enabling part is not in local scope. diff --git a/Src/exec.c b/Src/exec.c index 7612d43..2886785 100644 --- a/Src/exec.c +++ b/Src/exec.c @@ -4820,11 +4833,9 @@ execshfunc(Shfunc shf, LinkList args) if ((osfc = sfcontext) == SFC_NONE) sfcontext = SFC_DIRECT; xtrerr = stderr; - unqueue_signals(); doshfunc(shf, args, 0); - queue_signals(); sfcontext = osfc; free(cmdstack); cmdstack = ocs; @@ -5039,6 +5050,8 @@ doshfunc(Shfunc shfunc, LinkList doshargs, int noreturnval) static int funcdepth; #endif + queue_signals(); /* Lots of memory and global state changes coming */ + pushheap(); oargv0 = NULL; @@ -5261,6 +5274,8 @@ doshfunc(Shfunc shfunc, LinkList doshargs, int noreturnval) } popheap(); + unqueue_signals(); + /* * Exit with a tidy up. * Only leave if we're at the end of the appropriate function --- @@ -5296,7 +5311,7 @@ doshfunc(Shfunc shfunc, LinkList doshargs, int noreturnval) mod_export void runshfunc(Eprog prog, FuncWrap wrap, char *name) { - int cont, ouu; + int cont, ouu, q = queue_signal_level(); char *ou; ou = zalloc(ouu = underscoreused); @@ -5305,7 +5320,9 @@ runshfunc(Eprog prog, FuncWrap wrap, char *name) while (wrap) { wrap->module->wrapper++; + dont_queue_signals(); cont = wrap->handler(prog, wrap->next, name); + restore_queue_signals(q); wrap->module->wrapper--; if (!wrap->module->wrapper && @@ -5320,7 +5337,9 @@ runshfunc(Eprog prog, FuncWrap wrap, char *name) wrap = wrap->next; } startparamscope(); + dont_queue_signals(); execode(prog, 1, 0, "shfunc"); + restore_queue_signals(q); if (ou) { setunderscore(ou); zfree(ou, ouu);