From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 2673 invoked by alias); 30 Aug 2014 23:25:35 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 33075 Received: (qmail 24553 invoked from network); 30 Aug 2014 23:25:35 -0000 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.2 From: Bart Schaefer Message-id: <140830162545.ZM27219@torch.brasslantern.com> Date: Sat, 30 Aug 2014 16:25:45 -0700 In-reply-to: <20140827162834.GQ7356@sym.noone.org> Comments: In reply to Axel Beckert "Re: 5.0.5-dev-3" (Aug 27, 6:28pm) References: <20140824183909.75d04bf0@pws-pc.ntlworld.com> <20140827162834.GQ7356@sym.noone.org> X-Mailer: OpenZMail Classic (0.9.2 24April2005) To: Axel Beckert , zsh-workers@zsh.org Subject: "5 seconds to fail" Re: 5.0.5-dev-3 MIME-version: 1.0 Content-type: text/plain; charset=us-ascii On Aug 27, 6:28pm, Axel Beckert wrote: } } Hi, } } On Sun, Aug 24, 2014 at 06:39:09PM +0100, Peter Stephenson wrote: } > I think the only remaining open matter (excluding longer term issues } > that aren't going to be dealt with immediately) is test failures in } > some automated tests. } } We again got a few build-failures on the build daemons, but this time } on different architectures than with 5.0.5-dev-2, namely amd64 (Linux) } and s390x while kfreebsd-amd64 worked fine this time: } } https://buildd.debian.org/status/package.php?p=zsh&suite=experimental } } On amd64 (aka x86_64) the testsuite was hanging and killed after 150 } minutes: } https://buildd.debian.org/status/fetch.php?pkg=zsh&arch=amd64&ver=5.0.5-dev-3-1&stamp=1409099297 } } On s390x, gcc was killed after 150 minutes of inactivity: } https://buildd.debian.org/status/fetch.php?pkg=zsh&arch=s390x&ver=5.0.5-dev-3-1&stamp=1409118602 I was able by accident to reproduce this in a foreground build on CentOS. Can't say for sure it's exactly the same thing, but here's the stack trace I got: (gdb) where #0 0x0086e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 #1 0x009611ce in __lll_mutex_lock_wait () from /lib/tls/libc.so.6 #2 0x008efc0b in _L_mutex_lock_4191 () from /lib/tls/libc.so.6 #3 0x08010000 in ?? () #4 0x00000000 in ?? () That's literally all of it, no zsh source files at all, so I suspect the job has already exited and is a zombie stuck on that mutex. The parent "runtests.zsh" script waiting for it has this trace: (gdb) where #0 0x0086e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 #1 0x008afe8c in sigsuspend () from /lib/tls/libc.so.6 #2 0x080b6062 in signal_suspend (sig=17, wait_cmd=0) at ../../zsh-5.0/Src/signals.c:375 #3 0x08084b65 in zwaitjob (job=2, wait_cmd=0) at ../../zsh-5.0/Src/jobs.c:1454 #4 0x08084d3e in waitjobs () at ../../zsh-5.0/Src/jobs.c:1499 #5 0x08063753 in execpline (state=0xbfffc8b0, slcode=7170, how=2, last1=0) at ../../zsh-5.0/Src/exec.c:1554 #6 0x08062c57 in execlist (state=0xbfffc8b0, dont_change_job=1, exiting=0) at ../../zsh-5.0/Src/exec.c:1261 #7 0x0808c54b in execfor (state=0xbfffc8b0, do_exec=0) at ../../zsh-5.0/Src/loop.c:164 #8 0x080682ac in execcmd (state=0xbfffc8b0, input=0, output=0, how=18, last1=2) at ../../zsh-5.0/Src/exec.c:3232 #9 0x08063fda in execpline2 (state=0xbfffc8b0, pcode=771, how=18, input=0, output=0, last1=0) at ../../zsh-5.0/Src/exec.c:1691 #10 0x0806337f in execpline (state=0xbfffc8b0, slcode=48130, how=18, last1=0) at ../../zsh-5.0/Src/exec.c:1478 #11 0x08062c57 in execlist (state=0xbfffc8b0, dont_change_job=0, exiting=0) at ../../zsh-5.0/Src/exec.c:1261 #12 0x080626aa in execode (p=0xb7d1f778, dont_change_job=0, exiting=0, context=0x813c823 "toplevel") at ../../zsh-5.0/Src/exec.c:1070 #13 0x0807d636 in loop (toplevel=1, justonce=0) at ../../zsh-5.0/Src/init.c:185 #14 0x0808094b in zsh_main (argc=4, argv=0xbfffca04) at ../../zsh-5.0/Src/init.c:1625 #15 0x0804c0a6 in main (argc=4, argv=0xbfffca04) at ../../zsh-5.0/Src/main.c:93