zsh-workers
 help / color / mirror / code / Atom feed
* 5.0.5-dev-3
@ 2014-08-24 17:39 Peter Stephenson
  2014-08-26 17:11 ` 5.0.5-dev-3 Dominic Hopf
  2014-08-27 16:28 ` 5.0.5-dev-3 Axel Beckert
  0 siblings, 2 replies; 16+ messages in thread
From: Peter Stephenson @ 2014-08-24 17:39 UTC (permalink / raw)
  To: Zsh Hackers' List

Uploaded 5.0.5-dev-3 to http://www.zsh.org/pub/development/ .  The tag
is "zsh-5.0.5-dev-3".

I think the only remaining open matter (excluding longer term issues
that aren't going to be dealt with immediately) is test failures in
some automated tests.  Unless something definitive turns up I'm not
going to regard this as blocking the release.

pws


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 5.0.5-dev-3
  2014-08-24 17:39 5.0.5-dev-3 Peter Stephenson
@ 2014-08-26 17:11 ` Dominic Hopf
  2014-08-27 16:28 ` 5.0.5-dev-3 Axel Beckert
  1 sibling, 0 replies; 16+ messages in thread
From: Dominic Hopf @ 2014-08-26 17:11 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: Zsh Hackers' List

[-- Attachment #1: Type: text/plain, Size: 712 bytes --]

Packages for Fedora available here:

    https://dmaphy.fedorapeople.org/zsh-dev/


On Sun, Aug 24, 2014 at 7:39 PM, Peter Stephenson <
p.w.stephenson@ntlworld.com> wrote:

> Uploaded 5.0.5-dev-3 to http://www.zsh.org/pub/development/ .  The tag
> is "zsh-5.0.5-dev-3".
>
> I think the only remaining open matter (excluding longer term issues
> that aren't going to be dealt with immediately) is test failures in
> some automated tests.  Unless something definitive turns up I'm not
> going to regard this as blocking the release.
>
> pws
>



-- 
Diese E-Mail ist nicht mit GPG signiert, da ich sie vom Webinterface aus
geschrieben habe.

This mail is not signed with GPG because I wrote it from web interface.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 5.0.5-dev-3
  2014-08-24 17:39 5.0.5-dev-3 Peter Stephenson
  2014-08-26 17:11 ` 5.0.5-dev-3 Dominic Hopf
@ 2014-08-27 16:28 ` Axel Beckert
  2014-08-28 19:54   ` 5.0.5-dev-3 Axel Beckert
  2014-08-30 23:25   ` "5 seconds to fail" 5.0.5-dev-3 Bart Schaefer
  1 sibling, 2 replies; 16+ messages in thread
From: Axel Beckert @ 2014-08-27 16:28 UTC (permalink / raw)
  To: zsh-workers

Hi,

On Sun, Aug 24, 2014 at 06:39:09PM +0100, Peter Stephenson wrote:
> Uploaded 5.0.5-dev-3 to http://www.zsh.org/pub/development/ .  The tag
> is "zsh-5.0.5-dev-3".

Built fine locally and on our Jenkins. Uploaded it to Debian
Experimental yesterday.

> I think the only remaining open matter (excluding longer term issues
> that aren't going to be dealt with immediately) is test failures in
> some automated tests.

We again got a few build-failures on the build daemons, but this time
on different architectures than with 5.0.5-dev-2, namely amd64 (Linux)
and s390x while kfreebsd-amd64 worked fine this time:

https://buildd.debian.org/status/package.php?p=zsh&suite=experimental

On amd64 (aka x86_64) the testsuite was hanging and killed after 150
minutes:
https://buildd.debian.org/status/fetch.php?pkg=zsh&arch=amd64&ver=5.0.5-dev-3-1&stamp=1409099297

On s390x, gcc was killed after 150 minutes of inactivity:
https://buildd.debian.org/status/fetch.php?pkg=zsh&arch=s390x&ver=5.0.5-dev-3-1&stamp=1409118602

At the moment the latter looks more like a platform issue to me as the
same happened with a rebuild of 5.0.5 due to glibc bugfixes while it
didn't fail two months ago with a previous glibc version. I'll talk to
the s390 porters about this.

> Unless something definitive turns up I'm not going to regard this as
> blocking the release.

Since they don't seem to show up under some (yet unclear) conditions
and even then not permanently (more intermittent), I don't consider
them release-critical despite I need to find a way to avoid them in
Debian deterministically before I upload it to "unstable" instead of
"experimental".

		Kind regards, Axel
-- 
/~\  Plain Text Ribbon Campaign                   | Axel Beckert
\ /  Say No to HTML in E-Mail and News            | abe@deuxchevaux.org  (Mail)
 X   See http://www.nonhtmlmail.org/campaign.html | abe@noone.org (Mail+Jabber)
/ \  I love long mails: http://email.is-not-s.ms/ | http://noone.org/abe/ (Web)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 5.0.5-dev-3
  2014-08-27 16:28 ` 5.0.5-dev-3 Axel Beckert
@ 2014-08-28 19:54   ` Axel Beckert
  2014-08-30 23:46     ` 5.0.5-dev-3 Axel Beckert
  2014-08-30 23:25   ` "5 seconds to fail" 5.0.5-dev-3 Bart Schaefer
  1 sibling, 1 reply; 16+ messages in thread
From: Axel Beckert @ 2014-08-28 19:54 UTC (permalink / raw)
  To: zsh-workers

Hi,

JFYI:

On Wed, Aug 27, 2014 at 06:28:34PM +0200, Axel Beckert wrote:
> We again got a few build-failures on the build daemons, but this time
> on different architectures than with 5.0.5-dev-2, namely amd64 (Linux)
> and s390x while kfreebsd-amd64 worked fine this time:
> 
> https://buildd.debian.org/status/package.php?p=zsh&suite=experimental
> 
> On amd64 (aka x86_64) the testsuite was hanging and killed after 150
> minutes:
> https://buildd.debian.org/status/fetch.php?pkg=zsh&arch=amd64&ver=5.0.5-dev-3-1&stamp=1409099297

A second build (by just putting the package back into the build queue)
succeeded in this case.

So these test suite hangs are very likely non-deterministic. At least
not related to the build process in general. The two builds were on
different build daemon machines though. But in the meanwhile I doubt
that this makes a difference in this case as I had the issue once
locally and just the next built succeeded again.

> On s390x, gcc was killed after 150 minutes of inactivity:
> https://buildd.debian.org/status/fetch.php?pkg=zsh&arch=s390x&ver=5.0.5-dev-3-1&stamp=1409118602
> 
> At the moment the latter looks more like a platform issue to me as the
> same happened with a rebuild of 5.0.5 due to glibc bugfixes while it
> didn't fail two months ago with a previous glibc version. I'll talk to
> the s390 porters about this.

Has been confirmed and a currently running second build is also bound
to fail: gcc seems subtly broken in Debian Unstable on s390x currently
for all packages which use sigjmp_buf directly or in a build
dependency.

Not our job. ;-)

		Kind regards, Axel
-- 
/~\  Plain Text Ribbon Campaign                   | Axel Beckert
\ /  Say No to HTML in E-Mail and News            | abe@deuxchevaux.org  (Mail)
 X   See http://www.nonhtmlmail.org/campaign.html | abe@noone.org (Mail+Jabber)
/ \  I love long mails: http://email.is-not-s.ms/ | http://noone.org/abe/ (Web)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* "5 seconds to fail" Re: 5.0.5-dev-3
  2014-08-27 16:28 ` 5.0.5-dev-3 Axel Beckert
  2014-08-28 19:54   ` 5.0.5-dev-3 Axel Beckert
@ 2014-08-30 23:25   ` Bart Schaefer
  2014-09-29  1:20     ` "5 seconds to fail" Axel Beckert
  1 sibling, 1 reply; 16+ messages in thread
From: Bart Schaefer @ 2014-08-30 23:25 UTC (permalink / raw)
  To: Axel Beckert, zsh-workers

On Aug 27,  6:28pm, Axel Beckert wrote:
}
} Hi,
} 
} On Sun, Aug 24, 2014 at 06:39:09PM +0100, Peter Stephenson wrote:
} > I think the only remaining open matter (excluding longer term issues
} > that aren't going to be dealt with immediately) is test failures in
} > some automated tests.
} 
} We again got a few build-failures on the build daemons, but this time
} on different architectures than with 5.0.5-dev-2, namely amd64 (Linux)
} and s390x while kfreebsd-amd64 worked fine this time:
} 
} https://buildd.debian.org/status/package.php?p=zsh&suite=experimental
} 
} On amd64 (aka x86_64) the testsuite was hanging and killed after 150
} minutes:
} https://buildd.debian.org/status/fetch.php?pkg=zsh&arch=amd64&ver=5.0.5-dev-3-1&stamp=1409099297
} 
} On s390x, gcc was killed after 150 minutes of inactivity:
} https://buildd.debian.org/status/fetch.php?pkg=zsh&arch=s390x&ver=5.0.5-dev-3-1&stamp=1409118602


I was able by accident to reproduce this in a foreground build on CentOS.
Can't say for sure it's exactly the same thing, but here's the stack trace
I got:

(gdb) where
#0  0x0086e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x009611ce in __lll_mutex_lock_wait () from /lib/tls/libc.so.6
#2  0x008efc0b in _L_mutex_lock_4191 () from /lib/tls/libc.so.6
#3  0x08010000 in ?? ()
#4  0x00000000 in ?? ()

That's literally all of it, no zsh source files at all, so I suspect
the job has already exited and is a zombie stuck on that mutex.  The
parent "runtests.zsh" script waiting for it has this trace:

(gdb) where
#0  0x0086e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x008afe8c in sigsuspend () from /lib/tls/libc.so.6
#2  0x080b6062 in signal_suspend (sig=17, wait_cmd=0)
    at ../../zsh-5.0/Src/signals.c:375
#3  0x08084b65 in zwaitjob (job=2, wait_cmd=0) at ../../zsh-5.0/Src/jobs.c:1454
#4  0x08084d3e in waitjobs () at ../../zsh-5.0/Src/jobs.c:1499
#5  0x08063753 in execpline (state=0xbfffc8b0, slcode=7170, how=2, last1=0)
    at ../../zsh-5.0/Src/exec.c:1554
#6  0x08062c57 in execlist (state=0xbfffc8b0, dont_change_job=1, exiting=0)
    at ../../zsh-5.0/Src/exec.c:1261
#7  0x0808c54b in execfor (state=0xbfffc8b0, do_exec=0)
    at ../../zsh-5.0/Src/loop.c:164
#8  0x080682ac in execcmd (state=0xbfffc8b0, input=0, output=0, how=18, 
    last1=2) at ../../zsh-5.0/Src/exec.c:3232
#9  0x08063fda in execpline2 (state=0xbfffc8b0, pcode=771, how=18, input=0, 
    output=0, last1=0) at ../../zsh-5.0/Src/exec.c:1691
#10 0x0806337f in execpline (state=0xbfffc8b0, slcode=48130, how=18, last1=0)
    at ../../zsh-5.0/Src/exec.c:1478
#11 0x08062c57 in execlist (state=0xbfffc8b0, dont_change_job=0, exiting=0)
    at ../../zsh-5.0/Src/exec.c:1261
#12 0x080626aa in execode (p=0xb7d1f778, dont_change_job=0, exiting=0, 
    context=0x813c823 "toplevel") at ../../zsh-5.0/Src/exec.c:1070
#13 0x0807d636 in loop (toplevel=1, justonce=0) at ../../zsh-5.0/Src/init.c:185
#14 0x0808094b in zsh_main (argc=4, argv=0xbfffca04)
    at ../../zsh-5.0/Src/init.c:1625
#15 0x0804c0a6 in main (argc=4, argv=0xbfffca04) at ../../zsh-5.0/Src/main.c:93


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 5.0.5-dev-3
  2014-08-28 19:54   ` 5.0.5-dev-3 Axel Beckert
@ 2014-08-30 23:46     ` Axel Beckert
  0 siblings, 0 replies; 16+ messages in thread
From: Axel Beckert @ 2014-08-30 23:46 UTC (permalink / raw)
  To: zsh-workers

Hi,

On Thu, Aug 28, 2014 at 09:54:49PM +0200, Axel Beckert wrote:
> > On s390x, gcc was killed after 150 minutes of inactivity:
> > https://buildd.debian.org/status/fetch.php?pkg=zsh&arch=s390x&ver=5.0.5-dev-3-1&stamp=1409118602
> > 
> > At the moment the latter looks more like a platform issue to me as the
> > same happened with a rebuild of 5.0.5 due to glibc bugfixes while it
> > didn't fail two months ago with a previous glibc version. I'll talk to
> > the s390 porters about this.
> 
> Has been confirmed and a currently running second build is also bound
> to fail: gcc seems subtly broken in Debian Unstable on s390x currently
> for all packages which use sigjmp_buf directly or in a build
> dependency.

Solved, or at least workarounded: Not using -fstack-protector-strong
(but keeping -fstack-protector) on s390x solves the hanging gcc 4.9.
(gcc 4.8 doesn't even know this option.) Doesn't seem to affect other
architectures. Details are at https://bugs.debian.org/759870

Will adapt Debian's s390x builds of zsh accordingly.

		Kind regards, Axel
-- 
/~\  Plain Text Ribbon Campaign                   | Axel Beckert
\ /  Say No to HTML in E-Mail and News            | abe@deuxchevaux.org  (Mail)
 X   See http://www.nonhtmlmail.org/campaign.html | abe@noone.org (Mail+Jabber)
/ \  I love long mails: http://email.is-not-s.ms/ | http://noone.org/abe/ (Web)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: "5 seconds to fail"
  2014-08-30 23:25   ` "5 seconds to fail" 5.0.5-dev-3 Bart Schaefer
@ 2014-09-29  1:20     ` Axel Beckert
  2014-09-29  1:51       ` [Pkg-zsh-devel] Bug#760061: " Axel Beckert
  2014-09-30 17:40       ` [Pkg-zsh-devel] " Axel Beckert
  0 siblings, 2 replies; 16+ messages in thread
From: Axel Beckert @ 2014-09-29  1:20 UTC (permalink / raw)
  To: zsh-workers; +Cc: ivodd, 760061

Hi again,

On Sat, Aug 30, 2014 at 04:25:45PM -0700, Bart Schaefer wrote:
> On Aug 27,  6:28pm, Axel Beckert wrote:
> } On Sun, Aug 24, 2014 at 06:39:09PM +0100, Peter Stephenson wrote:
> } > I think the only remaining open matter (excluding longer term issues
> } > that aren't going to be dealt with immediately) is test failures in
> } > some automated tests.
> } 
> } We again got a few build-failures on the build daemons, but this time
> } on different architectures than with 5.0.5-dev-2, namely amd64 (Linux)
> } and s390x while kfreebsd-amd64 worked fine this time:
> } 
> } https://buildd.debian.org/status/package.php?p=zsh&suite=experimental
> } 
> } On amd64 (aka x86_64) the testsuite was hanging and killed after 150
> } minutes:
> } https://buildd.debian.org/status/fetch.php?pkg=zsh&arch=amd64&ver=5.0.5-dev-3-1&stamp=1409099297
> } 
> } On s390x, gcc was killed after 150 minutes of inactivity:
> } https://buildd.debian.org/status/fetch.php?pkg=zsh&arch=s390x&ver=5.0.5-dev-3-1&stamp=1409118602
> 
> I was able by accident to reproduce this in a foreground build on CentOS.

Thanks for the confirmation that this is likely not debian-specific.

I though still don't understand why this happens way more often on the
buildds than outside of the buildds.

Since we ran into the issue again with the upload one week ago, and we
have to do another upload again before the upcoming freeze for
Debian's next stable release, I did another run of building the zsh
package again and again (without connected terminal by using "ssh
localhost" without "-t" to get closer to the buildd environment) until
I run into the issue.

And this time I had some luck after something like 70 to 100
successful package builds in one row, but interestingly, with
different results as Bart:

> Can't say for sure it's exactly the same thing, but here's the stack trace
> I got:
> 
> (gdb) where
> #0  0x0086e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> #1  0x009611ce in __lll_mutex_lock_wait () from /lib/tls/libc.so.6
> #2  0x008efc0b in _L_mutex_lock_4191 () from /lib/tls/libc.so.6
> #3  0x08010000 in ?? ()
> #4  0x00000000 in ?? ()
> 
> That's literally all of it, no zsh source files at all, so I suspect
> the job has already exited and is a zombie stuck on that mutex.

~ # ps auxwwwf | fgrep -B10 -A2 Src/zsh
abe      18988  0.0  0.1  41920  9216 ?        SNs  01:53   0:00          \_ /usr/bin/perl /usr/bin/debuild -uc -us -B -j5
abe      19029  0.0  0.0   8964  1388 ?        SN   01:53   0:00              \_ tee ../zsh_5.0.6-2_amd64.build
abe      19030  0.0  0.1  41788 11232 ?        SN   01:53   0:00              \_ /usr/bin/perl /usr/bin/dpkg-buildpackage -rfakeroot -D -us -uc -i -j3 -B -j5
abe      13060  0.0  0.0  22632  3220 ?        SN   01:54   0:00                  \_ /bin/bash /usr/bin/fakeroot debian/rules binary-arch
abe      13075  0.0  0.0  13444  2424 ?        SN   01:54   0:00                      \_ /usr/bin/make -f debian/rules binary-arch
abe      22865  0.0  0.0   8476   808 ?        SN   01:55   0:00                          \_ /bin/sh -c HOME="/home/abe/zsh/zsh/obj-static/testhome" dh_auto_test -B obj-static --parallel || true
abe      22866  0.0  0.1  47996 13312 ?        SN   01:55   0:00                              \_ /usr/bin/perl -w /usr/bin/dh_auto_test -B obj-static --parallel
abe      22875  0.0  0.0  13444  2464 ?        SN   01:55   0:00                                  \_ make -j5 test
abe      22876  0.0  0.0   8476   828 ?        SN   01:55   0:00                                      \_ /bin/sh -c cd Test ; make check
abe      22877  0.0  0.0  13444  2284 ?        SN   01:55   0:00                                          \_ make check
abe      22879  0.0  0.0   8476   924 ?        SN   01:55   0:00                                              \_ /bin/sh -c if ZTST_testlist="`for f in ../../Test/*.ztst; \            do echo $f; done`" \  ZTST_srcdir="../../Test" \  ZTST_exe=../Src/zsh \  ../Src/zsh +Z -f ../../Test/runtests.zsh; then \  stat=0; \ else \  stat=1; \ fi; \ sleep 1; \ rm -rf Modules .zcompdump; \ exit $stat
abe      22881  0.0  0.0   7428  2120 ?        SN   01:55   0:00                                                  \_ ../Src/zsh +Z -f ../../Test/runtests.zsh
abe      23664  0.0  0.0   8084  2812 ?        SN   01:55   0:00                                                      \_ ../Src/zsh +Z -f ../../Test/ztst.zsh ../../Test/A05execution.ztst

This means the issue occurred while running the test suite against the
static build of the test suite.

I'll tomorrow check the other failed builds if that was the case
there, too. If so, I'll disable the test suite for the static build
completely and hope that's the fix for now. (We currently ignore the
result for the static build anyways since C02cond.ztst always fails
there. See some other mail from me several months if not a year ago.)

So what's different to Bart's case are the backtraces, latest children
first:

~ # gdb -p 23664
GNU gdb (Debian 7.7.1+dfsg-3) 7.7.1
[...]
Attaching to process 23664
Reading symbols from /home/abe/zsh/zsh/obj-static/Src/zsh...done.
0x000000000054f0eb in __lll_lock_wait_private ()
(gdb) bt
#0  0x000000000054f0eb in __lll_lock_wait_private ()
#1  0x000000000050d551 in _L_lock_3682 ()
#2  0x0000000000508b1b in _int_free ()
#3  0x00000000004349cd in freejob (jn=jn@entry=0x131e430, deleting=deleting@entry=1) at ../../Src/jobs.c:1240
#4  0x0000000000433578 in deletejob (jn=jn@entry=0x131e430, disowning=disowning@entry=0) at ../../Src/jobs.c:1295
#5  0x00000000004339ba in printjob (jn=jn@entry=0x131e430, lng=0, synch=synch@entry=0) at ../../Src/jobs.c:1146
#6  0x00000000004348d0 in update_job (jn=0x131e430) at ../../Src/jobs.c:565
#7  0x00000000004636e1 in wait_for_processes () at ../../Src/signals.c:515
#8  0x0000000000463df5 in zhandler (sig=17) at ../../Src/signals.c:594
#9  <signal handler called>
#10 0x0000000000509038 in _int_free ()
#11 0x0000000000417f92 in execshfunc (shf=0x1374c10, args=0x7f847b7567c8) at ../../Src/exec.c:4434
#12 0x00000000004184b0 in execshfunc (args=0x7f847b7567c8, shf=0x1374c10) at ../../Src/exec.c:4310
#13 execfuncdef (state=0x7fff6eb724f0, do_exec=<optimized out>) at ../../Src/exec.c:4319
#14 0x0000000000415cdd in execsimple (state=state@entry=0x7fff6eb724f0) at ../../Src/exec.c:1120
#15 0x000000000041cfb0 in execlist (state=state@entry=0x7fff6eb724f0, dont_change_job=dont_change_job@entry=1, exiting=exiting@entry=0) at ../../Src/exec.c:1259
#16 0x000000000041d540 in execode (p=p@entry=0x1339300, dont_change_job=dont_change_job@entry=1, exiting=exiting@entry=0, context=context@entry=0x59b98b "shfunc") at ../../Src/exec.c:1070
#17 0x0000000000417610 in runshfunc (prog=0x1339300, wrap=0x0, name=0x7f847b75b5e0 "ZTST_execchunk") at ../../Src/exec.c:4895
#18 0x0000000000417bd6 in doshfunc (shfunc=shfunc@entry=0x13390d0, doshargs=doshargs@entry=0x7f847b75b498, noreturnval=noreturnval@entry=0) at ../../Src/exec.c:4775
#19 0x0000000000417f80 in execshfunc (shf=0x13390d0, args=0x7f847b75b498) at ../../Src/exec.c:4432
#20 0x000000000041b7d9 in execshfunc (args=<optimized out>, shf=<optimized out>) at ../../Src/exec.c:4400
#21 execcmd (state=0x7fff6eb73720, input=input@entry=0, output=20401232, output@entry=0, how=20401232, how@entry=2, last1=0) at ../../Src/exec.c:3269
#22 0x000000000041bcfe in execpline2 (state=state@entry=0x7fff6eb73720, pcode=pcode@entry=5507, how=how@entry=2, input=0, output=0, last1=last1@entry=0) at ../../Src/exec.c:1691
#23 0x000000000041c0c6 in execpline (state=state@entry=0x7fff6eb73720, slcode=<optimized out>, how=how@entry=2, last1=0) at ../../Src/exec.c:1478
#24 0x000000000041d271 in execlist (state=state@entry=0x7fff6eb73720, dont_change_job=dont_change_job@entry=1, exiting=exiting@entry=0) at ../../Src/exec.c:1261
#25 0x000000000043d471 in execif (state=0x7fff6eb73720, do_exec=0) at ../../Src/loop.c:525
#26 0x000000000041ac06 in execcmd (state=0x7fff6eb73720, input=input@entry=0, output=20401232, output@entry=0, how=20401232, how@entry=2, last1=0) at ../../Src/exec.c:3232
#27 0x000000000041bcfe in execpline2 (state=state@entry=0x7fff6eb73720, pcode=pcode@entry=5123, how=how@entry=2, input=0, output=0, last1=last1@entry=0) at ../../Src/exec.c:1691
#28 0x000000000041c0c6 in execpline (state=state@entry=0x7fff6eb73720, slcode=<optimized out>, how=how@entry=2, last1=0) at ../../Src/exec.c:1478
#29 0x000000000041d271 in execlist (state=state@entry=0x7fff6eb73720, dont_change_job=dont_change_job@entry=1, exiting=exiting@entry=0) at ../../Src/exec.c:1261
#30 0x000000000043d0bd in execwhile (state=0x7fff6eb73720, do_exec=<optimized out>) at ../../Src/loop.c:420
#31 0x000000000041ac06 in execcmd (state=0x7fff6eb73720, input=input@entry=0, output=20401232, output@entry=0, how=20401232, how@entry=2, last1=0) at ../../Src/exec.c:3232
#32 0x000000000041bcfe in execpline2 (state=state@entry=0x7fff6eb73720, pcode=pcode@entry=323, how=how@entry=2, input=0, output=0, last1=last1@entry=0) at ../../Src/exec.c:1691
#33 0x000000000041c0c6 in execpline (state=state@entry=0x7fff6eb73720, slcode=<optimized out>, how=how@entry=2, last1=0) at ../../Src/exec.c:1478
#34 0x000000000041d271 in execlist (state=state@entry=0x7fff6eb73720, dont_change_job=dont_change_job@entry=1, exiting=exiting@entry=0) at ../../Src/exec.c:1261
#35 0x000000000041d540 in execode (p=p@entry=0x1339440, dont_change_job=dont_change_job@entry=1, exiting=exiting@entry=0, context=context@entry=0x59b98b "shfunc") at ../../Src/exec.c:1070
#36 0x0000000000417610 in runshfunc (prog=0x1339440, wrap=0x0, name=0x7f847b75a910 "ZTST_test") at ../../Src/exec.c:4895
#37 0x0000000000417bd6 in doshfunc (shfunc=shfunc@entry=0x1338820, doshargs=doshargs@entry=0x7f847b75a8b0, noreturnval=noreturnval@entry=0) at ../../Src/exec.c:4775
#38 0x0000000000417f80 in execshfunc (shf=0x1338820, args=0x7f847b75a8b0) at ../../Src/exec.c:4432
#39 0x000000000041b7d9 in execshfunc (args=<optimized out>, shf=<optimized out>) at ../../Src/exec.c:4400
#40 execcmd (state=0x7fff6eb74980, input=input@entry=0, output=20401232, output@entry=0, how=20401232, how@entry=2, last1=0) at ../../Src/exec.c:3269
#41 0x000000000041bcfe in execpline2 (state=state@entry=0x7fff6eb74980, pcode=pcode@entry=32515, how=how@entry=2, input=0, output=0, last1=last1@entry=0) at ../../Src/exec.c:1691
#42 0x000000000041c0c6 in execpline (state=state@entry=0x7fff6eb74980, slcode=<optimized out>, how=how@entry=2, last1=0) at ../../Src/exec.c:1478
#43 0x000000000041d271 in execlist (state=state@entry=0x7fff6eb74980, dont_change_job=dont_change_job@entry=1, exiting=0) at ../../Src/exec.c:1261
#44 0x000000000043d5e7 in execcase (state=0x7fff6eb74980, do_exec=0) at ../../Src/loop.c:603
#45 0x000000000041ac06 in execcmd (state=0x7fff6eb74980, input=input@entry=0, output=20401232, output@entry=0, how=20401232, how@entry=18, last1=0) at ../../Src/exec.c:3232
#46 0x000000000041bcfe in execpline2 (state=state@entry=0x7fff6eb74980, pcode=pcode@entry=31491, how=how@entry=18, input=0, output=0, last1=last1@entry=0) at ../../Src/exec.c:1691
#47 0x000000000041c0c6 in execpline (state=state@entry=0x7fff6eb74980, slcode=<optimized out>, how=how@entry=18, last1=0) at ../../Src/exec.c:1478
#48 0x000000000041d271 in execlist (state=state@entry=0x7fff6eb74980, dont_change_job=dont_change_job@entry=1, exiting=exiting@entry=0) at ../../Src/exec.c:1261
#49 0x000000000043d0bd in execwhile (state=0x7fff6eb74980, do_exec=<optimized out>) at ../../Src/loop.c:420
#50 0x000000000041ac06 in execcmd (state=0x7fff6eb74980, input=input@entry=0, output=20401232, output@entry=0, how=20401232, how@entry=18, last1=0) at ../../Src/exec.c:3232
#51 0x000000000041bcfe in execpline2 (state=state@entry=0x7fff6eb74980, pcode=pcode@entry=31427, how=how@entry=18, input=0, output=0, last1=last1@entry=0) at ../../Src/exec.c:1691
#52 0x000000000041c0c6 in execpline (state=state@entry=0x7fff6eb74980, slcode=<optimized out>, how=how@entry=18, last1=0) at ../../Src/exec.c:1478
#53 0x000000000041d271 in execlist (state=state@entry=0x7fff6eb74980, dont_change_job=dont_change_job@entry=0, exiting=exiting@entry=0) at ../../Src/exec.c:1261
#54 0x000000000041d540 in execode (p=p@entry=0x7f847b75a230, dont_change_job=dont_change_job@entry=0, exiting=exiting@entry=0, context=context@entry=0x59cf89 "toplevel") at ../../Src/exec.c:1070
#55 0x000000000042e00c in loop (toplevel=toplevel@entry=1, justonce=justonce@entry=0) at ../../Src/init.c:185
#56 0x00000000004313b6 in zsh_main (argc=<optimized out>, argv=<optimized out>) at ../../Src/init.c:1625
#57 0x00000000004f1840 in __libc_start_main ()
#58 0x0000000000401c87 in _start ()

~ # gdb -p 22881
GNU gdb (Debian 7.7.1+dfsg-3) 7.7.1
[...]
Attaching to process 22881
Reading symbols from /home/abe/zsh/zsh/obj-static/Src/zsh...done.
0x00000000004f8746 in sigsuspend ()
(gdb) bt
#0  0x00000000004f8746 in sigsuspend ()
#1  0x0000000000463296 in signal_suspend (sig=sig@entry=17, wait_cmd=wait_cmd@entry=0) at ../../Src/signals.c:375
#2  0x0000000000434e46 in zwaitjob (job=<optimized out>, wait_cmd=0) at ../../Src/jobs.c:1454
#3  0x0000000000435527 in waitjobs () at ../../Src/jobs.c:1499
#4  0x000000000041c6eb in execpline (state=state@entry=0x7fffd409efc0, slcode=<optimized out>, how=<optimized out>, how@entry=2, last1=0) at ../../Src/exec.c:1554
#5  0x000000000041d271 in execlist (state=state@entry=0x7fffd409efc0, dont_change_job=dont_change_job@entry=1, exiting=0) at ../../Src/exec.c:1261
#6  0x000000000043c581 in execfor (state=0x7fffd409efc0, do_exec=0) at ../../Src/loop.c:164
#7  0x000000000041ac06 in execcmd (state=0x7fffd409efc0, input=8, input@entry=0, output=output@entry=0, how=-1, how@entry=18, last1=368) at ../../Src/exec.c:3232
#8  0x000000000041bcfe in execpline2 (state=state@entry=0x7fffd409efc0, pcode=pcode@entry=771, how=how@entry=18, input=0, output=0, last1=last1@entry=0) at ../../Src/exec.c:1691
#9  0x000000000041c0c6 in execpline (state=state@entry=0x7fffd409efc0, slcode=<optimized out>, how=how@entry=18, last1=0) at ../../Src/exec.c:1478
#10 0x000000000041d271 in execlist (state=state@entry=0x7fffd409efc0, dont_change_job=dont_change_job@entry=0, exiting=exiting@entry=0) at ../../Src/exec.c:1261
#11 0x000000000041d540 in execode (p=p@entry=0x7fbe3410a948, dont_change_job=dont_change_job@entry=0, exiting=exiting@entry=0, context=context@entry=0x59cf89 "toplevel") at ../../Src/exec.c:1070
#12 0x000000000042e00c in loop (toplevel=toplevel@entry=1, justonce=justonce@entry=0) at ../../Src/init.c:185
#13 0x00000000004313b6 in zsh_main (argc=<optimized out>, argv=<optimized out>) at ../../Src/init.c:1625
#14 0x00000000004f1840 in __libc_start_main ()
#15 0x0000000000401c87 in _start ()

Hope this helps!

For now the process is still there and I can still do some more
investigation -- if it survives my Thinkpad going to Suspend-to-RAM
when I go to work in a few hours.

		Kind regards, Axel
-- 
/~\  Plain Text Ribbon Campaign                   | Axel Beckert
\ /  Say No to HTML in E-Mail and News            | abe@deuxchevaux.org  (Mail)
 X   See http://www.nonhtmlmail.org/campaign.html | abe@noone.org (Mail+Jabber)
/ \  I love long mails: http://email.is-not-s.ms/ | http://noone.org/abe/ (Web)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Pkg-zsh-devel] Bug#760061: "5 seconds to fail"
  2014-09-29  1:20     ` "5 seconds to fail" Axel Beckert
@ 2014-09-29  1:51       ` Axel Beckert
  2014-09-29  6:25         ` Bart Schaefer
  2014-09-30 17:40       ` [Pkg-zsh-devel] " Axel Beckert
  1 sibling, 1 reply; 16+ messages in thread
From: Axel Beckert @ 2014-09-29  1:51 UTC (permalink / raw)
  To: zsh-workers, 760061

Hi again,

Axel Beckert wrote:
> This means the issue occurred while running the test suite against the
> static build of the test suite.
> 
> I'll tomorrow check the other failed builds if that was the case
> there, too.

I was too curious, so I checked all buildd logs with hanging test
suites immediately after my last mail. We seem to have either a bunch
of different cases or (IMHO more likely) rather non-deterministic
places where in the test-suite it happens (which again would be more
in line with Bart's findings a month ago).

1) Hangs in the static build at

../../Test/A04redirect.ztst: starting.

https://buildd.debian.org/status/fetch.php?pkg=zsh&arch=armhf&ver=5.0.6-1&stamp=1409479814
https://buildd.debian.org/status/fetch.php?pkg=zsh&arch=armhf&ver=5.0.6-2&stamp=1411273633

2) Hangs in the static build at

This test takes 5 seconds to fail...

https://buildd.debian.org/status/fetch.php?pkg=zsh&arch=kfreebsd-amd64&ver=5.0.6-1&stamp=1409578643

This is the type I have currently here for debugging. It's on amd64
Linux, i.e. not Debian GNU/kFreeBSD on amd64 as the log mentioned
above.

3) Hangs in the static build at

../../Test/A05execution.ztst: starting.

https://buildd.debian.org/status/fetch.php?pkg=zsh&arch=mipsel&ver=5.0.6-1&stamp=1409529387

4) Hangs in the non-static build at

../../Test/X02zlevi.ztst: starting.

https://buildd.debian.org/status/fetch.php?pkg=zsh&arch=amd64&ver=5.0.5-dev-3-1&stamp=1409099297

5) Hangs in the non-static build at

This test takes 5 seconds to fail...

https://buildd.debian.org/status/fetch.php?pkg=zsh&arch=kfreebsd-amd64&ver=5.0.5-dev-2-1&stamp=1407974756
https://buildd.debian.org/status/fetch.php?pkg=zsh&arch=kfreebsd-amd64&ver=5.0.5-dev-2-1&stamp=1407936338

6) Hangs in the non-static build at

../../Test/A05execution.ztst: starting.

https://buildd.debian.org/status/fetch.php?pkg=zsh&arch=kfreebsd-amd64&ver=5.0.6-1&stamp=1409502508

So it's clearly not always the test suite of the static build. But I
plan to disable the test suite run on the statically build zsh for now
anyways to reduce the probability to run into that issue -- as we
ignore its result currently.

Additionally, I've got some more information about the hanging process
in case that helps:

Console output:

HOME="/home/abe/zsh/zsh/obj-static/testhome" dh_auto_test -B obj-static --parallel || true
make[1]: Entering directory '/home/abe/zsh/zsh/obj-static'
cd Test ; make check
make[2]: Entering directory '/home/abe/zsh/zsh/obj-static/Test'
if test -n ""; then \
  cd .. && DESTDIR= \
  make MODDIR=`pwd`/Test/Modules install.modules > /dev/null; \
fi
if ZTST_testlist="`for f in ../../Test/*.ztst; \
           do echo $f; done`" \
 ZTST_srcdir="../../Test" \
 ZTST_exe=../Src/zsh \
 ../Src/zsh +Z -f ../../Test/runtests.zsh; then \
 stat=0; \
else \
 stat=1; \
fi; \
sleep 1; \
rm -rf Modules .zcompdump; \
exit $stat
../../Test/A01grammar.ztst: starting.
This test hangs the shell when it fails...
../../Test/A01grammar.ztst: all tests successful.
../../Test/A02alias.ztst: starting.
This test hangs the shell when it fails...
../../Test/A02alias.ztst: all tests successful.
../../Test/A03quoting.ztst: starting.
../../Test/A03quoting.ztst: all tests successful.
../../Test/A04redirect.ztst: starting.
../../Test/A04redirect.ztst: all tests successful.
../../Test/A05execution.ztst: starting.
Unable to change MONITOR option
This test takes 5 seconds to fail...
<Hangs here>

Open files, again latest child first:

~ # lsof -p 23664
COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF     NODE NAME
zsh     23664  abe  cwd    DIR  253,3     4096  1053779 /home/abe/zsh/zsh/obj-static/Test/command.tmp
zsh     23664  abe  rtd    DIR  253,1     4096        2 /
zsh     23664  abe  txt    REG  253,3  5567456  1053778 /home/abe/zsh/zsh/obj-static/Src/zsh
zsh     23664  abe  mem    REG  253,1   151984   942005 /usr/lib/locale/C.UTF-8/LC_CTYPE
zsh     23664  abe  mem    REG  253,1       50   942004 /usr/lib/locale/C.UTF-8/LC_NUMERIC
zsh     23664  abe  mem    REG  253,1     2454   942993 /usr/lib/locale/C.UTF-8/LC_TIME
zsh     23664  abe  mem    REG  253,1  1501202   941630 /usr/lib/locale/C.UTF-8/LC_COLLATE
zsh     23664  abe  mem    REG  253,1      270   942576 /usr/lib/locale/C.UTF-8/LC_MONETARY
zsh     23664  abe  mem    REG  253,1       48   942574 /usr/lib/locale/C.UTF-8/LC_MESSAGES/SYS_LC_MESSAGES
zsh     23664  abe  mem    REG  253,1       34   941654 /usr/lib/locale/C.UTF-8/LC_PAPER
zsh     23664  abe  mem    REG  253,1       62   938215 /usr/lib/locale/C.UTF-8/LC_NAME
zsh     23664  abe  mem    REG  253,1      131   938217 /usr/lib/locale/C.UTF-8/LC_ADDRESS
zsh     23664  abe  mem    REG  253,1       47   942270 /usr/lib/locale/C.UTF-8/LC_TELEPHONE
zsh     23664  abe  mem    REG  253,1       23   942273 /usr/lib/locale/C.UTF-8/LC_MEASUREMENT
zsh     23664  abe  mem    REG  253,1    26258  1010059 /usr/lib/x86_64-linux-gnu/gconv/gconv-modules.cache
zsh     23664  abe  mem    REG  253,1      168   942571 /usr/lib/locale/C.UTF-8/LC_IDENTIFICATION
zsh     23664  abe  mem    REG  253,1  3124784   930019 /usr/lib/locale/locale-archive
zsh     23664  abe    0r   REG   0,32        0 36967738 /tmp/zsh.ztst.in.23664
zsh     23664  abe    1w   REG   0,32        9 36958604 /tmp/zsh.ztst.tout.23664
zsh     23664  abe    2w   REG   0,32        0 36958605 /tmp/zsh.ztst.terr.23664
zsh     23664  abe   10r   REG  253,3    14300  2101965 /home/abe/zsh/zsh/Test/ztst.zsh
zsh     23664  abe   11w  FIFO    0,8      0t0 36921442 pipe
zsh     23664  abe   12r   REG  253,3     4802  2101923 /home/abe/zsh/zsh/Test/A05execution.ztst
zsh     23664  abe   13r  FIFO    0,8      0t0 36920973 pipe
zsh     23664  abe   14w  FIFO    0,8      0t0 36921442 pipe
zsh     23664  abe   15w  FIFO    0,8      0t0 36921442 pipe
zsh     23664  abe   16r  FIFO    0,8      0t0 36968504 pipe
zsh     23664  abe   19w  FIFO    0,8      0t0 36968505 pipe

~ # lsof -p 22881
COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF     NODE NAME
zsh     22881  abe  cwd    DIR  253,3     4096  1053452 /home/abe/zsh/zsh/obj-static/Test
zsh     22881  abe  rtd    DIR  253,1     4096        2 /
zsh     22881  abe  txt    REG  253,3  5567456  1053778 /home/abe/zsh/zsh/obj-static/Src/zsh
zsh     22881  abe  mem    REG  253,1   151984   942005 /usr/lib/locale/C.UTF-8/LC_CTYPE
zsh     22881  abe  mem    REG  253,1       50   942004 /usr/lib/locale/C.UTF-8/LC_NUMERIC
zsh     22881  abe  mem    REG  253,1     2454   942993 /usr/lib/locale/C.UTF-8/LC_TIME
zsh     22881  abe  mem    REG  253,1  1501202   941630 /usr/lib/locale/C.UTF-8/LC_COLLATE
zsh     22881  abe  mem    REG  253,1      270   942576 /usr/lib/locale/C.UTF-8/LC_MONETARY
zsh     22881  abe  mem    REG  253,1       48   942574 /usr/lib/locale/C.UTF-8/LC_MESSAGES/SYS_LC_MESSAGES
zsh     22881  abe  mem    REG  253,1       34   941654 /usr/lib/locale/C.UTF-8/LC_PAPER
zsh     22881  abe  mem    REG  253,1       62   938215 /usr/lib/locale/C.UTF-8/LC_NAME
zsh     22881  abe  mem    REG  253,1      131   938217 /usr/lib/locale/C.UTF-8/LC_ADDRESS
zsh     22881  abe  mem    REG  253,1       47   942270 /usr/lib/locale/C.UTF-8/LC_TELEPHONE
zsh     22881  abe  mem    REG  253,1       23   942273 /usr/lib/locale/C.UTF-8/LC_MEASUREMENT
zsh     22881  abe  mem    REG  253,1    26258  1010059 /usr/lib/x86_64-linux-gnu/gconv/gconv-modules.cache
zsh     22881  abe  mem    REG  253,1      168   942571 /usr/lib/locale/C.UTF-8/LC_IDENTIFICATION
zsh     22881  abe  mem    REG  253,1  3124784   930019 /usr/lib/locale/locale-archive
zsh     22881  abe    0r  FIFO    0,8      0t0 36920973 pipe
zsh     22881  abe    1w  FIFO    0,8      0t0 36921442 pipe
zsh     22881  abe    2w  FIFO    0,8      0t0 36921442 pipe
zsh     22881  abe   10r   REG  253,3      758  2101964 /home/abe/zsh/zsh/Test/runtests.zsh

Contents of the open files in /tmp/:

~ # head -100 /tmp/zsh.ztst.*
==> /tmp/zsh.ztst.err.23664 <==

==> /tmp/zsh.ztst.in.23664 <==

==> /tmp/zsh.ztst.out.23664 <==
1
2
done

==> /tmp/zsh.ztst.terr.23664 <==

==> /tmp/zsh.ztst.tout.23664 <==
1
2
done


HTH.

		Kind regards, Axel
-- 
/~\  Plain Text Ribbon Campaign                   | Axel Beckert
\ /  Say No to HTML in E-Mail and News            | abe@deuxchevaux.org  (Mail)
 X   See http://www.nonhtmlmail.org/campaign.html | abe@noone.org (Mail+Jabber)
/ \  I love long mails: http://email.is-not-s.ms/ | http://noone.org/abe/ (Web)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Pkg-zsh-devel] Bug#760061: "5 seconds to fail"
  2014-09-29  1:51       ` [Pkg-zsh-devel] Bug#760061: " Axel Beckert
@ 2014-09-29  6:25         ` Bart Schaefer
  2014-09-29  9:02           ` [Pkg-zsh-devel] Bug#760061: " Axel Beckert
  0 siblings, 1 reply; 16+ messages in thread
From: Bart Schaefer @ 2014-09-29  6:25 UTC (permalink / raw)
  To: 760061, zsh-workers

On Sep 29,  3:51am, Axel Beckert wrote:
}
} 1) Hangs in the static build at
} 
} ../../Test/A04redirect.ztst: starting.

This doesn't mean very much; none of the tests in A04 produce any output
(unless they fail) so it could be stuck anywhere.

To refine, the "make check" need to be run with ZTST_verbose=1 in the
environment.  See Test/ztst.zsh for more.

} 2) Hangs in the static build at
} 
} This test takes 5 seconds to fail...

This is the only one to which my previous discussions apply.

I've since reproducesd this several times, never intentionally but it
always happens immediately after recompiling the binaries.

} 3) Hangs in the static build at
} 
} ../../Test/A05execution.ztst: starting.

Again this could be in any test, if possible retry with ZTST_verbose=1
 
} 4) Hangs in the non-static build at
} 
} ../../Test/X02zlevi.ztst: starting.

Once more, anywhere.  X02 manipulates zpty terminals to emulate an
interactive session, so is a likely place for race conditions.

} 5) Hangs in the non-static build at
} 
} This test takes 5 seconds to fail...

See (2).

} 6) Hangs in the non-static build at
} 
} ../../Test/A05execution.ztst: starting.

See (3).


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Pkg-zsh-devel] Bug#760061:  Bug#760061: "5 seconds to fail"
  2014-09-29  6:25         ` Bart Schaefer
@ 2014-09-29  9:02           ` Axel Beckert
  0 siblings, 0 replies; 16+ messages in thread
From: Axel Beckert @ 2014-09-29  9:02 UTC (permalink / raw)
  To: zsh-workers; +Cc: 760061

Hi Bart,

thanks for the feedback.

Bart Schaefer wrote:
> Again this could be in any test, if possible retry with ZTST_verbose=1

Will do. Let's see how quick I manage to reproduce the issue again.

		Kind regards, Axel
-- 
/~\  Plain Text Ribbon Campaign                   | Axel Beckert
\ /  Say No to HTML in E-Mail and News            | abe@deuxchevaux.org  (Mail)
 X   See http://www.nonhtmlmail.org/campaign.html | abe@noone.org (Mail+Jabber)
/ \  I love long mails: http://email.is-not-s.ms/ | http://noone.org/abe/ (Web)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Pkg-zsh-devel] Bug#760061: "5 seconds to fail"
  2014-09-29  1:20     ` "5 seconds to fail" Axel Beckert
  2014-09-29  1:51       ` [Pkg-zsh-devel] Bug#760061: " Axel Beckert
@ 2014-09-30 17:40       ` Axel Beckert
  2014-10-01  7:12         ` Bart Schaefer
  1 sibling, 1 reply; 16+ messages in thread
From: Axel Beckert @ 2014-09-30 17:40 UTC (permalink / raw)
  To: zsh-workers, 760061; +Cc: ivodd

Hi again,

Axel Beckert wrote:
> And this time I had some luck after something like 70 to 100
> successful package builds in one row, […]

It actually were 118 successful builds in a row before the first test
suite run was hanging.

Axel Beckert wrote:
> Bart Schaefer wrote:
> > Again this could be in any test, if possible retry with ZTST_verbose=1
> 
> Will do. […]

I wasn't able to reproduce the issue since I did that. I'm now at 237
builds with ZTST_verbose=1 in a row without hangs. The latest 10
builds even ran the test suite 10 times in a row per build. I'll let
the builds continue for at least a few more days to check if I can
still reproduce the issue.

But I have a slight hope that enabling verbose output avoids some race
condition which is triggering this issue occasionally.

If setting ZTST_verbose=1 really avoids the issue, this may already
help to get rid of the build failures as I plan to also run the
official package builds with ZTST_verbose=1 -- mainly to get some more
details where they fail exactly.

		Regards, Axel
-- 
 ,''`.  |  Axel Beckert <abe@debian.org>, http://people.debian.org/~abe/
: :' :  |  Debian Developer, ftp.ch.debian.org Admin
`. `'   |  1024D: F067 EA27 26B9 C3FC 1486  202E C09E 1D89 9593 0EDE
  `-    |  4096R: 2517 B724 C5F6 CA99 5329  6E61 2FF9 CD59 6126 16B5


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Pkg-zsh-devel] Bug#760061: "5 seconds to fail"
  2014-09-30 17:40       ` [Pkg-zsh-devel] " Axel Beckert
@ 2014-10-01  7:12         ` Bart Schaefer
  2014-10-01  8:51           ` Axel Beckert
  2014-10-02  7:10           ` Axel Beckert
  0 siblings, 2 replies; 16+ messages in thread
From: Bart Schaefer @ 2014-10-01  7:12 UTC (permalink / raw)
  To: Axel Beckert, zsh-workers, 760061; +Cc: ivodd

On Sep 30,  7:40pm, Axel Beckert wrote:
}
} But I have a slight hope that enabling verbose output avoids some race
} condition which is triggering this issue occasionally.

That's quite likely.

Also the patch in zsh-workers/33298 may resolve some deadlocks caused by
signals interrupting memory management.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Pkg-zsh-devel] Bug#760061: "5 seconds to fail"
  2014-10-01  7:12         ` Bart Schaefer
@ 2014-10-01  8:51           ` Axel Beckert
  2014-10-02  7:10           ` Axel Beckert
  1 sibling, 0 replies; 16+ messages in thread
From: Axel Beckert @ 2014-10-01  8:51 UTC (permalink / raw)
  To: zsh-workers, 760061; +Cc: ivodd

Hi Bart,

Bart Schaefer wrote:
> On Sep 30,  7:40pm, Axel Beckert wrote:
> } But I have a slight hope that enabling verbose output avoids some race
> } condition which is triggering this issue occasionally.
> 
> That's quite likely.

Perfect. :-)

> Also the patch in zsh-workers/33298 may resolve some deadlocks caused by
> signals interrupting memory management.

JFTR for those not reading zsh-workers, Bart refers to this mail:
http://www.zsh.org/mla/workers/2014/msg01086.html

Yep, saw that. Already wondered if that's related when I saw Vincent's
bug report (http://www.zsh.org/mla/workers/2014/msg01083.html). And I
was surely happy to see a fix for it in such a short time. Thanks!

I'll likely cherry-pick it for the Debian package.

		Regards, Axel
-- 
 ,''`.  |  Axel Beckert <abe@debian.org>, http://people.debian.org/~abe/
: :' :  |  Debian Developer, ftp.ch.debian.org Admin
`. `'   |  1024D: F067 EA27 26B9 C3FC 1486  202E C09E 1D89 9593 0EDE
  `-    |  4096R: 2517 B724 C5F6 CA99 5329  6E61 2FF9 CD59 6126 16B5


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Pkg-zsh-devel] Bug#760061: "5 seconds to fail"
  2014-10-01  7:12         ` Bart Schaefer
  2014-10-01  8:51           ` Axel Beckert
@ 2014-10-02  7:10           ` Axel Beckert
  2014-10-02 15:30             ` Bart Schaefer
  1 sibling, 1 reply; 16+ messages in thread
From: Axel Beckert @ 2014-10-02  7:10 UTC (permalink / raw)
  To: zsh-workers; +Cc: pkg-zsh-devel

Hi,

On Wed, Oct 01, 2014 at 12:12:01AM -0700, Bart Schaefer wrote:
> On Sep 30,  7:40pm, Axel Beckert wrote:
> }
> } But I have a slight hope that enabling verbose output avoids some race
> } condition which is triggering this issue occasionally.
> 
> That's quite likely.
> 
> Also the patch in zsh-workers/33298 may resolve some deadlocks caused by
> signals interrupting memory management.

Despite these two things (I cherry-picked the patch in
zsh-workers/33298 from git.) and the fact that I no more run the test
suite against the static build (which should reduce the chance for
happening by a factor of 2), the test suite hung again on one of the
build daemons. But so far only on kfreebsd-amd64 -- where it happened
by far most often, i.e. there may be something architecture-specific,
too.

Relevant part of the log:

../../Test/A05execution.ztst: starting.
Running test: ./prog execution
Test successful.
Running test: path (1)
Test successful.
Running test: path (2)
Test successful.
Running test: function argument passing
Test successful.
Running test: Aliases in functions
Test successful.
Running test: EXIT trap environment
Test successful.
Running test: return (1)
Test successful.
Running test: return (2)
Test successful.
Running test: autoloading (1)
Test successful.
Running test: autoloading with initialization
Test successful.
Running test: autoloading via -X
Test successful.
Running test: chpwd
Test successful.
Running test: chpwd_functions
Test successful.
Running test: TRAPEXIT
Test successful.
Running test: TRAPDEBUG
Test successful.
Running test: trap DEBUG
Test successful.
Running test: TRAPZERR
Test successful.
Running test: trap ZERR
Test successful.
Running test: Status reset by starting a backgrounded command
E: Caught signal ‘Terminated’: terminating immediately
make[1]: *** [test] Terminated
make: *** [build-arch] Terminated
Makefile:264: recipe for target 'test' failed
debian/rules:54: recipe for target 'build-arch' failed
make[2]: *** [check] Terminated
Makefile:188: recipe for target 'check' failed
Build killed with signal TERM after 150 minutes of inactivity

Full log at https://buildd.debian.org/status/fetch.php?pkg=zsh&arch=kfreebsd-amd64&ver=5.0.6-3&stamp=1412220279

		Kind regards, Axel
-- 
/~\  Plain Text Ribbon Campaign                   | Axel Beckert
\ /  Say No to HTML in E-Mail and News            | abe@deuxchevaux.org  (Mail)
 X   See http://www.nonhtmlmail.org/campaign.html | abe@noone.org (Mail+Jabber)
/ \  I love long mails: http://email.is-not-s.ms/ | http://noone.org/abe/ (Web)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Pkg-zsh-devel] Bug#760061: "5 seconds to fail"
  2014-10-02  7:10           ` Axel Beckert
@ 2014-10-02 15:30             ` Bart Schaefer
  2014-10-02 15:57               ` Axel Beckert
  0 siblings, 1 reply; 16+ messages in thread
From: Bart Schaefer @ 2014-10-02 15:30 UTC (permalink / raw)
  To: Axel Beckert, zsh-workers

[Removed debian from the reply for now.]

On Oct 2,  9:10am, Axel Beckert wrote:
}
} Running test: Status reset by starting a backgrounded command
} E: Caught signal `Terminated': terminating immediately

Well, THAT'S a bit unexpected.  Unless for some reason the output has
not been flushed, this means it's hanging here:

  false
  sleep 1000 &
  print $?
  kill $!


That's the third-from-last test in A05.  I don't suppose you could move
it to the very end and see if that changes the results?  E.g., if it
then fails at "trap ZERR" then lack of output flushing is masking the
real problem.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Pkg-zsh-devel] Bug#760061: "5 seconds to fail"
  2014-10-02 15:30             ` Bart Schaefer
@ 2014-10-02 15:57               ` Axel Beckert
  0 siblings, 0 replies; 16+ messages in thread
From: Axel Beckert @ 2014-10-02 15:57 UTC (permalink / raw)
  To: zsh-workers

Hi Bart,

On Thu, Oct 02, 2014 at 08:30:20AM -0700, Bart Schaefer wrote:
> [Removed debian from the reply for now.]

That's ok. I just want to track that we're working on it because the
Debian Release Team has an eye on that issue...

> On Oct 2,  9:10am, Axel Beckert wrote:
> } Running test: Status reset by starting a backgrounded command
> } E: Caught signal `Terminated': terminating immediately
> 
> Well, THAT'S a bit unexpected.  Unless for some reason the output has
> not been flushed, this means it's hanging here:

I wondered about flushed/non-flushed output, too. Usually it's saved
via "| tee"...

> That's the third-from-last test in A05.  I don't suppose you could move
> it to the very end and see if that changes the results?

Since the recent build only showed hanging on that architecture where
it happend most often in the past, I plan to run my next build
marathon on that architecture to see if I can reproduce it there
easier.

I can do that modification for that run.

		Kind regards, Axel
-- 
/~\  Plain Text Ribbon Campaign                   | Axel Beckert
\ /  Say No to HTML in E-Mail and News            | abe@deuxchevaux.org  (Mail)
 X   See http://www.nonhtmlmail.org/campaign.html | abe@noone.org (Mail+Jabber)
/ \  I love long mails: http://email.is-not-s.ms/ | http://noone.org/abe/ (Web)


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2014-10-02 15:57 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-24 17:39 5.0.5-dev-3 Peter Stephenson
2014-08-26 17:11 ` 5.0.5-dev-3 Dominic Hopf
2014-08-27 16:28 ` 5.0.5-dev-3 Axel Beckert
2014-08-28 19:54   ` 5.0.5-dev-3 Axel Beckert
2014-08-30 23:46     ` 5.0.5-dev-3 Axel Beckert
2014-08-30 23:25   ` "5 seconds to fail" 5.0.5-dev-3 Bart Schaefer
2014-09-29  1:20     ` "5 seconds to fail" Axel Beckert
2014-09-29  1:51       ` [Pkg-zsh-devel] Bug#760061: " Axel Beckert
2014-09-29  6:25         ` Bart Schaefer
2014-09-29  9:02           ` [Pkg-zsh-devel] Bug#760061: " Axel Beckert
2014-09-30 17:40       ` [Pkg-zsh-devel] " Axel Beckert
2014-10-01  7:12         ` Bart Schaefer
2014-10-01  8:51           ` Axel Beckert
2014-10-02  7:10           ` Axel Beckert
2014-10-02 15:30             ` Bart Schaefer
2014-10-02 15:57               ` Axel Beckert

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).