zsh-workers
 help / color / mirror / code / Atom feed
* zsh froze under Mac OS X
@ 2010-11-22 14:49 Vincent Lefevre
  2010-11-22 15:39 ` Bart Schaefer
  0 siblings, 1 reply; 5+ messages in thread
From: Vincent Lefevre @ 2010-11-22 14:49 UTC (permalink / raw)
  To: zsh-workers

Hi,

I had the following problem with zsh 4.3.10-dev-2 under Mac OS X
(it includes two patches that where posted here: one for
Src/Zle/computil.c and one for Src/jobs.c).

I did the following:
1. I quit a program (mutt) that was running for several days in screen.
2. A few minutes later, I reran the same program (but I don't think
   zsh started it because nothing occurred).
3. I did a Ctrl-C, then I couldn't have the prompt.

But as the machine started to be very slow after (1), I suspect that
problems started to appear at that time.

When I looked with htop after (3), I could see that zsh was taking
100% CPU time with more and more memory (I killed it with "kill -9",
while it was at 40% memory, according to htop).

AFAIK, this is the first time I get this problem exactly. But note
that I often get zsh crashes when a ssh started by zsh and running
for several days terminates.

Output of "sample 1539 10 10" while zsh was running:

Analysis of sampling pid 1539 every 10.000000 milliseconds
Call graph:
    993 Thread_100b
      993 start
        993 _start
          993 zsh_main
            993 zexit
              993 sourcehome
                993 source
                  993 loop
                    993 parse_event
                      993 par_event
                        993 par_sublist
                          993 par_sublist2
                            993 par_pline
                              993 par_cmd
                                992 par_simple
                                  989 zshlex
                                    978 gettokstr
                                      973 hcalloc
                                        965 zhalloc
                                          936 zhalloc
                                          29 zalloc
                                            28 malloc
                                              28 szone_malloc
                                                28 large_and_huge_malloc
                                                  26 vm_allocate
                                                    26 mach_msg
                                                      26 mach_msg_trap
                                                        26 mach_msg_trap
                                                  2 large_and_huge_malloc
                                            1 zalloc
                                        8 __bzero
                                          8 __bzero
                                      4 inungetc
                                        2 zshcalloc
                                          1 malloc
                                            1 malloc
                                          1 szone_malloc
                                            1 szone_malloc
                                        1 inputsetline
                                          1 inputsetline
                                        1 malloc
                                          1 malloc
                                      1 inpush
                                        1 inpush
                                    6 ingetc
                                      3 inpoptop
                                        2 free
                                          1 free
                                          1 szone_size
                                            1 szone_size
                                        1 inpoptop
                                      2 free
                                        2 free
                                      1 ingetc
                                    2 exalias
                                      1 exalias
                                      1 gethashnode
                                        1 gethashnode
                                    2 zshlex
                                    1 0x87440
                                      1 inungetc
                                        1 inungetc
                                  2 ecstrcode
                                    2 dyld_stub_strlen
                                      2 dyld_stub_strlen
                                  1 ecadd
                                    1 ecadd
                                1 ecadd
                                  1 ecadd

Total number in stack (recursive counted multiple, when >=5):

Sort by top of stack, same collapsed (when >= 5):
        zhalloc        936
        mach_msg_trap        26
        __bzero        8

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: zsh froze under Mac OS X
  2010-11-22 14:49 zsh froze under Mac OS X Vincent Lefevre
@ 2010-11-22 15:39 ` Bart Schaefer
  2010-11-23 11:08   ` Peter Stephenson
  2010-11-23 12:03   ` Vincent Lefevre
  0 siblings, 2 replies; 5+ messages in thread
From: Bart Schaefer @ 2010-11-22 15:39 UTC (permalink / raw)
  To: zsh-workers

On Nov 22,  3:49pm, Vincent Lefevre wrote:
}
} Call graph:
}             993 zexit
}               993 sourcehome
}                 993 source
}                   993 loop

This indicates that it's reading your .zlogout file.  Anything there
that could account for the behavior?

This ...

}                                     2 exalias
}                                       1 exalias
}                                       1 gethashnode
}                                         1 gethashnode

... makes me suspect you've got a recursively-expanding alias involved,
but that's much less certain than that it's .zlogout related.

-- 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: zsh froze under Mac OS X
  2010-11-22 15:39 ` Bart Schaefer
@ 2010-11-23 11:08   ` Peter Stephenson
  2010-11-23 13:40     ` Vincent Lefevre
  2010-11-23 12:03   ` Vincent Lefevre
  1 sibling, 1 reply; 5+ messages in thread
From: Peter Stephenson @ 2010-11-23 11:08 UTC (permalink / raw)
  To: zsh-workers

On Mon, 22 Nov 2010 07:39:00 -0800
Bart Schaefer <schaefer@brasslantern.com> wrote:
> On Nov 22,  3:49pm, Vincent Lefevre wrote:
> }
> } Call graph:
> }             993 zexit
> }               993 sourcehome
> }                 993 source
> }                   993 loop
> 
> This indicates that it's reading your .zlogout file.  Anything there
> that could account for the behavior?

Obviously we need to see what it's executing to get further.  The
information in the static points in input.c, inbuf and inbufptr, ought
to have been revealing.  Could there be .zwc files involved?

Strange things can happen if the file the shell is executing changes
under it, but the implication here is it started looking at the logout
file and immediately ran into problems, which sounds different.

> This ...
> 
> }                                     2 exalias
> }                                       1 exalias
> }                                       1 gethashnode
> }                                         1 gethashnode
> 
> ... makes me suspect you've got a recursively-expanding alias
> involved, but that's much less certain than that it's .zlogout
> related.

I don't know exactly what the numbers mean, but they're probably
something like the number of times this was called.  This suggests to me
it's only got here once, so it's unlikely to be a key part of the
problem.

-- 
Peter Stephenson <pws@csr.com>            Software Engineer
Tel: +44 (0)1223 692070                   Cambridge Silicon Radio Limited
Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, UK


Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: zsh froze under Mac OS X
  2010-11-22 15:39 ` Bart Schaefer
  2010-11-23 11:08   ` Peter Stephenson
@ 2010-11-23 12:03   ` Vincent Lefevre
  1 sibling, 0 replies; 5+ messages in thread
From: Vincent Lefevre @ 2010-11-23 12:03 UTC (permalink / raw)
  To: zsh-workers

[-- Attachment #1: Type: text/plain, Size: 1914 bytes --]

On 2010-11-22 07:39:00 -0800, Bart Schaefer wrote:
> On Nov 22,  3:49pm, Vincent Lefevre wrote:
> }
> } Call graph:
> }             993 zexit
> }               993 sourcehome
> }                 993 source
> }                   993 loop
> 
> This indicates that it's reading your .zlogout file.  Anything there
> that could account for the behavior?

I have the following:

----
source ~/.zdomain
----

which just has "domain=local.prunille".

----
if [[ -n $SSH_AUTH_SOCK ]] then
  if [[ `whence -w _call_sshagent` == '_call_sshagent: function' ]] then
    _call_sshagent -r
  elif [[ -n $SSH_AGENT_PID ]] then
    eval `ssh-agent -k`
  fi
fi
----

Here "_call_sshagent -r" was executed. I've attached the source of
this function.

----
[[ $OSTYPE == linux && $TTY == /dev/tty* ]] && clear

[[ $OSTYPE == linux && -n $SSH_CLIENT &&
   ${(M)${(f)"$(</proc/$PPID/status)"}:#Name:*} == Name:[[:blank:]]#sshd ]] &&
  kill -HUP $PPID 2>/dev/null
----

The OS isn't linux, so nothing should be done here.

----
true
----

So, it seems that zsh froze in _call_sshagent.

> This ...
> 
> }                                     2 exalias
> }                                       1 exalias
> }                                       1 gethashnode
> }                                         1 gethashnode
> 
> ... makes me suspect you've got a recursively-expanding alias involved,
> but that's much less certain than that it's .zlogout related.

Well, problems started to occur before I tried to exit zsh. There
could have been a problem in .zlogout due to an earlier memory
corruption or something like that.

If this is memory corruption, then this could be the same bug I've
been noticing for years...

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)

[-- Attachment #2: _call_sshagent --]
[-- Type: text/plain, Size: 3249 bytes --]

#!/usr/bin/env zsh

# Usage: _call_sshagent [ -l | -r ]
#   -l: try to use an existing ssh-agent and change SSH_AUTH_SOCK
#       accordingly. This is useful for some non-login shells (no
#       possible clean-up by the .zlogout).
#   -r: remove the socket associated with the current process and
#       kill ssh-agent if there are no sockets any longer.
#
# Note: You should execute _call_sshagent from your .zlogin and have
# the following code (or similar) in your .zlogout so that after you
# exit the last login shell, the authentication agent is killed.
#
# if [[ -n $SSH_AUTH_SOCK ]] then
#   if [[ `whence -w _call_sshagent` == '_call_sshagent: function' ]] then
#     _call_sshagent -r
#   elif [[ -n $SSH_AGENT_PID ]] then
#     eval `ssh-agent -k`
#   fi
# fi
#
# Also, if you use the "screen" utility and do SSH connections from
# it, the shells started by screen should be login shells (thanks to
# a line "shell -zsh" in your .screenrc) to make sure that ssh-agent
# will still be running after you exit all the other shells.

emulate -LR zsh

local link=/tmp/ssh-agent-$USER

local i=0
until (ln -s /dev/null $link.lock 2> /dev/null)
do
  [[ $i -eq 0 ]] && echo "$0: waiting for lock" >&2
  if [[ $((++i)) -eq 4 ]] then
    echo "$0: can't lock $link" >&2
    return
  fi
  sleep 2
done

local dir=`readlink $link`

if [[ $1 == -r ]] then

  if [[ -O $link && -d $dir && -O $dir && $SSH_AUTH_SOCK == $link/* ]] then
    local others
    rm -f $SSH_AUTH_SOCK
    unset SSH_AUTH_SOCK
    others=($dir/agent.*(N=))
    if [[ -z $others ]] then
      local pid=$(<$dir/ssh-agent.pid)
      rm -f $link $dir/ssh-agent.pid
      kill -TERM $pid
      kill_sshmasters
    fi
  else
    # Inconsistent data, try to kill ssh-agent in the standard way
    eval `ssh-agent -k`
  fi

elif [[ $1 == -l ]] then

  if [[ -O $link && -d $dir && -O $dir ]] then
    local old
    old=($link/agent.*(N=[1]))
    if [[ -S $old ]] then
      SSH_AUTH_SOCK=$old ssh-add -l >& /dev/null
      if [[ $? -ne 2 ]] then
        export SSH_AUTH_SOCK=$old
        unset SSH_AGENT_PID
      fi
    else
      echo "$0: $old isn't a socket" >&2
    fi
  fi

else

  if [[ -O $link && -d $dir && -O $dir ]] then
    local old
    old=($link/agent.*(N=[1]))
    if [[ -S $old ]] then
      SSH_AUTH_SOCK=$old ssh-add -l >& /dev/null
      if [[ $? -eq 2 ]] then
        # The agent could not be contacted, assume that it has died
        rm -f $dir/agent.*(N) $dir/ssh-agent.pid && rmdir $dir
        rm -f $link
        rm -f $link.lock
        $0
        return
      fi
      local new=$link/agent.$$
      if [[ $new == $old ]] || ln -f $old $new; then
        export SSH_AUTH_SOCK=$new
        unset SSH_AGENT_PID
      else
        echo "$0: can't link $new -> $old" >&2
      fi
    else
      echo "$0: $old isn't a socket" >&2
    fi
  elif eval `ssh-agent`; then
    if ln -fs $SSH_AUTH_SOCK:h $link; then
      local old=$SSH_AUTH_SOCK
      echo $SSH_AGENT_PID > $link/ssh-agent.pid
      rm -f $link.lock
      $0 && rm -f $old
      return
    else
      echo "$0: can't symlink $dir -> $SSH_AUTH_SOCK:h" >&2
    fi
  else
    echo "$0: can't call ssh-agent" >&2
  fi

fi

rm -f $link.lock

# $Id: _call_sshagent 14529 2006-10-22 10:13:13Z lefevre $

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: zsh froze under Mac OS X
  2010-11-23 11:08   ` Peter Stephenson
@ 2010-11-23 13:40     ` Vincent Lefevre
  0 siblings, 0 replies; 5+ messages in thread
From: Vincent Lefevre @ 2010-11-23 13:40 UTC (permalink / raw)
  To: zsh-workers

On 2010-11-23 11:08:54 +0000, Peter Stephenson wrote:
> On Mon, 22 Nov 2010 07:39:00 -0800
> Bart Schaefer <schaefer@brasslantern.com> wrote:
> > On Nov 22,  3:49pm, Vincent Lefevre wrote:
> > }
> > } Call graph:
> > }             993 zexit
> > }               993 sourcehome
> > }                 993 source
> > }                   993 loop
> > 
> > This indicates that it's reading your .zlogout file.  Anything there
> > that could account for the behavior?
> 
> Obviously we need to see what it's executing to get further.  The
> information in the static points in input.c, inbuf and inbufptr, ought
> to have been revealing.  Could there be .zwc files involved?

No, I don't have any such file.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-11-23 13:40 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-22 14:49 zsh froze under Mac OS X Vincent Lefevre
2010-11-22 15:39 ` Bart Schaefer
2010-11-23 11:08   ` Peter Stephenson
2010-11-23 13:40     ` Vincent Lefevre
2010-11-23 12:03   ` Vincent Lefevre

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).