zsh-workers
 help / Atom feed
* segfault via completion menu
@ 2019-04-09 14:30 Wesley Schwengle
  2019-05-20 20:55 ` Oliver Kiddle
  0 siblings, 1 reply; 9+ messages in thread
From: Wesley Schwengle @ 2019-04-09 14:30 UTC (permalink / raw)
  To: zsh-workers

[-- Attachment #1: Type: text/plain, Size: 752 bytes --]

Hello all,

today I've experiences a segfault running on an update to date zsh
from git (6f35d6c0d0eeb80c0145e5226285a8a45ffb5f25)

I can trigger this in a fairly "large" git repo such as the one I have
at work, or using the repository of git itself. I've included the
output of gdb

I can trigger the bug with zsh -f and the attached zshrc:

* cd ~code/git # the git repo from git@github.com:git/git.git
* zsh -f
* source the zshrc provided in this e-mail
* vi Zaaksysteem::Bar::voo::vooo::voo<tab> <nowwaitafewsecs> <ctrl-c>
* You now get a message: Killed by signal in compadd after 0s
* vi <tab> # yields a segfault

Cheers,
Wesley

-- 
Wesley Schwengle, Developer
Mintlab B.V., https://www.zaaksysteem.nl
E: wesley@mintlab.nl
T:  +31 20 737 00 05

[-- Attachment #2: cmpadd --]
[-- Type: application/octet-stream, Size: 13467 bytes --]

[-- Attachment #3: minimal-zshrc --]
[-- Type: application/octet-stream, Size: 815 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: segfault via completion menu
  2019-04-09 14:30 segfault via completion menu Wesley Schwengle
@ 2019-05-20 20:55 ` Oliver Kiddle
  2019-05-21 21:58   ` Daniel Shahaf
  0 siblings, 1 reply; 9+ messages in thread
From: Oliver Kiddle @ 2019-05-20 20:55 UTC (permalink / raw)
  To: Wesley Schwengle; +Cc: zsh-workers

On 9 Apr, Wesley Schwengle wrote:
> today I've experiences a segfault running on an update to date zsh
> from git (6f35d6c0d0eeb80c0145e5226285a8a45ffb5f25)

Thanks for sending this to us.

> I can trigger this in a fairly "large" git repo such as the one I have
> at work, or using the repository of git itself. I've included the
> output of gdb
>
> I can trigger the bug with zsh -f and the attached zshrc:
>
> * cd ~code/git # the git repo from git@github.com:git/git.git
> * zsh -f
> * source the zshrc provided in this e-mail
> * vi Zaaksysteem::Bar::voo::vooo::voo<tab> <nowwaitafewsecs> <ctrl-c>
> * You now get a message: Killed by signal in compadd after 0s
> * vi <tab> # yields a segfault

This seems to be due to interrupting of pattern matching. I was able to
cut this down to something that doesn't involve completion:

  zsh -f
  setopt extendedglob
  () {
    TRAPINT() {
      return 1
    }
    : **/*~(#a10)Zaaksysteem::Bar::voo::vooo::voo
  }

  interrupt the function with Ctrl-C and now do something that involves
  pattern matching, e.g:

  [[ a = :*: ]]

The glob can be varied, it just needs to take long enough to give you
time to catch it with Ctrl-C so pick a big enough directory.

Older versions of zsh didn't have the problem so I've been able to
bisect it down to the change that introduced it:

[827d36077641ca87d1796b9c5cb05e7c44b01919] 36853: replace pushheap/popheap by NEWHEAPS/OLDHEAPS in doshfunc() to optimize memory management

Backing that out on top of master appears to fix the issue. As it was an
optimisation, that might be an option. From reading comments in mem.c,
it's not especially clear to me what newheaps/oldheaps do. There's only
the one other use.

Oliver

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: segfault via completion menu
  2019-05-20 20:55 ` Oliver Kiddle
@ 2019-05-21 21:58   ` Daniel Shahaf
  2019-05-21 22:19     ` Bart Schaefer
  0 siblings, 1 reply; 9+ messages in thread
From: Daniel Shahaf @ 2019-05-21 21:58 UTC (permalink / raw)
  To: Oliver Kiddle, Wesley Schwengle; +Cc: zsh-workers

Oliver Kiddle wrote on Mon, 20 May 2019 20:57 +00:00:
> Backing that out on top of master appears to fix the issue. As it was an
> optimisation, that might be an option. From reading comments in mem.c,
> it's not especially clear to me what newheaps/oldheaps do. There's only
> the one other use.

It's not too clear to me either.

The defining property of a heap, IIUC, is that freeheap() and popheap()
release all heap-allocated memory allocated since the last pushheap().
In this respect they're analogous to APR pools, which I'm familiar
with¹.

Then, what NEWHEAPS() seems to do is put away the entire stack
of heaps, create a new stack of heaps, and then OLDHEAPS() frees the
entire new stack of heaps and reverts to the old one.  How is this
better than simply doing pushheap() and popheap()?  

- Code between NEWHEAPS() and OLDHEAPS() doesn't have to be careful to
  match pushheap() and popheap() calls exactly, because OLDHEAPS() will
  clean up everything anyway.
  
- Anything allocated on heap between NEWHEAPS() and OLDHEAPS() becomes
  invalid once OLDHEAPS() is called.

- There might be some considerations about maximum depth of the stack or
  total number of bytes allocated by heaps in a single stack?

- (Anything else?)

HTH,

Daniel

¹ https://subversion.apache.org/docs/community-guide/conventions.html#apr-pools
  pushheap() ≈ { p = apr_pool_create(p); }
  popheap() ≈ { tmp = p->parent; apr_pool_destroy(p); p = tmp; }

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: segfault via completion menu
  2019-05-21 21:58   ` Daniel Shahaf
@ 2019-05-21 22:19     ` Bart Schaefer
  2019-05-22  8:49       ` Peter Stephenson
  0 siblings, 1 reply; 9+ messages in thread
From: Bart Schaefer @ 2019-05-21 22:19 UTC (permalink / raw)
  To: Daniel Shahaf; +Cc: Oliver Kiddle, Wesley Schwengle, zsh-workers

On Tue, May 21, 2019 at 2:59 PM Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
>
> Oliver Kiddle wrote on Mon, 20 May 2019 20:57 +00:00:
> > Backing that out on top of master appears to fix the issue. As it was an
> > optimisation, that might be an option. From reading comments in mem.c,
> > it's not especially clear to me what newheaps/oldheaps do. There's only
> > the one other use.
>
> It's not too clear to me either.

Operationally, the difference between pushheap() and NEWHEAPS() is
that pushing scans the entire list of existing heap blocks and moves
them out of the way in favor of new empty blocks.  This is
time-consuming and eats a lot of memory if you're in a tightly
recursive function (read back through the articles leading up to the
change).  In contrast, NEWHEAPS() just starts an entirely new chain of
heap blocks without reference to the existing chain.

If an error occurs as a result of NEWHEAPS()/OLDHEAPS() in the context
of workers/36853, it ought to be traceable to something leaking
heap-allocated storage across boundaries, and probably means there was
a memory leak when interrupting a pattern match before, which this has
turned into an error by freeing the previously leaked space.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: segfault via completion menu
  2019-05-21 22:19     ` Bart Schaefer
@ 2019-05-22  8:49       ` Peter Stephenson
  2019-05-23 16:34         ` Peter Stephenson
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Stephenson @ 2019-05-22  8:49 UTC (permalink / raw)
  To: zsh-workers

On Tue, 2019-05-21 at 15:19 -0700, Bart Schaefer wrote:
> On Tue, May 21, 2019 at 2:59 PM Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
> If an error occurs as a result of NEWHEAPS()/OLDHEAPS() in the context
> of workers/36853, it ought to be traceable to something leaking
> heap-allocated storage across boundaries, and probably means there was
> a memory leak when interrupting a pattern match before, which this has
> turned into an error by freeing the previously leaked space.

This is really just saying the same thing a different way, but ---
the original crash was in patmatch() when looking at the pattern code,
which is set up when we compile a pattern in a different
function.  So there's probably some path where it's possible not to
recompile a pattern or reuse a pattern without recompiling, or simply
hang on to it too long.

It'll be something in the prog passed into pattry() from evalcond() and
I'm guesing in this case the pprog in that function came from
stat->prog->pats[npat] so was fished out of the existing programme
rather than compiled locally.

In general (and fairly obviously), anything stored long term in compiled
code is in permanent (i.e. explicitly freed) storage and if the value is
on the heap it should only have been parsed or copied from permanent
storage for immediate use.

pws


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: segfault via completion menu
  2019-05-22  8:49       ` Peter Stephenson
@ 2019-05-23 16:34         ` Peter Stephenson
  2019-05-24 22:34           ` Peter Stephenson
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Stephenson @ 2019-05-23 16:34 UTC (permalink / raw)
  To: zsh-workers

On Wed, 2019-05-22 at 09:49 +0100, Peter Stephenson wrote:
> On Tue, 2019-05-21 at 15:19 -0700, Bart Schaefer wrote:
> It'll be something in the prog passed into pattry() from evalcond() and
> I'm guesing in this case the pprog in that function came from
> stat->prog->pats[npat] so was fished out of the existing programme
> rather than compiled locally.

If so, something in the following assumptions is being violated, but I
can't see what from looking at the code.

- Whenever we initialise or copy pats[*] we set it to point to a dummy
pattern, ensuring we recompile the pattern next time.

- If we compile the pattern, we only save it if the code it's
being saved into is not on the heap.  (Which it shouldn't be
here, as this is a shell function.)

- If we save it we always use permanent memory.

Possibly adding some debug flags to check the heapiness of patterns at
various points might show something.  (Guess we should check it
is coming from a previously saved version rather than a newly
compiled version, though I don't see how the latter could cause
a crash.)

pws


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: segfault via completion menu
  2019-05-23 16:34         ` Peter Stephenson
@ 2019-05-24 22:34           ` Peter Stephenson
  2019-05-31 12:00             ` Wesley Schwengle
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Stephenson @ 2019-05-24 22:34 UTC (permalink / raw)
  To: zsh-workers

On Thu, 2019-05-23 at 17:34 +0100, Peter Stephenson wrote:
> On Wed, 2019-05-22 at 09:49 +0100, Peter Stephenson wrote:
> > On Tue, 2019-05-21 at 15:19 -0700, Bart Schaefer wrote:
> > It'll be something in the prog passed into pattry() from evalcond() and
> > I'm guesing in this case the pprog in that function came from
> > stat->prog->pats[npat] so was fished out of the existing programme
> > rather than compiled locally.
> 
> If so, something in the following assumptions is being violated, but I
> can't see what from looking at the code.

Alternatively, there could be some internal state in the pattern
matching left over in the case of an error return --- there are
variables that are intended to be used in recursive matching but
currently aren't explicity initialised at the start of matching.  It's
not clear how that could happen, but it would be safest anyway to
sanitise always on entry.

pws

diff --git a/Src/pattern.c b/Src/pattern.c
index 737f5cdcb..3d30b013c 100644
--- a/Src/pattern.c
+++ b/Src/pattern.c
@@ -2030,6 +2030,16 @@ int errsfound;				/* Total error count so far */
 /**/
 int forceerrs;				/* Forced maximum error count */
 
+/*
+ * exactpos is used to remember how far down an exact string we have
+ * matched, if we are doing approximation and can therefore redo from
+ * the same point; we never need to otherwise.
+ *
+ * exactend is a pointer to the end of the string, which isn't
+ * null-terminated.
+ */
+static char *exactpos, *exactend;
+
 /**/
 void
 pattrystart(void)
@@ -2463,6 +2473,8 @@ pattryrefs(Patprog prog, char *string, int stringlen, int unmetalenin,
 
 	patinput = patinstart;
 
+	exactpos = exactend = NULL;
+	/* The only external call to patmatch --- all others are recursive */
 	if (patmatch((Upat)progstr)) {
 	    /*
 	     * we were lazy and didn't save the globflags if an exclusion
@@ -2652,16 +2664,6 @@ patmatchlen(void)
 #define CHARMATCH_EXPR(expr, chpa) \
 	(charmatch_cache = (expr), CHARMATCH(charmatch_cache, chpa))
 
-/*
- * exactpos is used to remember how far down an exact string we have
- * matched, if we are doing approximation and can therefore redo from
- * the same point; we never need to otherwise.
- *
- * exactend is a pointer to the end of the string, which isn't
- * null-terminated.
- */
-static char *exactpos, *exactend;
-
 /*
  * Main matching routine.
  *


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: segfault via completion menu
  2019-05-24 22:34           ` Peter Stephenson
@ 2019-05-31 12:00             ` Wesley Schwengle
  2019-05-31 13:29               ` Peter Stephenson
  0 siblings, 1 reply; 9+ messages in thread
From: Wesley Schwengle @ 2019-05-31 12:00 UTC (permalink / raw)
  To: zsh-workers

Op za 25 mei 2019 om 00:35 schreef Peter Stephenson
<p.w.stephenson@ntlworld.com>:

> [snip]

Hello, I'm back from my month vacation and see there has been progress
on this issue. I believe the bug is fixed now by commit
4b85edface379a3575273a2b712d80bd9420d4c9.

Thanks for the help!

Cheers,
Wesley

-- 
Wesley Schwengle, Developer
Mintlab B.V., https://www.zaaksysteem.nl
E: wesley@mintlab.nl
T:  +31 20 737 00 05

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: segfault via completion menu
  2019-05-31 12:00             ` Wesley Schwengle
@ 2019-05-31 13:29               ` Peter Stephenson
  0 siblings, 0 replies; 9+ messages in thread
From: Peter Stephenson @ 2019-05-31 13:29 UTC (permalink / raw)
  To: zsh-workers

On Fri, 2019-05-31 at 14:00 +0200, Wesley Schwengle wrote:
> Op za 25 mei 2019 om 00:35 schreef Peter Stephenson
> <p.w.stephenson@ntlworld.com>:
> 
> > 
> > [snip]
> Hello, I'm back from my month vacation and see there has been progress
> on this issue. I believe the bug is fixed now by commit
> 4b85edface379a3575273a2b712d80bd9420d4c9.
> 
> Thanks for the help!

OK, that's good --- it looked plausible it was somewhere there but I
didn't know for sure so I'm glad to hear.

Cheers
pws


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, back to index

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-09 14:30 segfault via completion menu Wesley Schwengle
2019-05-20 20:55 ` Oliver Kiddle
2019-05-21 21:58   ` Daniel Shahaf
2019-05-21 22:19     ` Bart Schaefer
2019-05-22  8:49       ` Peter Stephenson
2019-05-23 16:34         ` Peter Stephenson
2019-05-24 22:34           ` Peter Stephenson
2019-05-31 12:00             ` Wesley Schwengle
2019-05-31 13:29               ` Peter Stephenson

zsh-workers

Archives are clonable: git clone --mirror http://inbox.vuxu.org/zsh-workers

Newsgroup available over NNTP:
	nntp://inbox.vuxu.org/vuxu.archive.zsh.workers


AGPL code for this site: git clone https://public-inbox.org/ public-inbox