zsh-workers
 help / color / mirror / code / Atom feed
* PATCH:  Crash bug on garbage input (previously reported to Debian)
@ 2015-02-14 18:25 Bart Schaefer
  2015-02-14 21:42 ` Peter Stephenson
  0 siblings, 1 reply; 10+ messages in thread
From: Bart Schaefer @ 2015-02-14 18:25 UTC (permalink / raw)
  To: zsh-workers

Garbage input (nul bytes, etc.) can cause the newly-introduced $(...)
parser to become confused during look-ahead and back up the input too
far before attempting a different parse.

The patch below simply detects the problem and turns it into a parse
error with an appropriate warning.  It might be helpful to figure out
how the confusion originates but this prevents the crash.

diff --git a/Src/input.c b/Src/input.c
index 2ecac7b..9520fdd 100644
--- a/Src/input.c
+++ b/Src/input.c
@@ -393,12 +393,14 @@ inungetc(int c)
 	    if (((inbufflags & INP_LINENO) || !strin) && c == '\n')
 		lineno--;
 	}
-#ifdef DEBUG
         else if (!(inbufflags & INP_CONT)) {
+#ifdef DEBUG
 	    /* Just for debugging */
 	    fprintf(stderr, "Attempt to inungetc() at start of input.\n");
-	}
 #endif
+	    zerr("Garbled input at %c (binary file as commands?)", c);
+	    return;
+	}
 	else {
 	    /*
 	     * The character is being backed up from a previous input stack
diff --git a/Src/lex.c b/Src/lex.c
index 433c27f..91628d4 100644
--- a/Src/lex.c
+++ b/Src/lex.c
@@ -503,13 +503,15 @@ cmd_or_math(int cs_type)
     /* else unsuccessful: unget the whole thing */
     hungetc(c);
     lexstop = 0;
-    while (lexbuf.len > oldlen) {
+    while (lexbuf.len > oldlen && !errflag) {
 	lexbuf.len--;
 	hungetc(itok(*--lexbuf.ptr) ?
 		ztokens[*lexbuf.ptr - Pound] : *lexbuf.ptr);
     }
+    if (errflag)
+	return 2;
     hungetc('(');
-    return 0;
+    return errflag ? 2 : 0;
 }
 
 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PATCH:  Crash bug on garbage input (previously reported to Debian)
  2015-02-14 18:25 PATCH: Crash bug on garbage input (previously reported to Debian) Bart Schaefer
@ 2015-02-14 21:42 ` Peter Stephenson
  2015-02-15 19:26   ` Bart Schaefer
  0 siblings, 1 reply; 10+ messages in thread
From: Peter Stephenson @ 2015-02-14 21:42 UTC (permalink / raw)
  To: zsh-workers

On Sat, 14 Feb 2015 10:25:34 -0800
Bart Schaefer <schaefer@brasslantern.com> wrote:
> Garbage input (nul bytes, etc.) can cause the newly-introduced $(...)
> parser to become confused during look-ahead and back up the input too
> far before attempting a different parse.
> 
> The patch below simply detects the problem and turns it into a parse
> error with an appropriate warning.  It might be helpful to figure out
> how the confusion originates but this prevents the crash.

Hmmm... backup characters are simply matched with input characters.
Could it be something to do with multibyte?  If it's just invalid
characters, your fix is probably good enough in practice.  If it's a
problem with real multibyte characters we need to do more.e can

(I suspect we can do better with the jungle of input and history
character reading, which is a bit of a mess, though that's not really
relevant to the problem since I don't think the mess is causing any
problems in character counting.)

pws


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PATCH:  Crash bug on garbage input (previously reported to Debian)
  2015-02-14 21:42 ` Peter Stephenson
@ 2015-02-15 19:26   ` Bart Schaefer
  2015-02-16 12:57     ` Peter Stephenson
  0 siblings, 1 reply; 10+ messages in thread
From: Bart Schaefer @ 2015-02-15 19:26 UTC (permalink / raw)
  To: zsh-workers

On Feb 14,  9:42pm, Peter Stephenson wrote:
} Subject: Re: PATCH:  Crash bug on garbage input (previously reported to De
}
} On Sat, 14 Feb 2015 10:25:34 -0800
} Bart Schaefer <schaefer@brasslantern.com> wrote:
} > Garbage input (nul bytes, etc.) can cause the newly-introduced $(...)
} > parser to become confused during look-ahead and back up the input too
} > far before attempting a different parse.
} 
} Hmmm... backup characters are simply matched with input characters.
} Could it be something to do with multibyte?

I don't think it's multibyte, though it's possible.  The sample input file
that gives the error has a nested subtitution, in command position, with
"${(s!...!" but no closing paren for the parameter flags, inside which
is $( followed by 17 more open parens, one close paran, then a bunch of
random stuff, a newline, and some more random stuff.  The error occurs
within cmd_or_math() when trying to ungetc the newline.  (I think you
were sent a copy of the file off-list.)

One interesting tidbit:  Even with my patch, if I "source" the test file
from an interactive shell, all the garbage that comes AFTER the newline
ends up on the history stack where I can pull it up with up-history.  So
setting errflag may be a bit too aggressive, something is failing to
unwind its idea of the source of shell input.

Here's a much simpler input string that triggers the same error; put this
in a file and read with "source" so that the expression is ended by EOF
and never has a chance to prompt for more input:

${($((()$[(s:
this ends up in the history)}})

Each of ${( $(( and $[( seem to be crucial to tripping the problem, though
I didn't try replacing $[ with $((.

Oh, same issue right at the PS1 prompt with this:

torch% ${($((()$[(s:    
braceparam mathsubst mathsubst> this ends up in the history)}})))}
Attempt to inungetc() at start of input.
zsh: Garbled input at \n (binary file as commands?)
zsh: parse error near `)'                                                       

If you remove the string "this ends up in the history" then nothing ends up
in the history (but the error still occurs).

Incidentally, the approach of calling the math parser and then falling back
to subshell parsing if math fails causes some interesting double errors:

torch% print $((echo ) echo )
zsh: parse error near `echo'
zsh: parse error near `$((echo ) echo )'


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PATCH:  Crash bug on garbage input (previously reported to Debian)
  2015-02-15 19:26   ` Bart Schaefer
@ 2015-02-16 12:57     ` Peter Stephenson
  2015-02-16 17:04       ` Peter Stephenson
  0 siblings, 1 reply; 10+ messages in thread
From: Peter Stephenson @ 2015-02-16 12:57 UTC (permalink / raw)
  To: zsh-workers

On Sun, 15 Feb 2015 11:26:22 -0800
Bart Schaefer <schaefer@brasslantern.com> wrote:
> torch% ${($((()$[(s:    
> braceparam mathsubst mathsubst> this ends up in the history)}})))}
> Attempt to inungetc() at start of input.
> zsh: Garbled input at \n (binary file as commands?)
> zsh: parse error near `)'

I see basically what's happening.  Here's the simplest form --- and note
that actually this isn't garbled input at all.

% print $((echo foo
mathsubst> ); echo bar)
Attempt to inungetc() at start of input.
zsh: Garbled input at \n (binary file as commands?)
zsh: parse error near `)'

We try to parse this as arithmetic, but we find out it isn't --- and we
only find this out in the continuation line.  Normally, when reading
continuation lines we just forget the old line from the input, so we
can't back up over it.  Here, however, we need to back up to be able to
parse again as a command string.  The code for the lexer assumes we can
do this but down below in the raw input layer we can't.

It looks like we do allow newlines as whitespace in arithmetic, so I
think we do need to allow the back up.  That implies we somehow need to
detect we're doing the tentative looking for math mode and if so add the
new line as a continuation of the old one --- that's allowed by the
input mechanism and I don't see why that wouldn't work here (largely
because I haven't tried it yet).  So it "only" remains working out where
to detect this.

This will need a test, too.

pws


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PATCH:  Crash bug on garbage input (previously reported to Debian)
  2015-02-16 12:57     ` Peter Stephenson
@ 2015-02-16 17:04       ` Peter Stephenson
  2015-02-16 19:39         ` Bart Schaefer
  2015-02-17  9:02         ` Mikael Magnusson
  0 siblings, 2 replies; 10+ messages in thread
From: Peter Stephenson @ 2015-02-16 17:04 UTC (permalink / raw)
  To: zsh-workers

On Mon, 16 Feb 2015 12:57:49 +0000
Peter Stephenson <p.stephenson@samsung.com> wrote:
> % print $((echo foo
> mathsubst> ); echo bar)
> Attempt to inungetc() at start of input.
> zsh: Garbled input at \n (binary file as commands?)
> zsh: parse error near `)'
>
> ...we somehow need to
> detect we're doing the tentative looking for math mode and if so add the
> new line as a continuation of the old one --- that's allowed by the
> input mechanism and I don't see why that wouldn't work here (largely
> because I haven't tried it yet).  So it "only" remains working out where
> to detect this.
> 
> This will need a test, too.

The current continuation / push expansion behaviour doesn't quite do
what we want since it assumes we've finished with previous input.

Here's a simple fix for appending to the input buffer instead of
replacing it for this case.  It's logically general, however I didn't
bother to optimise it since it's special to this particular very rare
case.  I've noted this in the code.  Optimising it now would be
premature --- it would just introduce lots of new and hard to test
special cases.  This version has the virtue of being virtually
independent of anything else going on.

I've add the "parse test from hell" where the first line looks like an
arithmetic substitution but it actually turns out to be a case statement
with unbalanced parentheses that needs the new parsing behaviour.

The original error now gives something that looks a bit more rational:

% ${($((()$[(s:
braceparam mathsubst mathsubst> this ends up in the history)}})
zsh: parse error near `$[(s:'
zsh: closing brace expected

The newly added error is probably sensible for robustness, though,
although it's also likely to be a sign that something else is broken.

pws

diff --git a/Src/input.c b/Src/input.c
index 9520fdd..f919e57 100644
--- a/Src/input.c
+++ b/Src/input.c
@@ -330,8 +330,37 @@ inputline(void)
 	}
     }
     isfirstch = 1;
-    /* Put this into the input channel. */
-    inputsetline(ingetcline, INP_FREE);
+    if ((inbufflags & INP_APPEND) && inbuf) {
+	/*
+	 * We need new input but need to be able to back up
+	 * over the old input, so append this line.
+	 * Pushing the line onto the stack doesn't have the right
+	 * effect.
+	 *
+	 * This is quite a simple and inefficient fix, but currently
+	 * we only need it when backing up over a multi-line $((...
+	 * that turned out to be a command substitution rather than
+	 * a math substitution, which is a very special case.
+	 * So it's not worth rewriting.
+	 */
+	char *oinbuf = inbuf;
+	int newlen = strlen(ingetcline);
+	int oldlen = (int)(inbufptr - inbuf) + inbufleft;
+	if (inbufflags & INP_FREE) {
+	    inbuf = realloc(inbuf, oldlen + newlen + 1);
+	    inbufptr += inbuf - oinbuf;
+	    strcpy(inbuf + oldlen, ingetcline);
+	} else {
+	    /* Paranoia: don't think this is used */
+	    DPUTS(1, "Appending to unallocated input line.");
+	}
+	inbufleft += newlen;
+	inbufct += newlen;
+	inbufflags |= INP_FREE;
+    } else {
+	/* Put this into the input channel. */
+	inputsetline(ingetcline, INP_FREE);
+    }
 
     return 0;
 }
diff --git a/Src/lex.c b/Src/lex.c
index 91628d4..0068485 100644
--- a/Src/lex.c
+++ b/Src/lex.c
@@ -483,9 +483,13 @@ cmd_or_math(int cs_type)
 {
     int oldlen = lexbuf.len;
     int c;
+    int oinflags = inbufflags;
 
     cmdpush(cs_type);
+    inbufflags |= INP_APPEND;
     c = dquote_parse(')', 0);
+    if (!(oinflags & INP_APPEND))
+	inbufflags &= ~INP_APPEND;
     cmdpop();
     *lexbuf.ptr = '\0';
     if (!c) {
diff --git a/Src/zsh.h b/Src/zsh.h
index 94e9ffc..dd946d2 100644
--- a/Src/zsh.h
+++ b/Src/zsh.h
@@ -410,6 +410,7 @@ enum {
 #define INP_CONT      (1<<3)	/* continue onto previously stacked input  */
 #define INP_ALCONT    (1<<4)	/* stack is continued from alias expn.     */
 #define INP_LINENO    (1<<5)    /* update line number                      */
+#define INP_APPEND    (1<<6)    /* Append new lines to allow backup        */
 
 /* Flags for metafy */
 #define META_REALLOC	0
diff --git a/Test/C01arith.ztst b/Test/C01arith.ztst
index ea87af2..09c0822 100644
--- a/Test/C01arith.ztst
+++ b/Test/C01arith.ztst
@@ -318,3 +318,38 @@
 # 0.75 is exactly representable, don't expect rounding error.
 >0
 >0.75
+
+  # The following tests for a bug that only happens when
+  # backing up over input read a line at a time, so we'll
+  # read the input from stdin.
+  $ZTST_testdir/../Src/zsh -f <<<'
+  print $((echo first command
+  ); echo second command)
+  print third command
+  '
+0:Backing up a line of input when finding out it's not arithmetic
+>first command second command
+>third command
+
+  $ZTST_testdir/../Src/zsh -f <<<'
+  print $((3 +
+  4))
+  print next line
+  '
+0:Not needing to back up a line when reading multiline arithmetic
+>7
+>next line
+
+  $ZTST_testdir/../Src/zsh -f <<<'
+  print $((case foo in
+  bar)
+  echo not this no, no
+  ;;
+  foo)
+  echo yes, this one
+  ;;
+  esac)
+  print after case in subshell)
+  '
+0:Non-arithmetic subst with command subsitution parse from hell
+>yes, this one after case in subshell


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PATCH:  Crash bug on garbage input (previously reported to Debian)
  2015-02-16 17:04       ` Peter Stephenson
@ 2015-02-16 19:39         ` Bart Schaefer
  2015-02-17 12:57           ` Peter Stephenson
  2015-02-17  9:02         ` Mikael Magnusson
  1 sibling, 1 reply; 10+ messages in thread
From: Bart Schaefer @ 2015-02-16 19:39 UTC (permalink / raw)
  To: zsh-workers

On Feb 16,  5:04pm, Peter Stephenson wrote:
}
} I've add the "parse test from hell" where the first line looks like an
} arithmetic substitution but it actually turns out to be a case statement
} with unbalanced parentheses that needs the new parsing behaviour.

There's an interesting bit of this which I don't think is actually a
change in behavior but which might be worth noting.

Interactive shell:

torch% print $((case foo in
mathsubst> 

If you send-break (ctrl+c) at this point, the input is entirely discarded
and can't be recalled with up-line-or-history.  Incidentally, it would
probably be possible for the mathsubst parser to reject "case foo in"
as impossible syntax a lot sooner, because if you simply close with "))"
at that point you get

zsh: bad math expression: operator expected at `foo in'

but anyway:

torch% print $((case foo in
mathsubst> bar)
cmdsubst subsh case> 

If you send-break at THIS point, the incomplete input ends up in the
history and is recallable.  Also, if you begin with "{" or "(" so that
the PS2 level includes "subsh" or "cursh" then the input is recallable. 

Another tidbit:  Older versions of the shell would only have cmdsubst>
as the second PS2 prompt, so this is nice.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PATCH: Crash bug on garbage input (previously reported to Debian)
  2015-02-16 17:04       ` Peter Stephenson
  2015-02-16 19:39         ` Bart Schaefer
@ 2015-02-17  9:02         ` Mikael Magnusson
  2015-02-17  9:39           ` Peter Stephenson
  1 sibling, 1 reply; 10+ messages in thread
From: Mikael Magnusson @ 2015-02-17  9:02 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: zsh workers

On Mon, Feb 16, 2015 at 6:04 PM, Peter Stephenson
<p.stephenson@samsung.com> wrote:
> Here's a simple fix for appending to the input buffer instead of
> replacing it for this case.
>
> diff --git a/Src/input.c b/Src/input.c
> index 9520fdd..f919e57 100644
> --- a/Src/input.c
> +++ b/Src/input.c
> @@ -330,8 +330,37 @@ inputline(void)
>         }
>      }
>      isfirstch = 1;
> -    /* Put this into the input channel. */
> -    inputsetline(ingetcline, INP_FREE);
> +    if ((inbufflags & INP_APPEND) && inbuf) {
> +       /*
> +        * We need new input but need to be able to back up
> +        * over the old input, so append this line.
> +        * Pushing the line onto the stack doesn't have the right
> +        * effect.
> +        *
> +        * This is quite a simple and inefficient fix, but currently
> +        * we only need it when backing up over a multi-line $((...
> +        * that turned out to be a command substitution rather than
> +        * a math substitution, which is a very special case.
> +        * So it's not worth rewriting.
> +        */
> +       char *oinbuf = inbuf;
> +       int newlen = strlen(ingetcline);
> +       int oldlen = (int)(inbufptr - inbuf) + inbufleft;
> +       if (inbufflags & INP_FREE) {
> +           inbuf = realloc(inbuf, oldlen + newlen + 1);
> +           inbufptr += inbuf - oinbuf;
> +           strcpy(inbuf + oldlen, ingetcline);

Coverity complains that ingetcline is not freed in the above path.
+free(ingetcline); here?

> +       } else {
> +           /* Paranoia: don't think this is used */
> +           DPUTS(1, "Appending to unallocated input line.");
> +       }
> +       inbufleft += newlen;
> +       inbufct += newlen;
> +       inbufflags |= INP_FREE;
> +    } else {
> +       /* Put this into the input channel. */
> +       inputsetline(ingetcline, INP_FREE);
> +    }
>
>      return 0;
>  }


-- 
Mikael Magnusson


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PATCH: Crash bug on garbage input (previously reported to Debian)
  2015-02-17  9:02         ` Mikael Magnusson
@ 2015-02-17  9:39           ` Peter Stephenson
  0 siblings, 0 replies; 10+ messages in thread
From: Peter Stephenson @ 2015-02-17  9:39 UTC (permalink / raw)
  To: zsh workers

On Tue, 17 Feb 2015 10:02:14 +0100
Mikael Magnusson <mikachu@gmail.com> wrote:
> Coverity complains that ingetcline is not freed in the above path.
> +free(ingetcline); here?

Yes, that's missing, and we might as well handle the other branch now
it's working --- I didn't bother when fixing the bug but it's not
exactly difficult to do.

pws

diff --git a/Src/input.c b/Src/input.c
index f919e57..92b1ad1 100644
--- a/Src/input.c
+++ b/Src/input.c
@@ -348,12 +348,13 @@ inputline(void)
 	int oldlen = (int)(inbufptr - inbuf) + inbufleft;
 	if (inbufflags & INP_FREE) {
 	    inbuf = realloc(inbuf, oldlen + newlen + 1);
-	    inbufptr += inbuf - oinbuf;
-	    strcpy(inbuf + oldlen, ingetcline);
 	} else {
-	    /* Paranoia: don't think this is used */
-	    DPUTS(1, "Appending to unallocated input line.");
+	    inbuf = zalloc(oldlen + newlen + 1);
+	    memcpy(inbuf, oinbuf, oldlen);
 	}
+	inbufptr += inbuf - oinbuf;
+	strcpy(inbuf + oldlen, ingetcline);
+	free(ingetcline);
 	inbufleft += newlen;
 	inbufct += newlen;
 	inbufflags |= INP_FREE;


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PATCH:  Crash bug on garbage input (previously reported to Debian)
  2015-02-16 19:39         ` Bart Schaefer
@ 2015-02-17 12:57           ` Peter Stephenson
  2015-02-17 17:13             ` Bart Schaefer
  0 siblings, 1 reply; 10+ messages in thread
From: Peter Stephenson @ 2015-02-17 12:57 UTC (permalink / raw)
  To: zsh-workers

On Mon, 16 Feb 2015 11:39:23 -0800
Bart Schaefer <schaefer@brasslantern.com> wrote:
> On Feb 16,  5:04pm, Peter Stephenson wrote:
> }
> } I've add the "parse test from hell" where the first line looks like an
> } arithmetic substitution but it actually turns out to be a case statement
> } with unbalanced parentheses that needs the new parsing behaviour.
>
> There's an interesting bit of this which I don't think is actually a
> change in behavior but which might be worth noting.
>
> Interactive shell:
>
> torch% print $((case foo in
> mathsubst>
>
> If you send-break (ctrl+c) at this point, the input is entirely discarded
> and can't be recalled with up-line-or-history.

I'm not getting this now.  If I abort here, I can recall the line above,
though anything I type on that second line gets discarded (which I think
is standard).

pws


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PATCH:  Crash bug on garbage input (previously reported to Debian)
  2015-02-17 12:57           ` Peter Stephenson
@ 2015-02-17 17:13             ` Bart Schaefer
  0 siblings, 0 replies; 10+ messages in thread
From: Bart Schaefer @ 2015-02-17 17:13 UTC (permalink / raw)
  To: zsh-workers

On Feb 17, 12:57pm, Peter Stephenson wrote:
} Subject: Re: PATCH:  Crash bug on garbage input (previously reported to De
}
} On Mon, 16 Feb 2015 11:39:23 -0800
} Bart Schaefer <schaefer@brasslantern.com> wrote:
} > torch% print $((case foo in
} > mathsubst>
} >
} > If you send-break (ctrl+c) at this point, the input is entirely discarded
} > and can't be recalled with up-line-or-history.
} 
} I'm not getting this now.  If I abort here, I can recall the line above,
} though anything I type on that second line gets discarded (which I think
} is standard).

Yep, confirmed.  Apparently that last patch for the memory leak also took
care of this, though I don't immediately see why.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-02-17 17:13 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-14 18:25 PATCH: Crash bug on garbage input (previously reported to Debian) Bart Schaefer
2015-02-14 21:42 ` Peter Stephenson
2015-02-15 19:26   ` Bart Schaefer
2015-02-16 12:57     ` Peter Stephenson
2015-02-16 17:04       ` Peter Stephenson
2015-02-16 19:39         ` Bart Schaefer
2015-02-17 12:57           ` Peter Stephenson
2015-02-17 17:13             ` Bart Schaefer
2015-02-17  9:02         ` Mikael Magnusson
2015-02-17  9:39           ` Peter Stephenson

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).