* PATCH: Crash bug on garbage input (previously reported to Debian) @ 2015-02-14 18:25 Bart Schaefer 2015-02-14 21:42 ` Peter Stephenson 0 siblings, 1 reply; 10+ messages in thread From: Bart Schaefer @ 2015-02-14 18:25 UTC (permalink / raw) To: zsh-workers Garbage input (nul bytes, etc.) can cause the newly-introduced $(...) parser to become confused during look-ahead and back up the input too far before attempting a different parse. The patch below simply detects the problem and turns it into a parse error with an appropriate warning. It might be helpful to figure out how the confusion originates but this prevents the crash. diff --git a/Src/input.c b/Src/input.c index 2ecac7b..9520fdd 100644 --- a/Src/input.c +++ b/Src/input.c @@ -393,12 +393,14 @@ inungetc(int c) if (((inbufflags & INP_LINENO) || !strin) && c == '\n') lineno--; } -#ifdef DEBUG else if (!(inbufflags & INP_CONT)) { +#ifdef DEBUG /* Just for debugging */ fprintf(stderr, "Attempt to inungetc() at start of input.\n"); - } #endif + zerr("Garbled input at %c (binary file as commands?)", c); + return; + } else { /* * The character is being backed up from a previous input stack diff --git a/Src/lex.c b/Src/lex.c index 433c27f..91628d4 100644 --- a/Src/lex.c +++ b/Src/lex.c @@ -503,13 +503,15 @@ cmd_or_math(int cs_type) /* else unsuccessful: unget the whole thing */ hungetc(c); lexstop = 0; - while (lexbuf.len > oldlen) { + while (lexbuf.len > oldlen && !errflag) { lexbuf.len--; hungetc(itok(*--lexbuf.ptr) ? ztokens[*lexbuf.ptr - Pound] : *lexbuf.ptr); } + if (errflag) + return 2; hungetc('('); - return 0; + return errflag ? 2 : 0; } ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PATCH: Crash bug on garbage input (previously reported to Debian) 2015-02-14 18:25 PATCH: Crash bug on garbage input (previously reported to Debian) Bart Schaefer @ 2015-02-14 21:42 ` Peter Stephenson 2015-02-15 19:26 ` Bart Schaefer 0 siblings, 1 reply; 10+ messages in thread From: Peter Stephenson @ 2015-02-14 21:42 UTC (permalink / raw) To: zsh-workers On Sat, 14 Feb 2015 10:25:34 -0800 Bart Schaefer <schaefer@brasslantern.com> wrote: > Garbage input (nul bytes, etc.) can cause the newly-introduced $(...) > parser to become confused during look-ahead and back up the input too > far before attempting a different parse. > > The patch below simply detects the problem and turns it into a parse > error with an appropriate warning. It might be helpful to figure out > how the confusion originates but this prevents the crash. Hmmm... backup characters are simply matched with input characters. Could it be something to do with multibyte? If it's just invalid characters, your fix is probably good enough in practice. If it's a problem with real multibyte characters we need to do more.e can (I suspect we can do better with the jungle of input and history character reading, which is a bit of a mess, though that's not really relevant to the problem since I don't think the mess is causing any problems in character counting.) pws ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PATCH: Crash bug on garbage input (previously reported to Debian) 2015-02-14 21:42 ` Peter Stephenson @ 2015-02-15 19:26 ` Bart Schaefer 2015-02-16 12:57 ` Peter Stephenson 0 siblings, 1 reply; 10+ messages in thread From: Bart Schaefer @ 2015-02-15 19:26 UTC (permalink / raw) To: zsh-workers On Feb 14, 9:42pm, Peter Stephenson wrote: } Subject: Re: PATCH: Crash bug on garbage input (previously reported to De } } On Sat, 14 Feb 2015 10:25:34 -0800 } Bart Schaefer <schaefer@brasslantern.com> wrote: } > Garbage input (nul bytes, etc.) can cause the newly-introduced $(...) } > parser to become confused during look-ahead and back up the input too } > far before attempting a different parse. } } Hmmm... backup characters are simply matched with input characters. } Could it be something to do with multibyte? I don't think it's multibyte, though it's possible. The sample input file that gives the error has a nested subtitution, in command position, with "${(s!...!" but no closing paren for the parameter flags, inside which is $( followed by 17 more open parens, one close paran, then a bunch of random stuff, a newline, and some more random stuff. The error occurs within cmd_or_math() when trying to ungetc the newline. (I think you were sent a copy of the file off-list.) One interesting tidbit: Even with my patch, if I "source" the test file from an interactive shell, all the garbage that comes AFTER the newline ends up on the history stack where I can pull it up with up-history. So setting errflag may be a bit too aggressive, something is failing to unwind its idea of the source of shell input. Here's a much simpler input string that triggers the same error; put this in a file and read with "source" so that the expression is ended by EOF and never has a chance to prompt for more input: ${($((()$[(s: this ends up in the history)}}) Each of ${( $(( and $[( seem to be crucial to tripping the problem, though I didn't try replacing $[ with $((. Oh, same issue right at the PS1 prompt with this: torch% ${($((()$[(s: braceparam mathsubst mathsubst> this ends up in the history)}})))} Attempt to inungetc() at start of input. zsh: Garbled input at \n (binary file as commands?) zsh: parse error near `)' If you remove the string "this ends up in the history" then nothing ends up in the history (but the error still occurs). Incidentally, the approach of calling the math parser and then falling back to subshell parsing if math fails causes some interesting double errors: torch% print $((echo ) echo ) zsh: parse error near `echo' zsh: parse error near `$((echo ) echo )' ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PATCH: Crash bug on garbage input (previously reported to Debian) 2015-02-15 19:26 ` Bart Schaefer @ 2015-02-16 12:57 ` Peter Stephenson 2015-02-16 17:04 ` Peter Stephenson 0 siblings, 1 reply; 10+ messages in thread From: Peter Stephenson @ 2015-02-16 12:57 UTC (permalink / raw) To: zsh-workers On Sun, 15 Feb 2015 11:26:22 -0800 Bart Schaefer <schaefer@brasslantern.com> wrote: > torch% ${($((()$[(s: > braceparam mathsubst mathsubst> this ends up in the history)}})))} > Attempt to inungetc() at start of input. > zsh: Garbled input at \n (binary file as commands?) > zsh: parse error near `)' I see basically what's happening. Here's the simplest form --- and note that actually this isn't garbled input at all. % print $((echo foo mathsubst> ); echo bar) Attempt to inungetc() at start of input. zsh: Garbled input at \n (binary file as commands?) zsh: parse error near `)' We try to parse this as arithmetic, but we find out it isn't --- and we only find this out in the continuation line. Normally, when reading continuation lines we just forget the old line from the input, so we can't back up over it. Here, however, we need to back up to be able to parse again as a command string. The code for the lexer assumes we can do this but down below in the raw input layer we can't. It looks like we do allow newlines as whitespace in arithmetic, so I think we do need to allow the back up. That implies we somehow need to detect we're doing the tentative looking for math mode and if so add the new line as a continuation of the old one --- that's allowed by the input mechanism and I don't see why that wouldn't work here (largely because I haven't tried it yet). So it "only" remains working out where to detect this. This will need a test, too. pws ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PATCH: Crash bug on garbage input (previously reported to Debian) 2015-02-16 12:57 ` Peter Stephenson @ 2015-02-16 17:04 ` Peter Stephenson 2015-02-16 19:39 ` Bart Schaefer 2015-02-17 9:02 ` Mikael Magnusson 0 siblings, 2 replies; 10+ messages in thread From: Peter Stephenson @ 2015-02-16 17:04 UTC (permalink / raw) To: zsh-workers On Mon, 16 Feb 2015 12:57:49 +0000 Peter Stephenson <p.stephenson@samsung.com> wrote: > % print $((echo foo > mathsubst> ); echo bar) > Attempt to inungetc() at start of input. > zsh: Garbled input at \n (binary file as commands?) > zsh: parse error near `)' > > ...we somehow need to > detect we're doing the tentative looking for math mode and if so add the > new line as a continuation of the old one --- that's allowed by the > input mechanism and I don't see why that wouldn't work here (largely > because I haven't tried it yet). So it "only" remains working out where > to detect this. > > This will need a test, too. The current continuation / push expansion behaviour doesn't quite do what we want since it assumes we've finished with previous input. Here's a simple fix for appending to the input buffer instead of replacing it for this case. It's logically general, however I didn't bother to optimise it since it's special to this particular very rare case. I've noted this in the code. Optimising it now would be premature --- it would just introduce lots of new and hard to test special cases. This version has the virtue of being virtually independent of anything else going on. I've add the "parse test from hell" where the first line looks like an arithmetic substitution but it actually turns out to be a case statement with unbalanced parentheses that needs the new parsing behaviour. The original error now gives something that looks a bit more rational: % ${($((()$[(s: braceparam mathsubst mathsubst> this ends up in the history)}}) zsh: parse error near `$[(s:' zsh: closing brace expected The newly added error is probably sensible for robustness, though, although it's also likely to be a sign that something else is broken. pws diff --git a/Src/input.c b/Src/input.c index 9520fdd..f919e57 100644 --- a/Src/input.c +++ b/Src/input.c @@ -330,8 +330,37 @@ inputline(void) } } isfirstch = 1; - /* Put this into the input channel. */ - inputsetline(ingetcline, INP_FREE); + if ((inbufflags & INP_APPEND) && inbuf) { + /* + * We need new input but need to be able to back up + * over the old input, so append this line. + * Pushing the line onto the stack doesn't have the right + * effect. + * + * This is quite a simple and inefficient fix, but currently + * we only need it when backing up over a multi-line $((... + * that turned out to be a command substitution rather than + * a math substitution, which is a very special case. + * So it's not worth rewriting. + */ + char *oinbuf = inbuf; + int newlen = strlen(ingetcline); + int oldlen = (int)(inbufptr - inbuf) + inbufleft; + if (inbufflags & INP_FREE) { + inbuf = realloc(inbuf, oldlen + newlen + 1); + inbufptr += inbuf - oinbuf; + strcpy(inbuf + oldlen, ingetcline); + } else { + /* Paranoia: don't think this is used */ + DPUTS(1, "Appending to unallocated input line."); + } + inbufleft += newlen; + inbufct += newlen; + inbufflags |= INP_FREE; + } else { + /* Put this into the input channel. */ + inputsetline(ingetcline, INP_FREE); + } return 0; } diff --git a/Src/lex.c b/Src/lex.c index 91628d4..0068485 100644 --- a/Src/lex.c +++ b/Src/lex.c @@ -483,9 +483,13 @@ cmd_or_math(int cs_type) { int oldlen = lexbuf.len; int c; + int oinflags = inbufflags; cmdpush(cs_type); + inbufflags |= INP_APPEND; c = dquote_parse(')', 0); + if (!(oinflags & INP_APPEND)) + inbufflags &= ~INP_APPEND; cmdpop(); *lexbuf.ptr = '\0'; if (!c) { diff --git a/Src/zsh.h b/Src/zsh.h index 94e9ffc..dd946d2 100644 --- a/Src/zsh.h +++ b/Src/zsh.h @@ -410,6 +410,7 @@ enum { #define INP_CONT (1<<3) /* continue onto previously stacked input */ #define INP_ALCONT (1<<4) /* stack is continued from alias expn. */ #define INP_LINENO (1<<5) /* update line number */ +#define INP_APPEND (1<<6) /* Append new lines to allow backup */ /* Flags for metafy */ #define META_REALLOC 0 diff --git a/Test/C01arith.ztst b/Test/C01arith.ztst index ea87af2..09c0822 100644 --- a/Test/C01arith.ztst +++ b/Test/C01arith.ztst @@ -318,3 +318,38 @@ # 0.75 is exactly representable, don't expect rounding error. >0 >0.75 + + # The following tests for a bug that only happens when + # backing up over input read a line at a time, so we'll + # read the input from stdin. + $ZTST_testdir/../Src/zsh -f <<<' + print $((echo first command + ); echo second command) + print third command + ' +0:Backing up a line of input when finding out it's not arithmetic +>first command second command +>third command + + $ZTST_testdir/../Src/zsh -f <<<' + print $((3 + + 4)) + print next line + ' +0:Not needing to back up a line when reading multiline arithmetic +>7 +>next line + + $ZTST_testdir/../Src/zsh -f <<<' + print $((case foo in + bar) + echo not this no, no + ;; + foo) + echo yes, this one + ;; + esac) + print after case in subshell) + ' +0:Non-arithmetic subst with command subsitution parse from hell +>yes, this one after case in subshell ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PATCH: Crash bug on garbage input (previously reported to Debian) 2015-02-16 17:04 ` Peter Stephenson @ 2015-02-16 19:39 ` Bart Schaefer 2015-02-17 12:57 ` Peter Stephenson 2015-02-17 9:02 ` Mikael Magnusson 1 sibling, 1 reply; 10+ messages in thread From: Bart Schaefer @ 2015-02-16 19:39 UTC (permalink / raw) To: zsh-workers On Feb 16, 5:04pm, Peter Stephenson wrote: } } I've add the "parse test from hell" where the first line looks like an } arithmetic substitution but it actually turns out to be a case statement } with unbalanced parentheses that needs the new parsing behaviour. There's an interesting bit of this which I don't think is actually a change in behavior but which might be worth noting. Interactive shell: torch% print $((case foo in mathsubst> If you send-break (ctrl+c) at this point, the input is entirely discarded and can't be recalled with up-line-or-history. Incidentally, it would probably be possible for the mathsubst parser to reject "case foo in" as impossible syntax a lot sooner, because if you simply close with "))" at that point you get zsh: bad math expression: operator expected at `foo in' but anyway: torch% print $((case foo in mathsubst> bar) cmdsubst subsh case> If you send-break at THIS point, the incomplete input ends up in the history and is recallable. Also, if you begin with "{" or "(" so that the PS2 level includes "subsh" or "cursh" then the input is recallable. Another tidbit: Older versions of the shell would only have cmdsubst> as the second PS2 prompt, so this is nice. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PATCH: Crash bug on garbage input (previously reported to Debian) 2015-02-16 19:39 ` Bart Schaefer @ 2015-02-17 12:57 ` Peter Stephenson 2015-02-17 17:13 ` Bart Schaefer 0 siblings, 1 reply; 10+ messages in thread From: Peter Stephenson @ 2015-02-17 12:57 UTC (permalink / raw) To: zsh-workers On Mon, 16 Feb 2015 11:39:23 -0800 Bart Schaefer <schaefer@brasslantern.com> wrote: > On Feb 16, 5:04pm, Peter Stephenson wrote: > } > } I've add the "parse test from hell" where the first line looks like an > } arithmetic substitution but it actually turns out to be a case statement > } with unbalanced parentheses that needs the new parsing behaviour. > > There's an interesting bit of this which I don't think is actually a > change in behavior but which might be worth noting. > > Interactive shell: > > torch% print $((case foo in > mathsubst> > > If you send-break (ctrl+c) at this point, the input is entirely discarded > and can't be recalled with up-line-or-history. I'm not getting this now. If I abort here, I can recall the line above, though anything I type on that second line gets discarded (which I think is standard). pws ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PATCH: Crash bug on garbage input (previously reported to Debian) 2015-02-17 12:57 ` Peter Stephenson @ 2015-02-17 17:13 ` Bart Schaefer 0 siblings, 0 replies; 10+ messages in thread From: Bart Schaefer @ 2015-02-17 17:13 UTC (permalink / raw) To: zsh-workers On Feb 17, 12:57pm, Peter Stephenson wrote: } Subject: Re: PATCH: Crash bug on garbage input (previously reported to De } } On Mon, 16 Feb 2015 11:39:23 -0800 } Bart Schaefer <schaefer@brasslantern.com> wrote: } > torch% print $((case foo in } > mathsubst> } > } > If you send-break (ctrl+c) at this point, the input is entirely discarded } > and can't be recalled with up-line-or-history. } } I'm not getting this now. If I abort here, I can recall the line above, } though anything I type on that second line gets discarded (which I think } is standard). Yep, confirmed. Apparently that last patch for the memory leak also took care of this, though I don't immediately see why. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PATCH: Crash bug on garbage input (previously reported to Debian) 2015-02-16 17:04 ` Peter Stephenson 2015-02-16 19:39 ` Bart Schaefer @ 2015-02-17 9:02 ` Mikael Magnusson 2015-02-17 9:39 ` Peter Stephenson 1 sibling, 1 reply; 10+ messages in thread From: Mikael Magnusson @ 2015-02-17 9:02 UTC (permalink / raw) To: Peter Stephenson; +Cc: zsh workers On Mon, Feb 16, 2015 at 6:04 PM, Peter Stephenson <p.stephenson@samsung.com> wrote: > Here's a simple fix for appending to the input buffer instead of > replacing it for this case. > > diff --git a/Src/input.c b/Src/input.c > index 9520fdd..f919e57 100644 > --- a/Src/input.c > +++ b/Src/input.c > @@ -330,8 +330,37 @@ inputline(void) > } > } > isfirstch = 1; > - /* Put this into the input channel. */ > - inputsetline(ingetcline, INP_FREE); > + if ((inbufflags & INP_APPEND) && inbuf) { > + /* > + * We need new input but need to be able to back up > + * over the old input, so append this line. > + * Pushing the line onto the stack doesn't have the right > + * effect. > + * > + * This is quite a simple and inefficient fix, but currently > + * we only need it when backing up over a multi-line $((... > + * that turned out to be a command substitution rather than > + * a math substitution, which is a very special case. > + * So it's not worth rewriting. > + */ > + char *oinbuf = inbuf; > + int newlen = strlen(ingetcline); > + int oldlen = (int)(inbufptr - inbuf) + inbufleft; > + if (inbufflags & INP_FREE) { > + inbuf = realloc(inbuf, oldlen + newlen + 1); > + inbufptr += inbuf - oinbuf; > + strcpy(inbuf + oldlen, ingetcline); Coverity complains that ingetcline is not freed in the above path. +free(ingetcline); here? > + } else { > + /* Paranoia: don't think this is used */ > + DPUTS(1, "Appending to unallocated input line."); > + } > + inbufleft += newlen; > + inbufct += newlen; > + inbufflags |= INP_FREE; > + } else { > + /* Put this into the input channel. */ > + inputsetline(ingetcline, INP_FREE); > + } > > return 0; > } -- Mikael Magnusson ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PATCH: Crash bug on garbage input (previously reported to Debian) 2015-02-17 9:02 ` Mikael Magnusson @ 2015-02-17 9:39 ` Peter Stephenson 0 siblings, 0 replies; 10+ messages in thread From: Peter Stephenson @ 2015-02-17 9:39 UTC (permalink / raw) To: zsh workers On Tue, 17 Feb 2015 10:02:14 +0100 Mikael Magnusson <mikachu@gmail.com> wrote: > Coverity complains that ingetcline is not freed in the above path. > +free(ingetcline); here? Yes, that's missing, and we might as well handle the other branch now it's working --- I didn't bother when fixing the bug but it's not exactly difficult to do. pws diff --git a/Src/input.c b/Src/input.c index f919e57..92b1ad1 100644 --- a/Src/input.c +++ b/Src/input.c @@ -348,12 +348,13 @@ inputline(void) int oldlen = (int)(inbufptr - inbuf) + inbufleft; if (inbufflags & INP_FREE) { inbuf = realloc(inbuf, oldlen + newlen + 1); - inbufptr += inbuf - oinbuf; - strcpy(inbuf + oldlen, ingetcline); } else { - /* Paranoia: don't think this is used */ - DPUTS(1, "Appending to unallocated input line."); + inbuf = zalloc(oldlen + newlen + 1); + memcpy(inbuf, oinbuf, oldlen); } + inbufptr += inbuf - oinbuf; + strcpy(inbuf + oldlen, ingetcline); + free(ingetcline); inbufleft += newlen; inbufct += newlen; inbufflags |= INP_FREE; ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2015-02-17 17:13 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-02-14 18:25 PATCH: Crash bug on garbage input (previously reported to Debian) Bart Schaefer 2015-02-14 21:42 ` Peter Stephenson 2015-02-15 19:26 ` Bart Schaefer 2015-02-16 12:57 ` Peter Stephenson 2015-02-16 17:04 ` Peter Stephenson 2015-02-16 19:39 ` Bart Schaefer 2015-02-17 12:57 ` Peter Stephenson 2015-02-17 17:13 ` Bart Schaefer 2015-02-17 9:02 ` Mikael Magnusson 2015-02-17 9:39 ` Peter Stephenson
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/zsh/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).