From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from math.gatech.edu (euclid.skiles.gatech.edu [130.207.146.50]) by werple.net.au (8.7/8.7.1) with SMTP id FAA12711 for ; Tue, 7 Nov 1995 05:33:23 +1100 (EST) Received: by math.gatech.edu (5.x/SMI-SVR4) id AA07571; Mon, 6 Nov 1995 13:20:20 -0500 Resent-Date: Mon, 6 Nov 1995 18:25:09 +0100 (MET) Old-Return-Path: From: Zoltan Hidvegi Message-Id: <199511061725.SAA08785@bolyai.cs.elte.hu> Subject: Re: Expansion/quoting quirks To: kaefer@aglaia.snafu.de (Thorsten Meinecke) Date: Mon, 6 Nov 1995 18:25:09 +0100 (MET) In-Reply-To: from "Thorsten Meinecke" at Nov 5, 95 02:00:27 pm X-Mailer: ELM [version 2.4 PL24] Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: hzoli@cs.elte.hu Resent-Message-Id: <"M5HPm1.0.Ds1.Z7bdm"@euclid> Resent-From: zsh-workers@math.gatech.edu X-Mailing-List: archive/latest/537 X-Loop: zsh-workers@math.gatech.edu Precedence: list Resent-Sender: zsh-workers-request@math.gatech.edu Thorsten Meinecke wrote: > echo `echo \\\\` # broken in hzoli, and in vanilla zsh if invoked as (k)sh Yes, that's really broken. However echo is not good for testing here since shells are diffen in how they interpret escape sequences in echo. It's better to alias echo to 'print -r --'. With zsh it is enough to set the bsdecho option. This bug appeared with the input patches from Peter. The following happens: the \\\\ within `...` is parsed as Bnull\Bnull\. The lexer is called again with that and it thinks that the first \ quotes the Bnull. The last \ then a parse error. Below is a patch to input.c to drop tokens from the input. These returned tokens caused some other bugs earlier and it can be dangerous when a script contains some tokens. > echo "$(echo \\\\)" # sh and ksh seem to differ here (bash would give `\\') sh should give two slashes. The difference is probably in the escape handling of sh. > nargs ${undef-"a b"} # vanilla + hzoli: shouldn't split here That's difficult. sh_word_split splits the result of a parameter expansion. Here the result is 'a b' which is split to 'a' 'b'. > #% argc=3, argv=( 'a b' '' 'c' ) > nargs ${undef-"$@"} # hzoli: 'a b' shouldn't split into 'a' 'b' Same as the previous example. > #% argc=3, argv=( 'a b' '' 'c' ) > nargs "${undef-"$@"}" # hzoli: zsh: closing brace expected That's because the second " closes the first. It would be easy to fix it. My problem is that I do not know what is the standard behaviour here. My library does not have the relevant POSIX papers. It would be important to know how to parse these things. It seems that the lexer should be called on the body of ${...-...}. I'll try to fix these if someone tells me what the standards say here. I have ksh93. May I assume that ksh93 behaviour is the standard? The most difficult part here is ${...##...}. Here the body should be interpreted as a pattern. Here the expanded body shoud be parsed again for quotes. E.g. foo='te\s\t' bar='\s\t' ; echo ${foo%%$bar} does not removes the tail of foo since \ only escapes the s and t. But foo='te"st"' bar='"??"' echo ${foo%$bar} does remove the tail. Bye, Zoltan diff -c Src/input.c~ Src/input.c *** Src/input.c~ Sat Nov 4 09:47:43 1995 --- Src/input.c Mon Nov 6 17:50:17 1995 *************** *** 109,115 **** if (inbufleft) { inbufleft--; inbufct--; ! return lastc = (unsigned)*inbufptr++; } /* * No characters in input buffer. --- 109,118 ---- if (inbufleft) { inbufleft--; inbufct--; ! lastc = (unsigned)*inbufptr++; ! if (itok(lastc)) ! continue; ! return lastc; } /* * No characters in input buffer.