From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 25860 invoked by alias); 7 Mar 2015 21:10:30 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 34681 Received: (qmail 19294 invoked from network); 7 Mar 2015 21:10:16 -0000 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.2 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:message-id:date:in-reply-to:comments :references:to:subject:mime-version:content-type; bh=LGiKKLaeutzdvRyE3Kzntq57G58W8nXLIcxV9ec9ic0=; b=WpgHcDc55JTXLIsOmb9ugxJVBT2DXzYGkAnHpUgRxkkhw+8KfvZ7dfBYt8sz+VVFHk IugVROFZ9JKiK/e1ARRjARG1rSqzU+6xt8n+/BcqaKB7X9d/J2C1/q58EPfU93raueYm EHfZydg/+LpyQwZR2RFjwry84wNqdfuQeHeTh/GnwDhqQIjwk2281P4zljuzfyZgiX74 0CuQXDxghhcpY6A8rfKs6CgVUbXTq/EJwIGBwxpPDCtIv8GhkUa1UHmehXhUCa5IzjXs N6/eUbN9iZLke6ybvqQoQXhfqYOJEz0Oj27BdEAUlazXKHcocv9XLGtZdnQ0KxIA3o3m 9m1w== X-Gm-Message-State: ALoCoQmM/+6JaiP8p5BpahDixUAeafl7qrkwMVpkWXRxQ0Oxr+seElwwQhFfHE3n5NHQKoRutKd2 X-Received: by 10.182.29.136 with SMTP id k8mr16154723obh.60.1425762611905; Sat, 07 Mar 2015 13:10:11 -0800 (PST) From: Bart Schaefer Message-Id: <150307131008.ZM17180@torch.brasslantern.com> Date: Sat, 7 Mar 2015 13:10:08 -0800 In-Reply-To: <20150307155252.75848f74@ntlworld.com> Comments: In reply to Peter Stephenson "Re: Aliasing separators (Re: grammar triviality with '&&')" (Mar 7, 3:52pm) References: <54F33934.2070607@eastlink.ca> <13666281425228233@web7o.yandex.ru> <54F345D3.9010204@eastlink.ca> <20150302022754.GA7449@xvii.vinc17.org> <20150302104619.GC6869@xvii.vinc17.org> <20150302110610.2e2c7e86@pwslap01u.europe.root.pri> <20150304144756.GA27231@ypig.lip.ens-lyon.fr> <150304175112.ZM19818@torch.brasslantern.com> <20150305100638.55631238@pwslap01u.europe.root.pri> <150305090720.ZM8441@torch.brasslantern.com> <20150305174011.0be5a31e@pwslap01u.europe.root.pri> <150305174240.ZM8732@torch.brasslantern.com> <20150306094039.3d968c63@pwslap01u.europe.root.pri> <150306112628.ZM9769@torch.brasslantern.com> <20150307155252.75848f74@ntlworld.com> X-Mailer: OpenZMail Classic (0.9.2 24April2005) To: zsh-workers@zsh.org Subject: Re: Aliasing separators (Re: grammar triviality with '&&') MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii On Mar 7, 3:52pm, Peter Stephenson wrote: } Subject: Re: Aliasing separators (Re: grammar triviality with '&&') } } On Fri, 6 Mar 2015 11:26:28 -0800 } Bart Schaefer wrote: } > } > In practice that means something like "only tokens that can be changed } > by concatenating with another string using simple lexical pasting, may } > be aliased." But that isn't a very satisfying way to say it (and it's } > not 100% true because of "{" being a reserved word). } } Sure, but I'm not sure that's particularly useful for users. The rule } looks something like "you can't alias it if it's one of them things } where you don't need to put a space after it", or something like that. Actually, that helped me to put some sort of explanation around my intuition of how aliasing should work: A token may be expanded as an alias only if doing so cannot change the lexical interpretation of any tokens that may appear adjacent to it. This emphasizes that "{this" expanding to "{ foothis" in the example at the tail of my previous message, is a bug: The interpretation of "this" has been changed, and the internal inconsistency is manifest when you examine what has been stored in the history. Also note "cannot" rather than "does not," which emphasizes why you can't make an alias for "&&" even if you always write "x && y" rather than "x&&y", and why you can't alias newline. On the other hand, if aliasing were an entirely different stage rather than occurring during regular lexing -- I think csh worked that way -- then I might be less concerned about this. That is, for csh I believe it went something like: 1. Read in the line 2. Break the line into words using whitespace and quoting ** 3. Check each word for alias expansion 4. Apply lexical analysis to the result 2 + 3 are why csh aliases are allowed to make \!:N history references to the words in the line while being expanded. Since zsh does 1 and 4 simultaneously, the rules at 3 have to be different, but the basic intention of applying the expansion to "words" was never meant to go away as a result. ** Also, csh broke the line up into commands at ';' '&&' etc. before applying aliases, so \!:N references don't cross command boundaries. } > The other minor point is that this slows down lexical analysis a lot. } > Many more things are going through checkalias(), including in some } > cases (as you pointed out) every individual character. } } That's not necessarily that minor, actually; do you have numbers? I "repaired" POSIXALIASES like so ... diff --git a/Src/lex.c b/Src/lex.c index 494ea88..33c5288 100644 --- a/Src/lex.c +++ b/Src/lex.c @@ -1739,7 +1739,7 @@ checkalias(void) return 0; if (!noaliases && isset(ALIASESOPT) && - (!isset(POSIXALIASES) || + (!isset(POSIXALIASES) || tokstr && !reswdtab->getnode(reswdtab, zshlextext))) { char *suf; ... and then ran this: repeat 10 do time Src/zsh -ic 'autoload +X -m \*' time Src/zsh -o posixaliases -ic 'autoload +X -m \*' done The "autoload +X" has the effect of loading the entire completion suite, which was the largest convenient bolus of lexing/parsing work I could think to throw at it. The difference with 30 aliases defined (none global) was insignificant, so looking up reserved words is not really a factor. I then backed out the lex.c changes and compared the old processing to the new. There still wasn't much difference. However, this is with an unstripped binary compiled for debugging, so it's possible a larger difference would show up if optimization were enabled. } > The lexer can certainly be (and was, before, though it was not } > explicitly stated) smart enough to know that any token that arrives } > at that point with tokstr == NULL is not in fact something that } > could be a command and therefore shouldn't be treated as one. } } Not sure what this means. "(" in command position is a token but is } effectively a command Maybe I could express it this way: A command is something such that if you prefix it with a precommand modifier, it's still a command. The old aliasing code effectively differentiated those kinds of somethings from other arbitrary tokens that might appear in what the lexer calls command position. It didn't do so in an obvious way, but looking for tokstr == NULL has the equivilent effect. } My point is that doesn't really makes sense unless you decide that "&&" } at the start of the line isn't going to be a token *at all*, in the same } way that "(" has effectively the reverse behaviour. In other words, } either you lose the parse error on "&& foo" and treat "&&" as a string } *within* the lexer when it occurs in command position, or there's no } case to answer here because the formal distinction you're trying to make } doesn't actually exist within the shell. This helps me understand what you were getting at, but I would counter that the existence of precommand modifiers shows that the shell does in fact have that distinction -- internally the distinction is at another level, but from an overall perspective "&&" appears only after commands, and "(" appears before; neither occurs "in place of" a command. Except for the -g case, aliases should only apply to things that really can be "in place of" a command. } If you did make that change, making "&&" a string rather than a token } at the start of the line, then you could alias it willy-nilly. It'd have to be not just at the start of a line, but after every ";" or "|" and so on. But you can't really do that -- it has to be token when it is not an alias, otherwise you get "&&: command not found" instead of the syntax error you should get. So you'd have to lex it as a string, look to see if it's an alias, and then if it is not, back up and re-lex it. Which is sort of what happens already when it IS an alias, I suppose. -- Barton E. Schaefer