* Re: grammar triviality with '&&' [not found] ` <20150304144756.GA27231@ypig.lip.ens-lyon.fr> @ 2015-03-04 15:18 ` Peter Stephenson 2015-03-04 21:13 ` Daniel Shahaf 2015-03-04 22:05 ` Mikael Magnusson [not found] ` <150304175112.ZM19818@torch.brasslantern.com> 1 sibling, 2 replies; 19+ messages in thread From: Peter Stephenson @ 2015-03-04 15:18 UTC (permalink / raw) To: Zsh Hackers' List On Wed, 4 Mar 2015 15:47:56 +0100 Vincent Lefevre <vincent@vinc17.net> wrote: > I've found a bug: > > % alias '&&=(){ return $? } && ' > % && echo OK > zsh: parse error near `&&' (Moved to zsh-workers) I was keeping very quiet about this, but it looks like it's not as hairy as I thought it might be and the new code is actually slightly cleaner... Now waiting for obscure failures elsewhere... pws diff --git a/Src/lex.c b/Src/lex.c index 307b6e9..a076614 100644 --- a/Src/lex.c +++ b/Src/lex.c @@ -1728,13 +1728,48 @@ gotword(void) } } +/* Check if current lex text matches an alias: 1 if so, else 0 */ + +static int +checkalias(void) +{ + Alias an; + + if (!noaliases && isset(ALIASESOPT) && + (!isset(POSIXALIASES) || + !reswdtab->getnode(reswdtab, zshlextext))) { + char *suf; + + an = (Alias) aliastab->getnode(aliastab, zshlextext); + if (an && !an->inuse && + ((an->node.flags & ALIAS_GLOBAL) || incmdpos || inalmore)) { + inpush(an->text, INP_ALIAS, an); + if (an->text[0] == ' ' && !(an->node.flags & ALIAS_GLOBAL)) + aliasspaceflag = 1; + lexstop = 0; + return 1; + } + if ((suf = strrchr(zshlextext, '.')) && suf[1] && + suf > zshlextext && suf[-1] != Meta && + (an = (Alias)sufaliastab->getnode(sufaliastab, suf+1)) && + !an->inuse && incmdpos) { + inpush(dupstring(zshlextext), INP_ALIAS, NULL); + inpush(" ", INP_ALIAS, NULL); + inpush(an->text, INP_ALIAS, an); + lexstop = 0; + return 1; + } + } + + return 0; +} + /* expand aliases and reserved words */ /**/ int exalias(void) { - Alias an; Reswd rw; hwend(); @@ -1746,7 +1781,7 @@ exalias(void) if (!tokstr) { zshlextext = tokstrings[tok]; - return 0; + return checkalias(); } else { VARARR(char, copy, (strlen(tokstr) + 1)); @@ -1772,34 +1807,10 @@ exalias(void) if (tok == STRING) { /* Check for an alias */ - if (!noaliases && isset(ALIASESOPT) && - (!isset(POSIXALIASES) || - !reswdtab->getnode(reswdtab, zshlextext))) { - char *suf; - - an = (Alias) aliastab->getnode(aliastab, zshlextext); - if (an && !an->inuse && - ((an->node.flags & ALIAS_GLOBAL) || incmdpos || inalmore)) { - inpush(an->text, INP_ALIAS, an); - if (an->text[0] == ' ' && !(an->node.flags & ALIAS_GLOBAL)) - aliasspaceflag = 1; - lexstop = 0; - if (zshlextext == copy) - zshlextext = tokstr; - return 1; - } - if ((suf = strrchr(zshlextext, '.')) && suf[1] && - suf > zshlextext && suf[-1] != Meta && - (an = (Alias)sufaliastab->getnode(sufaliastab, suf+1)) && - !an->inuse && incmdpos) { - inpush(dupstring(zshlextext), INP_ALIAS, NULL); - inpush(" ", INP_ALIAS, NULL); - inpush(an->text, INP_ALIAS, an); - lexstop = 0; - if (zshlextext == copy) - zshlextext = tokstr; - return 1; - } + if (checkalias()) { + if (zshlextext == copy) + zshlextext = tokstr; + return 1; } /* Then check for a reserved word */ diff --git a/Test/A02alias.ztst b/Test/A02alias.ztst index 7121c50..36dfa24 100644 --- a/Test/A02alias.ztst +++ b/Test/A02alias.ztst @@ -42,3 +42,18 @@ cat <(echo foo | cat) 0:Alias expansion works at the end of parsed strings >foo + + alias '&&=(){ return $?; } && ' + alias not_the_print_command=print + eval 'print This is output + && print And so is this + && { print And this too; false; } + && print But not this + && print Nor this + true + && not_the_print_command And aliases are expanded' +0:We can now alias special tokens. Woo hoo. +>This is output +>And so is this +>And this too +>And aliases are expanded ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: grammar triviality with '&&' 2015-03-04 15:18 ` grammar triviality with '&&' Peter Stephenson @ 2015-03-04 21:13 ` Daniel Shahaf 2015-03-04 22:05 ` Mikael Magnusson 1 sibling, 0 replies; 19+ messages in thread From: Daniel Shahaf @ 2015-03-04 21:13 UTC (permalink / raw) To: Peter Stephenson; +Cc: Zsh Hackers' List Peter Stephenson wrote on Wed, Mar 04, 2015 at 15:18:30 +0000: > On Wed, 4 Mar 2015 15:47:56 +0100 > Vincent Lefevre <vincent@vinc17.net> wrote: > > I've found a bug: > > > > % alias '&&=(){ return $? } && ' > > % && echo OK > > zsh: parse error near `&&' > > (Moved to zsh-workers) > > I was keeping very quiet about this, but it looks like it's not as hairy > as I thought it might be and the new code is actually slightly cleaner... > > Now waiting for obscure failures elsewhere... > > pws > The commit doesn't include the test part (even though the changelog entry does mention it). Daniel > diff --git a/Test/A02alias.ztst b/Test/A02alias.ztst > index 7121c50..36dfa24 100644 > --- a/Test/A02alias.ztst > +++ b/Test/A02alias.ztst > @@ -42,3 +42,18 @@ > cat <(echo foo | cat) > 0:Alias expansion works at the end of parsed strings > >foo > + > + alias '&&=(){ return $?; } && ' > + alias not_the_print_command=print > + eval 'print This is output > + && print And so is this > + && { print And this too; false; } > + && print But not this > + && print Nor this > + true > + && not_the_print_command And aliases are expanded' > +0:We can now alias special tokens. Woo hoo. > +>This is output > +>And so is this > +>And this too > +>And aliases are expanded ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: grammar triviality with '&&' 2015-03-04 15:18 ` grammar triviality with '&&' Peter Stephenson 2015-03-04 21:13 ` Daniel Shahaf @ 2015-03-04 22:05 ` Mikael Magnusson 2015-03-05 9:46 ` Peter Stephenson 1 sibling, 1 reply; 19+ messages in thread From: Mikael Magnusson @ 2015-03-04 22:05 UTC (permalink / raw) To: Peter Stephenson; +Cc: Zsh Hackers' List On Wed, Mar 4, 2015 at 4:18 PM, Peter Stephenson <p.stephenson@samsung.com> wrote: > On Wed, 4 Mar 2015 15:47:56 +0100 > Vincent Lefevre <vincent@vinc17.net> wrote: >> I've found a bug: >> >> % alias '&&=(){ return $? } && ' >> % && echo OK >> zsh: parse error near `&&' > > (Moved to zsh-workers) > > I was keeping very quiet about this, but it looks like it's not as hairy > as I thought it might be and the new code is actually slightly cleaner... > > Now waiting for obscure failures elsewhere... All I have to do is press ctrl-c at a prompt, and it crashes: Core was generated by `zsh -f'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x0000000000435477 in hasher (str=0x1 <error: Cannot access memory at address 0x1>) at hashtable.c:85 85 while ((c = *((unsigned char *) str++))) (gdb) bt #0 0x0000000000435477 in hasher (str=0x1 <error: Cannot access memory at address 0x1>) at hashtable.c:85 #1 0x0000000000435490 in gethashnode (ht=0x1a9b6a0, nam=0x0) at hashtable.c:231 #2 0x00000000004484fc in checkalias () at lex.c:1743 #3 0x000000000044ae74 in exalias () at lex.c:1784 #4 0x000000000044b0b7 in zshlex () at lex.c:272 #5 0x0000000000467d93 in parse_event (endtok=endtok@entry=37) at parse.c:538 #6 0x000000000043dc29 in loop (toplevel=toplevel@entry=1, justonce=justonce@entry=0) at init.c:145 #7 0x0000000000440a32 in zsh_main (argc=<optimized out>, argv=0x7fffef4307d8) at init.c:1674 #8 0x000000000040f1c6 in main (argc=<optimized out>, argv=<optimized out>) at ./main.c:93 (Note that if you run it inside gdb, you need to change the settings for 'handle SIGINT' with the command: (gdb) handle SIGINT nostop noprint pass or gdb will eat the ctrl-c) -- Mikael Magnusson ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: grammar triviality with '&&' 2015-03-04 22:05 ` Mikael Magnusson @ 2015-03-05 9:46 ` Peter Stephenson 0 siblings, 0 replies; 19+ messages in thread From: Peter Stephenson @ 2015-03-05 9:46 UTC (permalink / raw) To: Zsh Hackers' List On Wed, 4 Mar 2015 23:05:44 +0100 Mikael Magnusson <mikachu@gmail.com> wrote: > > Now waiting for obscure failures elsewhere... > > All I have to do is press ctrl-c at a prompt, and it crashes: It's amazing what you can do if you try. It appears we have to avoid expanding null pointers. One day, someone is going to have the delightful experience of writing more interactive tests. pws diff --git a/Src/lex.c b/Src/lex.c index a076614..494ea88 100644 --- a/Src/lex.c +++ b/Src/lex.c @@ -1735,6 +1735,9 @@ checkalias(void) { Alias an; + if (!zshlextext) + return 0; + if (!noaliases && isset(ALIASESOPT) && (!isset(POSIXALIASES) || !reswdtab->getnode(reswdtab, zshlextext))) { ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <150304175112.ZM19818@torch.brasslantern.com>]
[parent not found: <20150305100638.55631238@pwslap01u.europe.root.pri>]
* Aliasing separators (Re: grammar triviality with '&&') [not found] ` <20150305100638.55631238@pwslap01u.europe.root.pri> @ 2015-03-05 17:07 ` Bart Schaefer 2015-03-05 17:40 ` Peter Stephenson 0 siblings, 1 reply; 19+ messages in thread From: Bart Schaefer @ 2015-03-05 17:07 UTC (permalink / raw) To: zsh-workers Moved to zsh-workers. On Mar 5, 10:06am, Peter Stephenson wrote: } } > Although I see PWS has already made a (broken?) stab at changing this, } > I think that's a documentation omission rather than a code bug. Some } > things intentionally cannot be aliased. } } I don't think that makes sense: there's too much you already can alias. } You can alias reserved words and arbitrary magic sequences like \&, for } example, and consequently already have the power to do as much damage as } you like. Forbidding it in this case would just be providing an } unmemorable list of special cases. There has to be a line somewhere; you can't usefully alias whitespace, for example. And I think this particular case goes over that line. Consider, with this patch in place: torch% alias -g '&&'=foo torch% set -x torch% true&&false +Src/zsh:3> true foofalse torch% &&bar +Src/zsh:4> foobar zsh: command not found: foobar That's CERTAINLY not the intended behavior -- separator tokens don't need to be delineated by whitespace, but the intention of aliasing is that it does NOT take place "inside" an unbroken string without whitespace around it. Aliasing only of STRING tokens is exactly the right thing and this change is simply wrong. The doc only says "before parsing" as a shorthand instead of a long explaination about how the alias is replaced and then parsed all over again. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Aliasing separators (Re: grammar triviality with '&&') 2015-03-05 17:07 ` Aliasing separators (Re: grammar triviality with '&&') Bart Schaefer @ 2015-03-05 17:40 ` Peter Stephenson 2015-03-06 1:42 ` Bart Schaefer 0 siblings, 1 reply; 19+ messages in thread From: Peter Stephenson @ 2015-03-05 17:40 UTC (permalink / raw) To: zsh-workers On Thu, 5 Mar 2015 09:07:20 -0800 Bart Schaefer <schaefer@brasslantern.com> wrote: > Consider, with this patch in place: > > torch% alias -g '&&'=foo > torch% set -x > torch% true&&false > +Src/zsh:3> true foofalse > torch% &&bar > +Src/zsh:4> foobar > zsh: command not found: foobar > > That's CERTAINLY not the intended behavior I would tend to agree with that, but that's largely because I have no idea what the intended behaviour would be. It doesn't surprise me you can get gobbledygook if you alias tokens to something with a different behaviour. You get gobbledygook of a different kind if you alias reserved words to something with a different behaviour. ('alias -g' has always been a disaster waiting to happen, but I think the basic feature is there without that.) > Aliasing only of STRING tokens is exactly the right thing and this change > is simply wrong. The doc only says "before parsing" as a shorthand instead > of a long explaination about how the alias is replaced and then parsed all > over again. If you can produce an alternative patch describing the previous position properly, go ahead. I don't think anyone is actually screaming to use this change. Personally, I don't see why allowing someone to alias \& but not && is logical; either you give users enough rope to hang themselves (and we do), or you limit it to non-metacharacters (and we don't). alias -g '*=*; print nonexistent-file' ls * pws ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Aliasing separators (Re: grammar triviality with '&&') 2015-03-05 17:40 ` Peter Stephenson @ 2015-03-06 1:42 ` Bart Schaefer 2015-03-06 4:13 ` Mikael Magnusson 2015-03-06 9:40 ` Peter Stephenson 0 siblings, 2 replies; 19+ messages in thread From: Bart Schaefer @ 2015-03-06 1:42 UTC (permalink / raw) To: zsh-workers On Mar 5, 5:40pm, Peter Stephenson wrote: } } Personally, I don't see why allowing someone to alias \& but not && } is logical; either you give users enough rope to hang themselves (and we } do), or you limit it to non-metacharacters (and we don't). It has nothing to do with metacharacters and everything to do with tokens. "*" is not intrinsically a token; it's only a token when delimited by whitespace or other tokens. Similarly "\&" is not intrinsically a token. If you write "**" it becomes a different thing and any alias for "*" no longer applies. But (unless quoted, when aliasing doesn't apply anyway) "&&" always is a token; further it's in the class of tokens that separate other parts of the lexical space into tokens. Allowing separators to be aliased doesn't just change the outcome of a parse (in the way that, say, an alias for "fi" would), it changes the rules for constructing other aliasable tokens. Furthermore, ignoring "alias -g" issues which I agree are an entirely smellier kettle of fish, it changes the definition of "in command position". With input like torch% &&bar I would argue that the "&&" is NOT "in command position" because in the normal lexical situation "command position" ENDS just to the left of any separator. There's NOTHING in "command position" in that example. Either "&&" is a separator token and should act like one, or it isn't and in that example the alias for "&&bar" should be looked up instead. } > Aliasing only of STRING tokens is exactly the right thing and this } > change is simply wrong. The doc only says "before parsing" as a } > shorthand instead of a long explaination about how the alias is } > replaced and then parsed all over again. } } If you can produce an alternative patch describing the previous position } properly, go ahead. I don't think anyone is actually screaming to use } this change. I'll think about a documation patch if that's what you mean. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Aliasing separators (Re: grammar triviality with '&&') 2015-03-06 1:42 ` Bart Schaefer @ 2015-03-06 4:13 ` Mikael Magnusson 2015-03-06 16:43 ` Vincent Lefevre 2015-03-06 9:40 ` Peter Stephenson 1 sibling, 1 reply; 19+ messages in thread From: Mikael Magnusson @ 2015-03-06 4:13 UTC (permalink / raw) To: Bart Schaefer; +Cc: zsh workers On Fri, Mar 6, 2015 at 2:42 AM, Bart Schaefer <schaefer@brasslantern.com> wrote: > On Mar 5, 5:40pm, Peter Stephenson wrote: > } > } Personally, I don't see why allowing someone to alias \& but not && > } is logical; either you give users enough rope to hang themselves (and we > } do), or you limit it to non-metacharacters (and we don't). > > It has nothing to do with metacharacters and everything to do with tokens. > > "*" is not intrinsically a token; it's only a token when delimited by > whitespace or other tokens. Similarly "\&" is not intrinsically a token. > If you write "**" it becomes a different thing and any alias for "*" no > longer applies. > > But (unless quoted, when aliasing doesn't apply anyway) "&&" always is > a token; further it's in the class of tokens that separate other parts > of the lexical space into tokens. Allowing separators to be aliased > doesn't just change the outcome of a parse (in the way that, say, an > alias for "fi" would), it changes the rules for constructing other > aliasable tokens. > > Furthermore, ignoring "alias -g" issues which I agree are an entirely > smellier kettle of fish, it changes the definition of "in command > position". With input like > > torch% &&bar > > I would argue that the "&&" is NOT "in command position" because in the > normal lexical situation "command position" ENDS just to the left of any > separator. There's NOTHING in "command position" in that example. > > Either "&&" is a separator token and should act like one, or it isn't > and in that example the alias for "&&bar" should be looked up instead. > > } > Aliasing only of STRING tokens is exactly the right thing and this > } > change is simply wrong. The doc only says "before parsing" as a > } > shorthand instead of a long explaination about how the alias is > } > replaced and then parsed all over again. > } > } If you can produce an alternative patch describing the previous position > } properly, go ahead. I don't think anyone is actually screaming to use > } this change. > > I'll think about a documation patch if that's what you mean. You could argue that the documentation is already right, it uses "before parsing", and && is interpreted in the "lexing" stage, right? However, I don't think users can know that... :) -- Mikael Magnusson ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Aliasing separators (Re: grammar triviality with '&&') 2015-03-06 4:13 ` Mikael Magnusson @ 2015-03-06 16:43 ` Vincent Lefevre 0 siblings, 0 replies; 19+ messages in thread From: Vincent Lefevre @ 2015-03-06 16:43 UTC (permalink / raw) To: zsh-workers On 2015-03-06 05:13:28 +0100, Mikael Magnusson wrote: > You could argue that the documentation is already right, it uses > "before parsing", and && is interpreted in the "lexing" stage, right? > However, I don't think users can know that... :) The zsh man pages don't mention anything about the lexing stage. So, if this is the intended behavior, there's at least something missing in the documentation. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Aliasing separators (Re: grammar triviality with '&&') 2015-03-06 1:42 ` Bart Schaefer 2015-03-06 4:13 ` Mikael Magnusson @ 2015-03-06 9:40 ` Peter Stephenson 2015-03-06 19:26 ` Bart Schaefer 1 sibling, 1 reply; 19+ messages in thread From: Peter Stephenson @ 2015-03-06 9:40 UTC (permalink / raw) To: zsh-workers OK, to state my basic position (though it's kind of moot --- as I said I don't think anybody really needs the change) 1. tokenisation is part of lexing 2. alias expansion comes between lexing and parsing 3. any result of lexing is game for alias expansion, unless you make stricter rules than zsh already has. But this discussion isn't really going anywhere. On Thu, 5 Mar 2015 17:42:40 -0800 Bart Schaefer <schaefer@brasslantern.com> wrote: > torch% &&bar > > I would argue that the "&&" is NOT "in command position" because in the > normal lexical situation "command position" ENDS just to the left of any > separator. There's NOTHING in "command position" in that example. > > Either "&&" is a separator token and should act like one, or it isn't > and in that example the alias for "&&bar" should be looked up instead. Well, that's not how the lexer actually works. It's been told it's in command position and it fetches the next token. So whatever comes at the start of the line *must* be in command position. The parser can throw an error if it thinks it shouldn't be, but that's after alias expansion (so much is uncontroversial). Actually, come to think of it, I think you mean the opposite: normally, when you encounter a "&&", you're expecting the continuation of the current command; here, it's the reverse --- you're expecting the start of a command, and encounter something which only occurs after a command. So you might argue that you "turn off" "&&" analysis in the same way that you "turn on" "(" analysis at the same point --- and that example's relevant because when "(" is at the start of a line we only take one character at a time, i.e. print ((foo)) ((print foo); print bar) are treated entirely differently. (By the way, aliasing "(" here therefore does exactly what you'd expect: in the second case it gets replaced as a single token eacht time it occurs, in the first place not. I don't know of a good use for this, which is a kind of motto for the current discussion.) But I don't really buy that; we know a ";" separator has to be detected at this point whether there's a command there or not. So there's not really any sensible reason for not turning "&&" into a token. Given that, in any case, no one is actually suggesting we change the lexer to do something different with "&&" I don't think I see the relevance anyway. "&&" is a token and either expanded as an alias or not, just as you get a parse error with ";;" because it's always treated as a token whether we're in a case or not. > } > Aliasing only of STRING tokens is exactly the right thing and this > } > change is simply wrong. The doc only says "before parsing" as a > } > shorthand instead of a long explaination about how the alias is > } > replaced and then parsed all over again. > } > } If you can produce an alternative patch describing the previous position > } properly, go ahead. I don't think anyone is actually screaming to use > } this change. > > I'll think about a documation patch if that's what you mean. Yes, we certainly wouldn't need any code change in the other case. pws ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Aliasing separators (Re: grammar triviality with '&&') 2015-03-06 9:40 ` Peter Stephenson @ 2015-03-06 19:26 ` Bart Schaefer 2015-03-07 15:52 ` Peter Stephenson 0 siblings, 1 reply; 19+ messages in thread From: Bart Schaefer @ 2015-03-06 19:26 UTC (permalink / raw) To: zsh-workers On Mar 6, 9:40am, Peter Stephenson wrote: } Subject: Re: Aliasing separators (Re: grammar triviality with '&&') } } OK, to state my basic position (though it's kind of moot --- as I said I } don't think anybody really needs the change) 1. tokenisation is part of } lexing 2. alias expansion comes between lexing and parsing 3. any } result of lexing is game for alias expansion, unless you make stricter } rules than zsh already has. But this discussion isn't really going } anywhere. Understood, but (prior to 34641) zsh *did* have stricter rules, and (in terms of the lexer, not in terms of explaining to end users) the rule was very simple: only STRING tokens are subject to alias expansion. In practice that means something like "only tokens that can be changed by concatenating with another string using simple lexical pasting, may be aliased." But that isn't a very satisfying way to say it (and it's not 100% true because of "{" being a reserved word). As an aside, using zshlextext isn't really correct either if the real intention is to allow aliasing of tokens. Did you plan to allow the aliasing of the NEWLIN token? Because with 34641, alias $'\n'=... does not work, but alias '\n'=... actually does create an alias for hitting enter at a blank PS1 prompt. The point being that for non-STRING tokens, zshlextext doesn't always represent the actual input string. The other minor point is that this slows down lexical analysis a lot. Many more things are going through checkalias(), including in some cases (as you pointed out) every individual character. Finally it seems wrong that "setopt POSIXALIASES" disallows aliasing of reserved words but (with 34641) still allows aliasing of other special tokens. } On Thu, 5 Mar 2015 17:42:40 -0800 } Bart Schaefer <schaefer@brasslantern.com> wrote: } > torch% &&bar } > } > I would argue that the "&&" is NOT "in command position" because in the } > normal lexical situation "command position" ENDS just to the left of any } > separator. There's NOTHING in "command position" in that example. } } Well, that's not how the lexer actually works. It's been told it's in } command position and it fetches the next token. So whatever comes at } the start of the line *must* be in command position. This is curiously flipped around from the previous discussion; now you're arguing from the strict lexer POV and I'm talking about what it ought to mean to the end user. The lexer can certainly be (and was, before, though it was not explicitly stated) smart enough to know that any token that arrives at that point with tokstr == NULL is not in fact something that could be a command and therefore shouldn't be treated as one. } Given that, in any case, no one is actually suggesting we change the } lexer to do something different with "&&" I don't think I see the } relevance anyway. "&&" is a token and either expanded as an alias or } not It's relevant to "alias" vs. "alias -g". If && at the start of the line is not in command position, then it doesn't expand unless it has the global-alias flag. Incidentally here is a curious bug that is present both (long, long) before and also after 34641: torch% alias \{='print foo' torch% { this is a test foo this is a test So far so good, now recall from history: torch% { this is a test Back up and delete the space after "{": torch% {this is a test foothis is a test Now recall from history again: torch% { foothis is a test That seems like an oopsie. I'd actually rather the disambiguating space after the brace was inserted when the alias expands instead of pasting up "foothis". But let's figure out whether to undo/redo all or part of 34641 first. -- Barton E. Schaefer ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Aliasing separators (Re: grammar triviality with '&&') 2015-03-06 19:26 ` Bart Schaefer @ 2015-03-07 15:52 ` Peter Stephenson 2015-03-07 17:18 ` Ray Andrews 2015-03-07 21:10 ` Bart Schaefer 0 siblings, 2 replies; 19+ messages in thread From: Peter Stephenson @ 2015-03-07 15:52 UTC (permalink / raw) To: zsh-workers On Fri, 6 Mar 2015 11:26:28 -0800 Bart Schaefer <schaefer@brasslantern.com> wrote: > On Mar 6, 9:40am, Peter Stephenson wrote: > } Subject: Re: Aliasing separators (Re: grammar triviality with '&&') > } > } OK, to state my basic position (though it's kind of moot --- as I said I > } don't think anybody really needs the change) 1. tokenisation is part of > } lexing 2. alias expansion comes between lexing and parsing 3. any > } result of lexing is game for alias expansion, unless you make stricter > } rules than zsh already has. But this discussion isn't really going > } anywhere. > > Understood, but (prior to 34641) zsh *did* have stricter rules, and (in > terms of the lexer, not in terms of explaining to end users) the rule > was very simple: only STRING tokens are subject to alias expansion. > > In practice that means something like "only tokens that can be changed > by concatenating with another string using simple lexical pasting, may > be aliased." But that isn't a very satisfying way to say it (and it's > not 100% true because of "{" being a reserved word). Sure, but I'm not sure that's particularly useful for users. The rule looks something like "you can't alias it if it's one of them things where you don't need to put a space after it", or something like that. We can still document it somehow, though, so this isn't really fundamental. > As an aside, using zshlextext isn't really correct either if the real > intention is to allow aliasing of tokens. Did you plan to allow the > aliasing of the NEWLIN token? Because with 34641, > > alias $'\n'=... > > does not work, but > > alias '\n'=... > > actually does create an alias for hitting enter at a blank PS1 prompt. > The point being that for non-STRING tokens, zshlextext doesn't always > represent the actual input string. Yes, (not necessarily here but in general) that has effects in other cases e.g. text representation of syntactic structures, so it's certainly something to think about. The cases where it's different are weird and wonderful enough it's not clear you'd want to work, but that's a fairly fuzzy target. There'd certainly be room for more detailed advice to users (though, to be honest, I'm less convinced than I used to be about the merits of longer documentation). > The other minor point is that this slows down lexical analysis a lot. > Many more things are going through checkalias(), including in some > cases (as you pointed out) every individual character. That's not necessarily that minor, actually; do you have numbers? Tokens are a small fraction of most commands but even there there may be pathological cases. (Hmm... entirely separately, what would we gain by optimising the case where there are no global aliases, which a lot of us don't use, not to search for them? That looks straightforward --- count added or removed global aliases, or less bug prone scan the alias table when it changes, and only search for aliases if the count is non-zero or incmdpos or inalmore are set. The major problem is the likelihood of an obscure bug in the count rendering global aliases unusable.) > Finally it seems wrong that "setopt POSIXALIASES" disallows aliasing of > reserved words but (with 34641) still allows aliasing of other special > tokens. Yes, that looks like a real bug. > } On Thu, 5 Mar 2015 17:42:40 -0800 > } Bart Schaefer <schaefer@brasslantern.com> wrote: > } > torch% &&bar > } > > } > I would argue that the "&&" is NOT "in command position" because in the > } > normal lexical situation "command position" ENDS just to the left of any > } > separator. There's NOTHING in "command position" in that example. > } > } Well, that's not how the lexer actually works. It's been told it's in > } command position and it fetches the next token. So whatever comes at > } the start of the line *must* be in command position. > > This is curiously flipped around from the previous discussion; now you're > arguing from the strict lexer POV and I'm talking about what it ought to > mean to the end user. I'm under the impression I've been arguing about what actually happens, which is complicated when different people have different views of it. > The lexer can certainly be (and was, before, though it was not explicitly > stated) smart enough to know that any token that arrives at that point > with tokstr == NULL is not in fact something that could be a command and > therefore shouldn't be treated as one. Not sure what this means. "(" in command position is a token but is effectively a command meaning "enter a subshell and while you're there do whatever I tell you next", and is handled as such by the exec.c chain. > } Given that, in any case, no one is actually suggesting we change the > } lexer to do something different with "&&" I don't think I see the > } relevance anyway. "&&" is a token and either expanded as an alias or > } not > > It's relevant to "alias" vs. "alias -g". If && at the start of the line > is not in command position, then it doesn't expand unless it has the > global-alias flag. My point is that doesn't really makes sense unless you decide that "&&" at the start of the line isn't going to be a token *at all*, in the same way that "(" has effectively the reverse behaviour. In other words, either you lose the parse error on "&& foo" and treat "&&" as a string *within* the lexer when it occurs in command position, or there's no case to answer here because the formal distinction you're trying to make doesn't actually exist within the shell. If you did make that change, making "&&" a string rather than a token at the start of the line, then you could alias it willy-nilly. So I still don't really see it as relevant to the behaviour of tokens. I suspect I'm not explaining this point properly. pws ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Aliasing separators (Re: grammar triviality with '&&') 2015-03-07 15:52 ` Peter Stephenson @ 2015-03-07 17:18 ` Ray Andrews 2015-03-07 21:10 ` Bart Schaefer 1 sibling, 0 replies; 19+ messages in thread From: Ray Andrews @ 2015-03-07 17:18 UTC (permalink / raw) To: zsh-workers On 03/07/2015 07:52 AM, Peter Stephenson wrote: > If you did make that change, making "&&" a string rather than a token > at the start of the line, then you could alias it willy-nilly. So I > still don't really see it as relevant to the behaviour of tokens. I > suspect I'm not explaining this point properly. pws As for me, I didn't realize I'd be guilty of being a party to the creation of an ouroboros. Can it even be contemplated to make fundamental syntax aliasable? Can I alias a backtick? Can I alias the word alias? Not on this side of sanity. I dunno, maybe what you guys are contemplating makes sense, but it sure looks like the time-traveler's paradox to me. Allowing syntax to change it's own meaning? Is there anywhere in this universe where aliasing '&&' is useful, even if it didn't create paradox? I myself am happy with Bart's last explanation, it's robust, understandable, necessary, fundamental. [ -e file1 ]\ && do-this isn't hard to type. Pandora, meet Kurt Godel, meet Doctor Who. Lawrence, forgive me ;-) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Aliasing separators (Re: grammar triviality with '&&') 2015-03-07 15:52 ` Peter Stephenson 2015-03-07 17:18 ` Ray Andrews @ 2015-03-07 21:10 ` Bart Schaefer 2015-03-09 11:46 ` Vincent Lefevre 1 sibling, 1 reply; 19+ messages in thread From: Bart Schaefer @ 2015-03-07 21:10 UTC (permalink / raw) To: zsh-workers On Mar 7, 3:52pm, Peter Stephenson wrote: } Subject: Re: Aliasing separators (Re: grammar triviality with '&&') } } On Fri, 6 Mar 2015 11:26:28 -0800 } Bart Schaefer <schaefer@brasslantern.com> wrote: } > } > In practice that means something like "only tokens that can be changed } > by concatenating with another string using simple lexical pasting, may } > be aliased." But that isn't a very satisfying way to say it (and it's } > not 100% true because of "{" being a reserved word). } } Sure, but I'm not sure that's particularly useful for users. The rule } looks something like "you can't alias it if it's one of them things } where you don't need to put a space after it", or something like that. Actually, that helped me to put some sort of explanation around my intuition of how aliasing should work: A token may be expanded as an alias only if doing so cannot change the lexical interpretation of any tokens that may appear adjacent to it. This emphasizes that "{this" expanding to "{ foothis" in the example at the tail of my previous message, is a bug: The interpretation of "this" has been changed, and the internal inconsistency is manifest when you examine what has been stored in the history. Also note "cannot" rather than "does not," which emphasizes why you can't make an alias for "&&" even if you always write "x && y" rather than "x&&y", and why you can't alias newline. On the other hand, if aliasing were an entirely different stage rather than occurring during regular lexing -- I think csh worked that way -- then I might be less concerned about this. That is, for csh I believe it went something like: 1. Read in the line 2. Break the line into words using whitespace and quoting ** 3. Check each word for alias expansion 4. Apply lexical analysis to the result 2 + 3 are why csh aliases are allowed to make \!:N history references to the words in the line while being expanded. Since zsh does 1 and 4 simultaneously, the rules at 3 have to be different, but the basic intention of applying the expansion to "words" was never meant to go away as a result. ** Also, csh broke the line up into commands at ';' '&&' etc. before applying aliases, so \!:N references don't cross command boundaries. } > The other minor point is that this slows down lexical analysis a lot. } > Many more things are going through checkalias(), including in some } > cases (as you pointed out) every individual character. } } That's not necessarily that minor, actually; do you have numbers? I "repaired" POSIXALIASES like so ... diff --git a/Src/lex.c b/Src/lex.c index 494ea88..33c5288 100644 --- a/Src/lex.c +++ b/Src/lex.c @@ -1739,7 +1739,7 @@ checkalias(void) return 0; if (!noaliases && isset(ALIASESOPT) && - (!isset(POSIXALIASES) || + (!isset(POSIXALIASES) || tokstr && !reswdtab->getnode(reswdtab, zshlextext))) { char *suf; ... and then ran this: repeat 10 do time Src/zsh -ic 'autoload +X -m \*' time Src/zsh -o posixaliases -ic 'autoload +X -m \*' done The "autoload +X" has the effect of loading the entire completion suite, which was the largest convenient bolus of lexing/parsing work I could think to throw at it. The difference with 30 aliases defined (none global) was insignificant, so looking up reserved words is not really a factor. I then backed out the lex.c changes and compared the old processing to the new. There still wasn't much difference. However, this is with an unstripped binary compiled for debugging, so it's possible a larger difference would show up if optimization were enabled. } > The lexer can certainly be (and was, before, though it was not } > explicitly stated) smart enough to know that any token that arrives } > at that point with tokstr == NULL is not in fact something that } > could be a command and therefore shouldn't be treated as one. } } Not sure what this means. "(" in command position is a token but is } effectively a command Maybe I could express it this way: A command is something such that if you prefix it with a precommand modifier, it's still a command. The old aliasing code effectively differentiated those kinds of somethings from other arbitrary tokens that might appear in what the lexer calls command position. It didn't do so in an obvious way, but looking for tokstr == NULL has the equivilent effect. } My point is that doesn't really makes sense unless you decide that "&&" } at the start of the line isn't going to be a token *at all*, in the same } way that "(" has effectively the reverse behaviour. In other words, } either you lose the parse error on "&& foo" and treat "&&" as a string } *within* the lexer when it occurs in command position, or there's no } case to answer here because the formal distinction you're trying to make } doesn't actually exist within the shell. This helps me understand what you were getting at, but I would counter that the existence of precommand modifiers shows that the shell does in fact have that distinction -- internally the distinction is at another level, but from an overall perspective "&&" appears only after commands, and "(" appears before; neither occurs "in place of" a command. Except for the -g case, aliases should only apply to things that really can be "in place of" a command. } If you did make that change, making "&&" a string rather than a token } at the start of the line, then you could alias it willy-nilly. It'd have to be not just at the start of a line, but after every ";" or "|" and so on. But you can't really do that -- it has to be token when it is not an alias, otherwise you get "&&: command not found" instead of the syntax error you should get. So you'd have to lex it as a string, look to see if it's an alias, and then if it is not, back up and re-lex it. Which is sort of what happens already when it IS an alias, I suppose. -- Barton E. Schaefer ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Aliasing separators (Re: grammar triviality with '&&') 2015-03-07 21:10 ` Bart Schaefer @ 2015-03-09 11:46 ` Vincent Lefevre 2015-03-09 16:33 ` Peter Stephenson 0 siblings, 1 reply; 19+ messages in thread From: Vincent Lefevre @ 2015-03-09 11:46 UTC (permalink / raw) To: zsh-workers On 2015-03-07 13:10:08 -0800, Bart Schaefer wrote: > Actually, that helped me to put some sort of explanation around my > intuition of how aliasing should work: > > A token may be expanded as an alias only if doing so cannot change the > lexical interpretation of any tokens that may appear adjacent to it. > > This emphasizes that "{this" expanding to "{ foothis" in the example at > the tail of my previous message, is a bug: The interpretation of "this" > has been changed, and the internal inconsistency is manifest when you > examine what has been stored in the history. Can't this be solved by adding a space during alias expansion? -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Aliasing separators (Re: grammar triviality with '&&') 2015-03-09 11:46 ` Vincent Lefevre @ 2015-03-09 16:33 ` Peter Stephenson 2015-03-09 17:03 ` Bart Schaefer 0 siblings, 1 reply; 19+ messages in thread From: Peter Stephenson @ 2015-03-09 16:33 UTC (permalink / raw) To: zsh-workers On Mon, 9 Mar 2015 12:46:29 +0100 Vincent Lefevre <vincent@vinc17.net> wrote: > On 2015-03-07 13:10:08 -0800, Bart Schaefer wrote: > > Actually, that helped me to put some sort of explanation around my > > intuition of how aliasing should work: > > > > A token may be expanded as an alias only if doing so cannot change the > > lexical interpretation of any tokens that may appear adjacent to it. > > > > This emphasizes that "{this" expanding to "{ foothis" in the example at > > the tail of my previous message, is a bug: The interpretation of "this" > > has been changed, and the internal inconsistency is manifest when you > > examine what has been stored in the history. > > Can't this be solved by adding a space during alias expansion? What happens to white space certainly seems to be central to the things Bart is most worried about --- both what goes into and what comes out of the alias. Possibly soluble with enough checks and management of token expansion. Are we motivated enough to look into the necessary code to improve this? We still have the option of backing it off before anyone outside the mailing list notices. ("We" mostly meaning the people doing the work, which for present purposes means Bart and me. People not doing any work are much more easily motivated :-/) pws ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Aliasing separators (Re: grammar triviality with '&&') 2015-03-09 16:33 ` Peter Stephenson @ 2015-03-09 17:03 ` Bart Schaefer 2015-03-09 17:39 ` Peter Stephenson 2015-03-09 17:47 ` Ray Andrews 0 siblings, 2 replies; 19+ messages in thread From: Bart Schaefer @ 2015-03-09 17:03 UTC (permalink / raw) To: zsh-workers On Mar 9, 4:33pm, Peter Stephenson wrote: } } What happens to white space certainly seems to be central to the things } Bart is most worried about --- both what goes into and what comes out of } the alias. Possibly soluble with enough checks and management of } token expansion. Are we motivated enough to look into the necessary } code to improve this? We still have the option of backing it off } before anyone outside the mailing list notices. My vote (perhaps obviously) would be to back it out, fix the issue with "{" / catenation and the history, and then consider whether aliasing of other tokens can be re-introduced. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Aliasing separators (Re: grammar triviality with '&&') 2015-03-09 17:03 ` Bart Schaefer @ 2015-03-09 17:39 ` Peter Stephenson 2015-03-09 17:47 ` Ray Andrews 1 sibling, 0 replies; 19+ messages in thread From: Peter Stephenson @ 2015-03-09 17:39 UTC (permalink / raw) To: zsh-workers On Mon, 9 Mar 2015 10:03:18 -0700 Bart Schaefer <schaefer@brasslantern.com> wrote: > On Mar 9, 4:33pm, Peter Stephenson wrote: > } > } What happens to white space certainly seems to be central to the things > } Bart is most worried about --- both what goes into and what comes out of > } the alias. Possibly soluble with enough checks and management of > } token expansion. Are we motivated enough to look into the necessary > } code to improve this? We still have the option of backing it off > } before anyone outside the mailing list notices. > > My vote (perhaps obviously) would be to back it out, fix the issue with > "{" / catenation and the history, and then consider whether aliasing of > other tokens can be re-introduced. I'm not sure what the last bit means, but the only requirement for backing it off remains updating the documentation so it properly describes what can and can't be aliased. pws ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Aliasing separators (Re: grammar triviality with '&&') 2015-03-09 17:03 ` Bart Schaefer 2015-03-09 17:39 ` Peter Stephenson @ 2015-03-09 17:47 ` Ray Andrews 1 sibling, 0 replies; 19+ messages in thread From: Ray Andrews @ 2015-03-09 17:47 UTC (permalink / raw) To: zsh-workers On 03/09/2015 10:03 AM, Bart Schaefer wrote: > My vote (perhaps obviously) would be to back it out, fix the issue with > "{" / catenation and the history, and then consider whether aliasing of > other tokens can be re-introduced. > I'm not competent to vote, but my hope will be that whatever happens will make the code more consistent, reliable,understandable and intuitive, and make usage more predicable and friendly and not open any cans of worms. [ -e file1 ] && do-this ... from the start, I understood that this was friendly ... the parser might have been crabby and demand a complete statement but it is considerate enough to wrap and see what's on the next line before throwing an error. In the same spirit I thought that: [ -e file1 ] && do-this ... would be the same sort of friendliness, with the '&&' automatically checking the previous errorlevel. I thought that is would be impossible for it to parse as anything but it's naive reading, therefore it need not be an error. But if not, it's simply nothing of a real world difficulty. The 'alias' thing seems very scary. I might one day understand aliasing fundamental syntax, but for now the mind reels. Are not things tricky enough already? I hope something good comes from my trivial question ;-) ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2015-03-09 17:47 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <54F33934.2070607@eastlink.ca> [not found] ` <13666281425228233@web7o.yandex.ru> [not found] ` <54F345D3.9010204@eastlink.ca> [not found] ` <D0509295-7DA9-4F18-9E3D-D50C0A756998@larryv.me> [not found] ` <20150302022754.GA7449@xvii.vinc17.org> [not found] ` <CABx2=D8efL3X2tfB+_+VweY2yye6EhaMNbJa3b3jJeVMp=7gaQ@mail.gmail.com> [not found] ` <20150302104619.GC6869@xvii.vinc17.org> [not found] ` <20150302110610.2e2c7e86@pwslap01u.europe.root.pri> [not found] ` <CAH+w=7YoHjN85hqOZVywOfYGZqvU74vZrbE84Ln+V2HQi-6nSA@mail.gmail.com> [not found] ` <20150304144756.GA27231@ypig.lip.ens-lyon.fr> 2015-03-04 15:18 ` grammar triviality with '&&' Peter Stephenson 2015-03-04 21:13 ` Daniel Shahaf 2015-03-04 22:05 ` Mikael Magnusson 2015-03-05 9:46 ` Peter Stephenson [not found] ` <150304175112.ZM19818@torch.brasslantern.com> [not found] ` <20150305100638.55631238@pwslap01u.europe.root.pri> 2015-03-05 17:07 ` Aliasing separators (Re: grammar triviality with '&&') Bart Schaefer 2015-03-05 17:40 ` Peter Stephenson 2015-03-06 1:42 ` Bart Schaefer 2015-03-06 4:13 ` Mikael Magnusson 2015-03-06 16:43 ` Vincent Lefevre 2015-03-06 9:40 ` Peter Stephenson 2015-03-06 19:26 ` Bart Schaefer 2015-03-07 15:52 ` Peter Stephenson 2015-03-07 17:18 ` Ray Andrews 2015-03-07 21:10 ` Bart Schaefer 2015-03-09 11:46 ` Vincent Lefevre 2015-03-09 16:33 ` Peter Stephenson 2015-03-09 17:03 ` Bart Schaefer 2015-03-09 17:39 ` Peter Stephenson 2015-03-09 17:47 ` Ray Andrews
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/zsh/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).