zsh-workers
 help / color / mirror / code / Atom feed
* [BUG] In reference to patch 39815, about (z) flag and $( parse error
@ 2017-10-10 15:03 Sebastian Gniazdowski
  2017-10-10 16:19 ` Sebastian Gniazdowski
  0 siblings, 1 reply; 23+ messages in thread
From: Sebastian Gniazdowski @ 2017-10-10 15:03 UTC (permalink / raw)
  To: zsh-workers

Hello,
there's patch:

2016-11-03 4073a66 39815: Read input to end on parse error in $(...) inside a string.

It almost fixes following issue. There's syntax.txt:

asmcmds+=(${(o)$(cgasm -f '.*' | perl -alne '
                BEGIN{ my @cmds; }
                push @cmds, split(/ /, lc $F[0] =~ y|/| |r);
                END{ print join " ", @cmds;}'
        )})
dbpkgs+=(${(fo@)$(pacman -Qq)})

On 5.2:

% data=$(<syntax.txt)
% arr=( "${(Z+cn+)data}" )
% print -rl "${arr[@]}"
asmcmds+=(
${(o)$(cgasm -f '.*' | perl -alne '
                BEGIN{ my @cmds; }
                push @cmds, split(/ /, lc $F[0] =~ y|/| |r);
                END{ print join " ", @cmds;}'

%

On 5.3 (has the patch):

% data=$(<syntax.txt)
% arr=( "${(Z+cn+)data}" )
% print -rl "${arr[@]}"
asmcmds+=(
${(o)$(cgasm -f '.*' | perl -alne '
                BEGIN{ my @cmds; }
                push @cmds, split(/ /, lc $F[0] =~ y|/| |r);
                END{ print join " ", @cmds;}'
        )})
dbpkgs+=(${(fo@)$(pacman -Qq)})
%

Then, doing on 5.3: 

% a='dbpkgs+=(${(fo@)$(pacman -Qq)})'
% print -rl "${(Z+cn+)a}"
dbpkgs+=(
${(fo@)$(pacman -Qq)}
)

So something is still wrong?

--  
Sebastian Gniazdowski
psprint /at/ zdharma.org


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG] In reference to patch 39815, about (z) flag and $( parse error
  2017-10-10 15:03 [BUG] In reference to patch 39815, about (z) flag and $( parse error Sebastian Gniazdowski
@ 2017-10-10 16:19 ` Sebastian Gniazdowski
  2017-10-11  3:49   ` Bart Schaefer
       [not found]   ` <etPan.59dd94f1.69ba7f51.98a8@AirmailxGenerated.am>
  0 siblings, 2 replies; 23+ messages in thread
From: Sebastian Gniazdowski @ 2017-10-10 16:19 UTC (permalink / raw)
  To: zsh-workers

Spend some time on this (after debugging whole thing before). Following parses correctly with (z):

asmcmds+=(${(o)$(ls -1 | perl -alne 'echo foo')
})
dbpkgs+=(${(fo@)$(pacman -Qq)})

following doesn't:

asmcmds+=(${(o)$(ls -1 | perl -alne 'echo foo'
)})
dbpkgs+=(${(fo@)$(pacman -Qq)})

And looking further, it's about ")" being at the same line as $(, i.e. this doesn't parse with (z):

$(ls -1 | perl -alne 'echo foo'
)
dbpkgs+=(${(fo@)$(pacman -Qq)})

(last line to test if one gets following tokens or just text).
-- 
Sebastian Gniazdowski
psprint /at/ zdharma.org


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG] In reference to patch 39815, about (z) flag and $( parse error
  2017-10-10 16:19 ` Sebastian Gniazdowski
@ 2017-10-11  3:49   ` Bart Schaefer
  2017-10-11  6:41     ` Peter Stephenson
       [not found]   ` <etPan.59dd94f1.69ba7f51.98a8@AirmailxGenerated.am>
  1 sibling, 1 reply; 23+ messages in thread
From: Bart Schaefer @ 2017-10-11  3:49 UTC (permalink / raw)
  To: Sebastian Gniazdowski, zsh-workers

On Oct 10,  6:19pm, Sebastian Gniazdowski wrote:
} Subject: Re: [BUG] In reference to patch 39815, about (z) flag and $( pars
}
} Following parses correctly with (z):
} 
} asmcmds+=(${(o)$(ls -1 | perl -alne 'echo foo')
} })
} dbpkgs+=(${(fo@)$(pacman -Qq)})
} 
} following doesn't:
} 
} asmcmds+=(${(o)$(ls -1 | perl -alne 'echo foo'
} )})
} dbpkgs+=(${(fo@)$(pacman -Qq)})
} 
} And looking further, it's about ")" being at the same line as $(

So the doc says that (Z:n:) "causes unquoted newlines to be treated as
ordinary whitespace" but in fact what it really does is cause newline
*not* to be converted into a separator; it is still labeled as its
own separate token, which confuses this parse.

The following fixes it, but it seems a hack -- is there a better way?

diff --git a/Src/lex.c b/Src/lex.c
index 8493d47..a7b71d6 100644
--- a/Src/lex.c
+++ b/Src/lex.c
@@ -308,8 +308,14 @@ zshlex(void)
 	isnewlin = 0;
     else
 	isnewlin = (inbufct) ? -1 : 1;
-    if (tok == SEMI || (tok == NEWLIN && !(lexflags & LEXFLAGS_NEWLINE)))
+    if (tok == SEMI)
 	tok = SEPER;
+    else if (tok == NEWLIN) {
+	if (lexflags & LEXFLAGS_NEWLINE)
+	    zshlex();
+	else
+	    tok = SEPER;
+    }
 }
 
 /**/


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG] In reference to patch 39815, about (z) flag and $( parse error
  2017-10-11  3:49   ` Bart Schaefer
@ 2017-10-11  6:41     ` Peter Stephenson
  0 siblings, 0 replies; 23+ messages in thread
From: Peter Stephenson @ 2017-10-11  6:41 UTC (permalink / raw)
  To: zsh-workers

On Tue, 10 Oct 2017 20:49:17 -0700
Bart Schaefer <schaefer@brasslantern.com> wrote:
> So the doc says that (Z:n:) "causes unquoted newlines to be treated as
> ordinary whitespace" but in fact what it really does is cause newline
> *not* to be converted into a separator; it is still labeled as its
> own separate token, which confuses this parse.
> 
> The following fixes it, but it seems a hack -- is there a better way?

To restrict it just to this case, you mean?  It ought to be possible to
detect specific flags.  But as long as it does something plausible I
don't think it matters.  The way it sits on top of the lexical analyser
is perpetual work in progress.

pws


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG] In reference to patch 39815, about (z) flag and $( parse error
       [not found]   ` <etPan.59dd94f1.69ba7f51.98a8@AirmailxGenerated.am>
@ 2017-10-11  8:31     ` Sebastian Gniazdowski
  2017-10-11 14:09       ` Sebastian Gniazdowski
  2017-10-11 17:02       ` Bart Schaefer
  0 siblings, 2 replies; 23+ messages in thread
From: Sebastian Gniazdowski @ 2017-10-11  8:31 UTC (permalink / raw)
  To: Bart Schaefer, zsh-workers

[-- Attachment #1: Type: text/plain, Size: 1607 bytes --]

On 11 Oct 2017 at 05:49:17, Bart Schaefer (schaefer@brasslantern.com) wrote:
> On Oct 10, 6:19pm, Sebastian Gniazdowski wrote:
> ...
> } And looking further, it's about ")" being at the same line as $(
>  
> So the doc says that (Z:n:) "causes unquoted newlines to be treated as
> ordinary whitespace" but in fact what it really does is cause newline
> *not* to be converted into a separator; it is still labeled as its
> own separate token, which confuses this parse.
>  
> The following fixes it, but it seems a hack -- is there a better way?
>  
> diff --git a/Src/lex.c b/Src/lex.c

I've tested the patch. It was easy because I have unit tests that check correctness of parsing of various zshrc files taken from Github. There's a problem with line:

ZSH_AUTOSUGGEST_CLEAR_WIDGETS+=("expand-or-complete")

Testing it manually is fine:

% data='ZSH_AUTOSUGGEST_CLEAR_WIDGETS+=("expand-or-complete")'; print -rl "${(Z+cn+)data}"
ZSH_AUTOSUGGEST_CLEAR_WIDGETS+=(
"expand-or-complete"
)

However one test fails on this line. I attach screenshot, left is dump of tokens on 5.3.1, right is on 5.4.2-dev-0 with the patch.

So basically the patch produces single token: ZSH_AUTOSUGGEST_CLEAR_WIDGETS+=("expand-or-complete").

One other problematic line is:   hosts=($h ${${${(@M)${(f)"$(cat ~/.ssh/config)"}:#Host *}#Host }:#*[*?]*})

The zshrc is:
https://github.com/zdharma/zplugin-crasis/blob/master/test/_test4/zshrc

The dump is easy to do: data=$(<zshrc); print -rl "${(Z+cn+)data}" > all-5.4.2-dev-0.txt, etc.

--  
Sebastian Gniazdowski
psprint /at/ zdharma.org

[-- Attachment #2: tokdmp.png --]
[-- Type: image/png, Size: 60259 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG] In reference to patch 39815, about (z) flag and $( parse error
  2017-10-11  8:31     ` Sebastian Gniazdowski
@ 2017-10-11 14:09       ` Sebastian Gniazdowski
  2017-10-11 15:07         ` Sebastian Gniazdowski
  2017-10-11 17:02       ` Bart Schaefer
  1 sibling, 1 reply; 23+ messages in thread
From: Sebastian Gniazdowski @ 2017-10-11 14:09 UTC (permalink / raw)
  To: Bart Schaefer, zsh-workers

Just checked that the zshrc parses correctly from zsh-5.0.6-dev-0 down, i.e. 5.0.3, 5.0.2, etc.

On 11 Oct 2017 at 10:31:00, Sebastian Gniazdowski (psprint@zdharma.org) wrote:
> ...
> The zshrc is:
> https://github.com/zdharma/zplugin-crasis/blob/master/test/_test4/zshrc  
> ...

--  
Sebastian Gniazdowski
psprint /at/ zdharma.org


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG] In reference to patch 39815, about (z) flag and $( parse error
  2017-10-11 14:09       ` Sebastian Gniazdowski
@ 2017-10-11 15:07         ` Sebastian Gniazdowski
  0 siblings, 0 replies; 23+ messages in thread
From: Sebastian Gniazdowski @ 2017-10-11 15:07 UTC (permalink / raw)
  To: Bart Schaefer, zsh-workers

Sorry, I meant the original zshrc, which has the next-line $(-closing ')'.

    https://github.com/alyptik/dotfiles/blob/master/.zsh.d/.zshrc

So basically the Z+cn+ problem doesn't occur on zsh <= 5.0.6-dev-0.

On 11 Oct 2017 at 16:09:45, Sebastian Gniazdowski (psprint@zdharma.org) wrote:
> Just checked that the zshrc parses correctly from zsh-5.0.6-dev-0 down, i.e. 5.0.3,  
> 5.0.2, etc.
>  
> On 11 Oct 2017 at 10:31:00, Sebastian Gniazdowski (psprint@zdharma.org) wrote:
> > ...
> > The zshrc is:
> > https://github.com/zdharma/zplugin-crasis/blob/master/test/_test4/zshrc  
> > ...

--  
Sebastian Gniazdowski
psprint /at/ zdharma.org


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG] In reference to patch 39815, about (z) flag and $( parse error
  2017-10-11  8:31     ` Sebastian Gniazdowski
  2017-10-11 14:09       ` Sebastian Gniazdowski
@ 2017-10-11 17:02       ` Bart Schaefer
  2017-10-12 14:54         ` Sebastian Gniazdowski
  1 sibling, 1 reply; 23+ messages in thread
From: Bart Schaefer @ 2017-10-11 17:02 UTC (permalink / raw)
  To: zsh-workers

On Oct 11, 10:31am, Sebastian Gniazdowski wrote:
}
} However one test fails on this line.

Hmm, so the problem appears to be that when parsing a comment, we
actually need the NEWLIN token to mark the end of the comment.
When parsing code, we need NEWLIN to be ignored as whitespace and
not tokenized.

That is, in your example the ZSH_AUTOSUGGEST_CLEAR_WIDGETS line is
parsed incorrectly (with my patch) because the line preceding it is
a comment.

So I think the fix for this is going to have to be elsewhere, rather
than in zshlex().


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG] In reference to patch 39815, about (z) flag and $( parse error
  2017-10-11 17:02       ` Bart Schaefer
@ 2017-10-12 14:54         ` Sebastian Gniazdowski
  2017-10-12 15:26           ` Bart Schaefer
  0 siblings, 1 reply; 23+ messages in thread
From: Sebastian Gniazdowski @ 2017-10-12 14:54 UTC (permalink / raw)
  To: Bart Schaefer, zsh-workers

On 11 października 2017 at 19:02:31, Bart Schaefer (schaefer@brasslantern.com) wrote:
> Hmm, so the problem appears to be that when parsing a comment, we
> actually need the NEWLIN token to mark the end of the comment.
> When parsing code, we need NEWLIN to be ignored as whitespace and
> not tokenized.
>  
> That is, in your example the ZSH_AUTOSUGGEST_CLEAR_WIDGETS line is
> parsed incorrectly (with my patch) because the line preceding it is
> a comment.
>  
> So I think the fix for this is going to have to be elsewhere, rather
> than in zshlex().

Have you any possible places for the fix? I've ran bisect, and the "<= 5.0.6 doesn't have Z+cn+ problem" holds up to this patch:

commit c0d01a6fe0c67911650730cf13a2b9a0db16e59b
Date:   Tue Jan 6 17:05:17 2015 +0000

    Fix command substitutions to parse contents as they are read in.

    Do this by refactoring misnamed lexsave()/lexrestore() to allow
    continuity of history and input.

So accurately it's 5.0.7-dev-1. Looks old, but 2015 isn't that far away.

--  
Sebastian Gniazdowski
psprint /at/ zdharma.org


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG] In reference to patch 39815, about (z) flag and $( parse error
  2017-10-12 14:54         ` Sebastian Gniazdowski
@ 2017-10-12 15:26           ` Bart Schaefer
  2017-10-12 15:50             ` Peter Stephenson
  2017-10-13  7:53             ` Sebastian Gniazdowski
  0 siblings, 2 replies; 23+ messages in thread
From: Bart Schaefer @ 2017-10-12 15:26 UTC (permalink / raw)
  To: zsh-workers

On Thu, Oct 12, 2017 at 7:54 AM, Sebastian Gniazdowski
<psprint@zdharma.org> wrote:
> On 11 października 2017 at 19:02:31, Bart Schaefer (schaefer@brasslantern.com) wrote:
>> Hmm, so the problem appears to be that when parsing a comment, we
>> actually need the NEWLIN token to mark the end of the comment.
>> When parsing code, we need NEWLIN to be ignored as whitespace and
>> not tokenized.
>
> Have you any possible places for the fix?

It'd most likely need to be in bufferwords(), which somehow needs to
know that the STRING token returned from the lexer is [or not] a
comment and that therefore it should [or not] ignore the subsequent
NEWLIN.  However, that cascades all the way down into gettok() which
is where the "#" character is recognized and everything from there up
to but not including the newline is consumed.

> I've ran bisect, and the "<= 5.0.6 doesn't have Z+cn+ problem" holds up to this patch:
>
> commit c0d01a6fe0c67911650730cf13a2b9a0db16e59b
> Date:   Tue Jan 6 17:05:17 2015 +0000
>
>     Fix command substitutions to parse contents as they are read in.

(PWS, it would be helpful if you could include workers article numbers
in your commit logs.)

That commit is this:

        * 34160: Src/init.c, Src/input.c, Src/lex.c, Src/parse.c,
        Src/zsh.h, Test/D08cmdsubst.ztst: fix the problem that command
        and similar substitutions weren't properly parsed so could end
        prematurely.  Use improved resolution in context save and restore
        to allow parsing the substitution while tracking the string.

So it's not that 5.0.6 was correct, it's that we've swapped one
problem [that affects more than just (Z:n:)] for another.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG] In reference to patch 39815, about (z) flag and $( parse error
  2017-10-12 15:26           ` Bart Schaefer
@ 2017-10-12 15:50             ` Peter Stephenson
  2017-10-13  7:56               ` Bart Schaefer
  2017-10-13  7:53             ` Sebastian Gniazdowski
  1 sibling, 1 reply; 23+ messages in thread
From: Peter Stephenson @ 2017-10-12 15:50 UTC (permalink / raw)
  To: zsh-workers

On Thu, 12 Oct 2017 08:26:35 -0700
Bart Schaefer <schaefer@brasslantern.com> wrote:
> It'd most likely need to be in bufferwords(), which somehow needs to
> know that the STRING token returned from the lexer is [or not] a
> comment and that therefore it should [or not] ignore the subsequent
> NEWLIN.  However, that cascades all the way down into gettok() which
> is where the "#" character is recognized and everything from there up
> to but not including the newline is consumed.

The traditional fix for this is a hacky state variable.  We'd
incremented it (or set it to 1?) on the comment character, and decrement
it (or set it to 0?) at the newline.  But, if I'm following, either we
might need to decrement it after the newline has been processed, or
tweak bufferwords() to add an extra state for detecting the newline
when it knows it's in that state, which implies an extra stage of the
dance.  It's not very elegant but there is lots of prior art in
terms of signalling between gettok() and the parser.

> (PWS, it would be helpful if you could include workers article numbers
> in your commit logs.)

Most of them do, except the ones anyone actually looks at.

pws


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG] In reference to patch 39815, about (z) flag and $( parse error
  2017-10-12 15:26           ` Bart Schaefer
  2017-10-12 15:50             ` Peter Stephenson
@ 2017-10-13  7:53             ` Sebastian Gniazdowski
  2017-10-13  8:04               ` Bart Schaefer
  1 sibling, 1 reply; 23+ messages in thread
From: Sebastian Gniazdowski @ 2017-10-13  7:53 UTC (permalink / raw)
  To: Bart Schaefer, zsh-workers

On 12 Oct 2017 at 17:26:35, Bart Schaefer (schaefer@brasslantern.com) wrote:
> It'd most likely need to be in bufferwords(), which somehow needs to
> know that the STRING token returned from the lexer is [or not] a
> comment and that therefore it should [or not] ignore the subsequent
> NEWLIN. However, that cascades all the way down into gettok() which
> is where the "#" character is recognized and everything from there up
> to but not including the newline is consumed.

Sounds complicated. Good that parsing problem can be detected by buf+=$'\ntest' and checking if last element returned by (z)/(Z+cn+) is "test".

> So it's not that 5.0.6 was correct, it's that we've swapped one
> problem [that affects more than just (Z:n:)] for another.

I was hoping that the goal of this patch wasn't exactly equal to what it did with (Z+cn+), i.e. that it accidentally changed the flag's behavior.

-- 
Sebastian Gniazdowski
psprint /at/ zdharma.org


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG] In reference to patch 39815, about (z) flag and $( parse error
  2017-10-12 15:50             ` Peter Stephenson
@ 2017-10-13  7:56               ` Bart Schaefer
  2017-10-13  9:36                 ` Peter Stephenson
  0 siblings, 1 reply; 23+ messages in thread
From: Bart Schaefer @ 2017-10-13  7:56 UTC (permalink / raw)
  To: zsh-workers

On Thu, Oct 12, 2017 at 8:50 AM, Peter Stephenson
<p.stephenson@samsung.com> wrote:
> On Thu, 12 Oct 2017 08:26:35 -0700
> Bart Schaefer <schaefer@brasslantern.com> wrote:
>> It'd most likely need to be in bufferwords(), which somehow needs to
>> know that the STRING token returned from the lexer is [or not] a
>> comment
>
> The traditional fix for this is a hacky state variable.

After beating my head against this for a while and observing
confusingly different (and sometimes confusingly NOT different)
behavior with various attempts to track whether we were scanning
comment or code, I finally went back to the original commit c0d01a6f,
and realized that it rewrote skipcomm() which is called from down in
the guts of gettok() via gettokstr().  After I got there it became
ridiculously easy.  (As they say, a $1000 job -- $1 for the missing
screw, $999 for knowing where to put it.)

diff --git a/Src/lex.c b/Src/lex.c
index 8493d47..e0190af 100644
--- a/Src/lex.c
+++ b/Src/lex.c
@@ -2128,8 +2128,12 @@ skipcomm(void)
      * function at the history layer --- this is consistent with the
      * intention of maintaining the history and input layers across
      * the recursive parsing.
+     *
+     * Also turn off LEXFLAGS_NEWLINE because this is already skipping
+     * across the entire construct, and parse_event() needs embedded
+     * newlines to be "real" when looking for the OUTPAR token.
      */
-    lexflags &= ~LEXFLAGS_ZLE;
+    lexflags &= ~(LEXFLAGS_ZLE|LEXFLAGS_NEWLINE);
     dbparens = 0;      /* restored by zcontext_restore_partial() */

     if (!parse_event(OUTPAR) || tok != OUTPAR) {


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG] In reference to patch 39815, about (z) flag and $( parse error
  2017-10-13  7:53             ` Sebastian Gniazdowski
@ 2017-10-13  8:04               ` Bart Schaefer
  0 siblings, 0 replies; 23+ messages in thread
From: Bart Schaefer @ 2017-10-13  8:04 UTC (permalink / raw)
  To: zsh-workers

On Fri, Oct 13, 2017 at 12:53 AM, Sebastian Gniazdowski
<psprint@zdharma.org> wrote:
>
> I was hoping that the goal of this patch wasn't exactly equal to what it did with (Z+cn+), i.e. that it accidentally changed the flag's behavior.

That *is* what happened, but that doesn't always mean that it
*directly* changed the behavior.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG] In reference to patch 39815, about (z) flag and $( parse error
  2017-10-13  7:56               ` Bart Schaefer
@ 2017-10-13  9:36                 ` Peter Stephenson
  2017-10-13 20:55                   ` Bart Schaefer
  0 siblings, 1 reply; 23+ messages in thread
From: Peter Stephenson @ 2017-10-13  9:36 UTC (permalink / raw)
  To: zsh-workers

On Fri, 13 Oct 2017 00:56:53 -0700
Bart Schaefer <schaefer@brasslantern.com> wrote:
> diff --git a/Src/lex.c b/Src/lex.c
> index 8493d47..e0190af 100644
> --- a/Src/lex.c
> +++ b/Src/lex.c
> @@ -2128,8 +2128,12 @@ skipcomm(void)
>       * function at the history layer --- this is consistent with the
>       * intention of maintaining the history and input layers across
>       * the recursive parsing.
> +     *
> +     * Also turn off LEXFLAGS_NEWLINE because this is already skipping
> +     * across the entire construct, and parse_event() needs embedded
> +     * newlines to be "real" when looking for the OUTPAR token.
>       */
> -    lexflags &= ~LEXFLAGS_ZLE;
> +    lexflags &= ~(LEXFLAGS_ZLE|LEXFLAGS_NEWLINE);
>      dbparens = 0;      /* restored by zcontext_restore_partial() */
> 
>      if (!parse_event(OUTPAR) || tok != OUTPAR) {

Makes sense, thanks --- presumably we get away with the other bits we've
added to lexflags?

pws


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG] In reference to patch 39815, about (z) flag and $( parse error
  2017-10-13  9:36                 ` Peter Stephenson
@ 2017-10-13 20:55                   ` Bart Schaefer
  2017-10-14 14:44                     ` Sebastian Gniazdowski
  0 siblings, 1 reply; 23+ messages in thread
From: Bart Schaefer @ 2017-10-13 20:55 UTC (permalink / raw)
  To: zsh-workers

On Oct 13, 10:36am, Peter Stephenson wrote:
}
} presumably we get away with the other bits we've added to lexflags?

"make check" doesn't find any that we don't get away with, in any case.

Sebastian, if you have a minimal test case for this perhaps we could
add it to D04parameter?


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG] In reference to patch 39815, about (z) flag and $( parse error
  2017-10-13 20:55                   ` Bart Schaefer
@ 2017-10-14 14:44                     ` Sebastian Gniazdowski
  2017-10-15  1:53                       ` Bart Schaefer
  0 siblings, 1 reply; 23+ messages in thread
From: Sebastian Gniazdowski @ 2017-10-14 14:44 UTC (permalink / raw)
  To: Bart Schaefer, zsh-workers

[-- Attachment #1: Type: text/plain, Size: 1444 bytes --]

On 13 października 2017 at 22:55:10, Bart Schaefer (schaefer@brasslantern.com) wrote:
> On Oct 13, 10:36am, Peter Stephenson wrote:
> }
> } presumably we get away with the other bits we've added to lexflags?
>  
> "make check" doesn't find any that we don't get away with, in any case.
>  
> Sebastian, if you have a minimal test case for this perhaps we could
> add it to D04parameter?
>  

Maybe this:

  local buf=$'asmcmds+=(${(o)$(ls -1 \'.*\' | perl -alne \'\nEND{ print " "; }\'\n)})'
  local -a arr=( "${(@)${(@Z+cn+)buf}/(#m)?/-> $MATCH}" )
  print -rl -- "$arr[@]"
0:$( being closed at next line
>-> asmcmds+=(
>-> ${(o)$(ls -1 '.*' | perl -alne '
>END{ print " "; }'
>)}
>-> )

it precedes lines with "->" to know what's token and what's in-token line break. Attached as diff.

I've ran my unit tests and screened token dumps and everything looks correct.

But I've spotted other thing. Z+cn+ seems to parse "always", (z) seems to parse superficially when it occurs e.g. anonymous function. This doesn't depend on Zsh version used. Attached is diff of token dumps from the zshrc (the one with cgasm), on left we see normal parsing (Z+cn+ flag), on right a shift to "coarse-grained mode" (z-flag) when enterring anon-fun. This doesn't feel right, although I'm happy with (z) flag as I'm using Z+cn+ (just being anxious for bigger changes here).

--  
Sebastian Gniazdowski
psprint /at/ zdharma.org

[-- Attachment #2: Zcn_test.diff.txt --]
[-- Type: text/plain, Size: 607 bytes --]

diff --git a/Test/D04parameter.ztst b/Test/D04parameter.ztst
index 5ffaaa1..05f886b 100644
--- a/Test/D04parameter.ztst
+++ b/Test/D04parameter.ztst
@@ -2401,3 +2401,13 @@ F:behavior, see http://austingroupbugs.net/view.php?id=888
 0:Append to element of associative array on creation
 >one
 >howsyourfathertoday
+
+  local buf=$'asmcmds+=(${(o)$(ls -1 \'.*\' | perl -alne \'\nEND{ print " "; }\'\n)})'
+  local -a arr=( "${(@)${(@Z+cn+)buf}/(#m)?/-> $MATCH}" )
+  print -rl -- "$arr[@]"
+0:$( being closed at next line
+>-> asmcmds+=(
+>-> ${(o)$(ls -1 '.*' | perl -alne '
+>END{ print " "; }'
+>)}
+>-> )

[-- Attachment #3: z_vs_zcn.png --]
[-- Type: image/png, Size: 32088 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG] In reference to patch 39815, about (z) flag and $( parse error
  2017-10-14 14:44                     ` Sebastian Gniazdowski
@ 2017-10-15  1:53                       ` Bart Schaefer
  2017-10-15 14:31                         ` Sebastian Gniazdowski
  0 siblings, 1 reply; 23+ messages in thread
From: Bart Schaefer @ 2017-10-15  1:53 UTC (permalink / raw)
  To: zsh-workers

On Oct 14,  4:44pm, Sebastian Gniazdowski wrote:
}
} But I've spotted other thing. Z+cn+ seems to parse "always", (z) seems
} to parse superficially when it occurs e.g. anonymous function.

I'm not able to reproduce this with a simple test case, e.g.:

torch% buf='#foo                
quote> () {           
quote>  echo baz  
quote>  }    
quote> #done'
torch% printf '<%s>\n' ${(z)buf}
<#foo>
<;>
<()>
<{>
<;>
<echo>
<baz>
<;>
<}>
<;>
<#done>

I'm therefore suspicious that something else is involved in your example.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG] In reference to patch 39815, about (z) flag and $( parse error
  2017-10-15  1:53                       ` Bart Schaefer
@ 2017-10-15 14:31                         ` Sebastian Gniazdowski
  2017-10-15 18:09                           ` Bart Schaefer
  0 siblings, 1 reply; 23+ messages in thread
From: Sebastian Gniazdowski @ 2017-10-15 14:31 UTC (permalink / raw)
  To: Bart Schaefer, zsh-workers

On 15 października 2017 at 03:53:01, Bart Schaefer (schaefer@brasslantern.com) wrote:
> torch% printf '<%s>\n' ${(z)buf}

Btw, interesting use of printf.

> I'm therefore suspicious that something else is involved in your example.

It's rather not a bug after all. The cause is an apostrophe in comment. But (z) isn't comment-aware, so this is fine. Basically this parses "incorrectly":

# '
() {
        au_arr+=(expand-absolute-path up-line-or-beginning-search)
}

Tokens are:

<#>
<'
() {
        au_arr+=(expand-absolute-path up-line-or-beginning-search)
}>

--  
Sebastian Gniazdowski
psprint /at/ zdharma.org


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG] In reference to patch 39815, about (z) flag and $( parse error
  2017-10-15 14:31                         ` Sebastian Gniazdowski
@ 2017-10-15 18:09                           ` Bart Schaefer
  2017-10-15 18:44                             ` Daniel Shahaf
  2017-10-17  6:34                             ` Sebastian Gniazdowski
  0 siblings, 2 replies; 23+ messages in thread
From: Bart Schaefer @ 2017-10-15 18:09 UTC (permalink / raw)
  To: zsh-workers

On Oct 15,  4:31pm, Sebastian Gniazdowski wrote:
}
} It's rather not a bug after all. The cause is an apostrophe in comment.
} 
} # '
} () {
}         au_arr+=(expand-absolute-path up-line-or-beginning-search)
} }

Interesting.

This points out a different issue:

torch% setopt cshjunkiequotes
torch% printf '<%s>\n' ${(z)buf}
<#>
<'>
torch% 

Should the parse abort there?  Other parse errors don't cause (z) to
fail when cshjunkiequotes is not set.  (At least it should not abort
silently, but (z) suppresses parse error messages.)

If the parse should not abort, should it act as if cshjunkiequotes
were not set, or should it treat the line from the quote to the end
as a STRING token and then resume parsing on the next line?

The patch below implements that latter.


diff --git a/Src/lex.c b/Src/lex.c
index e0190af..c2a5966 100644
--- a/Src/lex.c
+++ b/Src/lex.c
@@ -1291,7 +1291,9 @@ gettokstr(int c, int sub)
 		ALLOWHIST
 		if (c != '\'') {
 		    unmatched = '\'';
-		    peek = LEXERR;
+		    /* Not an error when called from bufferwords() */
+		    if (!(lexflags & LEXFLAGS_ACTIVE))
+			peek = LEXERR;
 		    cmdpop();
 		    goto brk;
 		}
@@ -1313,7 +1315,9 @@ gettokstr(int c, int sub)
 	    cmdpop();
 	    if (c) {
 		unmatched = '"';
-		peek = LEXERR;
+		/* Not an error when called from bufferwords() */
+		if (!(lexflags & LEXFLAGS_ACTIVE))
+		    peek = LEXERR;
 		goto brk;
 	    }
 	    c = Dnull;
@@ -1350,7 +1354,9 @@ gettokstr(int c, int sub)
 	    cmdpop();
 	    if (c != '`') {
 		unmatched = '`';
-		peek = LEXERR;
+		/* Not an error when called from bufferwords() */
+		if (!(lexflags & LEXFLAGS_ACTIVE))
+		    peek = LEXERR;
 		goto brk;
 	    }
 	    c = Tick;
@@ -1392,7 +1398,7 @@ gettokstr(int c, int sub)
 	return LEXERR;
     }
     hungetc(c);
-    if (unmatched)
+    if (unmatched && !(lexflags & LEXFLAGS_ACTIVE))
 	zerr("unmatched %c", unmatched);
     if (in_brace_param) {
 	while(bct-- >= in_brace_param)


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG] In reference to patch 39815, about (z) flag and $( parse error
  2017-10-15 18:09                           ` Bart Schaefer
@ 2017-10-15 18:44                             ` Daniel Shahaf
  2017-10-15 19:59                               ` Bart Schaefer
  2017-10-17  6:34                             ` Sebastian Gniazdowski
  1 sibling, 1 reply; 23+ messages in thread
From: Daniel Shahaf @ 2017-10-15 18:44 UTC (permalink / raw)
  To: zsh-workers

Bart Schaefer wrote on Sun, 15 Oct 2017 11:09 -0700:
> On Oct 15,  4:31pm, Sebastian Gniazdowski wrote:
> }
> } It's rather not a bug after all. The cause is an apostrophe in comment.
> } 
> } # '
> } () {
> }         au_arr+=(expand-absolute-path up-line-or-beginning-search)
> } }
> 
> Interesting.

To be clear, the parse in the grandparent message is correct for an
interactive shell with interactivecomments disabled.  (Well, "correct"
considering that the final, multiline string, token is incomplete.)

(z-sy-h uses ${(z)} to parse $PREBUFFER$BUFFER.)

> This points out a different issue:
> 
> torch% setopt cshjunkiequotes
> torch% printf '<%s>\n' ${(z)buf}


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG] In reference to patch 39815, about (z) flag and $( parse error
  2017-10-15 18:44                             ` Daniel Shahaf
@ 2017-10-15 19:59                               ` Bart Schaefer
  0 siblings, 0 replies; 23+ messages in thread
From: Bart Schaefer @ 2017-10-15 19:59 UTC (permalink / raw)
  To: zsh-workers

On Oct 15,  6:44pm, Daniel Shahaf wrote:
}
} > On Oct 15,  4:31pm, Sebastian Gniazdowski wrote:
} > }
} > } It's rather not a bug after all. The cause is an apostrophe in comment.
} > } 
} > } # '
} > } () {
} > }         au_arr+=(expand-absolute-path up-line-or-beginning-search)
} > } }
} 
} To be clear, the parse in the grandparent message is correct for an
} interactive shell with interactivecomments disabled.

The (z) flag doesn't distinguish interactive or interactivecomments; it
never interprets comments.  That's one reason why (Z:c:) got added.

There probably ought to be (Z) options for rcquotes and rcexpandparam
as well.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG] In reference to patch 39815, about (z) flag and $( parse error
  2017-10-15 18:09                           ` Bart Schaefer
  2017-10-15 18:44                             ` Daniel Shahaf
@ 2017-10-17  6:34                             ` Sebastian Gniazdowski
  1 sibling, 0 replies; 23+ messages in thread
From: Sebastian Gniazdowski @ 2017-10-17  6:34 UTC (permalink / raw)
  To: Bart Schaefer, zsh-workers

On 15 października 2017 at 20:09:46, Bart Schaefer (schaefer@brasslantern.com) wrote:
> This points out a different issue:
>  
> torch% setopt cshjunkiequotes
> torch% printf '<%s>\n' ${(z)buf}
> <#>
> <'>
> torch%
>  
> Should the parse abort there? Other parse errors don't cause (z) to
> fail when cshjunkiequotes is not set. (At least it should not abort
> silently, but (z) suppresses parse error messages.)
>  
> If the parse should not abort, should it act as if cshjunkiequotes
> were not set, or should it treat the line from the quote to the end
> as a STRING token and then resume parsing on the next line?
>  
> The patch below implements that latter.

I cannot fully understand, but it sounds like if it was a conservative approach. I agree that parse should not abort on error and include text that follows (not sure how it would resume on next "line"). Hmm but now I think I understand. cshjunkiequotes doesn't allow newlines in strings, so indeed it is a valid point that on error it should consume to end of *line*, not end of parsed *buffor*.

I've run the unit tests and the latest patch didn't influence them.

--  
Sebastian Gniazdowski
psprint /at/ zdharma.org


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2017-10-17  6:42 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-10 15:03 [BUG] In reference to patch 39815, about (z) flag and $( parse error Sebastian Gniazdowski
2017-10-10 16:19 ` Sebastian Gniazdowski
2017-10-11  3:49   ` Bart Schaefer
2017-10-11  6:41     ` Peter Stephenson
     [not found]   ` <etPan.59dd94f1.69ba7f51.98a8@AirmailxGenerated.am>
2017-10-11  8:31     ` Sebastian Gniazdowski
2017-10-11 14:09       ` Sebastian Gniazdowski
2017-10-11 15:07         ` Sebastian Gniazdowski
2017-10-11 17:02       ` Bart Schaefer
2017-10-12 14:54         ` Sebastian Gniazdowski
2017-10-12 15:26           ` Bart Schaefer
2017-10-12 15:50             ` Peter Stephenson
2017-10-13  7:56               ` Bart Schaefer
2017-10-13  9:36                 ` Peter Stephenson
2017-10-13 20:55                   ` Bart Schaefer
2017-10-14 14:44                     ` Sebastian Gniazdowski
2017-10-15  1:53                       ` Bart Schaefer
2017-10-15 14:31                         ` Sebastian Gniazdowski
2017-10-15 18:09                           ` Bart Schaefer
2017-10-15 18:44                             ` Daniel Shahaf
2017-10-15 19:59                               ` Bart Schaefer
2017-10-17  6:34                             ` Sebastian Gniazdowski
2017-10-13  7:53             ` Sebastian Gniazdowski
2017-10-13  8:04               ` Bart Schaefer

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).