zsh-workers
 help / color / mirror / code / Atom feed
* (Y) modifier: up to N matches?
@ 2014-06-02 18:23 Daniel Shahaf
  2014-06-03  3:46 ` Bart Schaefer
  0 siblings, 1 reply; 15+ messages in thread
From: Daniel Shahaf @ 2014-06-02 18:23 UTC (permalink / raw)
  To: zsh-workers

Right now,   *(NY) expands to either zero or one filenames.

Would it make sense to have (Y) take a numeric argument specifying the maximal
number of files to match?  e.g., *(NY5) would expand to between 0 and 5
filenames (but never more than 5).

I think it will be a small code change (just check matchct when deciding
whether to return) and will work nicely with other qualifiers (eg sorting)
without special effort.  It would be a superset of the current functionality
[(NY1) would be equivalent to the current (NY)].

I imagine it would be useful in cases where 'find | head' or 'grep | head' are
useful.

Thoughts?


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: (Y) modifier: up to N matches?
  2014-06-02 18:23 (Y) modifier: up to N matches? Daniel Shahaf
@ 2014-06-03  3:46 ` Bart Schaefer
  2014-06-04  2:08   ` [PATCH] " Daniel Shahaf
  0 siblings, 1 reply; 15+ messages in thread
From: Bart Schaefer @ 2014-06-03  3:46 UTC (permalink / raw)
  To: zsh-workers

On Jun 2,  6:23pm, Daniel Shahaf wrote:
}
} Would it make sense to have (Y) take a numeric argument specifying the
} maximal number of files to match? e.g., *(NY5) would expand to between
} 0 and 5 filenames (but never more than 5).

My concern is that people are going to expect the (o)/(O) qualifiers to
take effect before (Y) does, and will be confused about the "skipped"
files when it takes effect after.  If (Y) can't return more than one
result, there's nothing to sort.

However, I can't come up with any other objection.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] Re: (Y) modifier: up to N matches?
  2014-06-03  3:46 ` Bart Schaefer
@ 2014-06-04  2:08   ` Daniel Shahaf
  2014-06-04  6:01     ` Mikael Magnusson
  2014-06-04  6:42     ` Bart Schaefer
  0 siblings, 2 replies; 15+ messages in thread
From: Daniel Shahaf @ 2014-06-04  2:08 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers

[-- Attachment #1: Type: text/plain, Size: 1448 bytes --]

Bart Schaefer wrote on Mon, Jun 02, 2014 at 20:46:03 -0700:
> On Jun 2,  6:23pm, Daniel Shahaf wrote:
> }
> } Would it make sense to have (Y) take a numeric argument specifying the
> } maximal number of files to match? e.g., *(NY5) would expand to between
> } 0 and 5 filenames (but never more than 5).
> 
> My concern is that people are going to expect the (o)/(O) qualifiers to
> take effect before (Y) does, and will be confused about the "skipped"
> files when it takes effect after.  If (Y) can't return more than one
> result, there's nothing to sort.

The expectations about sorting are just as much of a problem with (Y) as
they would be with (Y42): in both cases there might be "skipped" files.
For example, in the source dir, *(Y/on) [or *(Y1/on)] might result in "Etc"
even though "Doc" exists.

Let's clarify that in the documentation of (Y).

> However, I can't come up with any other objection.

Two patches attached.  The first patch implements (Y42) and adds completion
and the above-mentioned docs clarification.  (The two latter parts should be
useful even if we don't add an argument to (Y).)  The second patch isn't
strictly required, but it cleans up the return type changes that are no
longer needed after the first patch.

FWIW, I'm intentionally making (Y) without argument an error; we can settle
on its semantics later after (Y42) has seen some "in the field" use.  The
spelling (Y1) can be used instead.

Cheers,

Daniel

[-- Attachment #2: 0001-Teach-Y-to-take-an-argument.patch --]
[-- Type: text/x-patch, Size: 5947 bytes --]

>From d4bd736da686b5927a3c655d3e25f9d0b2fc7f99 Mon Sep 17 00:00:00 2001
From: Daniel Shahaf <d.s@daniel.shahaf.name>
Date: Mon, 2 Jun 2014 19:14:10 +0000
Subject: [PATCH 1/2] Teach (Y) to take an argument

---
 Completion/Zsh/Type/_globquals |    1 +
 Doc/Zsh/expn.yo                |    7 ++++---
 Src/glob.c                     |   34 +++++++++++++++++++++++++++-------
 Test/D02glob.ztst              |   23 ++++++++++++++---------
 4 files changed, 46 insertions(+), 19 deletions(-)

diff --git a/Completion/Zsh/Type/_globquals b/Completion/Zsh/Type/_globquals
index c98bd0c..37db161 100644
--- a/Completion/Zsh/Type/_globquals
+++ b/Completion/Zsh/Type/_globquals
@@ -251,6 +251,7 @@ case $state in
     "o:+ sort order, up"
     "O:+ sort order, down"
     "P:prepend word"
+    "Y:+ at most ARG matches"
     "[:+ range of files"
     "):end of qualifiers"
     "\::modifier"
diff --git a/Doc/Zsh/expn.yo b/Doc/Zsh/expn.yo
index 25247f9..2f91fec 100644
--- a/Doc/Zsh/expn.yo
+++ b/Doc/Zsh/expn.yo
@@ -2564,9 +2564,10 @@ item(tt(n))(
 sets the tt(NUMERIC_GLOB_SORT) option for the current pattern
 pindex(NUMERIC_GLOB_SORT, setting in pattern)
 )
-item(tt(Y))(
-enables short-circuit mode: the pattern will expand to just the first
-matching filename, if any.
+item(tt(Y)var(n))(
+enables short-circuit mode: the pattern will expand to at most var(n)
+filenames.  If more than var(n) matches exist, only the first var(n)
+matches in directory traversal order will be considered.
 )
 item(tt(o)var(c))(
 specifies how the names of the files should be sorted. If var(c) is
diff --git a/Src/glob.c b/Src/glob.c
index 0ca63fc..33b49f1 100644
--- a/Src/glob.c
+++ b/Src/glob.c
@@ -473,7 +473,8 @@ scanner(Complist q, int shortcircuit)
 	if (q->closure == 2)	/* (foo/)## - match one or more dirs */
 	    q->closure = 1;
 	else
-	    if (scanner(q->next, shortcircuit) == 1)
+	    scanner(q->next, shortcircuit);
+	    if (shortcircuit && shortcircuit == matchct)
 		return 1;
     }
     p = q->pat;
@@ -520,7 +521,8 @@ scanner(Complist q, int shortcircuit)
 		if (add) {
 		    addpath(str, l);
 		    if (!closure || !statfullpath("", NULL, 1))
-			if (scanner((q->closure) ? q : q->next, shortcircuit) == 1)
+			scanner((q->closure) ? q : q->next, shortcircuit);
+			if (shortcircuit && shortcircuit == matchct)
 			    return 1;
 		    pathbuf[pathpos = oppos] = '\0';
 		}
@@ -528,7 +530,8 @@ scanner(Complist q, int shortcircuit)
 	} else {
 	    if (str[l])
 		str = dupstrpfx(str, l);
-	    if (insert(str, 0) == 1 && shortcircuit)
+	    insert(str, 0);
+	    if (shortcircuit && shortcircuit == matchct)
 		return 1;
 	}
     } else {
@@ -619,7 +622,8 @@ scanner(Complist q, int shortcircuit)
 		    subdirlen += sizeof(int);
 		} else
 		    /* if the last filename component, just add it */
-		    if (insert(fn, 1) == 1 && shortcircuit)
+		    insert(fn, 1);
+		    if (shortcircuit && shortcircuit == matchct)
 			return 1;
 	    }
 	}
@@ -633,7 +637,9 @@ scanner(Complist q, int shortcircuit)
 		fn += l + 1;
 		memcpy((char *)&errsfound, fn, sizeof(int));
 		fn += sizeof(int);
-		if (scanner((q->closure) ? q : q->next, shortcircuit) == 1) /* scan next level */
+		/* scan next level */
+		scanner((q->closure) ? q : q->next, shortcircuit); 
+		if (shortcircuit && shortcircuit == matchct)
 		    return 1;
 		pathbuf[pathpos = oppos] = '\0';
 	    }
@@ -1150,7 +1156,8 @@ zglob(LinkList list, LinkNode np, int nountok)
 					/* and index+1 of the last match */
     struct globdata saved;		/* saved glob state              */
     int nobareglob = !isset(BAREGLOBQUAL);
-    int shortcircuit = 0;
+    int shortcircuit = 0;		/* How many files to match;      */
+					/* 0 means no limit              */
 
     if (unset(GLOBOPT) || !haswilds(ostr) || unset(EXECOPT)) {
 	if (!nountok)
@@ -1502,9 +1509,22 @@ zglob(LinkList list, LinkNode np, int nountok)
 		    gf_numsort = !(sense & 1);
 		    break;
 		case 'Y':
-		    /* Short circuit: just check if there are any matches */
+		{
+		    /* Short circuit: limit number of matches */
+		    const char *s_saved = s;
 		    shortcircuit = !(sense & 1);
+		    if (shortcircuit) {
+			/* Parse the argument. */
+			data = qgetnum(&s);
+			if ((shortcircuit = data) != data) {
+			    /* Integer overflow */
+			    zerr("value too big: Y%s", s_saved);
+			    restore_globstate(saved);
+			    return;
+			}
+		    }
 		    break;
+		}
 		case 'a':
 		    /* Access time in given range */
 		    g_amc = 0;
diff --git a/Test/D02glob.ztst b/Test/D02glob.ztst
index 9e29de2..c00bbe3 100644
--- a/Test/D02glob.ztst
+++ b/Test/D02glob.ztst
@@ -543,17 +543,22 @@
 >Multiple files matched
 >Normal string if nullglob not set
 
- (){ print $#@ } glob.tmp/dir*(Y)
- (){ print $#@ } glob.tmp/file*(NY)
- (){ [[ $1 = glob.tmp/dir? ]] && echo "(Y) returns a matching filename" } glob.tmp/dir*(Y)
- # Can be negated
- (){ print $@:t } glob.tmp/dir*(Y^Y)
- (){ [[ $#@ -eq 1 ]] && print Globs before last path component } glob.tmp/dir?/subdir(NY)
- (){ [[ $#@ -eq 0 ]] && print Respects qualifiers } glob.tmp/dir?/subdir(NY.)
+ (){ print $#@ } glob.tmp/dir*(Y1)
+ (){ print $#@ } glob.tmp/file*(NY1)
+ (){ [[ "$*" == */dir?\ */dir? ]] && print Returns matching filenames } glob.tmp/dir*(Y2)
+ (){ print "Limit is upper bound:" $@:t } glob.tmp/dir*(Y5)
+ (){ print "Negated:" $@:t } glob.tmp/dir*(Y1^Y)
+ (){ print "Sorting:" $@:t } glob.tmp/dir*(Y4On)
+ (){ [[ $#@ -eq 1 ]] && print Globs before last path component } glob.tmp/dir?/subdir(NY1)
+ (){ [[ $#@ -eq 0 ]] && print Respects qualifiers } glob.tmp/dir*(NY1.)
+ (print -- *(Y)) 2>/dev/null || print "Argument required"
 0:short-circuit modifier
 >1
 >0
->(Y) returns a matching filename
->dir1 dir2 dir3 dir4
+>Returns matching filenames
+>Limit is upper bound: dir1 dir2 dir3 dir4
+>Negated: dir1 dir2 dir3 dir4
+>Sorting: dir4 dir3 dir2 dir1
 >Globs before last path component
 >Respects qualifiers
+>Argument required
-- 
1.7.10.4


[-- Attachment #3: 0002-glob.c-Undo-now-unneeded-return-type-changes.patch --]
[-- Type: text/x-patch, Size: 4700 bytes --]

>From b3ad6fe33263861671e19329d3a773128b220358 Mon Sep 17 00:00:00 2001
From: Daniel Shahaf <d.s@daniel.shahaf.name>
Date: Mon, 2 Jun 2014 23:11:37 +0000
Subject: [PATCH 2/2] glob.c: Undo now-unneeded return type changes

---
 Src/glob.c |   40 +++++++++++++++++++---------------------
 1 file changed, 19 insertions(+), 21 deletions(-)

diff --git a/Src/glob.c b/Src/glob.c
index 33b49f1..c74a560 100644
--- a/Src/glob.c
+++ b/Src/glob.c
@@ -297,16 +297,15 @@ statfullpath(const char *s, struct stat *st, int l)
 
 char **inserts;
 
-/* add a match to the list.  Return 1 if it was inserted, 0 otherwise. */
+/* add a match to the list */
 
 /**/
-static int
+static void
 insert(char *s, int checked)
 {
     struct stat buf, buf2, *bp;
     char *news = s;
     int statted = 0;
-    int inserted = 0;
 
     queue_signals();
     inserts = NULL;
@@ -317,7 +316,7 @@ insert(char *s, int checked)
 	checked = statted = 1;
 	if (statfullpath(s, &buf, 1)) {
 	    unqueue_signals();
-	    return inserted;
+	    return;
 	}
 	mode = buf.st_mode;
 	if (gf_follow) {
@@ -341,7 +340,7 @@ insert(char *s, int checked)
 
 	if (!statted && statfullpath(s, &buf, 1)) {
 	    unqueue_signals();
-	    return inserted;
+	    return;
 	}
 	news = dyncat(pathbuf, news);
 
@@ -366,7 +365,7 @@ insert(char *s, int checked)
 		/* Try next alternative, or return if there are no more */
 		if (!(qo = qo->or)) {
 		    unqueue_signals();
-		    return inserted;
+		    return;
 		}
 		qn = qo;
 		continue;
@@ -376,7 +375,7 @@ insert(char *s, int checked)
     } else if (!checked) {
 	if (statfullpath(s, NULL, 1)) {
 	    unqueue_signals();
-	    return inserted;
+	    return;
 	}
 	statted = 1;
 	news = dyncat(pathbuf, news);
@@ -436,7 +435,6 @@ insert(char *s, int checked)
 	}
 	matchptr++;
 
-	inserted = 1;
 	if (++matchct == matchsz) {
 	    matchbuf = (Gmatch )realloc((char *)matchbuf,
 					sizeof(struct gmatch) * (matchsz *= 2));
@@ -447,7 +445,7 @@ insert(char *s, int checked)
 	    break;
     }
     unqueue_signals();
-    return inserted;
+    return;
 }
 
 /* Do the globbing:  scanner is called recursively *
@@ -455,7 +453,7 @@ insert(char *s, int checked)
  * tried all of it.                                */
 
 /**/
-static int
+static void
 scanner(Complist q, int shortcircuit)
 {
     Patprog p;
@@ -466,7 +464,7 @@ scanner(Complist q, int shortcircuit)
 
     init_dirsav(&ds);
     if (!q)
-	return -1;
+	return;
 
     if ((closure = q->closure)) {
 	/* (foo/)# - match zero or more dirs */
@@ -475,7 +473,7 @@ scanner(Complist q, int shortcircuit)
 	else
 	    scanner(q->next, shortcircuit);
 	    if (shortcircuit && shortcircuit == matchct)
-		return 1;
+		return;
     }
     p = q->pat;
     /* Now the actual matching for the current path section. */
@@ -490,13 +488,13 @@ scanner(Complist q, int shortcircuit)
 	    int err;
 
 	    if (l >= PATH_MAX)
-		return -1;
+		return;
 	    err = lchdir(pathbuf + pathbufcwd, &ds, 0);
 	    if (err == -1)
-		return -1;
+		return;
 	    if (err) {
 		zerr("current directory lost during glob");
-		return -1;
+		return;
 	    }
 	    pathbufcwd = pathpos;
 	}
@@ -523,7 +521,7 @@ scanner(Complist q, int shortcircuit)
 		    if (!closure || !statfullpath("", NULL, 1))
 			scanner((q->closure) ? q : q->next, shortcircuit);
 			if (shortcircuit && shortcircuit == matchct)
-			    return 1;
+			    return;
 		    pathbuf[pathpos = oppos] = '\0';
 		}
 	    }
@@ -532,7 +530,7 @@ scanner(Complist q, int shortcircuit)
 		str = dupstrpfx(str, l);
 	    insert(str, 0);
 	    if (shortcircuit && shortcircuit == matchct)
-		return 1;
+		return;
 	}
     } else {
 	/* Do pattern matching on current path section. */
@@ -543,7 +541,7 @@ scanner(Complist q, int shortcircuit)
 	int subdirlen = 0;
 
 	if (lock == NULL)
-	    return -1;
+	    return;
 	while ((fn = zreaddir(lock, 1)) && !errflag) {
 	    /* prefix and suffix are zle trickery */
 	    if (!dirs && !colonmod &&
@@ -624,7 +622,7 @@ scanner(Complist q, int shortcircuit)
 		    /* if the last filename component, just add it */
 		    insert(fn, 1);
 		    if (shortcircuit && shortcircuit == matchct)
-			return 1;
+			return;
 	    }
 	}
 	closedir(lock);
@@ -640,7 +638,7 @@ scanner(Complist q, int shortcircuit)
 		/* scan next level */
 		scanner((q->closure) ? q : q->next, shortcircuit); 
 		if (shortcircuit && shortcircuit == matchct)
-		    return 1;
+		    return;
 		pathbuf[pathpos = oppos] = '\0';
 	    }
 	    hrealloc(subdirs, subdirlen, 0);
@@ -654,7 +652,7 @@ scanner(Complist q, int shortcircuit)
 	    close(ds.dirfd);
 	pathbufcwd = pbcwdsav;
     }
-    return 0;
+    return;
 }
 
 /* This function tokenizes a zsh glob pattern */
-- 
1.7.10.4


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Re: (Y) modifier: up to N matches?
  2014-06-04  2:08   ` [PATCH] " Daniel Shahaf
@ 2014-06-04  6:01     ` Mikael Magnusson
  2014-06-04  6:54       ` Bart Schaefer
  2014-06-04  6:42     ` Bart Schaefer
  1 sibling, 1 reply; 15+ messages in thread
From: Mikael Magnusson @ 2014-06-04  6:01 UTC (permalink / raw)
  To: Daniel Shahaf; +Cc: Bart Schaefer, zsh workers

On 4 June 2014 04:08, Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
> Bart Schaefer wrote on Mon, Jun 02, 2014 at 20:46:03 -0700:
>> On Jun 2,  6:23pm, Daniel Shahaf wrote:
>> }
>> } Would it make sense to have (Y) take a numeric argument specifying the
>> } maximal number of files to match? e.g., *(NY5) would expand to between
>> } 0 and 5 filenames (but never more than 5).
>>
>> My concern is that people are going to expect the (o)/(O) qualifiers to
>> take effect before (Y) does, and will be confused about the "skipped"
>> files when it takes effect after.  If (Y) can't return more than one
>> result, there's nothing to sort.
>
> The expectations about sorting are just as much of a problem with (Y) as
> they would be with (Y42): in both cases there might be "skipped" files.
> For example, in the source dir, *(Y/on) [or *(Y1/on)] might result in "Etc"
> even though "Doc" exists.
>
> Let's clarify that in the documentation of (Y).
>
>> However, I can't come up with any other objection.
>
> Two patches attached.  The first patch implements (Y42) and adds completion
> and the above-mentioned docs clarification.  (The two latter parts should be
> useful even if we don't add an argument to (Y).)  The second patch isn't
> strictly required, but it cleans up the return type changes that are no
> longer needed after the first patch.
>
> FWIW, I'm intentionally making (Y) without argument an error; we can settle
> on its semantics later after (Y42) has seen some "in the field" use.  The
> spelling (Y1) can be used instead.
>
> Cheers,
>
> Daniel

We already have [1,42] for the first 42 matches after sorting, just
make a reference to that I guess? And as for taking arguments, all
other glob qualifiers that take variable length arguments do it in the
form of u:args: where the : can be any character, so this should be
consistent with that, I think.

-- 
Mikael Magnusson


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Re: (Y) modifier: up to N matches?
  2014-06-04  2:08   ` [PATCH] " Daniel Shahaf
  2014-06-04  6:01     ` Mikael Magnusson
@ 2014-06-04  6:42     ` Bart Schaefer
  2014-06-04  9:25       ` Peter Stephenson
  2014-06-04 23:08       ` Daniel Shahaf
  1 sibling, 2 replies; 15+ messages in thread
From: Bart Schaefer @ 2014-06-04  6:42 UTC (permalink / raw)
  To: zsh-workers

On Jun 4,  2:08am, Daniel Shahaf wrote:
} Subject: [PATCH] Re: (Y) modifier: up to N matches?
} 
} Bart Schaefer wrote on Mon, Jun 02, 2014 at 20:46:03 -0700:
} > 
} > My concern is that people are going to expect the (o)/(O) qualifiers to
} > take effect before (Y) does, and will be confused about the "skipped"
} > files when it takes effect after.  If (Y) can't return more than one
} > result, there's nothing to sort.
} 
} The expectations about sorting are just as much of a problem with (Y) as
} they would be with (Y42): in both cases there might be "skipped" files.

Yes, but if there's only one it's pretty obvious why.  Consider that in
your doc update you wrote:
  +filenames.  If more than var(n) matches exist, only the first var(n)
  +matches in directory traversal order will be considered.

Yet those n filenames are returned in alphabetical order, which has no
relation to directory traversal order.  In fact, people might expect
(Od) to cause subdirectories to be "traversed" before files in the
current directory, which never happens with (Y).  It might even be
preferable if (Yn) implied (odoN).

I'll repeat that this isn't a major objection.

} FWIW, I'm intentionally making (Y) without argument an error; we can
} settle on its semantics later after (Y42) has seen some "in the field"
} use. The spelling (Y1) can be used instead.

I'm doubtful that any default semantic other than (Y1) will be useful.

As an aside, I'm now pondering whether this can be made usable with **/
to find n matches in every directory rather than the first n matches
in the whole tree.  **/*(e{'reply=($REPLY/*(NY3))'}) almost works but
doesn't include the current diretory.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Re: (Y) modifier: up to N matches?
  2014-06-04  6:01     ` Mikael Magnusson
@ 2014-06-04  6:54       ` Bart Schaefer
  0 siblings, 0 replies; 15+ messages in thread
From: Bart Schaefer @ 2014-06-04  6:54 UTC (permalink / raw)
  To: zsh workers

On Jun 4,  8:01am, Mikael Magnusson wrote:
}
} We already have [1,42] for the first 42 matches after sorting, just
} make a reference to that I guess? And as for taking arguments, all
} other glob qualifiers that take variable length arguments do it in the
} form of u:args: where the : can be any character, so this should be
} consistent with that, I think.

Actually qualifiers that take only numeric arguments don't use the :args:
format, see for example u123 for files owned by user 123, or L1024 for
files 1k in size.  I find I've forgotten how one specifies a device for
the (dDEV) qualifier, and the manual doesn't help, but that doesn't use
delimiters either.  (Ah, it's device number.)

So I think Y42 is fine as it is.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Re: (Y) modifier: up to N matches?
  2014-06-04  6:42     ` Bart Schaefer
@ 2014-06-04  9:25       ` Peter Stephenson
  2014-06-04 23:08       ` Daniel Shahaf
  1 sibling, 0 replies; 15+ messages in thread
From: Peter Stephenson @ 2014-06-04  9:25 UTC (permalink / raw)
  To: zsh-workers

On Tue, 03 Jun 2014 23:42:29 -0700
Bart Schaefer <schaefer@brasslantern.com> wrote:
> } FWIW, I'm intentionally making (Y) without argument an error; we can
> } settle on its semantics later after (Y42) has seen some "in the field"
> } use. The spelling (Y1) can be used instead.
> 
> I'm doubtful that any default semantic other than (Y1) will be useful.

I have similar thoughts, but as long as the documentation makes clear
that it will take the first files it comes across, regardless of any
other qualifiers, I don't think there's a big problem, either.

pws


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Re: (Y) modifier: up to N matches?
  2014-06-04  6:42     ` Bart Schaefer
  2014-06-04  9:25       ` Peter Stephenson
@ 2014-06-04 23:08       ` Daniel Shahaf
  2014-06-04 23:16         ` Bart Schaefer
  1 sibling, 1 reply; 15+ messages in thread
From: Daniel Shahaf @ 2014-06-04 23:08 UTC (permalink / raw)
  To: zsh-workers

[-- Attachment #1: Type: text/plain, Size: 2684 bytes --]

Bart Schaefer wrote on Tue, Jun 03, 2014 at 23:42:29 -0700:
> On Jun 4,  2:08am, Daniel Shahaf wrote:
> } Subject: [PATCH] Re: (Y) modifier: up to N matches?
> } 
> } Bart Schaefer wrote on Mon, Jun 02, 2014 at 20:46:03 -0700:
> } > 
> } > My concern is that people are going to expect the (o)/(O) qualifiers to
> } > take effect before (Y) does, and will be confused about the "skipped"
> } > files when it takes effect after.  If (Y) can't return more than one
> } > result, there's nothing to sort.
> } 
> } The expectations about sorting are just as much of a problem with (Y) as
> } they would be with (Y42): in both cases there might be "skipped" files.
> 
> Yes, but if there's only one it's pretty obvious why.  Consider that in
> your doc update you wrote:
>   +filenames.  If more than var(n) matches exist, only the first var(n)
>   +matches in directory traversal order will be considered.
> 
> Yet those n filenames are returned in alphabetical order, which has no
> relation to directory traversal order.

By "directory traversal", I meant to refer not just to the order of
visiting directories (e.g., depth-first), but also to the order of
visiting files within those directories (the readdir() order).  Is there
a way to make the docs clearer?

"filesystem tree traversal order" might be more accurate, but I don't
really know whether it would be clearer than the current text to the
average reader.

> In fact, people might expect (Od) to cause subdirectories to be
> "traversed" before files in the current directory, which never happens
> with (Y).  It might even be preferable if (Yn) implied (odoN).
> 
> I'll repeat that this isn't a major objection.
> 

What would (odoN) do?  It produces the same output as (oN) alone, since
presence of (oN) causes any other sort qualifiers to be ignored.

+1 on defaulting to (oN).  Patch attached.  It implements "default to
(oN) if (Yn) has been specified and no sort has been specified in any
qualifiers group".

I'm also attaching a second patch that prevents GS_NONE from being
left-shifted in the code.  I couldn't produce a user-visible problem,
but I confirmed that (1<<19) gets bitwise-or'd into gf_sorts (and 1<<19
has no defined meaning in that bitfield).

> } FWIW, I'm intentionally making (Y) without argument an error; we can
> } settle on its semantics later after (Y42) has seen some "in the field"
> } use. The spelling (Y1) can be used instead.
> 
> I'm doubtful that any default semantic other than (Y1) will be useful.

I don't see any other default right now either, but I wanted to leave
the door open to assign a bare (Y) some other, currently-unanticipated
meaning in the future.

>

Thanks,

Daniel

[-- Attachment #2: 0001-Y-glob-modifier-defaults-to-no-sorting.patch --]
[-- Type: text/x-patch, Size: 2799 bytes --]

>From 03e5a2238baf8c38028e49bfe1ed9b63cfa36f18 Mon Sep 17 00:00:00 2001
From: Daniel Shahaf <d.s@daniel.shahaf.name>
Date: Wed, 4 Jun 2014 15:48:53 +0000
Subject: [PATCH 1/2] (Y) glob modifier defaults to no sorting

---
 Doc/Zsh/expn.yo   |    7 ++++++-
 Src/glob.c        |    2 +-
 Test/D02glob.ztst |    2 +-
 3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/Doc/Zsh/expn.yo b/Doc/Zsh/expn.yo
index 2f91fec..ff9f2b1 100644
--- a/Doc/Zsh/expn.yo
+++ b/Doc/Zsh/expn.yo
@@ -2568,10 +2568,12 @@ item(tt(Y)var(n))(
 enables short-circuit mode: the pattern will expand to at most var(n)
 filenames.  If more than var(n) matches exist, only the first var(n)
 matches in directory traversal order will be considered.
+
+Implies tt(oN) when no tt(o)var(c) qualifier is used.
 )
 item(tt(o)var(c))(
 specifies how the names of the files should be sorted. If var(c) is
-tt(n) they are sorted by name (the default); if it is tt(L) they
+tt(n) they are sorted by name; if it is tt(L) they
 are sorted depending on the size (length) of the files; if tt(l)
 they are sorted by the number of links; if tt(a), tt(m), or tt(c)
 they are sorted by the time of the last access, modification, or
@@ -2586,6 +2588,9 @@ so `tt(*(^-oL))' gives a list of all files sorted by file size in descending
 order, following any symbolic links.  Unless tt(oN) is used, multiple order
 specifiers may occur to resolve ties.
 
+The default sorting is tt(n) (by name) unless the tt(Y) glob qualifier is used,
+in which case it is tt(N) (unsorted).
+
 tt(oe) and tt(o+) are special cases; they are each followed by shell code,
 delimited as for the tt(e) glob qualifier and the tt(+) glob qualifier
 respectively (see above).  The code is executed for each matched file with
diff --git a/Src/glob.c b/Src/glob.c
index c74a560..a62cfee 100644
--- a/Src/glob.c
+++ b/Src/glob.c
@@ -1779,7 +1779,7 @@ zglob(LinkList list, LinkNode np, int nountok)
 	return;
     }
     if (!gf_nsorts) {
-	gf_sortlist[0].tp = gf_sorts = GS_NAME;
+	gf_sortlist[0].tp = gf_sorts = (shortcircuit ? GS_NONE : GS_NAME);
 	gf_nsorts = 1;
     }
     /* Initialise receptacle for matched files, *
diff --git a/Test/D02glob.ztst b/Test/D02glob.ztst
index c00bbe3..358c934 100644
--- a/Test/D02glob.ztst
+++ b/Test/D02glob.ztst
@@ -546,7 +546,7 @@
  (){ print $#@ } glob.tmp/dir*(Y1)
  (){ print $#@ } glob.tmp/file*(NY1)
  (){ [[ "$*" == */dir?\ */dir? ]] && print Returns matching filenames } glob.tmp/dir*(Y2)
- (){ print "Limit is upper bound:" $@:t } glob.tmp/dir*(Y5)
+ (){ print "Limit is upper bound:" ${(o)@:t} } glob.tmp/dir*(Y5)
  (){ print "Negated:" $@:t } glob.tmp/dir*(Y1^Y)
  (){ print "Sorting:" $@:t } glob.tmp/dir*(Y4On)
  (){ [[ $#@ -eq 1 ]] && print Globs before last path component } glob.tmp/dir?/subdir(NY1)
-- 
1.7.10.4


[-- Attachment #3: 0002-Fix-GS_NONE-would-be-shifted-into-undefined-value.patch --]
[-- Type: text/x-patch, Size: 885 bytes --]

>From a7abb13613cb7b4cdcd1cb2d62ff8c54b39c6634 Mon Sep 17 00:00:00 2001
From: Daniel Shahaf <d.s@daniel.shahaf.name>
Date: Wed, 4 Jun 2014 15:42:22 +0000
Subject: [PATCH 2/2] Fix GS_NONE would be shifted into undefined value

---
 Src/glob.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/Src/glob.c b/Src/glob.c
index a62cfee..c6cb3d2 100644
--- a/Src/glob.c
+++ b/Src/glob.c
@@ -1619,9 +1619,10 @@ zglob(LinkList list, LinkNode np, int nountok)
 			restore_globstate(saved);
 			return;
 		    }
+		    if ((sense & 2) &&
+			(t & (GS_SIZE|GS_ATIME|GS_MTIME|GS_CTIME|GS_LINKS)))
+			t <<= GS_SHIFT; /* HERE: GS_EXEC? */
 		    if (t != GS_EXEC) {
-			if ((sense & 2) && !(t & (GS_NAME|GS_DEPTH)))
-			    t <<= GS_SHIFT; /* HERE: GS_EXEC? */
 			if (gf_sorts & t) {
 			    zerr("doubled sort specifier");
 			    restore_globstate(saved);
-- 
1.7.10.4


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Re: (Y) modifier: up to N matches?
  2014-06-04 23:08       ` Daniel Shahaf
@ 2014-06-04 23:16         ` Bart Schaefer
  2014-06-05  4:45           ` Bart Schaefer
  2014-06-05 23:24           ` Oliver Kiddle
  0 siblings, 2 replies; 15+ messages in thread
From: Bart Schaefer @ 2014-06-04 23:16 UTC (permalink / raw)
  To: zsh-workers

[-- Attachment #1: Type: text/plain, Size: 518 bytes --]

On Jun 4, 2014 4:08 PM, "Daniel Shahaf" <d.s@daniel.shahaf.name> wrote:
>
> Bart Schaefer wrote on Tue, Jun 03, 2014 at 23:42:29 -0700:
> > It might even be preferable if (Yn) implied (odoN).
>
> What would (odoN) do?  It produces the same output as (oN) alone, since
> presence of (oN) causes any other sort qualifiers to be ignored.

I verified by comparing the output of **/*(odoN) vs **/*(oN) that they may
in fact differ.  I didn't dig into exactly how or why.

More later when I have time to peruse the patches.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Re: (Y) modifier: up to N matches?
  2014-06-04 23:16         ` Bart Schaefer
@ 2014-06-05  4:45           ` Bart Schaefer
  2014-06-05 23:24           ` Oliver Kiddle
  1 sibling, 0 replies; 15+ messages in thread
From: Bart Schaefer @ 2014-06-05  4:45 UTC (permalink / raw)
  To: zsh-workers

On Jun 4,  4:16pm, Bart Schaefer wrote:
}
} I verified by comparing the output of **/*(odoN) vs **/*(oN) that they
} may in fact differ. I didn't dig into exactly how or why.

It appears that **/*(od) returns files in subdirectories first regardless
of any other sort flags, including (oN) which applies only to the files
within each subdirectory and to subdirectories of the same parent with
relation to each other.

Similarly (Od) displays local names first, though they may otherwise be
unordered (or at leat not predictably ordered).  Even without **/ as a
prefix, (odoN) is a different order than (oN).  This must be because an
unstable qsort is applied even though it theoretically compares all names
as equivalent.
 
} More later when I have time to peruse the patches.

Both patches look sane and simple enough.  My only quibble with the
second (1<<GF_NONE) patch is that by enumerating the states it becomes
necessary to edit this again if a new sort order is introduced, but
that doesn't happen very often so is probably moot.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Re: (Y) modifier: up to N matches?
  2014-06-04 23:16         ` Bart Schaefer
  2014-06-05  4:45           ` Bart Schaefer
@ 2014-06-05 23:24           ` Oliver Kiddle
  2014-06-06  2:20             ` Bart Schaefer
  2014-06-06  2:45             ` Daniel Shahaf
  1 sibling, 2 replies; 15+ messages in thread
From: Oliver Kiddle @ 2014-06-05 23:24 UTC (permalink / raw)
  To: zsh-workers

Bart wrote:
> On Jun 4, 2014 4:08 PM, "Daniel Shahaf" <d.s@daniel.shahaf.name> wrote:
> >
> > What would (odoN) do?  It produces the same output as (oN) alone, since
> > presence of (oN) causes any other sort qualifiers to be ignored.
> 
> I verified by comparing the output of **/*(odoN) vs **/*(oN) that they may
> in fact differ.  I didn't dig into exactly how or why.

Are the sort flags actually affecting the order in which directories are
visited or only the order in which matches are output? Considering uses
of the new (Y) modifier as I've understood it: I commonly use (../)# to
find the top-level of something. For example (../)#.git(:h) will find
the top of a git checkout. It'd be useful if this would stop searching
after the first match which should be the one that involves going up as
few directories as possible. I'm also inclined to think that with
downward globs that you'd want it to find the nearest match so checking
files before directories.

Oliver


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Re: (Y) modifier: up to N matches?
  2014-06-05 23:24           ` Oliver Kiddle
@ 2014-06-06  2:20             ` Bart Schaefer
  2014-06-06  2:40               ` Daniel Shahaf
  2014-06-06  2:45             ` Daniel Shahaf
  1 sibling, 1 reply; 15+ messages in thread
From: Bart Schaefer @ 2014-06-06  2:20 UTC (permalink / raw)
  To: zsh-workers

On Jun 6,  1:24am, Oliver Kiddle wrote:
}
} Are the sort flags actually affecting the order in which directories are
} visited or only the order in which matches are output?

Entirely independent of (Y) ... the sort flags only apply to the order
of outputs.  The actual scan is always in breadth-first order no matter
what flags have been passed in.

Looking at this leads me to believe that the (Y) patch may have been
missing some curly braces.  Either that or the indendation is wrong.

There's one extra hunk at the top here to skip initializing a data
structure if it's never going to be used.

diff --git a/Src/glob.c b/Src/glob.c
index c6cb3d2..15a5f70 100644
--- a/Src/glob.c
+++ b/Src/glob.c
@@ -462,18 +462,19 @@ scanner(Complist q, int shortcircuit)
     int errssofar = errsfound;
     struct dirsav ds;
 
-    init_dirsav(&ds);
     if (!q)
 	return;
+    init_dirsav(&ds);
 
     if ((closure = q->closure)) {
 	/* (foo/)# - match zero or more dirs */
 	if (q->closure == 2)	/* (foo/)## - match one or more dirs */
 	    q->closure = 1;
-	else
+	else {
 	    scanner(q->next, shortcircuit);
 	    if (shortcircuit && shortcircuit == matchct)
 		return;
+	}
     }
     p = q->pat;
     /* Now the actual matching for the current path section. */
@@ -518,10 +519,11 @@ scanner(Complist q, int shortcircuit)
 		}
 		if (add) {
 		    addpath(str, l);
-		    if (!closure || !statfullpath("", NULL, 1))
+		    if (!closure || !statfullpath("", NULL, 1)) {
 			scanner((q->closure) ? q : q->next, shortcircuit);
 			if (shortcircuit && shortcircuit == matchct)
 			    return;
+		    }
 		    pathbuf[pathpos = oppos] = '\0';
 		}
 	    }
@@ -618,11 +620,12 @@ scanner(Complist q, int shortcircuit)
 		    memcpy(subdirs + subdirlen, (char *)&errsfound,
 			   sizeof(int));
 		    subdirlen += sizeof(int);
-		} else
+		} else {
 		    /* if the last filename component, just add it */
 		    insert(fn, 1);
 		    if (shortcircuit && shortcircuit == matchct)
 			return;
+		}
 	    }
 	}
 	closedir(lock);


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Re: (Y) modifier: up to N matches?
  2014-06-06  2:20             ` Bart Schaefer
@ 2014-06-06  2:40               ` Daniel Shahaf
  0 siblings, 0 replies; 15+ messages in thread
From: Daniel Shahaf @ 2014-06-06  2:40 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers

Bart Schaefer wrote on Thu, Jun 05, 2014 at 19:20:47 -0700:
> Looking at this leads me to believe that the (Y) patch may have been
> missing some curly braces.  Either that or the indendation is wrong.

It's missing braces.  Their absence didn't affect correctness though
(didn't cause false negatives or false positives).


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Re: (Y) modifier: up to N matches?
  2014-06-05 23:24           ` Oliver Kiddle
  2014-06-06  2:20             ` Bart Schaefer
@ 2014-06-06  2:45             ` Daniel Shahaf
  2014-06-06  4:24               ` Bart Schaefer
  1 sibling, 1 reply; 15+ messages in thread
From: Daniel Shahaf @ 2014-06-06  2:45 UTC (permalink / raw)
  To: Oliver Kiddle; +Cc: zsh-workers

Oliver Kiddle wrote on Fri, Jun 06, 2014 at 01:24:45 +0200:
> Considering uses of the new (Y) modifier as I've understood it: I
> commonly use (../)# to find the top-level of something. For example
> (../)#.git(:h) will find the top of a git checkout. It'd be useful if
> this would stop searching after the first match which should be the
> one that involves going up as few directories as possible.

It does:

    % setopt extendedglob
    % cd $(mktemp -d)
    % mkdir -p foo A/foo A/B
    % cd A/B
    % echo (../)#foo(:A:h:t)
    A tmp.Ykz8jNBvJo
    % echo (../)#foo(Y1:A:h:t)
    A

(Also: 'git rev-parse --show-toplevel')

> I'm also inclined to think that with downward globs that you'd want it
> to find the nearest match so checking files before directories.

That's the current behaviour:

    % mkdir -p A/B/foo A/foo
    % echo **/foo(Y1)
    A/foo

It may be worthwhile adding a unit test for this to Test/D02glob to
ensure this behaviour doesn't change.

Daniel

>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Re: (Y) modifier: up to N matches?
  2014-06-06  2:45             ` Daniel Shahaf
@ 2014-06-06  4:24               ` Bart Schaefer
  0 siblings, 0 replies; 15+ messages in thread
From: Bart Schaefer @ 2014-06-06  4:24 UTC (permalink / raw)
  To: zsh-workers

On Jun 6,  2:45am, Daniel Shahaf wrote:
} Subject: Re: [PATCH] Re: (Y) modifier: up to N matches?
}
}     % mkdir -p A/B/foo A/foo
}     % echo **/foo(Y1)
}     A/foo
} 
} It may be worthwhile adding a unit test for this to Test/D02glob to
} ensure this behaviour doesn't change.

I think the existing tests of **/ would fail if this changed, so yet
another test is probably not necessary, but if wanted:

diff --git a/Test/D02glob.ztst b/Test/D02glob.ztst
index 358c934..a40e154 100644
--- a/Test/D02glob.ztst
+++ b/Test/D02glob.ztst
@@ -550,6 +550,7 @@
  (){ print "Negated:" $@:t } glob.tmp/dir*(Y1^Y)
  (){ print "Sorting:" $@:t } glob.tmp/dir*(Y4On)
  (){ [[ $#@ -eq 1 ]] && print Globs before last path component } glob.tmp/dir?/subdir(NY1)
+ (){ [[ $1 == glob.tmp/a ]] } glob.tmp/**/a(Y1) && print Breadth first
  (){ [[ $#@ -eq 0 ]] && print Respects qualifiers } glob.tmp/dir*(NY1.)
  (print -- *(Y)) 2>/dev/null || print "Argument required"
 0:short-circuit modifier
@@ -560,5 +561,6 @@
 >Negated: dir1 dir2 dir3 dir4
 >Sorting: dir4 dir3 dir2 dir1
 >Globs before last path component
+>Breadth first
 >Respects qualifiers
 >Argument required


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2014-06-06  4:24 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-02 18:23 (Y) modifier: up to N matches? Daniel Shahaf
2014-06-03  3:46 ` Bart Schaefer
2014-06-04  2:08   ` [PATCH] " Daniel Shahaf
2014-06-04  6:01     ` Mikael Magnusson
2014-06-04  6:54       ` Bart Schaefer
2014-06-04  6:42     ` Bart Schaefer
2014-06-04  9:25       ` Peter Stephenson
2014-06-04 23:08       ` Daniel Shahaf
2014-06-04 23:16         ` Bart Schaefer
2014-06-05  4:45           ` Bart Schaefer
2014-06-05 23:24           ` Oliver Kiddle
2014-06-06  2:20             ` Bart Schaefer
2014-06-06  2:40               ` Daniel Shahaf
2014-06-06  2:45             ` Daniel Shahaf
2014-06-06  4:24               ` Bart Schaefer

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).