zsh-workers
 help / color / mirror / code / Atom feed
* PATCH: configurability of pattern characters, part 1
@ 2013-05-31 23:29 Peter Stephenson
  2013-06-01  6:22 ` Bart Schaefer
  0 siblings, 1 reply; 13+ messages in thread
From: Peter Stephenson @ 2013-05-31 23:29 UTC (permalink / raw)
  To: Zsh Hackers' List

I think it's actually quite easy to make it possible to disable
(possibly also enable) individual pattern characters with no significant
disadvantages, indeed the code is actually a bit neater.  So you'll be
able to mix and match EXTENDEDGLOB features, turning off anything that
trips you up.

Here's my master plan.

The patch here changes the scheme for recognising special characters in
patterns.  Instead of testing internally for isset(EXTENDEDGLOB) etc.,
there is now a set of pattern characters which either is the token for
the character (some KSHGLOBs don't need tokens because the following
open parenthesis is tokenised instead), or the special token Marker if
the pattern character isn't active --- the latter will never match a
character in a tokenised string, it's only put into strings in special
cases as a (surprise!) marker.  So we set up this special array
appropriately for the combination of options: EXTENDEDGLOB, KSHGLOB,
SHGLOB.  (Note that there are parsing implications elsewhere,
particularly for SHGLOB; these aren't affected.)

As a bonus, we get a free array that can be used as a set of active
characters to test for the end of a pattern segment or the end of a pure
string segment, replacing the two existing arrays that did that and
so simplifying the way this works.

This patch shouldn't have any major effect: it will have some, I hope, to
make parsing of KSHGLOB and SHGLOB patterns a little bit more consistent
as they were something of an afterthought.

All the tests are still passing after the following patch.


The second step, to follow: now we have the "zpc_special" array, it will
be possible (and fairly straightforward) to introduce a special variable
to indicate pattern characters that should be turned off, and what's
more this won't slow down the body of pattern compilation --- we just
need to initialise "zpc_special" from the new variable once for each
pattern compilation just as we do with the glob options.  For example:

zsh_pattern_disable=('^' '(' '+(')

This says even if EXTENDED_GLOB is on don't use ^ for patterns;
even if SH_GLOB is off don't use '(' for grouping; even if KSH_GLOB is
on don't use '+(' for patterns.  '#' applies to grouping but turning it
off will probably also turn off globbing flags (I could fudge it but I
think it's getting hairy at this point).  Likewise, turning off the
basic * and ? will turn off their KSHGLOB use (but obviously not vice
versa).

I'm not sure if it should be possible to turn on special characters that
the options say are off; although it should work within the pattern code
the parser might not recognise patterns in some cases, SHGLOB
particularly.  If it seems to work, we could also have:

zsh_pattern_enable=('~' '(')

This says even if EXTENDED_GLOB is off do use '~' for patterns; even if
SH_GLOB is on do use '(' for patterns (however, as I noted, many
expressions with parentheses won't parse as groups anyway because of
the use of SH_GLOB elsewhere; turning on individual EXTENDEDGLOB
characters looks easier).  Of course, you can just turn on EXTENDGLOB
and KSHGLOB and turn off SHGLOB and simply disable any characters you
don't want.

We'll need to set the new shell variable(s) locally to empty for
completion.  I think also "emulate" should clear them (locally for
"emulate -L") to present a pristine pattern environment for emulation.

diff --git a/Src/glob.c b/Src/glob.c
index ca2ffaf..db86d24 100644
--- a/Src/glob.c
+++ b/Src/glob.c
@@ -708,8 +708,9 @@ parsecomplist(char *instr)
     }
 
     /* Parse repeated directories such as (dir/)# and (dir/)## */
-    if (*(str = instr) == Inpar && !skipparens(Inpar, Outpar, (char **)&str) &&
-        *str == Pound && isset(EXTENDEDGLOB) && str[-2] == '/') {
+    if (*(str = instr) == zpc_special[ZPC_INPAR] &&
+	!skipparens(Inpar, Outpar, (char **)&str) &&
+        *str == zpc_special[ZPC_HASH] && str[-2] == '/') {
 	instr++;
 	if (!(p1 = patcompile(instr, compflags, &instr)))
 	    return NULL;
@@ -761,9 +762,9 @@ parsepat(char *str)
      * Check for initial globbing flags, so that they don't form
      * a bogus path component.
      */
-    if ((*str == Inpar && str[1] == Pound && isset(EXTENDEDGLOB)) ||
-	(isset(KSHGLOB) && *str == '@' && str[1] == Inpar &&
-	 str[2] == Pound)) {
+    if ((*str == zpc_special[ZPC_INPAR] && str[1] == zpc_special[ZPC_HASH]) ||
+	(*str == zpc_special[ZPC_KSH_AT] && str[1] == Inpar &&
+	 str[2] == zpc_special[ZPC_HASH])) {
 	str += (*str == Inpar) ? 2 : 3;
 	if (!patgetglobflags(&str, &assert, &ignore))
 	    return NULL;
@@ -1146,7 +1147,7 @@ zglob(LinkList list, LinkNode np, int nountok)
     gf_pre_words = NULL;
 
     /* Check for qualifiers */
-    while (!nobareglob || isset(EXTENDEDGLOB)) {
+    while (!nobareglob || zpc_special[ZPC_HASH] != Marker) {
 	struct qual *newquals;
 	char *s;
 	int sense, paren;
@@ -1192,10 +1193,11 @@ zglob(LinkList list, LinkNode np, int nountok)
 	    case Outpar:
 		paren++; /*FALLTHROUGH*/
 	    case Bar:
-		nobareglob = 1;
+		if (zpc_special[ZPC_BAR] != Marker)
+		    nobareglob = 1;
 		break;
 	    case Tilde:
-		if (isset(EXTENDEDGLOB))
+		if (zpc_special[ZPC_TILDE] != Marker)
 		    nobareglob = 1;
 		break;
 	    case Inpar:
@@ -1205,7 +1207,7 @@ zglob(LinkList list, LinkNode np, int nountok)
 	}
 	if (*s != Inpar)
 	    break;
-	if (isset(EXTENDEDGLOB) && s[1] == Pound) {
+	if (s[1] == zpc_special[ZPC_HASH]) {
 	    if (s[2] == 'q') {
 		*s = 0;
 		s += 2;
diff --git a/Src/pattern.c b/Src/pattern.c
index 3b6edb8..54d6e7c 100644
--- a/Src/pattern.c
+++ b/Src/pattern.c
@@ -225,34 +225,27 @@ typedef unsigned long zrange_t;
 #endif
 
 /*
- * Characters which terminate a pattern segment.  We actually use
- * a pointer patendseg which skips the first character if we are not
- * parsing a file pattern.
- * Note that the size of this and the next array are hard-wired
- * via the definitions.
+ * Array of characters corresponding to zpc_chars enum, which it must match.
  */
-
-static char endseg[] = {
-    '/',			/* file only */
-    '\0', Bar, Outpar,		/* all patterns */
-    Tilde			/* extended glob only */
+static const char zpc_chars[ZPC_COUNT] = {
+    '/', '\0', Bar, Outpar, Tilde, Inpar, Quest, Star, Inbrack, Inang,
+    Hat, Pound, Bnullkeep, Quest, Star, '+', '!', '@'
 };
 
-#define PATENDSEGLEN_NORM 4
-#define PATENDSEGLEN_EXT  5
-
-/* Characters which terminate a simple string */
-
-static char endstr[] = {
-    '/',			/* file only */
-    '\0', Bar, Outpar, Quest, Star, Inbrack, Inpar, Inang, Bnullkeep,
-				/* all patterns */
-    Tilde, Hat, Pound		/* extended glob only */
-};
-
-#define PATENDSTRLEN_NORM 10
-#define PATENDSTRLEN_EXT  13
+/*
+ * Characters which terminate a simple string (ZPC_COUNT) or
+ * an entire pattern segment (the first ZPC_SEG_COUNT).
+ * Each entry is either the corresponding character in zpc_chars
+ * or Marker which is guaranteed not to match a character in a
+ * pattern we are compiling.
+ *
+ * The complete list indicates characters that are special, so e.g.
+ * (testchar == special[ZPC_TILDE]) succeeds only if testchar is a Tilde
+ * *and* Tilde is currently special.
+ */
 
+/**/
+char zpc_special[ZPC_COUNT];
 
 /* Default size for pattern buffer */
 #define P_DEF_ALLOC 256
@@ -264,10 +257,6 @@ static char *patcode;		/* point of code emission */
 static long patsize;		/* size of code */
 static char *patout;		/* start of code emission string */
 static long patalloc;		/* size allocated for same */
-static char *patendseg;		/* characters ending segment */
-static int patendseglen;	/* length of same */
-static char *patendstr;		/* characters ending plain string */
-static int patendstrlen;	/* length of sameo */
 
 /* Flags used in both compilation and execution */
 static int patflags;		    /* flags passed down to patcompile */
@@ -417,12 +406,56 @@ static long rn_offs;
 		    (P_OP(p) == P_BACK) ? \
 		    ((p)-rn_offs) : ((p)+rn_offs) : NULL)
 
+/*
+ * Set up zpc_special with characters that end a string segment.
+ * "Marker" cannot occur in the pattern we are compiling so
+ * is used to mark "invalid".
+ */
+static void
+patcompcharsset(void)
+{
+    memcpy(zpc_special, zpc_chars, ZPC_COUNT);
+    if (!isset(EXTENDEDGLOB)) {
+	/* Extended glob characters are not active */
+	zpc_special[ZPC_TILDE] = zpc_special[ZPC_HAT] =
+	    zpc_special[ZPC_HASH] = Marker;
+    }
+    if (!isset(KSHGLOB)) {
+	/*
+	 * Ksh glob characters are not active.
+	 * * and ? are shared with normal globbing, but for their
+	 * use here we are looking for a following Inpar.
+	 */
+	zpc_special[ZPC_KSH_QUEST] = zpc_special[ZPC_KSH_STAR] =
+	    zpc_special[ZPC_KSH_PLUS] = zpc_special[ZPC_KSH_BANG] =
+	    zpc_special[ZPC_KSH_AT] = Marker;
+    }
+    /*
+     * Note that if we are using KSHGLOB, then we test for a following
+     * Inpar, not zpc_special[ZPC_INPAR]:  the latter makes an Inpar on
+     * its own active.  The zpc_special[ZPC_KSH_*] followed by any old Inpar
+     * discriminate ksh globbing.
+     */
+    if (isset(SHGLOB)) {
+	/*
+	 * Grouping and numeric ranges are not valid.
+	 * We do allow alternation, however; it's needed for
+	 * "case".  This may not be entirely consistent.
+	 *
+	 * Don't disable Outpar: we may need to match the end of KSHGLOB
+	 * parentheses and it would be difficult to tell them apart.
+	 */
+	zpc_special[ZPC_INPAR] = zpc_special[ZPC_INANG] = Marker;
+    }
+}
+
 /* Called before parsing a set of file matchs to initialize flags */
 
 /**/
 void
 patcompstart(void)
 {
+    patcompcharsset();
     if (isset(CASEGLOB))
 	patglobflags = 0;
     else
@@ -469,16 +502,9 @@ patcompile(char *exp, int inflags, char **endexp)
     patnpar = 1;
     patflags = inflags & ~(PAT_PURES|PAT_HAS_EXCLUDP);
 
-    patendseg = endseg;
-    patendseglen = isset(EXTENDEDGLOB) ? PATENDSEGLEN_EXT : PATENDSEGLEN_NORM;
-    patendstr = endstr;
-    patendstrlen = isset(EXTENDEDGLOB) ? PATENDSTRLEN_EXT : PATENDSTRLEN_NORM;
-
     if (!(patflags & PAT_FILE)) {
-	patendseg++;
-	patendstr++;
-	patendseglen--;
-	patendstrlen--;
+	patcompcharsset();
+	zpc_special[ZPC_SLASH] = Marker;
 	remnulargs(patparse);
 	if (isset(MULTIBYTE))
 	    patglobflags = GF_MULTIBYTE;
@@ -698,11 +724,11 @@ patcompswitch(int paren, int *flagp)
 
     *flagp |= flags & (P_HSTART|P_PURESTR);
 
-    while (*patparse == Bar ||
-	   (isset(EXTENDEDGLOB) && *patparse == Tilde &&
+    while (*patparse == zpc_chars[ZPC_BAR] ||
+	   (*patparse == zpc_special[ZPC_TILDE] &&
 	    (patparse[1] == '/' ||
-	     !memchr(patendseg, patparse[1], patendseglen)))) {
-	int tilde = *patparse++ == Tilde;
+	     !memchr(zpc_special, patparse[1], ZPC_SEG_COUNT)))) {
+	int tilde = *patparse++ == zpc_special[ZPC_TILDE];
 	long gfnode = 0, newbr;
 
 	*flagp &= ~P_PURESTR;
@@ -739,12 +765,9 @@ patcompswitch(int paren, int *flagp)
 	    up.p = NULL;
 	    patadd((char *)&up, 0, sizeof(up), 0);
 	    /* / is not treated as special if we are at top level */
-	    if (!paren && *patendseg == '/') {
+	    if (!paren && zpc_special[ZPC_SLASH] == '/') {
 		tilde++;
-		patendseg++;
-		patendseglen--;
-		patendstr++;
-		patendstrlen--;
+		zpc_special[ZPC_SLASH] = Marker;
 	    }
 	} else {
 	    excsync = 0;
@@ -784,10 +807,7 @@ patcompswitch(int paren, int *flagp)
 	newbr = patcompbranch(&flags);
 	if (tilde == 2) {
 	    /* restore special treatment of / */
-	    patendseg--;
-	    patendseglen++;
-	    patendstr--;
-	    patendstrlen++;
+	    zpc_special[ZPC_SLASH] = '/';
 	}
 	if (!newbr)
 	    return 0;
@@ -855,14 +875,13 @@ patcompbranch(int *flagp)
     *flagp = P_PURESTR;
 
     starter = chain = 0;
-    while (!memchr(patendseg, *patparse, patendseglen) ||
-	   (*patparse == Tilde && patparse[1] != '/' &&
-	    memchr(patendseg, patparse[1], patendseglen))) {
-	if (isset(EXTENDEDGLOB) &&
-	    ((!isset(SHGLOB) &&
-	      (*patparse == Inpar && patparse[1] == Pound)) ||
-	     (isset(KSHGLOB) && *patparse == '@' && patparse[1] == Inpar &&
-	      patparse[2] == Pound))) {
+    while (!memchr(zpc_special, *patparse, ZPC_SEG_COUNT) ||
+	   (*patparse == zpc_special[ZPC_TILDE] && patparse[1] != '/' &&
+	    memchr(zpc_special, patparse[1], ZPC_SEG_COUNT))) {
+	if ((*patparse == zpc_special[ZPC_INPAR] &&
+	     patparse[1] == zpc_special[ZPC_HASH]) ||
+	    (*patparse == zpc_special[ZPC_KSH_AT] && patparse[1] == Inpar &&
+	     patparse[2] == zpc_special[ZPC_HASH])) {
 	    /* Globbing flags. */
 	    char *pp1 = patparse;
 	    int oldglobflags = patglobflags, ignore;
@@ -910,7 +929,7 @@ patcompbranch(int *flagp)
 		break;
 	    else
 		continue;
-	} else if (isset(EXTENDEDGLOB) && *patparse == Hat) {
+	} else if (*patparse == zpc_special[ZPC_HAT]) {
 	    /*
 	     * ^pat:  anything but pat.  For proper backtracking,
 	     * etc., we turn this into (*~pat), except without the
@@ -1171,7 +1190,7 @@ patcomppiece(int *flagp)
 {
     long starter = 0, next, op, opnd;
     int flags, flags2, kshchar, len, ch, patch, nmeta;
-    int pound, count;
+    int hash, count;
     union upat up;
     char *nptr, *str0, *ptr, *patprev;
     zrange_t from, to;
@@ -1185,11 +1204,17 @@ patcomppiece(int *flagp)
 	 * the string doesn't introduce a ksh-like parenthesized expression.
 	 */
 	kshchar = '\0';
-	if (isset(KSHGLOB) && *patparse && patparse[1] == Inpar) {
-	    if (strchr("?*+!@", *patparse))
-		kshchar = STOUC(*patparse);
-	    else if (*patparse == Star || *patparse == Quest)
-		kshchar = STOUC(ztokens[*patparse - Pound]);
+	if (*patparse && patparse[1] == Inpar) {
+	    if (*patparse == zpc_special[ZPC_KSH_PLUS])
+		kshchar = STOUC('+');
+	    else if (*patparse == zpc_special[ZPC_KSH_BANG])
+		kshchar = STOUC('!');
+	    else if (*patparse == zpc_special[ZPC_KSH_AT])
+		kshchar = STOUC('@');
+	    else if (*patparse == zpc_special[ZPC_KSH_STAR])
+		kshchar = STOUC('*');
+	    else if (*patparse == zpc_special[ZPC_KSH_QUEST])
+		kshchar = STOUC('?');
 	}
 
 	/*
@@ -1199,10 +1224,10 @@ patcomppiece(int *flagp)
 	 * tildes are not special if there is nothing following to
 	 * be excluded.
 	 */
-	if (kshchar || (memchr(patendstr, *patparse, patendstrlen) &&
-			(*patparse != Tilde ||
+	if (kshchar || (memchr(zpc_special, *patparse, ZPC_COUNT) &&
+			(*patparse != zpc_special[ZPC_TILDE] ||
 			 patparse[1] == '/' ||
-			 !memchr(patendseg, patparse[1], patendseglen))))
+			 !memchr(zpc_special, patparse[1], ZPC_SEG_COUNT))))
 	    break;
 
 	/* Remember the previous character for backtracking */
@@ -1227,10 +1252,14 @@ patcomppiece(int *flagp)
 	 * If we have more than one character, a following hash
 	 * or (#c...) only applies to the last, so backtrack one character.
 	 */
-	if (isset(EXTENDEDGLOB) &&
-	    (*patparse == Pound ||
-	     (*patparse == Inpar && patparse[1] == Pound &&
-	      patparse[2] == 'c')) && morelen)
+	if ((*patparse == zpc_special[ZPC_HASH] ||
+	     (*patparse == zpc_special[ZPC_INPAR] &&
+	      patparse[1] == zpc_special[ZPC_HASH] &&
+	      patparse[2] == 'c') ||
+	     (*patparse == zpc_special[ZPC_KSH_AT] &&
+	      patparse[1] == Inpar &&
+	      patparse[2] == zpc_special[ZPC_HASH] &&
+	      patparse[3] == 'c')) && morelen)
 	    patparse = patprev;
 	/*
 	 * If len is 1, we can't have an active # following, so doesn't
@@ -1306,15 +1335,21 @@ patcomppiece(int *flagp)
 	METACHARINC(patparse);
 	switch(patch) {
 	case Quest:
+	    DPUTS(zpc_special[ZPC_QUEST] == Marker,
+		  "Treating '?' as pattern character although disabled");
 	    flags |= P_SIMPLE;
 	    starter = patnode(P_ANY);
 	    break;
 	case Star:
+	    DPUTS(zpc_special[ZPC_STAR] == Marker,
+		  "Treating '*' as pattern character although disabled");
 	    /* kshchar is used as a sign that we can't have #'s. */
 	    kshchar = -1;
 	    starter = patnode(P_STAR);
 	    break;
 	case Inbrack:
+	    DPUTS(zpc_special[ZPC_INBRACK] == Marker,
+		  "Treating '[' as pattern character although disabled");
 	    flags |= P_SIMPLE;
 	    if (*patparse == Hat || *patparse == '^' || *patparse == '!') {
 		patparse++;
@@ -1368,9 +1403,10 @@ patcomppiece(int *flagp)
 	    patadd(NULL, 0, 1, 0);
 	    break;
 	case Inpar:
-	    /* is this how to treat parentheses in SHGLOB? */
-	    if (isset(SHGLOB) && !kshchar)
-		return 0;
+	    DPUTS(zpc_special[ZPC_INPAR] == Marker,
+		  "Treating '(' as pattern character although disabled");
+	    DPUTS(isset(SHGLOB) && !kshchar,
+		  "Treating bare '(' as pattern character with SHGLOB");
 	    if (kshchar == '!') {
 		/* This is nasty, we should really either handle all
 		 * kshglobbing below or here.  But most of the
@@ -1393,6 +1429,9 @@ patcomppiece(int *flagp)
 	    break;
 	case Inang:
 	    /* Numeric glob */
+	    DPUTS(zpc_special[ZPC_INANG] == Marker,
+		  "Treating '<' as pattern character although disabled");
+	    DPUTS(isset(SHGLOB), "Treating <..> as numeric range with SHGLOB");
 	    len = 0;		/* beginning present 1, end present 2 */
 	    if (idigit(*patparse)) {
 		from = (zrange_t) zstrtol((char *)patparse,
@@ -1435,6 +1474,8 @@ patcomppiece(int *flagp)
 	     */
 	    break;
 	case Pound:
+	    DPUTS(zpc_special[ZPC_HASH] == Marker,
+		  "Treating '#' as pattern character although disabled");
 	    DPUTS(!isset(EXTENDEDGLOB), "BUG: # not treated as string");
 	    /*
 	     * A hash here is an error; it should follow something
@@ -1465,16 +1506,21 @@ patcomppiece(int *flagp)
     }
 
     count = 0;
-    if (!(pound = (*patparse == Pound && isset(EXTENDEDGLOB))) &&
-	!(count = (isset(EXTENDEDGLOB) && *patparse == Inpar &&
-		   patparse[1] == Pound && patparse[2] == 'c')) &&
+    if (!(hash = (*patparse == zpc_special[ZPC_HASH])) &&
+	!(count = ((*patparse == zpc_special[ZPC_INPAR] &&
+		    patparse[1] == zpc_special[ZPC_HASH] &&
+		    patparse[2] == 'c') ||
+		   (*patparse == zpc_special[ZPC_KSH_AT] &&
+		    patparse[1] == Inpar &&
+		    patparse[2] == zpc_special[ZPC_HASH] &&
+		    patparse[3] == 'c'))) &&
 	(kshchar <= 0 || kshchar == '@' || kshchar == '!')) {
 	*flagp = flags;
 	return starter;
     }
 
     /* too much at once doesn't currently work */
-    if (kshchar && (pound || count))
+    if (kshchar && (hash || count))
 	return 0;
 
     if (kshchar == '*') {
@@ -1490,7 +1536,7 @@ patcomppiece(int *flagp)
 	op = P_COUNT;
 	patparse += 3;
 	*flagp = P_HSTART;
-    } else if (*++patparse == Pound) {
+    } else if (*++patparse == zpc_special[ZPC_HASH]) {
 	op = P_TWOHASH;
 	patparse++;
 	*flagp = P_HSTART;
@@ -1600,7 +1646,7 @@ patcomppiece(int *flagp)
 	pattail(starter, next);
 	patoptail(starter, next);
     }
-    if (*patparse == Pound)
+    if (*patparse == zpc_special[ZPC_HASH])
 	return 0;
 
     return starter;
diff --git a/Src/zsh.h b/Src/zsh.h
index f247563..639c2b7 100644
--- a/Src/zsh.h
+++ b/Src/zsh.h
@@ -179,7 +179,11 @@ struct mathfunc {
  * Take care to update the use of IMETA appropriately when adding
  * tokens here.
  */
-/* Marker used in paramsubst for rc_expand_param */
+/*
+ * Marker used in paramsubst for rc_expand_param.
+ * Also used in pattern character arrays as guaranteed not to
+ * mark a character in a string.
+ */
 #define Marker		((char) 0xa0)
 
 /* chars that need to be quoted if meant literally */
@@ -1375,6 +1379,40 @@ struct patprog {
 #define PAT_HAS_EXCLUDP	0x0800	/* (internal): top-level path1~path2. */
 #define PAT_LCMATCHUC   0x1000  /* equivalent to setting (#l) */
 
+/**
+ * Indexes into the array of active pattern characters.
+ * This must match the array zpc_chars in pattern.c.
+ */
+enum zpc_chars {
+    /*
+     * These characters both terminate a pattern segment and
+     * a pure string segment.
+     */
+    ZPC_SLASH,			/* / active as file separator */
+    ZPC_NULL,			/* \0 as string terminator */
+    ZPC_BAR,			/* | for "or" */
+    ZPC_OUTPAR,			/* ) for grouping */
+    ZPC_TILDE,			/* ~ for exclusion (extended glob) */
+    ZPC_SEG_COUNT,              /* No. of the above characters */
+    /*
+     * These characters terminate a pure string segment.
+     */
+    ZPC_INPAR = ZPC_SEG_COUNT,  /* ( for grouping */
+    ZPC_QUEST,			/* ? as wildcard */
+    ZPC_STAR,			/* * as wildcard */
+    ZPC_INBRACK,		/* [ for character class */
+    ZPC_INANG,			/* < for numeric glob */
+    ZPC_HAT,			/* ^ for exclusion (extended glob) */
+    ZPC_HASH,			/* # for repetition (extended glob) */
+    ZPC_BNULLKEEP,		/* Special backslashed null not removed */
+    ZPC_KSH_QUEST,              /* ? for ?(...) in KSH_GLOB */
+    ZPC_KSH_STAR,               /* * for *(...) in KSH_GLOB */
+    ZPC_KSH_PLUS,               /* + for +(...) in KSH_GLOB */
+    ZPC_KSH_BANG,               /* ! for !(...) in KSH_GLOB */
+    ZPC_KSH_AT,                 /* @ for @(...) in KSH_GLOB */
+    ZPC_COUNT			/* Number of special chararacters */
+};
+
 /*
  * Special match types used in character classes.  These
  * are represented as tokens, with Meta added.  The character

-- 
Peter Stephenson <p.w.stephenson@ntlworld.com>
Web page now at http://homepage.ntlworld.com/p.w.stephenson/


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PATCH: configurability of pattern characters, part 1
  2013-05-31 23:29 PATCH: configurability of pattern characters, part 1 Peter Stephenson
@ 2013-06-01  6:22 ` Bart Schaefer
  2013-06-01 20:18   ` Peter Stephenson
  2013-06-01 23:09   ` PATCH: configurability of pattern characters, part 2 Peter Stephenson
  0 siblings, 2 replies; 13+ messages in thread
From: Bart Schaefer @ 2013-06-01  6:22 UTC (permalink / raw)
  To: Zsh Hackers' List

On Jun 1, 12:29am, Peter Stephenson wrote:
}
} The second step, to follow: now we have the "zpc_special" array, it will
} be possible (and fairly straightforward) to introduce a special variable
} to indicate pattern characters that should be turned off

Possible alternative idea -- use the enable/disable builtins?

    disable -p '^' '(' '+('

(Choose another switch if you don't like -p for pattern.)

I suppose that's harder to set/restore on within/without a local scope.
On the other hand I've frequently wished that some internal tables were
scope-able; e.g., making the $functions special variable a local has
subtle undesirable side-effects on autoloaded functions, but if the
underlying table itself could be localized, those would go away.

} We'll need to set the new shell variable(s) locally to empty for
} completion.

Hmm, that's another problem with the enable/disable idea ... or is it?
"emulate -R zsh -c 'autoload _main_complete'" should do the trick ...?

} I think also "emulate" should clear them (locally for "emulate -L") to
} present a pristine pattern environment for emulation.

I agree ... which for me is an argument *against* using variables for
this.  I know emulation modes already play with the special-ness of
things like HISTCHARS and MANPATH, but it doesn't actually go so far
as creating empty locals for them.

Speaking of HISTCHARS, do we agree that it'd be a bad idea to be able
to swap around which characters have what glob semantics?  E.g., it's
OK if * means only "*", but you can't make % mean "match any number of
any character".

-- 
Barton E. Schaefer


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PATCH: configurability of pattern characters, part 1
  2013-06-01  6:22 ` Bart Schaefer
@ 2013-06-01 20:18   ` Peter Stephenson
  2013-06-02  7:16     ` Bart Schaefer
  2013-06-01 23:09   ` PATCH: configurability of pattern characters, part 2 Peter Stephenson
  1 sibling, 1 reply; 13+ messages in thread
From: Peter Stephenson @ 2013-06-01 20:18 UTC (permalink / raw)
  To: Zsh Hackers' List

On Fri, 31 May 2013 23:22:23 -0700
Bart Schaefer <schaefer@brasslantern.com> wrote:
> On Jun 1, 12:29am, Peter Stephenson wrote:
> }
> } The second step, to follow: now we have the "zpc_special" array, it will
> } be possible (and fairly straightforward) to introduce a special variable
> } to indicate pattern characters that should be turned off
> 
> Possible alternative idea -- use the enable/disable builtins?
> 
>     disable -p '^' '(' '+('

That's quite reasonable, it makes it more natural to enforce particular
entries than in an array.

The simple meaning for enable -p is that it reverses a disable, it
doesn't explicitly enable something that's not allowed by the options.
I think I'll stick with that (though it needs clearly documenting).  If
you want to be able to enable or disable every pattern separetely, you
"setopt extendedglob kshglob noshglob" first.

> } We'll need to set the new shell variable(s) locally to empty for
> } completion.
> 
> Hmm, that's another problem with the enable/disable idea ... or is it?
> "emulate -R zsh -c 'autoload _main_complete'" should do the trick ...?

Yes, it should, but that's another story I haven't written yet, so we
probably need another way...

"disable -p" should output the current settings, which we could save.
We reenable anything associated with extendedglob, turning off kshglob
and shglob using the options as now, and use disable -p to redisable the
user's stuff at the end in an "always" block.

Or how about readonly zsh/parameter arrays corresponding to enabled and
disabled patterns?  Same idea, just slightly more efficient to save.
(Could even be read/write; since they'd be documented as a front end to
enable/disable rather than the basic syntax the scoping behaviour isn't
so much of an issue.)

> } I think also "emulate" should clear them (locally for "emulate -L") to
> } present a pristine pattern environment for emulation.
> 
> I agree ... which for me is an argument *against* using variables for
> this.  I know emulation modes already play with the special-ness of
> things like HISTCHARS and MANPATH, but it doesn't actually go so far
> as creating empty locals for them.

The trouble is this creates an additional form of scope that's not
options or parameters or traps for saving and restoring.  However,
there's nothing fundamentally difficult about that.

Logically, to get emulate -L to behave as it does for options and traps,
there should be an option "localpatterns" that causes the effects of
disable -p to be restored after the current scope: otherwise we don't
have a natural distinction between the behaviour of "emulate" and
"emulate -L" (they'd have some hidden different effect on scoping, which
I don't like).  We just need to save one bit per pattern, so this is
much more efficient than what we currently do for localoptions.  In
other words, "setopt localpatterns" would mean "restore the disabled
pattern state at the end of the current function scope"; emulate's only
contribution would be to set this if -L was given and in any case turn
off the current disables.  That's exactly parallel to localoptions
except there's only one emulation state in this case, i.e. the one where
the patterns are controlled only by the option settings.
 
> Speaking of HISTCHARS, do we agree that it'd be a bad idea to be able
> to swap around which characters have what glob semantics?  E.g., it's
> OK if * means only "*", but you can't make % mean "match any number of
> any character".

Yes, I really don't think there's any sense in allowing this.  It causes
confusion far beyond any convenience it produces.

-- 
Peter Stephenson <p.w.stephenson@ntlworld.com>
Web page now at http://homepage.ntlworld.com/p.w.stephenson/


^ permalink raw reply	[flat|nested] 13+ messages in thread

* PATCH: configurability of pattern characters, part 2
  2013-06-01  6:22 ` Bart Schaefer
  2013-06-01 20:18   ` Peter Stephenson
@ 2013-06-01 23:09   ` Peter Stephenson
  2013-06-04  6:45     ` Bart Schaefer
  1 sibling, 1 reply; 13+ messages in thread
From: Peter Stephenson @ 2013-06-01 23:09 UTC (permalink / raw)
  To: zsh-workers

Here's the basic disable -p / enable -p and emulation behaviour to be
exposed to public view before the rest.  Still to come: restore patterns
for completion (possibly with zsh/parameter variable help); tests for
some of the leading combinations.

Up to now, this has only had finger tests of some of the simpler forms,
so there could well be bugs lurking in the disabling of certain
patterns.  These should emerge when I write some ztst code.

It's unlikely to make your shell eat your pet rabbit if you don't
actually use disable -p, however.

If somebody can spot some effect I haven't documented, please shout.

diff --git a/Doc/Zsh/builtins.yo b/Doc/Zsh/builtins.yo
index f3a7f6a..a9a150f 100644
--- a/Doc/Zsh/builtins.yo
+++ b/Doc/Zsh/builtins.yo
@@ -291,8 +291,8 @@ enditem()
 findex(disable)
 cindex(disabling commands)
 cindex(commands, disabling)
-item(tt(disable) [ tt(-afmrs) ] var(name) ...)(
-Temporarily disable the var(name)d hash table elements.  The default
+item(tt(disable) [ tt(-afmprs) ] var(name) ...)(
+Temporarily disable the var(name)d hash table elements or patterns.  The default
 is to disable builtin commands.  This allows you to use an external
 command with the same name as a builtin command.  The tt(-a) option
 causes tt(disable) to act on regular or global aliases.  The tt(-s)
@@ -305,6 +305,80 @@ quoted to prevent them from undergoing filename expansion), and all hash
 table elements from the corresponding hash table matching these patterns
 are disabled.  Disabled objects can be enabled with the tt(enable)
 command.
+
+With the option tt(-p), var(name) ... refer to elements of the
+shell's pattern syntax as described in noderef(Filename Generation).
+Certain elements can be disabled separately, as given below.
+
+Note that patterns
+not allowed by the current settings for the options tt(EXTENDED_GLOB),
+tt(KSH_GLOB) and tt(SH_GLOB) are never enabled, regardless of the
+setting here.  For example, if tt(EXTENDED_GLOB) is not active,
+the pattern tt(^) is ineffective even if `tt(disable -p "^")' has
+not been issued.  The list below indicates any option settings
+that restrict the use of the pattern.  It should be noted that
+setting tt(SH_GLOB) has a wider effect then merely disabling patterns
+as certain expressions, in particular those involving parentheses,
+are parsed differently.
+
+The following patterns may be disabled; all
+the strings need quoting on the command line to prevent them from
+being interpreted immediately as patterns and the patterns are
+shown below in single quotes as a reminder.
+startitem()
+item(tt('?'))(
+The pattern character tt(?) wherever it occurs, including when preceding
+a parenthesis with tt(KSH_GLOB).
+)
+item(tt('*'))(
+The pattern character tt(*) wherever it occurs, including recursive
+globbing and when preceding a parenthesis with tt(KSH_GLOB).
+)
+item('LSQUARE()')(
+Character classes.
+)
+item(tt('<') (tt(NO_SH_GLOB)))(
+Numeric ranges.
+)
+item(tt('|') (tt(NO_SH_GLOB)))(
+Alternation in grouped patterns, case statements, or KSH_GLOB
+parenthesised expressions.
+)
+item(tt('LPAR()') (tt(NO_SH_GLOB)))(
+Grouping using single parentheses.  Disabling this does not disable the
+use of parentheses for tt(KSH_GLOB) where they are introduced by a
+special character, nor for glob qualifiers (use `tt(setopt
+NO_BARE_GLOB_QUAL)' to disable glob qualifiers that use parentheses
+only).
+)
+item(tt('~') (tt(EXTENDED_GLOB)))(
+Exclusion in the form var(A)tt(~)var(B).
+)
+item(tt('^') (tt(EXTENDED_GLOB)))(
+Exclusion in the form var(A)tt(^)var(B).
+)
+item(tt('#') (tt(EXTENDED_GLOB)))(
+The pattern character tt(#) wherever it occurs, both for
+repetition of a previous pattern and for indicating globbing flags.
+)
+item(tt('?LPAR()') (tt(KSH_GLOB)))(
+The grouping form tt(?LPAR())var(...)tt(RPAR()).  Note this is also
+disabled if tt('?') is disabled.
+)
+item(tt('*LPAR()') (tt(KSH_GLOB)))(
+The grouping form tt(*LPAR())var(...)tt(RPAR()).  Note this is also
+disabled if tt('*') is disabled.
+)
+item(tt('PLUS()LPAR()') (tt(KSH_GLOB)))(
+The grouping form tt(PLUS()LPAR())var(...)tt(RPAR()).
+)
+item(tt('!LPAR()') (tt(KSH_GLOB)))(
+The grouping form tt(!LPAR())var(...)tt(RPAR()).
+)
+item(tt('@LPAR()') (tt(KSH_GLOB)))(
+The grouping form tt(@LPAR())var(...)tt(RPAR()).
+)
+enditem()
 )
 findex(disown)
 cindex(jobs, disowning)
@@ -376,7 +450,9 @@ the section COMPATIBILITY in zmanref(zsh)
 ifnzman(\
 noderef(Compatibility)
 )\
-.
+.  In addition to setting shell options, the command also restores
+the pristine state of pattern enables, as if all patterns had been
+enabled using tt(enable -p).
 
 If the tt(emulate) command occurs inside a function that has been
 marked for execution tracing with tt(functions -t) then the tt(xtrace)
@@ -390,9 +466,11 @@ are reset to their default value corresponding to the specified emulation
 mode, except for certain options describing the interactive
 environment; otherwise, only those options likely to cause portability
 problems in scripts and functions are altered.  If the tt(-L) switch is given,
-the options tt(LOCAL_OPTIONS) and tt(LOCAL_TRAPS) will be set as
-well, causing the effects of the tt(emulate) command and any tt(setopt) and
-tt(trap) commands to be local to the immediately surrounding shell
+the options tt(LOCAL_OPTIONS), tt(LOCAL_PATTERNS) and tt(LOCAL_TRAPS)
+will be set as
+well, causing the effects of the tt(emulate) command and any tt(setopt),
+tt(disable -p) or tt(enable -b), and tt(trap) commands to be local to
+the immediately surrounding shell
 function, if any; normally these options are turned off in all emulation
 modes except tt(ksh). The tt(-L) switch is mutually exclusive with the
 use of tt(-c) in var(flags).
@@ -414,7 +492,8 @@ Use of tt(-c) enables `sticky' emulation mode for functions defined
 within the evaluated expression:  the emulation mode is associated
 thereafter with the function so that whenever the function is executed
 the emulation (respecting the tt(-R) switch, if present) and all
-options are set before entry to the function, and restored after exit.
+options are set (and pattern disables cleared)
+before entry to the function, and the state is restored after exit.
 If the function is called when the sticky emulation is already in
 effect, either within an `tt(emulate) var(shell) tt(-c)' expression or
 within another function with the same sticky emulation, entry and exit
@@ -471,7 +550,7 @@ endsitem()
 findex(enable)
 cindex(enabling commands)
 cindex(commands, enabling)
-item(tt(enable) [ tt(-afmrs) ] var(name) ...)(
+item(tt(enable) [ tt(-afmprs) ] var(name) ...)(
 Enable the var(name)d hash table elements, presumably disabled
 earlier with tt(disable).  The default is to enable builtin commands.
 The tt(-a) option causes tt(enable) to act on regular or global aliases.
@@ -483,6 +562,13 @@ printed.  With the tt(-m) flag the arguments are taken as patterns
 (should be quoted) and all hash table elements from the corresponding
 hash table matching these patterns are enabled.  Enabled objects can be
 disabled with the tt(disable) builtin command.
+
+tt(enable -p) reenables patterns disabled with tt(disable -p).  Note
+that it does not override globbing options; for example, `tt(enable -p
+"~")' does not cause the pattern character tt(~) to be active unless
+the tt(EXTENDED_GLOB) option is also set.  To enable all possible
+patterns (so that they may be invidually disabled with tt(disable -p)),
+use `tt(setopt EXTENDED_GLOB KSH_GLOB NO_SH_GLOB)'.
 )
 findex(eval)
 cindex(evaluating arguments as commands)
diff --git a/Doc/Zsh/options.yo b/Doc/Zsh/options.yo
index 60892dd..ec86232 100644
--- a/Doc/Zsh/options.yo
+++ b/Doc/Zsh/options.yo
@@ -1618,6 +1618,19 @@ A shell function can also guarantee itself a known shell configuration
 with a formulation like `tt(emulate -L zsh)'; the tt(-L) activates
 tt(LOCAL_OPTIONS).
 )
+pindex(LOCAL_PATTERNS)
+pindex(NO_LOCAL_PATTERNS)
+pindex(LOCALPATTERNS)
+pindex(NOLOCALPATTERNS)
+item(tt(LOCAL_PATTERNS))(
+If this option is set at the point of return from a shell function,
+the state of pattern disables, as set with the builtin command
+`tt(disable -p)', is restored to what it was when the function was
+entered.  The behaviour of this option is similar to the effect
+of tt(LOCAL_OPTIONS) on options; hence `tt(emulate -L sh)' (or
+indeed any other emulation with the tt(-L) option) activates
+tt(LOCAL_PATTERNS).
+)
 pindex(LOCAL_TRAPS)
 pindex(NO_LOCAL_TRAPS)
 pindex(LOCALTRAPS)
diff --git a/Doc/zmacros.yo b/Doc/zmacros.yo
index 19506d2..aed5bd8 100644
--- a/Doc/zmacros.yo
+++ b/Doc/zmacros.yo
@@ -33,6 +33,7 @@ DEFINEMACRO(RQUOTE)(0)(CHAR(39))
 DEFINEMACRO(LPAR)(0)(CHAR(40))
 DEFINEMACRO(RPAR)(0)(CHAR(41))
 DEFINEMACRO(PLUS)(0)(CHAR(43))
+DEFINEMACRO(LSQUARE)(0)(CHAR(91))
 
 DEFINEMACRO(DASH)(0)(ifztexi(--)ifnztexi(-))
 
diff --git a/Src/builtin.c b/Src/builtin.c
index bc91578..8516acd 100644
--- a/Src/builtin.c
+++ b/Src/builtin.c
@@ -55,11 +55,11 @@ static struct builtin builtins[] =
     BUILTIN("continue", BINF_PSPECIAL, bin_break, 0, 1, BIN_CONTINUE, NULL, NULL),
     BUILTIN("declare", BINF_PLUSOPTS | BINF_MAGICEQUALS | BINF_PSPECIAL, bin_typeset, 0, -1, 0, "AE:%F:%HL:%R:%TUZ:%afghi:%klmprtuxz", NULL),
     BUILTIN("dirs", 0, bin_dirs, 0, -1, 0, "clpv", NULL),
-    BUILTIN("disable", 0, bin_enable, 0, -1, BIN_DISABLE, "afmrs", NULL),
+    BUILTIN("disable", 0, bin_enable, 0, -1, BIN_DISABLE, "afmprs", NULL),
     BUILTIN("disown", 0, bin_fg, 0, -1, BIN_DISOWN, NULL, NULL),
     BUILTIN("echo", BINF_SKIPINVALID, bin_print, 0, -1, BIN_ECHO, "neE", "-"),
     BUILTIN("emulate", 0, bin_emulate, 0, -1, 0, "LR", NULL),
-    BUILTIN("enable", 0, bin_enable, 0, -1, BIN_ENABLE, "afmrs", NULL),
+    BUILTIN("enable", 0, bin_enable, 0, -1, BIN_ENABLE, "afmprs", NULL),
     BUILTIN("eval", BINF_PSPECIAL, bin_eval, 0, -1, BIN_EVAL, NULL, NULL),
     BUILTIN("exit", BINF_PSPECIAL, bin_break, 0, 1, BIN_EXIT, NULL, NULL),
     BUILTIN("export", BINF_PLUSOPTS | BINF_MAGICEQUALS | BINF_PSPECIAL, bin_typeset, 0, -1, BIN_EXPORT, "E:%F:%HL:%R:%TUZ:%afhi:%lprtu", "xg"),
@@ -467,7 +467,9 @@ bin_enable(char *name, char **argv, Options ops, int func)
     int match = 0, returnval = 0;
 
     /* Find out which hash table we are working with. */
-    if (OPT_ISSET(ops,'f'))
+    if (OPT_ISSET(ops,'p')) {
+	return pat_enables(name, argv, func == BIN_ENABLE);
+    } else if (OPT_ISSET(ops,'f'))
 	ht = shfunctab;
     else if (OPT_ISSET(ops,'r'))
 	ht = reswdtab;
@@ -5020,6 +5022,7 @@ bin_emulate(UNUSED(char *nam), char **argv, Options ops, UNUSED(int func))
     int opt_R = OPT_ISSET(ops, 'R');
     int saveemulation, savehackchar;
     int ret = 1, new_emulation;
+    unsigned int savepatterns;
     char saveopts[OPT_SIZE], new_opts[OPT_SIZE];
     char *cmd = 0;
     const char *shname = *argv;
@@ -5061,7 +5064,8 @@ bin_emulate(UNUSED(char *nam), char **argv, Options ops, UNUSED(int func))
     if (!argv[1]) {
 	emulate(shname, OPT_ISSET(ops,'R'), &emulation, opts);
 	if (OPT_ISSET(ops,'L'))
-	    opts[LOCALOPTIONS] = opts[LOCALTRAPS] = 1;
+	    opts[LOCALOPTIONS] = opts[LOCALTRAPS] = opts[LOCALPATTERNS] = 1;
+	clearpatterndisables();
 	return 0;
     }
 
@@ -5082,6 +5086,13 @@ bin_emulate(UNUSED(char *nam), char **argv, Options ops, UNUSED(int func))
 	goto restore;
     }
 
+    savepatterns = savepatterndisables();
+    /*
+     * All emulations start with an empty set of pattern disables,
+     * hence no special "sticky" behaviour is required.
+     */
+    clearpatterndisables();
+
     saveemulation = emulation;
     emulation = new_emulation;
     memcpy(opts, new_opts, sizeof(opts));
@@ -5131,6 +5142,7 @@ bin_emulate(UNUSED(char *nam), char **argv, Options ops, UNUSED(int func))
     sticky = save_sticky;
     emulation = saveemulation;
     memcpy(opts, saveopts, sizeof(opts));
+    restorepatterndisables(savepatterns);
 restore:
     keyboardhackchar = savehackchar;
     inittyptab();	/* restore banghist */
diff --git a/Src/exec.c b/Src/exec.c
index 14c2ba0..75805d3 100644
--- a/Src/exec.c
+++ b/Src/exec.c
@@ -4627,6 +4627,7 @@ doshfunc(Shfunc shfunc, LinkList doshargs, int noreturnval)
     }
 
     starttrapscope();
+    startpatternscope();
 
     pptab = pparams;
     if (!(flags & PM_UNDEFINED))
@@ -4674,6 +4675,8 @@ doshfunc(Shfunc shfunc, LinkList doshargs, int noreturnval)
 		 offptr++)
 		opts[*offptr] = 0;
 	}
+	/* All emulations start with pattern disables clear */
+	clearpatterndisables();
     } else
 	restore_sticky = 0;
 
@@ -4774,6 +4777,8 @@ doshfunc(Shfunc shfunc, LinkList doshargs, int noreturnval)
     scriptname = oldscriptname;
     oflags = ooflags;
 
+    endpatternscope();		/* before restoring old LOCALPATTERNS */
+
     if (restore_sticky) {
 	/*
 	 * If we switched to an emulation environment just for
diff --git a/Src/options.c b/Src/options.c
index 480fccd..ad869b2 100644
--- a/Src/options.c
+++ b/Src/options.c
@@ -179,6 +179,7 @@ static struct optname optns[] = {
 {{NULL, "listrowsfirst",      0},			 LISTROWSFIRST},
 {{NULL, "listtypes",	      OPT_ALL},			 LISTTYPES},
 {{NULL, "localoptions",	      OPT_EMULATE|OPT_KSH},	 LOCALOPTIONS},
+{{NULL, "localpatterns",      OPT_EMULATE},		 LOCALPATTERNS},
 {{NULL, "localtraps",	      OPT_EMULATE|OPT_KSH},	 LOCALTRAPS},
 {{NULL, "login",	      OPT_SPECIAL},		 LOGINSHELL},
 {{NULL, "longlistjobs",	      0},			 LONGLISTJOBS},
diff --git a/Src/pattern.c b/Src/pattern.c
index 54d6e7c..a90d3cd 100644
--- a/Src/pattern.c
+++ b/Src/pattern.c
@@ -233,6 +233,27 @@ static const char zpc_chars[ZPC_COUNT] = {
 };
 
 /*
+ * Corresponding strings used in enable/disable -p.
+ * NULL means no way of turning this on or off.
+ */
+static const char *zpc_strings[ZPC_COUNT] = {
+   NULL, NULL, "|", NULL, "~", "(", "?", "*", "[", "<",
+   "^", "#", NULL, "?(", "*(", "+(", "!(", "@("
+};
+
+/*
+ * Corresponding array of pattern disables as set by the user
+ * using "disable -p".
+ */
+static char zpc_disables[ZPC_COUNT];
+
+/*
+ * Stack of saved (compressed) zpc_disables for function scope.
+ */
+
+static struct zpc_disables_save *zpc_disables_stack;
+
+/*
  * Characters which terminate a simple string (ZPC_COUNT) or
  * an entire pattern segment (the first ZPC_SEG_COUNT).
  * Each entry is either the corresponding character in zpc_chars
@@ -414,7 +435,19 @@ static long rn_offs;
 static void
 patcompcharsset(void)
 {
+    char *spp, *disp;
+    int i;
+
+    /* Initialise enabled special characters */
     memcpy(zpc_special, zpc_chars, ZPC_COUNT);
+    /* Apply user disables from disable -p */
+    for (i = 0, spp = zpc_special, disp = zpc_disables;
+	 i < ZPC_COUNT;
+	 i++, spp++, disp++) {
+	if (*disp)
+	    *spp = Marker;
+    }
+
     if (!isset(EXTENDEDGLOB)) {
 	/* Extended glob characters are not active */
 	zpc_special[ZPC_TILDE] = zpc_special[ZPC_HAT] =
@@ -3799,3 +3832,137 @@ freepatprog(Patprog prog)
     if (prog && prog != dummy_patprog1 && prog != dummy_patprog2)
 	zfree(prog, prog->size);
 }
+
+/* Disable or reenable a pattern character */
+
+/**/
+int
+pat_enables(const char *cmd, char **patp, int enable)
+{
+    int ret = 0;
+    const char **stringp;
+    char *disp;
+
+    if (!*patp) {
+	int done = 0;
+	for (stringp = zpc_strings, disp = zpc_disables;
+	     stringp < zpc_strings + ZPC_COUNT;
+	     stringp++, disp++) {
+	    if (!*stringp)
+		continue;
+	    if (enable ? *disp : !*disp)
+		continue;
+	    if (done)
+		putc(' ', stdout);
+	    printf("'%s'", *stringp);
+	    done = 1;
+	}
+	if (done)
+	    putc('\n', stdout);
+	return 0;
+    }
+
+    for (; *patp; patp++) {
+	for (stringp = zpc_strings, disp = zpc_disables;
+	     stringp < zpc_strings + ZPC_COUNT;
+	     stringp++, disp++) {
+	    if (*stringp && !strcmp(*stringp, *patp)) {
+		*disp = (char)!enable;
+		break;
+	    }
+	}
+	if (stringp == zpc_strings + ZPC_COUNT) {
+	    zerrnam(cmd, "invalid pattern: %s", *patp);
+	    ret = 1;
+	}
+    }
+
+    return ret;
+}
+
+/*
+ * Save the current state of pattern disables, returning the saved value.
+ */
+
+/**/
+unsigned int
+savepatterndisables(void)
+{
+    unsigned int disables, bit;
+    char *disp;
+
+    disables = 0;
+    for (bit = 1, disp = zpc_disables;
+	 disp < zpc_disables + ZPC_COUNT;
+	 bit <<= 1, disp++) {
+	if (*disp)
+	    disables |= bit;
+    }
+    return disables;
+}
+
+/*
+ * Function scope saving pattern enables.
+ */
+
+/**/
+void
+startpatternscope(void)
+{
+    Zpc_disables_save newdis;
+
+    newdis = (Zpc_disables_save)zalloc(sizeof(*newdis));
+    newdis->next = zpc_disables_stack;
+    newdis->disables = savepatterndisables();
+
+    zpc_disables_stack = newdis;
+}
+
+/*
+ * Restore completely the state of pattern disables.
+ */
+
+/**/
+void
+restorepatterndisables(unsigned int disables)
+{
+    char *disp;
+    unsigned int bit;
+
+    for (bit = 1, disp = zpc_disables;
+	 disp < zpc_disables + ZPC_COUNT;
+	 bit <<= 1, disp++) {
+	if (disables & bit)
+	    *disp = 1;
+	else
+	    *disp = 0;
+    }
+}
+
+/*
+ * Function scope to restore pattern enables if localpatterns is turned on.
+ */
+
+/**/
+void
+endpatternscope(void)
+{
+    Zpc_disables_save olddis;
+
+    olddis = zpc_disables_stack;
+    zpc_disables_stack = olddis->next;
+
+    if (isset(LOCALPATTERNS))
+	restorepatterndisables(olddis->disables);
+
+    zfree(olddis, sizeof(*olddis));
+}
+
+/* Reinitialise pattern disables */
+
+/**/
+void
+clearpatterndisables(void)
+{
+    memset(zpc_disables, 0, ZPC_COUNT);
+}
diff --git a/Src/zsh.h b/Src/zsh.h
index 639c2b7..299357d 100644
--- a/Src/zsh.h
+++ b/Src/zsh.h
@@ -1414,6 +1414,21 @@ enum zpc_chars {
 };
 
 /*
+ * Structure to save disables special characters for function scope.
+ */
+struct zpc_disables_save {
+    struct zpc_disables_save *next;
+    /*
+     * Bit vector of ZPC_COUNT disabled characters.
+     * We'll live dangerously and assumed ZPC_COUNT is no greater
+     * than the number of bits an an unsigned int.
+     */
+    unsigned int disables;
+};
+
+typedef struct zpc_disables_save *Zpc_disables_save;
+
+/*
  * Special match types used in character classes.  These
  * are represented as tokens, with Meta added.  The character
  * class is represented as a metafied string, with only these
@@ -2074,6 +2089,7 @@ enum {
     LISTROWSFIRST,
     LISTTYPES,
     LOCALOPTIONS,
+    LOCALPATTERNS,
     LOCALTRAPS,
     LOGINSHELL,
     LONGLISTJOBS,

-- 
Peter Stephenson <p.w.stephenson@ntlworld.com>
Web page now at http://homepage.ntlworld.com/p.w.stephenson/


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PATCH: configurability of pattern characters, part 1
  2013-06-01 20:18   ` Peter Stephenson
@ 2013-06-02  7:16     ` Bart Schaefer
  2013-06-03  8:40       ` Peter Stephenson
  0 siblings, 1 reply; 13+ messages in thread
From: Bart Schaefer @ 2013-06-02  7:16 UTC (permalink / raw)
  To: Zsh Hackers' List

On Jun 1,  9:18pm, Peter Stephenson wrote:
}
} The simple meaning for enable -p is that it reverses a disable, it
} doesn't explicitly enable something that's not allowed by the options.

Agreed.  In the other cases of disable, you have to create something a
different way (function, alias) before you can disable/enable it, so
I think that's fine here too.

I'm still not entirely clear what happens in e.g. this case:

% setopt kshglob
% disable -p '+('
% setopt kshglob

Does the setopt re-enable '+(' or does it remain disabled?  What about:

% setopt kshglob
% disable -p '+('
% unsetopt kshglob
% setopt kshglob

} "disable -p" should output the current settings, which we could save.

Explicit save/restore not necessary with the patch in 31444, right?

} Or how about readonly zsh/parameter arrays corresponding to enabled and
} disabled patterns?  Same idea, just slightly more efficient to save.

I think it'd be fine to add these, though calling one of them $patterns
is likely to clash with some existing scripts.  I'd vote for having it
be readonly like $builtins and $reswords.  The writable zsh/parameter
hashes are for objects that can be created/deleted by the user, but we
are not allowing the user to create new pattern tokens.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PATCH: configurability of pattern characters, part 1
  2013-06-02  7:16     ` Bart Schaefer
@ 2013-06-03  8:40       ` Peter Stephenson
  2013-06-03 15:05         ` Bart Schaefer
  0 siblings, 1 reply; 13+ messages in thread
From: Peter Stephenson @ 2013-06-03  8:40 UTC (permalink / raw)
  To: Zsh Hackers' List

On Sun, 02 Jun 2013 00:16:27 -0700
Bart Schaefer <schaefer@brasslantern.com> wrote:
> On Jun 1,  9:18pm, Peter Stephenson wrote:
> }
> } The simple meaning for enable -p is that it reverses a disable, it
> } doesn't explicitly enable something that's not allowed by the options.
> 
> Agreed.  In the other cases of disable, you have to create something a
> different way (function, alias) before you can disable/enable it, so
> I think that's fine here too.
> 
> I'm still not entirely clear what happens in e.g. this case:
> 
> % setopt kshglob
> % disable -p '+('
> % setopt kshglob
> 
> Does the setopt re-enable '+(' or does it remain disabled?

It remains disabled: that's a separate set of controls.

<  What about:
> 
> % setopt kshglob
> % disable -p '+('
> % unsetopt kshglob
> % setopt kshglob

It will stay disabled until you "enable -p '+('" or enter an emulation.
But if you enter a non-ksh emulation in which you "setopt kshglob" it
will pop up.

By the way, I suspect I need to do more work on the various expressions
with parentheses to make everything work smoothly.

> } "disable -p" should output the current settings, which we could save.
> 
> Explicit save/restore not necessary with the patch in 31444, right?

Yes, I'll just need to ensure LOCAL_PATTERNS is set with the other
completion options and use enable -p to ensure extendedglob patterns are
enabled.

> } Or how about readonly zsh/parameter arrays corresponding to enabled and
> } disabled patterns?  Same idea, just slightly more efficient to save.
> 
> I think it'd be fine to add these, though calling one of them $patterns
> is likely to clash with some existing scripts.  I'd vote for having it
> be readonly like $builtins and $reswords.  The writable zsh/parameter
> hashes are for objects that can be created/deleted by the user, but we
> are not allowing the user to create new pattern tokens.

It seems like I don't really have a use for these for now.

pws


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PATCH: configurability of pattern characters, part 1
  2013-06-03  8:40       ` Peter Stephenson
@ 2013-06-03 15:05         ` Bart Schaefer
  2013-06-03 15:31           ` Peter Stephenson
  0 siblings, 1 reply; 13+ messages in thread
From: Bart Schaefer @ 2013-06-03 15:05 UTC (permalink / raw)
  To: Zsh Hackers' List

[-- Attachment #1: Type: text/plain, Size: 913 bytes --]

On Mon, Jun 3, 2013 at 1:40 AM, Peter Stephenson
<p.stephenson@samsung.com>wrote:

> On Sun, 02 Jun 2013 00:16:27 -0700
> Bart Schaefer <schaefer@brasslantern.com> wrote:
> > % setopt kshglob
> > % disable -p '+('
> > % setopt kshglob
> >
> > Does the setopt re-enable '+(' or does it remain disabled?
>
> It remains disabled: that's a separate set of controls.


OK, so one way to think about this is that "disable" controls whether the
syntax is available, and setopt controls whether the syntax is used.
 Earlier, though, you said:

> If you want to be able to enable or disable every pattern separetely, you
> "setopt extendedglob kshglob noshglob" first.

Is the word "first" significant there?  E.g., does that mean that if I do

% unsetopt kshglob
% disable -p '+('
% setopt kshglob

that '+(' will be enabled?  (I know, I could apply the patch and try it,
but I'm interested in the intentions behind it.)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PATCH: configurability of pattern characters, part 1
  2013-06-03 15:05         ` Bart Schaefer
@ 2013-06-03 15:31           ` Peter Stephenson
  0 siblings, 0 replies; 13+ messages in thread
From: Peter Stephenson @ 2013-06-03 15:31 UTC (permalink / raw)
  To: Zsh Hackers' List

On Mon, 03 Jun 2013 08:05:22 -0700
Bart Schaefer <schaefer@brasslantern.com> wrote:
> > If you want to be able to enable or disable every pattern separetely, you
> > "setopt extendedglob kshglob noshglob" first.
> 
> Is the word "first" significant there?

No, they are completely separate controls.

It was just a bit of pedantry: until you did it, you weren't exerting
the level of control you (by hypothesis) wanted by using enable and
disable.

pws


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PATCH: configurability of pattern characters, part 2
  2013-06-01 23:09   ` PATCH: configurability of pattern characters, part 2 Peter Stephenson
@ 2013-06-04  6:45     ` Bart Schaefer
  2013-06-04  8:44       ` Peter Stephenson
  0 siblings, 1 reply; 13+ messages in thread
From: Bart Schaefer @ 2013-06-04  6:45 UTC (permalink / raw)
  To: Zsh hackers list

[-- Attachment #1: Type: text/plain, Size: 1242 bytes --]

On Sat, Jun 1, 2013 at 4:09 PM, Peter Stephenson <
p.w.stephenson@ntlworld.com> wrote:

> Here's the basic disable -p / enable -p and emulation behaviour to be
> exposed to public view before the rest.
>
> Up to now, this has only had finger tests of some of the simpler forms,
> so there could well be bugs lurking in the disabling of certain
> patterns.
>
> If somebody can spot some effect I haven't documented, please shout.
>

Fiddling with this, the very first thing I tried gave me an unexpected
result.

schaefer<502> print *
aclocal.m4 aczsh.m4 autom4te.cache ChangeLog Completion Config config.guess
config.h config.h.in config.log config.modules
config.modules.shconfig.status config.sub configure
configure.ac Doc Etc FEATURES Functions INSTALL install-sh LICENCE MACHINES
Makefile Makefile.in META-FAQ Misc mkinstalldirs NEWS README Scripts Src
stamp-h stamp-h.in StartupFiles Test Util
schaefer<503> disable -p \*
schaefer<504> print *
zsh: no match
schaefer<505>

I was rather expecting to get a * as output rather than a no match error.
 It seems very odd to have a pattern that can't possibly match anything but
that is still treated as a pattern; this did not really disable *, it just
turned it into the equivalent of [*].

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PATCH: configurability of pattern characters, part 2
  2013-06-04  6:45     ` Bart Schaefer
@ 2013-06-04  8:44       ` Peter Stephenson
  2013-06-04 14:50         ` Bart Schaefer
  2013-06-09 18:06         ` Peter Stephenson
  0 siblings, 2 replies; 13+ messages in thread
From: Peter Stephenson @ 2013-06-04  8:44 UTC (permalink / raw)
  To: Zsh hackers list

On Mon, 03 Jun 2013 23:45:39 -0700
Bart Schaefer <schaefer@brasslantern.com> wrote:
> schaefer<503> disable -p \*
> schaefer<504> print *
> zsh: no match

I think that's because I haven't yet got around to haswilds(), so it
thinks there are patterns in it, but finds they don't turn into
anything.  haswilds() is going to have to expand somewhat.

pws


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PATCH: configurability of pattern characters, part 2
  2013-06-04  8:44       ` Peter Stephenson
@ 2013-06-04 14:50         ` Bart Schaefer
  2013-06-04 15:09           ` Peter Stephenson
  2013-06-09 18:06         ` Peter Stephenson
  1 sibling, 1 reply; 13+ messages in thread
From: Bart Schaefer @ 2013-06-04 14:50 UTC (permalink / raw)
  To: Zsh hackers list

[-- Attachment #1: Type: text/plain, Size: 257 bytes --]

On Tue, Jun 4, 2013 at 1:44 AM, Peter Stephenson
<p.stephenson@samsung.com>wrote:

> haswilds() is going to have to expand somewhat.
>

Hrm, this idea has fingers in a lot of pies.  Maybe I should be asking what
motivated you to do this in the first place?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PATCH: configurability of pattern characters, part 2
  2013-06-04 14:50         ` Bart Schaefer
@ 2013-06-04 15:09           ` Peter Stephenson
  0 siblings, 0 replies; 13+ messages in thread
From: Peter Stephenson @ 2013-06-04 15:09 UTC (permalink / raw)
  To: Zsh hackers list

On Tue, 04 Jun 2013 07:50:23 -0700
Bart Schaefer <schaefer@brasslantern.com> wrote:
> On Tue, Jun 4, 2013 at 1:44 AM, Peter Stephenson
> <p.stephenson@samsung.com>wrote:
> 
> > haswilds() is going to have to expand somewhat.
> >
> 
> Hrm, this idea has fingers in a lot of pies.

Obviously the glob code depends on patterns... is there something else
I've missed?

> Maybe I should be asking what motivated you to do this in the first
> place?

The feeling that people aren't using extended globbing because it's got
bits they don't want that confuse matters: ^ (with e.g. stty) and ~
(with backup files, particularly those with a ~ in the middle) are
particular gotchas.

pws


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PATCH: configurability of pattern characters, part 2
  2013-06-04  8:44       ` Peter Stephenson
  2013-06-04 14:50         ` Bart Schaefer
@ 2013-06-09 18:06         ` Peter Stephenson
  1 sibling, 0 replies; 13+ messages in thread
From: Peter Stephenson @ 2013-06-09 18:06 UTC (permalink / raw)
  To: Zsh hackers list

On Tue, 04 Jun 2013 09:44:47 +0100
Peter Stephenson <p.stephenson@samsung.com> wrote:
> On Mon, 03 Jun 2013 23:45:39 -0700
> Bart Schaefer <schaefer@brasslantern.com> wrote:
> > schaefer<503> disable -p \*
> > schaefer<504> print *
> > zsh: no match
> 
> I think that's because I haven't yet got around to haswilds(), so it
> thinks there are patterns in it, but finds they don't turn into
> anything.  haswilds() is going to have to expand somewhat.

This fixes haswilds() and completion.  This should render the original
patch basically functional, however I think it needs a little work
within pattern.c to render disabling of features associated with
parentheses working (to be clear: I don't expect any existing feature to
be broken by anything I've done so far).  As I noted somewhere in the
documentation, I'm not planning on making the pattern enables and
disables affect parsing, just whether the pattern is used as a pattern.

Once I've fixed parentheses and written some tests that should be it.

diff --git a/Completion/compinit b/Completion/compinit
index 7b8a346..f9d2c57 100644
--- a/Completion/compinit
+++ b/Completion/compinit
@@ -163,8 +163,9 @@ _comp_options=(
 
 typeset -g _comp_setup='local -A _comp_caller_options;
              _comp_caller_options=(${(kv)options[@]});
-             setopt localoptions localtraps ${_comp_options[@]};
+             setopt localoptions localtraps localpatterns ${_comp_options[@]};
              local IFS=$'\'\ \\t\\r\\n\\0\''
+             enable -p \| \~ \( \? \* \[ \< \^ \#
              exec </dev/null;
              trap - ZERR
              local -a reply
diff --git a/Src/glob.c b/Src/glob.c
index db86d24..0defb1a 100644
--- a/Src/glob.c
+++ b/Src/glob.c
@@ -445,41 +445,6 @@ insert(char *s, int checked)
     unqueue_signals();
 }
 
-/* Check to see if str is eligible for filename generation. */
-
-/**/
-mod_export int
-haswilds(char *str)
-{
-    /* `[' and `]' are legal even if bad patterns are usually not. */
-    if ((*str == Inbrack || *str == Outbrack) && !str[1])
-	return 0;
-
-    /* If % is immediately followed by ?, then that ? is     *
-     * not treated as a wildcard.  This is so you don't have *
-     * to escape job references such as %?foo.               */
-    if (str[0] == '%' && str[1] == Quest)
-	str[1] = '?';
-
-    for (; *str; str++) {
-	switch (*str) {
-	    case Inpar:
-	    case Bar:
-	    case Star:
-	    case Inbrack:
-	    case Inang:
-	    case Quest:
-		return 1;
-	    case Pound:
-	    case Hat:
-		if (isset(EXTENDEDGLOB))
-		    return 1;
-		break;
-	}
-    }
-    return 0;
-}
-
 /* Do the globbing:  scanner is called recursively *
  * with successive bits of the path until we've    *
  * tried all of it.                                */
diff --git a/Src/pattern.c b/Src/pattern.c
index a90d3cd..b7897e7 100644
--- a/Src/pattern.c
+++ b/Src/pattern.c
@@ -3966,3 +3966,78 @@ clearpatterndisables(void)
 {
     memset(zpc_disables, 0, ZPC_COUNT);
 }
+
+
+/* Check to see if str is eligible for filename generation. */
+
+/**/
+mod_export int
+haswilds(char *str)
+{
+    char *start;
+
+    /* `[' and `]' are legal even if bad patterns are usually not. */
+    if ((*str == Inbrack || *str == Outbrack) && !str[1])
+	return 0;
+
+    /* If % is immediately followed by ?, then that ? is     *
+     * not treated as a wildcard.  This is so you don't have *
+     * to escape job references such as %?foo.               */
+    if (str[0] == '%' && str[1] == Quest)
+	str[1] = '?';
+
+    /*
+     * Note that at this point zpc_special has not been set up.
+     */
+    start = str;
+    for (; *str; str++) {
+	switch (*str) {
+	    case Inpar:
+		if ((!isset(SHGLOB) && !zpc_disables[ZPC_INPAR]) ||
+		    (str > start && isset(KSHGLOB) &&
+		     ((str[-1] == Quest && !zpc_disables[ZPC_KSH_QUEST]) ||
+		      (str[-1] == Star && !zpc_disables[ZPC_KSH_STAR]) ||
+		      (str[-1] == '+' && !zpc_disables[ZPC_KSH_PLUS]) ||
+		      (str[-1] == '!' && !zpc_disables[ZPC_KSH_BANG]) ||
+		      (str[-1] == '@' && !zpc_disables[ZPC_KSH_AT]))))
+		    return 1;
+		break;
+
+	    case Bar:
+		if (!zpc_disables[ZPC_BAR])
+		    return 1;
+		break;
+
+	    case Star:
+		if (!zpc_disables[ZPC_STAR])
+		    return 1;
+		break;
+
+	    case Inbrack:
+		if (!zpc_disables[ZPC_INBRACK])
+		    return 1;
+		break;
+
+	    case Inang:
+		if (!zpc_disables[ZPC_INANG])
+		    return 1;
+		break;
+
+	    case Quest:
+		if (!zpc_disables[ZPC_QUEST])
+		    return 1;
+		break;
+
+	    case Pound:
+		if (isset(EXTENDEDGLOB) && !zpc_disables[ZPC_HASH])
+		    return 1;
+		break;
+
+	    case Hat:
+		if (isset(EXTENDEDGLOB) && !zpc_disables[ZPC_HAT])
+		    return 1;
+		break;
+	}
+    }
+    return 0;
+}

-- 
Peter Stephenson <p.w.stephenson@ntlworld.com>
Web page now at http://homepage.ntlworld.com/p.w.stephenson/


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2013-06-09 18:07 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-31 23:29 PATCH: configurability of pattern characters, part 1 Peter Stephenson
2013-06-01  6:22 ` Bart Schaefer
2013-06-01 20:18   ` Peter Stephenson
2013-06-02  7:16     ` Bart Schaefer
2013-06-03  8:40       ` Peter Stephenson
2013-06-03 15:05         ` Bart Schaefer
2013-06-03 15:31           ` Peter Stephenson
2013-06-01 23:09   ` PATCH: configurability of pattern characters, part 2 Peter Stephenson
2013-06-04  6:45     ` Bart Schaefer
2013-06-04  8:44       ` Peter Stephenson
2013-06-04 14:50         ` Bart Schaefer
2013-06-04 15:09           ` Peter Stephenson
2013-06-09 18:06         ` Peter Stephenson

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).