zsh-workers
 help / color / mirror / code / Atom feed
* Notes on bash(1)
@ 1998-12-09  3:25 Phil Pennock
  1998-12-09  9:01 ` Peter Stephenson
  0 siblings, 1 reply; 7+ messages in thread
From: Phil Pennock @ 1998-12-09  3:25 UTC (permalink / raw)
  To: Zsh Development Workers

I recently needed to check something in bash(1) and noticed some
interesting points in the manual.  I'm throwing them this way for
discussion/whatever.  Bash is 2.01.1(1)-release.

* bash has arrays.  'declare', 'local' & 'readonly' each accept '-a' to
  declare an array.  Is it reasonable to add '-a' to 'typeset'?  This
  would automatically duplicate the bash-ism.
  Further, would it be an idea to then deprecate 'set -A' which
  overloads parameter setting onto 'set'?

* ${parameter/pattern/string} and ${parameter//pattern/string}
  pattern is expanded as per pathname expansion.  Longest match of
  pattern against parameter is replaced with string.  Once for / and for
  all instances with //.  #pattern anchors to beginning, %pattern
  anchors to end.  string may be null.  Applied to an array, this works
  on each element.
  zsh has a some of this with the colon-modifier 's'.
  The anchors in particular are nice.  They would make it easy to fully
  replicate basename(1) without forking.  We can't currently (AFAIK)
  accurately duplicate $(basename file .ext) (think - filename: .ext.ex)
  % base=${var:t:s/%.ext//}

* ${parameter:offset} and ${parameter:offset:length} provide substring
  and array extraction.  Both length and offset are arithmetic
  expressions.  length>=0.  offset may be negative to measure from end.
  zsh notably has ${parameter[start,stop]} already.  How desirable is
  this alternate syntax, given that zsh allows history modifiers in the
  same place?  It doesn't look like there would be a conflict, provided
  zsh requires $[...] for variables there.  Testing, bash allows:
  $ foo=abcde; t=2; echo ${foo:t}
-- 
--> Phil Pennock ; GAT d- s+:+ a22 C++(++++) UL++++/I+++/S+++/H+ P++@ L+++
E-@ W(+) N>++ o !K w--- O>+ M V !PS PE Y+ PGP+ t-- 5++ X+ R !tv b++>+++ DI+ D+
G+ e+ h* r y?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Notes on bash(1)
  1998-12-09  3:25 Notes on bash(1) Phil Pennock
@ 1998-12-09  9:01 ` Peter Stephenson
  1998-12-09 17:04   ` PATCH: 3.1.5: bash ${.../old/new} Peter Stephenson
  1998-12-09 19:43   ` PATCH: Docs out of sync Phil Pennock
  0 siblings, 2 replies; 7+ messages in thread
From: Peter Stephenson @ 1998-12-09  9:01 UTC (permalink / raw)
  To: zsh-workers

Phil Pennock wrote:
> * bash has arrays.  'declare', 'local' & 'readonly' each accept '-a' to
>   declare an array.  Is it reasonable to add '-a' to 'typeset'?  This
>   would automatically duplicate the bash-ism.

It's not exactly essential, since bash and zsh have rather different
extensions to sh in any case.  They're only similar in as much as they
both include sh.  It's hard enough keeping ksh emulation working.

>   Further, would it be an idea to then deprecate 'set -A' which
>   overloads parameter setting onto 'set'?

That's needed for ksh.

> * ${parameter/pattern/string} and ${parameter//pattern/string}
>   pattern is expanded as per pathname expansion.  Longest match of
>   pattern against parameter is replaced with string.  Once for / and for
>   all instances with //.  #pattern anchors to beginning, %pattern
>   anchors to end.  string may be null.  Applied to an array, this works
>   on each element.
>   zsh has a some of this with the colon-modifier 's'.

This would be quite useful, since :s only does simple string
replacement, and it doesn't clash with anything, yet, though if you
wait long enough...

Maybe it can be done quite simply by upgrading the extra flags Sven
added for # and % to match internal bits of a parameter's value.

>   The anchors in particular are nice.  They would make it easy to fully
>   replicate basename(1) without forking.  We can't currently (AFAIK)
>   accurately duplicate $(basename file .ext) (think - filename: .ext.ex)
>   % base=${var:t:s/%.ext//}

If you mean `remove the extension if and only if it's .ext', then you
can do ${${var:t}%.ext}.

> * ${parameter:offset} and ${parameter:offset:length} provide substring
>   and array extraction.  Both length and offset are arithmetic
>   expressions.  length>=0.  offset may be negative to measure from end.
>   zsh notably has ${parameter[start,stop]} already.  How desirable is
>   this alternate syntax, given that zsh allows history modifiers in the
>   same place?  It doesn't look like there would be a conflict, provided
>   zsh requires $[...] for variables there.  Testing, bash allows:
>   $ foo=abcde; t=2; echo ${foo:t}

I think it's too close to the history modifiers and we'd better stick
with the subscript notation, unless we're aiming at a bash
compatibility mode which is really going a bit far.

It worries me slightly that there are people out there who don't know
the difference between bash and sh --- which is their problem, but one
day they may start inflicting it on other people.

-- 
Peter Stephenson <pws@ibmth.df.unipi.it>       Tel: +39 050 844536
WWW:  http://www.ifh.de/~pws/
Dipartimento di Fisica, Via Buonarroti 2, 56127 Pisa, Italy


^ permalink raw reply	[flat|nested] 7+ messages in thread

* PATCH: 3.1.5: bash ${.../old/new}
  1998-12-09  9:01 ` Peter Stephenson
@ 1998-12-09 17:04   ` Peter Stephenson
  1998-12-10 15:52     ` Strange substring search behaviour Peter Stephenson
  1998-12-09 19:43   ` PATCH: Docs out of sync Phil Pennock
  1 sibling, 1 reply; 7+ messages in thread
From: Peter Stephenson @ 1998-12-09 17:04 UTC (permalink / raw)
  To: zsh-workers

I wrote:
> Phil Pennock wrote:
> > * ${parameter/pattern/string} and ${parameter//pattern/string}
> >   pattern is expanded as per pathname expansion.  Longest match of
> >   pattern against parameter is replaced with string.  Once for / and for
> >   all instances with //.  #pattern anchors to beginning, %pattern
> >   anchors to end.  string may be null.  Applied to an array, this works
> >   on each element.
> >   zsh has a some of this with the colon-modifier 's'.
> 
> Maybe it can be done quite simply by upgrading the extra flags Sven
> added for # and % to match internal bits of a parameter's value.

This turns out to be correct.  It's quite difficult to do pattern
substitution otherwise --- you have to hack off the head and tail and
muck around inside --- and particularly multiple substitution, so I
think this is useful.

Actually, doing a single substitution was really easy, doing a global
one required a little more effort, particularly to avoid accumulated
partially substituted strings.  It seems to work.

Since this is zsh, and since it fits in with the existing code, you
can get special effects (described in the manual) without me needing
to do anything clever (the following don't use patterns but of course
it works for those):

% foo=wimbaweawimbawe
% print ${(I.2.)foo/w/z}             # second occurrence only
wimbazeawimbawe
% print ${(I.2.)foo//w/z}            # all occurrences from the second
wimbazeazimbaze
% print ${foo:/wimba/burble}         # has to match the whole string
wimbaweawimbawe
% print ${foo:/wimbaweawimbawe/burble}
burble

Arrays and elements thereof work as expected (thanks probably to
Zoli's last major surgery).

In fact, the internals are pretty much all there to be able to replace
the shortest match instead of the longest match for the pattern.  The
only thing missing is the syntax.  If somebody suggest some, I will
add it.  (This is easy basically because zsh does all this stuff in a
hugely inefficient way, simply by reducing the test string from either
end until a part of it matches.  If anybody wants to take a year off
and fix this...)

I hope this applies to a reasonably clean 3.1.5, but there isn't one
around in these parts.  At one stage it got mixed up with some of
Sven's Conddef's in zsh.h, but if you just expect a bit of offset
things should be OK.

(If you really want to know why the old files have the suffix .pmfl,
it's because I started off by altering all the parameter substitution
flags to symbols, a long overdue change.)

Oh, and I fixed some more else-dangleage in subst.c.

*** Doc/Zsh/expn.yo.pmfl	Tue Nov 10 10:10:01 1998
--- Doc/Zsh/expn.yo	Wed Dec  9 17:27:02 1998
***************
*** 398,403 ****
--- 398,421 ----
  the matched array elements are removed (use the tt((M)) flag to
  remove the non-matched elements).
  )
+ xitem(tt(${)var(name)tt(/)var(pattern)tt(/)var(repl)tt(}))
+ item(tt(${)var(name)tt(//)var(pattern)tt(/)var(repl)tt(}))(
+ Substitute the longest possible match of var(pattern) in the value of
+ variable var(name) with the string var(repl).  The first form
+ substitutes just the first occurrence, the second all occurrences.
+ The var(pattern) may begin with a var(#), in which case the
+ var(pattern) must match at the start of the string, or var(%), in
+ which case it must match at the end of the string.  The var(repl) may
+ be an empty string, in which case the final tt(/) may also be omitted.
+ To quote the final tt(/) in other cases it should be preceded by two
+ backslashes (i.e., a quoted backslash).  Substitution of an array is as
+ described for tt(#) and tt(%) above.
+ 
+ The first tt(/) may be preceded by a tt(:), in which case the match
+ will only succeed if it matches the entire word.  Note also the
+ effect of the tt(I) parameter expansion flag below:  the flags tt(S),
+ tt(M), tt(R), tt(B), tt(E) and tt(N) are not useful, however.
+ )
  item(tt(${#)var(spec)tt(}))(
  If var(spec) is one of the above substitutions, substitute
  the length in characters of the result instead of
***************
*** 553,558 ****
--- 571,580 ----
  )
  item(tt(I:)var(expr)tt(:))(
  Search the var(expr)th match (where var(expr) evaluates to a number).
+ This may be used with tt(${)...tt(/)...tt(}) or
+ tt(${)...tt(//)...tt(}) substitution:  in the first case, only the
+ var(expr)th match is substituted, while in the second case,  all
+ matches from the var(expr)th on are substituted.
  )
  item(tt(M))(
  Include the matched portion in the result.
*** Src/glob.c.pmfl	Wed Dec  9 11:25:29 1998
--- Src/glob.c	Wed Dec  9 17:53:36 1998
***************
*** 1806,1846 ****
  /* do the ${foo%%bar}, ${foo#bar} stuff */
  /* please do not laugh at this code. */
  
  /* Having found a match in getmatch, decide what part of string
   * to return.  The matched part starts b characters into string s
   * and finishes e characters in: 0 <= b <= e <= strlen(s)
   * (yes, empty matches should work).
!  * Bits 3 and higher in fl are used: the flags are
!  *   8:		Result is matched portion.
!  *  16:		Result is unmatched portion.
!  *		(N.B. this should be set for standard ${foo#bar} etc. matches.)
!  *  32:		Result is numeric position of start of matched portion.
!  *  64:		Result is numeric position of end of matched portion.
!  * 128:		Result is length of matched portion.
   */
  
  /**/
  static char *
! get_match_ret(char *s, int b, int e, int fl)
  {
      char buf[80], *r, *p, *rr;
      int ll = 0, l = strlen(s), bl = 0, t = 0, i;
  
!     if (fl & 8)			/* matched portion */
  	ll += 1 + (e - b);
!     if (fl & 16)		/* unmatched portion */
  	ll += 1 + (l - (e - b));
!     if (fl & 32) {
  	/* position of start of matched portion */
  	sprintf(buf, "%d ", b + 1);
  	ll += (bl = strlen(buf));
      }
!     if (fl & 64) {
  	/* position of end of matched portion */
  	sprintf(buf + bl, "%d ", e + 1);
  	ll += (bl = strlen(buf));
      }
!     if (fl & 128) {
  	/* length of matched portion */
  	sprintf(buf + bl, "%d ", e - b);
  	ll += (bl = strlen(buf));
--- 1806,1867 ----
  /* do the ${foo%%bar}, ${foo#bar} stuff */
  /* please do not laugh at this code. */
  
+ struct repldata {
+     int b, e;			/* beginning and end of chunk to replace */
+ };
+ typedef struct repldata *Repldata;
+ 
+ /* 
+  * List of bits of matches to concatenate with replacement string.
+  * The data is a struct repldata.  It is not used in cases like
+  * ${...//#foo/bar} even though SUB_GLOBAL is set, since the match
+  * is anchored.  It goes on the heap.
+  */
+ 
+ static LinkList repllist;
+ 
  /* Having found a match in getmatch, decide what part of string
   * to return.  The matched part starts b characters into string s
   * and finishes e characters in: 0 <= b <= e <= strlen(s)
   * (yes, empty matches should work).
!  * fl is a set of the SUB_* matches defined in zsh.h from SUB_MATCH onwards;
!  * the lower parts are ignored.
!  * replstr is the replacement string for a substitution
   */
  
  /**/
  static char *
! get_match_ret(char *s, int b, int e, int fl, char *replstr)
  {
      char buf[80], *r, *p, *rr;
      int ll = 0, l = strlen(s), bl = 0, t = 0, i;
  
!     if (replstr) {
! 	if ((fl & SUB_GLOBAL) && repllist) {
! 	    /* We are replacing the chunk, just add this to the list */
! 	    Repldata rd = (Repldata) halloc(sizeof(*rd));
! 	    rd->b = b;
! 	    rd->e = e;
! 	    addlinknode(repllist, rd);
! 	    return s;
! 	}
! 	ll += strlen(replstr);
!     }
!     if (fl & SUB_MATCH)			/* matched portion */
  	ll += 1 + (e - b);
!     if (fl & SUB_REST)		/* unmatched portion */
  	ll += 1 + (l - (e - b));
!     if (fl & SUB_BIND) {
  	/* position of start of matched portion */
  	sprintf(buf, "%d ", b + 1);
  	ll += (bl = strlen(buf));
      }
!     if (fl & SUB_EIND) {
  	/* position of end of matched portion */
  	sprintf(buf + bl, "%d ", e + 1);
  	ll += (bl = strlen(buf));
      }
!     if (fl & SUB_LEN) {
  	/* length of matched portion */
  	sprintf(buf + bl, "%d ", e - b);
  	ll += (bl = strlen(buf));
***************
*** 1850,1862 ****
  
      rr = r = (char *)ncalloc(ll);
  
!     if (fl & 8) {
  	/* copy matched portion to new buffer */
  	for (i = b, p = s + b; i < e; i++)
  	    *rr++ = *p++;
  	t = 1;
      }
!     if (fl & 16) {
  	/* Copy unmatched portion to buffer.  If both portions *
  	 * requested, put a space in between (why?)            */
  	if (t)
--- 1871,1883 ----
  
      rr = r = (char *)ncalloc(ll);
  
!     if (fl & SUB_MATCH) {
  	/* copy matched portion to new buffer */
  	for (i = b, p = s + b; i < e; i++)
  	    *rr++ = *p++;
  	t = 1;
      }
!     if (fl & SUB_REST) {
  	/* Copy unmatched portion to buffer.  If both portions *
  	 * requested, put a space in between (why?)            */
  	if (t)
***************
*** 1864,1869 ****
--- 1885,1893 ----
  	/* there may be unmatched bits at both beginning and end of string */
  	for (i = 0, p = s; i < b; i++)
  	    *rr++ = *p++;
+ 	if (replstr)
+ 	    for (p = replstr; *p; )
+ 		*rr++ = *p++;
  	for (i = e, p = s + e; i < l; i++)
  	    *rr++ = *p++;
  	t = 1;
***************
*** 1879,1920 ****
      return r;
  }
  
! /* It is called from paramsubst to get the match for ${foo#bar} etc.
!  * Bits of fl determines the required action:
!  *   bit 0: match the end instead of the beginning (% or %%)
!  *   bit 1: % or # was doubled so get the longest match
!  *   bit 2: substring match
!  *   bit 3: include the matched portion
!  *   bit 4: include the unmatched portion
!  *   bit 5: the index of the beginning
!  *   bit 6: the index of the end
!  *   bit 7: the length of the match
!  *   bit 8: match the complete string
   * *sp points to the string we have to modify. The n'th match will be
   * returned in *sp. ncalloc is used to get memory for the result string.
   */
  
  /**/
  int
! getmatch(char **sp, char *pat, int fl, int n)
  {
      Comp c;
!     char *s = *sp, *t, sav;
!     int i, j, l = strlen(*sp);
  
      c = parsereg(pat);
      if (!c) {
  	zerr("bad pattern: %s", pat, 0);
  	return 1;
      }
!     if (fl & 256) {
  	i = domatch(s, c, 0);
! 	*sp = get_match_ret(*sp, 0, domatch(s, c, 0) ? l : 0, fl);
! 	if (! **sp && (((fl & 8) && !i) || ((fl & 16) && i)))
  	    return 0;
  	return 1;
      }
!     switch (fl & 7) {
      case 0:
  	/* Smallest possible match at head of string:    *
  	 * start adding characters until we get a match. */
--- 1903,1940 ----
      return r;
  }
  
! /*
!  * This is called from paramsubst to get the match for ${foo#bar} etc.
!  * fl is a set of the SUB_* flags defined in zsh.h
   * *sp points to the string we have to modify. The n'th match will be
   * returned in *sp. ncalloc is used to get memory for the result string.
+  * replstr is the replacement string from a ${.../orig/repl}, in
+  * which case pat is the original.
   */
  
  /**/
  int
! getmatch(char **sp, char *pat, int fl, int n, char *replstr)
  {
      Comp c;
!     char *s = *sp, *t, *start, sav;
!     int i, j, l = strlen(*sp), lleft, matched;
  
+     MUSTUSEHEAP("getmatch");	/* presumably covered by prefork() test */
+     repllist = NULL;
      c = parsereg(pat);
      if (!c) {
  	zerr("bad pattern: %s", pat, 0);
  	return 1;
      }
!     if (fl & SUB_ALL) {
  	i = domatch(s, c, 0);
! 	*sp = get_match_ret(*sp, 0, i ? l : 0, fl, i ? replstr : 0);
! 	if (! **sp && (((fl & SUB_MATCH) && !i) || ((fl & SUB_REST) && i)))
  	    return 0;
  	return 1;
      }
!     switch (fl & (SUB_END|SUB_LONG|SUB_SUBSTR)) {
      case 0:
  	/* Smallest possible match at head of string:    *
  	 * start adding characters until we get a match. */
***************
*** 1923,1929 ****
  	    *t = '\0';
  	    if (domatch(s, c, 0) && !--n) {
  		*t = sav;
! 		*sp = get_match_ret(*sp, 0, i, fl);
  		return 1;
  	    }
  	    if ((*t = sav) == Meta)
--- 1943,1949 ----
  	    *t = '\0';
  	    if (domatch(s, c, 0) && !--n) {
  		*t = sav;
! 		*sp = get_match_ret(*sp, 0, i, fl, replstr);
  		return 1;
  	    }
  	    if ((*t = sav) == Meta)
***************
*** 1931,1942 ****
  	}
  	break;
  
!     case 1:
  	/* Smallest possible match at tail of string:  *
  	 * move back down string until we get a match. */
  	for (t = s + l; t >= s; t--) {
  	    if (domatch(t, c, 0) && !--n) {
! 		*sp = get_match_ret(*sp, t - s, l, fl);
  		return 1;
  	    }
  	    if (t > s+1 && t[-2] == Meta)
--- 1951,1962 ----
  	}
  	break;
  
!     case SUB_END:
  	/* Smallest possible match at tail of string:  *
  	 * move back down string until we get a match. */
  	for (t = s + l; t >= s; t--) {
  	    if (domatch(t, c, 0) && !--n) {
! 		*sp = get_match_ret(*sp, t - s, l, fl, replstr);
  		return 1;
  	    }
  	    if (t > s+1 && t[-2] == Meta)
***************
*** 1944,1950 ****
  	}
  	break;
  
!     case 2:
  	/* Largest possible match at head of string:        *
  	 * delete characters from end until we get a match. */
  	for (t = s + l; t > s; t--) {
--- 1964,1970 ----
  	}
  	break;
  
!     case SUB_LONG:
  	/* Largest possible match at head of string:        *
  	 * delete characters from end until we get a match. */
  	for (t = s + l; t > s; t--) {
***************
*** 1952,1958 ****
  	    *t = '\0';
  	    if (domatch(s, c, 0) && !--n) {
  		*t = sav;
! 		*sp = get_match_ret(*sp, 0, t - s, fl);
  		return 1;
  	    }
  	    *t = sav;
--- 1972,1978 ----
  	    *t = '\0';
  	    if (domatch(s, c, 0) && !--n) {
  		*t = sav;
! 		*sp = get_match_ret(*sp, 0, t - s, fl, replstr);
  		return 1;
  	    }
  	    *t = sav;
***************
*** 1961,1972 ****
  	}
  	break;
  
!     case 3:
  	/* Largest possible match at tail of string:       *
  	 * move forward along string until we get a match. */
  	for (i = 0, t = s; i < l; i++, t++) {
  	    if (domatch(t, c, 0) && !--n) {
! 		*sp = get_match_ret(*sp, i, l, fl);
  		return 1;
  	    }
  	    if (*t == Meta)
--- 1981,1992 ----
  	}
  	break;
  
!     case (SUB_END|SUB_LONG):
  	/* Largest possible match at tail of string:       *
  	 * move forward along string until we get a match. */
  	for (i = 0, t = s; i < l; i++, t++) {
  	    if (domatch(t, c, 0) && !--n) {
! 		*sp = get_match_ret(*sp, i, l, fl, replstr);
  		return 1;
  	    }
  	    if (*t == Meta)
***************
*** 1974,1983 ****
  	}
  	break;
  
!     case 4:
  	/* Smallest at start, but matching substrings. */
  	if (domatch(s + l, c, 0) && !--n) {
! 	    *sp = get_match_ret(*sp, 0, 0, fl);
  	    return 1;
  	}
  	for (i = 1; i <= l; i++) {
--- 1994,2003 ----
  	}
  	break;
  
!     case SUB_SUBSTR:
  	/* Smallest at start, but matching substrings. */
  	if (domatch(s + l, c, 0) && !--n) {
! 	    *sp = get_match_ret(*sp, 0, 0, fl, replstr);
  	    return 1;
  	}
  	for (i = 1; i <= l; i++) {
***************
*** 1986,1992 ****
  		s[j] = '\0';
  		if (domatch(t, c, 0) && !--n) {
  		    s[j] = sav;
! 		    *sp = get_match_ret(*sp, t - s, j, fl);
  		    return 1;
  		}
  		if ((s[j] = sav) == Meta)
--- 2006,2012 ----
  		s[j] = '\0';
  		if (domatch(t, c, 0) && !--n) {
  		    s[j] = sav;
! 		    *sp = get_match_ret(*sp, t - s, j, fl, replstr);
  		    return 1;
  		}
  		if ((s[j] = sav) == Meta)
***************
*** 1999,2008 ****
  	}
  	break;
  
!     case 5:
  	/* Smallest at end, matching substrings */
  	if (domatch(s + l, c, 0) && !--n) {
! 	    *sp = get_match_ret(*sp, l, l, fl);
  	    return 1;
  	}
  	for (i = l; i--;) {
--- 2019,2028 ----
  	}
  	break;
  
!     case (SUB_END|SUB_SUBSTR):
  	/* Smallest at end, matching substrings */
  	if (domatch(s + l, c, 0) && !--n) {
! 	    *sp = get_match_ret(*sp, l, l, fl, replstr);
  	    return 1;
  	}
  	for (i = l; i--;) {
***************
*** 2013,2019 ****
  		*t = '\0';
  		if (domatch(s + j, c, 0) && !--n) {
  		    *t = sav;
! 		    *sp = get_match_ret(*sp, j, t - s, fl);
  		    return 1;
  		}
  		*t = sav;
--- 2033,2039 ----
  		*t = '\0';
  		if (domatch(s + j, c, 0) && !--n) {
  		    *t = sav;
! 		    *sp = get_match_ret(*sp, j, t - s, fl, replstr);
  		    return 1;
  		}
  		*t = sav;
***************
*** 2025,2056 ****
  	}
  	break;
  
!     case 6:
  	/* Largest at start, matching substrings. */
! 	for (i = l; i; i--) {
! 	    for (t = s, j = i; j <= l; j++, t++) {
! 		sav = s[j];
! 		s[j] = '\0';
! 		if (domatch(t, c, 0) && !--n) {
! 		    s[j] = sav;
! 		    *sp = get_match_ret(*sp, t - s, j, fl);
! 		    return 1;
  		}
! 		if ((s[j] = sav) == Meta)
! 		    j++;
! 		if (*t == Meta)
! 		    t++;
  	    }
! 	    if (i >= 2 && s[i-2] == Meta)
! 		i--;
! 	}
! 	if (domatch(s + l, c, 0) && !--n) {
! 	    *sp = get_match_ret(*sp, 0, 0, fl);
  	    return 1;
  	}
  	break;
  
!     case 7:
  	/* Largest at end, matching substrings. */
  	for (i = 0; i < l; i++) {
  	    for (t = s + l, j = i; j >= 0; j--, t--) {
--- 2045,2098 ----
  	}
  	break;
  
!     case (SUB_LONG|SUB_SUBSTR):
  	/* Largest at start, matching substrings. */
! 	start = s;
! 	lleft = l;
! 	if (fl & SUB_GLOBAL)
! 	    repllist = newlinklist();
! 	do {
! 	    /* loop over all matches for global substitution */
! 	    matched = 0;
! 	    for (i = lleft; i; i--) {
! 		for (t = start, j = i; j <= lleft; j++, t++) {
! 		    sav = start[j];
! 		    start[j] = '\0';
! 		    if (domatch(t, c, 0) &&
! 			(!--n || ((fl & SUB_GLOBAL) && n <= 0))) {
! 			start[j] = sav;
! 			*sp = get_match_ret(*sp, t - s, j + (start-s), fl,
! 					    replstr);
! 			if (!(fl & SUB_GLOBAL))
! 			    return 1;
! 			matched = j;
! 			start += j;
! 			lleft -= j;
! 			break;
! 		    }
! 		    if ((start[j] = sav) == Meta)
! 			j++;
! 		    if (*t == Meta)
! 			t++;
  		}
! 		if (matched)
! 		    break;
! 		if (i >= 2 && s[i-2] == Meta)
! 		    i--;
  	    }
! 	} while (matched);
! 	/*
! 	 * check if we can match a blank string, if so do it
! 	 * at the start.  Goodness knows if this is a good idea
! 	 * with global substitution, so it doesn't happen.
! 	 */
! 	if (!(fl & SUB_GLOBAL) && domatch(s + l, c, 0) && !--n) {
! 	    *sp = get_match_ret(*sp, 0, 0, fl, replstr);
  	    return 1;
  	}
  	break;
  
!     case (SUB_END|SUB_LONG|SUB_SUBSTR):
  	/* Largest at end, matching substrings. */
  	for (i = 0; i < l; i++) {
  	    for (t = s + l, j = i; j >= 0; j--, t--) {
***************
*** 2058,2064 ****
  		*t = '\0';
  		if (domatch(s + j, c, 0) && !--n) {
  		    *t = sav;
! 		    *sp = get_match_ret(*sp, j, t - s, fl);
  		    return 1;
  		}
  		*t = sav;
--- 2100,2106 ----
  		*t = '\0';
  		if (domatch(s + j, c, 0) && !--n) {
  		    *t = sav;
! 		    *sp = get_match_ret(*sp, j, t - s, fl, replstr);
  		    return 1;
  		}
  		*t = sav;
***************
*** 2071,2083 ****
  		i++;
  	}
  	if (domatch(s + l, c, 0) && !--n) {
! 	    *sp = get_match_ret(*sp, l, l, fl);
  	    return 1;
  	}
  	break;
      }
!     /* munge the whole string */
!     *sp = get_match_ret(*sp, 0, 0, fl);
      return 1;
  }
  
--- 2113,2158 ----
  		i++;
  	}
  	if (domatch(s + l, c, 0) && !--n) {
! 	    *sp = get_match_ret(*sp, l, l, fl, replstr);
  	    return 1;
  	}
  	break;
      }
! 
!     if (repllist && nonempty(repllist)) {
! 	/* Put all the bits of a global search and replace together. */
! 	LinkNode nd;
! 	Repldata rd;
! 	int rlen;
! 
! 	lleft = 0;		/* size of returned string */
! 	i = 0;			/* start of last chunk we got from *sp */
! 	rlen = strlen(replstr);
! 	for (nd = firstnode(repllist); nd; incnode(nd)) {
! 	    rd = (Repldata) getdata(nd);
! 	    lleft += rd->b - i; /* previous chunk of *sp */
! 	    lleft += rlen;	/* the replaced bit */
! 	    i = rd->e;		/* start of next chunk of *sp */
! 	}
! 	lleft += l - i;	/* final chunk from *sp */
! 	start = t = halloc(lleft+1);
! 	i = 0;
! 	for (nd = firstnode(repllist); nd; incnode(nd)) {
! 	    rd = (Repldata) getdata(nd);
! 	    memcpy(t, s + i, rd->b - i);
! 	    t += rd->b - i;
! 	    memcpy(t, replstr, rlen);
! 	    t += rlen;
! 	    i = rd->e;
! 	}
! 	memcpy(t, s + i, l - i);
! 	start[lleft] = '\0';
! 	*sp = start;
! 	return 1;
!     }
! 
!     /* munge the whole string: no match, so no replstr */
!     *sp = get_match_ret(*sp, 0, 0, fl, 0);
      return 1;
  }
  
*** Src/subst.c.pmfl	Wed Dec  9 11:25:29 1998
--- Src/subst.c	Wed Dec  9 17:18:19 1998
***************
*** 99,105 ****
      char *str  = str3;
  
      while (!errflag && *str) {
! 	if ((qt = *str == Qstring) || *str == String)
  	    if (str[1] == Inpar) {
  		str++;
  		goto comsub;
--- 99,105 ----
      char *str  = str3;
  
      while (!errflag && *str) {
! 	if ((qt = *str == Qstring) || *str == String) {
  	    if (str[1] == Inpar) {
  		str++;
  		goto comsub;
***************
*** 125,131 ****
  		str3 = (char *)getdata(node);
  		continue;
  	    }
! 	else if ((qt = *str == Qtick) || *str == Tick)
  	  comsub: {
  	    LinkList pl;
  	    char *s, *str2 = str;
--- 125,131 ----
  		str3 = (char *)getdata(node);
  		continue;
  	    }
! 	} else if ((qt = *str == Qtick) || *str == Tick)
  	  comsub: {
  	    LinkList pl;
  	    char *s, *str2 = str;
***************
*** 135,142 ****
  	    if (*str == Inpar) {
  		endchar = Outpar;
  		str[-1] = '\0';
  		if (skipparens(Inpar, Outpar, &str))
! 		    DPUTS(1, "BUG: parse error in command substitution");
  		str--;
  	    } else {
  		endchar = *str;
--- 135,146 ----
  	    if (*str == Inpar) {
  		endchar = Outpar;
  		str[-1] = '\0';
+ #ifdef DEBUG
  		if (skipparens(Inpar, Outpar, &str))
! 		    dputs("BUG: parse error in command substitution");
! #else
! 		skipparens(Inpar, Outpar, &str);
! #endif
  		str--;
  	    } else {
  		endchar = *str;
***************
*** 298,304 ****
      if (!assign)
  	return;
  
!     if (assign < 3)
  	if ((*namptr)[1] && (sub = strchr(*namptr + 1, Equals))) {
  	    if (assign == 1)
  		for (ptr = *namptr; ptr != sub; ptr++)
--- 302,308 ----
      if (!assign)
  	return;
  
!     if (assign < 3) {
  	if ((*namptr)[1] && (sub = strchr(*namptr + 1, Equals))) {
  	    if (assign == 1)
  		for (ptr = *namptr; ptr != sub; ptr++)
***************
*** 311,316 ****
--- 315,321 ----
  	    }
  	} else
  	    return;
+     }
  
      ptr = *namptr;
      while ((sub = strchr(ptr, ':'))) {
***************
*** 691,697 ****
      char *aptr = *str;
      char *s = aptr, *fstr, *idbeg, *idend, *ostr = (char *) getdata(n);
      int colf;			/* != 0 means we found a colon after the name */
-     int doub = 0;		/* != 0 means we have %%, not %, or ##, not # */
      int isarr = 0;
      int plan9 = isset(RCEXPANDPARAM);
      int globsubst = isset(GLOBSUBST);
--- 696,701 ----
***************
*** 705,715 ****
      Value v;
      int flags = 0;
      int flnum = 0;
-     int substr = 0;
      int sortit = 0, casind = 0;
      int casmod = 0;
      char *sep = NULL, *spsep = NULL;
      char *premul = NULL, *postmul = NULL, *preone = NULL, *postone = NULL;
      long prenum = 0, postnum = 0;
      int copied = 0;
      int arrasg = 0;
--- 709,719 ----
      Value v;
      int flags = 0;
      int flnum = 0;
      int sortit = 0, casind = 0;
      int casmod = 0;
      char *sep = NULL, *spsep = NULL;
      char *premul = NULL, *postmul = NULL, *preone = NULL, *postone = NULL;
+     char *replstr = NULL;	/* replacement string for /orig/repl */
      long prenum = 0, postnum = 0;
      int copied = 0;
      int arrasg = 0;
***************
*** 764,785 ****
  		    nojoin = 1;
  		    break;
  		case 'M':
! 		    flags |= 8;
  		    break;
  		case 'R':
! 		    flags |= 16;
  		    break;
  		case 'B':
! 		    flags |= 32;
  		    break;
  		case 'E':
! 		    flags |= 64;
  		    break;
  		case 'N':
! 		    flags |= 128;
  		    break;
  		case 'S':
! 		    substr = 1;
  		    break;
  		case 'I':
  		    flnum = get_intarg(&s);
--- 768,789 ----
  		    nojoin = 1;
  		    break;
  		case 'M':
! 		    flags |= SUB_MATCH;
  		    break;
  		case 'R':
! 		    flags |= SUB_REST;
  		    break;
  		case 'B':
! 		    flags |= SUB_BIND;
  		    break;
  		case 'E':
! 		    flags |= SUB_EIND;
  		    break;
  		case 'N':
! 		    flags |= SUB_LEN;
  		    break;
  		case 'S':
! 		    flags |= SUB_SUBSTR;
  		    break;
  		case 'I':
  		    flnum = get_intarg(&s);
***************
*** 940,946 ****
  		s++;
  	    } else
  		globsubst = 1;
! 	} else if (*s == '+')
  	    if (iident(s[1]))
  		chkset = 1, s++;
  	    else if (!inbrace) {
--- 944,950 ----
  		s++;
  	    } else
  		globsubst = 1;
! 	} else if (*s == '+') {
  	    if (iident(s[1]))
  		chkset = 1, s++;
  	    else if (!inbrace) {
***************
*** 951,957 ****
  		zerr("bad substitution", NULL, 0);
  		return NULL;
  	    }
! 	else
  	    break;
      }
      globsubst = globsubst && !qt;
--- 955,961 ----
  		zerr("bad substitution", NULL, 0);
  		return NULL;
  	    }
! 	} else
  	    break;
      }
      globsubst = globsubst && !qt;
***************
*** 1124,1146 ****
  		    *s == '=' || *s == Equals ||
  		    *s == '%' ||
  		    *s == '#' || *s == Pound ||
! 		    *s == '?' || *s == Quest)) {
  
  	if (!flnum)
  	    flnum++;
  	if (*s == '%')
! 	    flags |= 1;
  
  	/* Check for ${..%%..} or ${..##..} */
  	if ((*s == '%' || *s == '#' || *s == Pound) && *s == s[1]) {
  	    s++;
! 	    doub = 1;
  	}
  	s++;
  
! 	flags |= (doub << 1) | (substr << 2) | (colf << 8);
! 	if (!(flags & 0xf8))
! 	    flags |= 16;
  
  	if (colf && !vunset)
  	    vunset = (isarr) ? !*aval : !*val || (*val == Nularg && !val[1]);
--- 1128,1195 ----
  		    *s == '=' || *s == Equals ||
  		    *s == '%' ||
  		    *s == '#' || *s == Pound ||
! 		    *s == '?' || *s == Quest ||
! 		    *s == '/')) {
  
  	if (!flnum)
  	    flnum++;
  	if (*s == '%')
! 	    flags |= SUB_END;
  
  	/* Check for ${..%%..} or ${..##..} */
  	if ((*s == '%' || *s == '#' || *s == Pound) && *s == s[1]) {
  	    s++;
! 	    /* we have %%, not %, or ##, not # */
! 	    flags |= SUB_LONG;
  	}
  	s++;
+ 	if (s[-1] == '/') {
+ 	    char *ptr;
+ 	    /* previous flags are irrelevant: we always want longest match */
+ 	    flags = SUB_LONG;
+ 	    if (*s == '/') {
+ 		/* doubled, so replace all occurrences */
+ 		flags |= SUB_GLOBAL;
+ 		s++;
+ 	    }
+ 	    /* Check for anchored substitution */
+ 	    if (*s == '%') {
+ 		/* anchor at tail */
+ 		flags |= SUB_END;
+ 		s++;
+ 	    } else if (*s == '#' || *s == Pound) {
+ 		/* anchor at head: this is the `normal' case in getmatch */
+ 		s++;
+ 	    } else
+ 		flags |= SUB_SUBSTR;
+ 	    /*
+ 	     * Find the / marking the end of the search pattern.
+ 	     * If there isn't one, we're just going to delete that,
+ 	     * i.e. replace it with an empty string.
+ 	     *
+ 	     * This allows quotation of the slash with '\\/'. Why
+ 	     * two?  Well, for a non-quoted string we can check for
+ 	     * Bnull+/, which is what you get from `\/', but inside
+ 	     * double quotes the Bnull isn't there, so it's not
+ 	     * consistent.
+ 	     */
+ 	    for (ptr = s; *ptr && *ptr != '/'; ptr++)
+ 		if (*ptr == '\\' && ptr[1] == '/')
+ 		    chuck(ptr);
+ 	    replstr = (*ptr && ptr[1]) ? ptr+1 : "";
+ 	    untokenize(replstr);
+ 	    *ptr = '\0';
+ 	}
  
! 	if (colf)
! 	    flags |= SUB_ALL;
! 	/*
! 	 * With no special flags, i.e. just a # or % or whatever,
! 	 * the matched portion is removed and we keep the rest.
! 	 * We also want the rest when we're doing a substitution.
! 	 */
! 	if (!(flags & (SUB_MATCH|SUB_REST|SUB_BIND|SUB_EIND|SUB_LEN)))
! 	    flags |= SUB_REST;
  
  	if (colf && !vunset)
  	    vunset = (isarr) ? !*aval : !*val || (*val == Nularg && !val[1]);
***************
*** 1234,1239 ****
--- 1283,1289 ----
  	case '%':
  	case '#':
  	case Pound:
+ 	case '/':
  	    if (qt)
  		if (parse_subst_string(s)) {
  		    zerr("parse error in ${...%c...} substitution",
***************
*** 1247,1260 ****
  		char **pp = aval = (char **)ncalloc(sizeof(char *) * (arrlen(aval) + 1));
  
  		while ((*pp = *ap++)) {
! 		    if (getmatch(pp, s, flags, flnum))
  			pp++;
  		}
  		copied = 1;
  	    } else {
  		if (vunset)
  		    val = dupstring("");
! 		getmatch(&val, s, flags, flnum);
  		copied = 1;
  	    }
  	    break;
--- 1297,1310 ----
  		char **pp = aval = (char **)ncalloc(sizeof(char *) * (arrlen(aval) + 1));
  
  		while ((*pp = *ap++)) {
! 		    if (getmatch(pp, s, flags, flnum, replstr))
  			pp++;
  		}
  		copied = 1;
  	    } else {
  		if (vunset)
  		    val = dupstring("");
! 		getmatch(&val, s, flags, flnum, replstr);
  		copied = 1;
  	    }
  	    break;
*** Src/zsh.h.pmfl	Wed Dec  9 11:25:29 1998
--- Src/zsh.h	Wed Dec  9 14:57:39 1998
***************
*** 890,895 ****
--- 892,914 ----
  #define PM_DONTIMPORT	(1<<12)	/* do not import this variable                */
  #define PM_RESTRICTED	(1<<13) /* cannot be changed in restricted mode       */
  #define PM_UNSET	(1<<14)	/* has null value                             */
+ 
+ /*
+  * Flags for doing matches inside parameter substitutions, i.e.
+  * ${...#...} and friends.  This could be an enum, but so
+  * could a lot of other things.
+  */
+ 
+ #define SUB_END		0x0001	/* match end instead of begining, % or %%  */
+ #define SUB_LONG	0x0002	/* % or # doubled, get longest match */
+ #define SUB_SUBSTR	0x0004	/* match a substring */
+ #define SUB_MATCH	0x0008	/* include the matched portion */
+ #define SUB_REST	0x0010	/* include the unmatched portion */
+ #define SUB_BIND	0x0020	/* index of beginning of string */
+ #define SUB_EIND	0x0040	/* index of end of string */
+ #define SUB_LEN		0x0080	/* length of match */
+ #define SUB_ALL		0x0100	/* match complete string */
+ #define SUB_GLOBAL	0x0200	/* global substitution ${..//all/these} */
  
  /* node for named directory hash table (nameddirtab) */
  
  
-- 
Peter Stephenson <pws@ibmth.df.unipi.it>       Tel: +39 050 844536
WWW:  http://www.ifh.de/~pws/
Dipartimento di Fisica, Via Buonarroti 2, 56127 Pisa, Italy


^ permalink raw reply	[flat|nested] 7+ messages in thread

* PATCH: Docs out of sync
  1998-12-09  9:01 ` Peter Stephenson
  1998-12-09 17:04   ` PATCH: 3.1.5: bash ${.../old/new} Peter Stephenson
@ 1998-12-09 19:43   ` Phil Pennock
  1998-12-12  7:45     ` Bart Schaefer
  1 sibling, 1 reply; 7+ messages in thread
From: Phil Pennock @ 1998-12-09 19:43 UTC (permalink / raw)
  To: zsh-workers

Typing away merrily, Peter Stephenson produced the immortal words:
> Phil Pennock wrote:
> > * bash has arrays.  'declare', 'local' & 'readonly' each accept '-a' to
> >   declare an array.  Is it reasonable to add '-a' to 'typeset'?  This
> >   would automatically duplicate the bash-ism.
> 
> It's not exactly essential, since bash and zsh have rather different
> extensions to sh in any case.  They're only similar in as much as they
> both include sh.  It's hard enough keeping ksh emulation working.

Looking through the source for 3.1.5-patched, bin_typeset et al accept
-a.  This is undocumented.  This functionality is not in 3.1.5.
With no arguments beyond the options, "typeset -a" scans the parameter
table for arrays.  "typeset -A" scans for hashes.  An added bonus!  ;^)
Except that "typeset -a foo" is silently ignored.  An unhandled case
later on.

Scanning back, Bart added this functionality in patch 4608 (Nov 12).
It was documented in the article, but no docs patch.  The options to
'local' were also modified.

Docs patch included at end of this article.

> >   Further, would it be an idea to then deprecate 'set -A' which
> >   overloads parameter setting onto 'set'?
> 
> That's needed for ksh.

Fair enough.  But adding 'setting' to typeset would round it out.

> If you mean `remove the extension if and only if it's .ext', then you
> can do ${${var:t}%.ext}.

Ah.  *DOH!*  Thanks.  My zsh-based javawrapper has now been suitably
fixed.  :^)

> I think it's too close to the history modifiers and we'd better stick
> with the subscript notation, unless we're aiming at a bash
> compatibility mode which is really going a bit far.

That's about what I thought -- too close for normal use.
Note though that there are already a number of option-aliases to match
bash.  How about a bash-compatibility which includes the sh stuff,
disables a new option, say, 'PARAM_HISTMODS' and perhaps one day sets
'BASH_SUBSTRINGS'?  A simpler idea would be to just note the possibility
for now and see what happens to bash in the future.

> It worries me slightly that there are people out there who don't know
> the difference between bash and sh --- which is their problem, but one
> day they may start inflicting it on other people.

Most of them seem to be writing systems scripts for Linux.  Come zsh 3.2
I am planning on trying a very careful experiment -- how much falls
apart with /bin/sh being zsh.

Patch stuff
-----------
The zsh-development-guide specifies context-diffs.  Is it okay to use
unified context diffs in future?

*** dDoc/Zsh/builtins.yo	Tue Nov 10 09:10:01 1998
--- Doc/Zsh/builtins.yo	Wed Dec  9 19:36:34 1998
***************
*** 512,518 ****
  endsitem()
  )
  findex(local)
! item(tt(local) [ {tt(PLUS())|tt(-)}tt(LRZilrtu) [var(n)]] [ var(name)[tt(=)var(value)] ] ...)(
  Same as tt(typeset), except that the options tt(-x) and
  tt(-f) are not permitted.
  )
--- 512,518 ----
  endsitem()
  )
  findex(local)
! item(tt(local) [ {tt(PLUS())|tt(-)}tt(ALRUZailrtu) [var(n)]] [ var(name)[tt(=)var(value)] ] ...)(
  Same as tt(typeset), except that the options tt(-x) and
  tt(-f) are not permitted.
  )
***************
*** 878,884 ****
  findex(typeset)
  cindex(parameters, setting)
  cindex(parameters, declaring)
! item(tt(typeset) [ {tt(PLUS())|tt(-)}tt(LRUZfilrtuxm) [var(n)]] [ var(name)[tt(=)var(value)] ... ])(
  Set attributes and values for shell parameters.
  When invoked inside a function a new parameter is created which will be
  unset when the function completes.  The new parameter will not be
--- 878,884 ----
  findex(typeset)
  cindex(parameters, setting)
  cindex(parameters, declaring)
! item(tt(typeset) [ {tt(PLUS())|tt(-)}tt(ALRUZafilrtuxm) [var(n)]] [ var(name)[tt(=)var(value)] ... ])(
  Set attributes and values for shell parameters.
  When invoked inside a function a new parameter is created which will be
  unset when the function completes.  The new parameter will not be
***************
*** 887,892 ****
--- 887,895 ----
  The following attributes are valid:
  
  startitem()
+ item(tt(-A))(
+ Declare var(name) to be an em(A)ssociation parameter (also known as a hash).
+ )
  item(tt(-L))(
  Left justify and remove leading blanks from var(value).
  If var(n) is nonzero, it defines the width of the field;
***************
*** 915,920 ****
--- 918,927 ----
  If var(n) is nonzero it defines the width of the field;
  otherwise it is determined by the width of the value of the
  first assignment.
+ )
+ item(tt(-a))(
+ On its own, this option produces a list of all array parameters.
+ If any non-options are provided, the tt(typeset) command is silently ignored.
  )
  item(tt(-f))(
  The names refer to functions rather than parameters.  No assignments

-- 
--> Phil Pennock ; GAT d- s+:+ a22 C++(++++) UL++++/I+++/S+++/H+ P++@ L+++
E-@ W(+) N>++ o !K w--- O>+ M V !PS PE Y+ PGP+ t-- 5++ X+ R !tv b++>+++ DI+ D+
G+ e+ h* r y?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Strange substring search behaviour
  1998-12-09 17:04   ` PATCH: 3.1.5: bash ${.../old/new} Peter Stephenson
@ 1998-12-10 15:52     ` Peter Stephenson
  0 siblings, 0 replies; 7+ messages in thread
From: Peter Stephenson @ 1998-12-10 15:52 UTC (permalink / raw)
  To: zsh-workers

Peter Stephenson wrote:
> In fact, the internals are pretty much all there to be able to replace
> the shortest match instead of the longest match for the pattern.  The
> only thing missing is the syntax.

I decided on a syntax:  S for shortest substring; the substring flag
is not used for substitutions otherwise.

However, I discovered an ambiguity I wasn't aware of.  The form
${(S)foo#bar} is supposed to find substrings in $foo, using the
shortest match (## would give the longest match).  But (the M flag
means print the portion actually matched rather than the string with
that deleted, it doesn't affect what actually matches):

% foo="twinkle twinkle little star"
% print ${(M)foo#t*e}                # shortest match of t*e at head
twinkle                              # so far so good
% print ${(MS)foo#t*e}               # same but look for substrings
tle

This suprised me.  I would have expected it to start from the head,
and look for the shortest string that matches there, and carry on down
the string looking for the shortest match from any position.  Instead
it looks for the shortest *possible* match *anywhere*.  Maybe I should
have guessed?  It makes it difficult for shortest-match substitution,
since that has to start from the beginning and go down the string
(i.e., I wanted ${(S)foo//t*e/spy} to print `spy spy lispy star' and
this posting came about because it didn't).

Furthermore, this makes it a little strange when used with the I.n. flag,
which tells you to use the n'th match.

% print ${(MSI.1.)foo#t*e} 
tle                                  # first match: shortest
% print ${(MSI.2.)foo#t*e} 
ttle                                 # second match: second shortest
% print ${(MSI.3.)foo#t*e} 
twinkle                              # first occurrence of third shortest
% print ${(MSI.4.)foo#t*e} 
twinkle                              # the other twinkle
% print ${(MSI.5.)foo#t*e} 
twinkle little                       # all rather interesting...
% print ${(MSI.6.)foo#t*3} 
twinkle twinkle                      # ...in its own way...
% print ${(MSI.7.)foo#t*e} 
twinkle twinkle little               # ...but is it right?
                                     # (in fact, that's the *longest* match).

I would have expected `twinkle', `twinkle', `ttle' and `tle' (the last
has already gone by then if you're doing a global substitution so
doesn't get replaced), i.e. the shortest matches from each position in
order of finding.

I'd quite like to rewrite the whole thing the way my original
inclinations told me.  Any comments?  In other words, does anyone
think they or anyone else is expecting to find the globally shortest
match first?  Should I ask for a vote on zsh-users?

-- 
Peter Stephenson <pws@ibmth.df.unipi.it>       Tel: +39 050 844536
WWW:  http://www.ifh.de/~pws/
Dipartimento di Fisica, Via Buonarroti 2, 56127 Pisa, Italy


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: PATCH: Docs out of sync
  1998-12-09 19:43   ` PATCH: Docs out of sync Phil Pennock
@ 1998-12-12  7:45     ` Bart Schaefer
  0 siblings, 0 replies; 7+ messages in thread
From: Bart Schaefer @ 1998-12-12  7:45 UTC (permalink / raw)
  To: Phil Pennock, zsh-workers

On Dec 9,  7:43pm, Phil Pennock wrote:
} Subject: PATCH: Docs out of sync
}
} Typing away merrily, Peter Stephenson produced the immortal words:
} > Phil Pennock wrote:
} > > * bash has arrays.  'declare', 'local' & 'readonly' each accept '-a' to
} > >   declare an array.  Is it reasonable to add '-a' to 'typeset'?  This
} > >   would automatically duplicate the bash-ism.
} > 
} > It's not exactly essential, since bash and zsh have rather different
} > extensions to sh in any case.  They're only similar in as much as they
} > both include sh.  It's hard enough keeping ksh emulation working.
} 
} Looking through the source for 3.1.5-patched, bin_typeset et al accept
} -a.  This is undocumented.  This functionality is not in 3.1.5.
} With no arguments beyond the options, "typeset -a" scans the parameter
} table for arrays.  "typeset -A" scans for hashes.  An added bonus!  ;^)

It was actually typeset -A that I was adding, and -a was the bonus.

} Except that "typeset -a foo" is silently ignored.  An unhandled case
} later on.

It's an unhandled case because the code (from prior to my patch) has an
explicit 	on &= ~PM_ARRAY;	in the loop that creates the
parameters when they don't already exist.  I think that's there because
the syntax	typeset array=(x y z)	doesn't make it through the
parser, but I don't know for sure so I didn't mess with it.

Anyway, it's not correct to say that `typeset -a foo` is ignored -- it
DOES create the parameter foo, but it creates it as a scalar, not as an
array, which is exactly what used to happen with `typeset foo=(x y z)`
before (x y z) started being interpreted as a glob modifier.  (Now it
says "unknown file attribute" and fails entirely.)

} Scanning back, Bart added this functionality in patch 4608 (Nov 12).
} It was documented in the article, but no docs patch.  The options to
} 'local' were also modified.

I didn't give a doc patch because none of this stuff is stable yet; the
syntax may change, etc.  (Though `typeset -A` is unlikely to change, as
it's ksh compatible.)

} Docs patch included at end of this article.

Thanks, but ...

-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re:  Strange substring search behaviour
@ 1998-12-11  8:07 Sven Wischnowsky
  0 siblings, 0 replies; 7+ messages in thread
From: Sven Wischnowsky @ 1998-12-11  8:07 UTC (permalink / raw)
  To: zsh-workers


Peter Stephenson wrote:

> I'd quite like to rewrite the whole thing the way my original
> inclinations told me.  Any comments?  In other words, does anyone
> think they or anyone else is expecting to find the globally shortest
> match first?  Should I ask for a vote on zsh-users?

I think changing it would be a good idea (even though I guess it was
me who wrote the code with the strange behavior -- on the other side,
it just does what it was told ;-).

Bye
 Sven


--
Sven Wischnowsky                         wischnow@informatik.hu-berlin.de


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~1998-12-12  7:47 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1998-12-09  3:25 Notes on bash(1) Phil Pennock
1998-12-09  9:01 ` Peter Stephenson
1998-12-09 17:04   ` PATCH: 3.1.5: bash ${.../old/new} Peter Stephenson
1998-12-10 15:52     ` Strange substring search behaviour Peter Stephenson
1998-12-09 19:43   ` PATCH: Docs out of sync Phil Pennock
1998-12-12  7:45     ` Bart Schaefer
1998-12-11  8:07 Strange substring search behaviour Sven Wischnowsky

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).