zsh-workers
 help / color / mirror / code / Atom feed
From: Peter Stephenson <pws@ifh.de>
To: zsh-workers@sunsite.auc.dk (Zsh hackers list)
Subject: Re: strange glob expansion
Date: Mon, 06 Sep 1999 16:08:39 +0200	[thread overview]
Message-ID: <199909061408.QAA353048@hydra.ifh.de> (raw)
In-Reply-To: "Sven Wischnowsky"'s message of "Mon, 06 Sep 1999 14:43:23 MET DST." <199909061243.OAA02544@beta.informatik.hu-berlin.de>

Sven Wischnowsky wrote:
> Peter Stephenson wrote:
> 
> > By the way, does anyone want a globbing flag to turn on extended glob?
> > e.g. (#x)foo and (#X)foo would compile pattern foo with extended glob
> > on or off.
> 
> That would then be an exception in that it is recognised even if
> extendedglob is not set, right?

Yes, that's ugly.  There's no real reason why you'd use parentheses
without extendedglob for such a pattern, but the special case is not
nice.

> Hm, dunno. May make things hard to read if you are used to using
> extendedglob, I think. Although it sounds like something that may be
> useful, although I can't come up with an example.

I was thinking of the command line for people who don't usually use
extendedglob, but that's probably not a very likely usage.  Setting
extendedglob locally in functions is certainly to be preferred,
although sometimes you want to keep the user's setting while having
your own extendedglob patterns (as in completion).

> We had this discussion about allowing back-references. Personally I
> don't think that just storing the matched portions in some special
> array is the best way -- what if I'm really interested in the indexes
> (beginning/end of matched part)?
> 
> So how about a set of globbing flags that turn on collecting
> back-references, say what information we want and give the name of an
> array where that information is stored? Something like `(#mparts)...'
> to store the matched parts in an array named `parts', `(#pbegs:ends)...'
> to store the begin-positions in an array `begs' and the end-positions
> in an array `ends'. Or something like that.

Hmm, I was hoping to keep the globbing flag syntax compact and easy to
read.  We could have the array names fixed, to $match, $beginning and
$end for example, but use the letters (#mbe) to decide which would be
set, maybe also (#MBE) for setting MATCH etc. to the whole matched
string for use in substrings.  The arrays wouldn't have to be special,
though, so you could make them local.  This sort of arrangement
(although using functions) seems to work OK in emacs.  Can you think
of examples where using the same variable names for different matches
would cause a problem?

For substitutions in ${.../../..${match[1]}..} this would mean
delaying singsub() of the replacement string, and with global
substitution re-substituting the replacement string on each loop.  I
think that's OK, although it could make quite a difference when
substituting on arrays.  That was one advantage of using some special
syntax like \1 for backreferences: it's easy to detect when they're
being used.

> I don't know enough about the matching code to know if this `turn-on-
> back-references-only-when-needed' is possible and easy enough to
> implement and if the effect on the normal, non-collecting processing
> is small enough (in terms of execution speed).

There is some overhead with backreferences active --- each `(' and `)'
uses another subroutine call --- but the speed overhead is essentially
nil when not using backreferencing --- the fact that the size of the
struct changes when using backreferences can be avoided, too; that was
inherited from the Specner code, but in our case we don't need to
store the extracted matches inside the struct.  It looks like I can
even do it with the normal grouping effect of globbing flags, i.e. you
can turn it on and off in the middle, still with no significant
overhead (so if you're not using it, the only difference is the odd
extra test for the flags and zeroing things).

I suppose it's potentially worth having backreferencing even with file
globs for use with the e glob qualifier.  That means retaining the
number of the parenthesis over the entire filename path.

By the way, I'm not planning on removing the limit of 9
backreferences.  I think that ought to be sufficient.

-- 
Peter Stephenson <pws@ifh.de>       Tel: +39 050 844536
WWW:  http://www.ifh.de/~pws/
Dipartimento di Fisica, Via Buonarroti 2, 56100 Pisa, Italy


  reply	other threads:[~1999-09-06 14:08 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
1999-09-06 12:43 Sven Wischnowsky
1999-09-06 14:08 ` Peter Stephenson [this message]
     [not found] <19990905220731.A7232@thelonious.new.ox.ac.uk>
1999-09-06 12:25 ` Peter Stephenson
1999-09-07 19:44   ` Bart Schaefer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=199909061408.QAA353048@hydra.ifh.de \
    --to=pws@ifh.de \
    --cc=zsh-workers@sunsite.auc.dk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).