zsh-workers
 help / color / mirror / code / Atom feed
From: Peter Stephenson <pws@ibmth.df.unipi.it>
To: zsh-workers@math.gatech.edu
Subject: Strange substring search behaviour
Date: Thu, 10 Dec 1998 16:52:52 +0100	[thread overview]
Message-ID: <9812101552.AA30992@ibmth.df.unipi.it> (raw)
In-Reply-To: "Peter Stephenson"'s message of "Wed, 09 Dec 1998 18:04:27 NFT." <9812091704.AA42751@ibmth.df.unipi.it>

Peter Stephenson wrote:
> In fact, the internals are pretty much all there to be able to replace
> the shortest match instead of the longest match for the pattern.  The
> only thing missing is the syntax.

I decided on a syntax:  S for shortest substring; the substring flag
is not used for substitutions otherwise.

However, I discovered an ambiguity I wasn't aware of.  The form
${(S)foo#bar} is supposed to find substrings in $foo, using the
shortest match (## would give the longest match).  But (the M flag
means print the portion actually matched rather than the string with
that deleted, it doesn't affect what actually matches):

% foo="twinkle twinkle little star"
% print ${(M)foo#t*e}                # shortest match of t*e at head
twinkle                              # so far so good
% print ${(MS)foo#t*e}               # same but look for substrings
tle

This suprised me.  I would have expected it to start from the head,
and look for the shortest string that matches there, and carry on down
the string looking for the shortest match from any position.  Instead
it looks for the shortest *possible* match *anywhere*.  Maybe I should
have guessed?  It makes it difficult for shortest-match substitution,
since that has to start from the beginning and go down the string
(i.e., I wanted ${(S)foo//t*e/spy} to print `spy spy lispy star' and
this posting came about because it didn't).

Furthermore, this makes it a little strange when used with the I.n. flag,
which tells you to use the n'th match.

% print ${(MSI.1.)foo#t*e} 
tle                                  # first match: shortest
% print ${(MSI.2.)foo#t*e} 
ttle                                 # second match: second shortest
% print ${(MSI.3.)foo#t*e} 
twinkle                              # first occurrence of third shortest
% print ${(MSI.4.)foo#t*e} 
twinkle                              # the other twinkle
% print ${(MSI.5.)foo#t*e} 
twinkle little                       # all rather interesting...
% print ${(MSI.6.)foo#t*3} 
twinkle twinkle                      # ...in its own way...
% print ${(MSI.7.)foo#t*e} 
twinkle twinkle little               # ...but is it right?
                                     # (in fact, that's the *longest* match).

I would have expected `twinkle', `twinkle', `ttle' and `tle' (the last
has already gone by then if you're doing a global substitution so
doesn't get replaced), i.e. the shortest matches from each position in
order of finding.

I'd quite like to rewrite the whole thing the way my original
inclinations told me.  Any comments?  In other words, does anyone
think they or anyone else is expecting to find the globally shortest
match first?  Should I ask for a vote on zsh-users?

-- 
Peter Stephenson <pws@ibmth.df.unipi.it>       Tel: +39 050 844536
WWW:  http://www.ifh.de/~pws/
Dipartimento di Fisica, Via Buonarroti 2, 56127 Pisa, Italy


  reply	other threads:[~1998-12-10 23:42 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
1998-12-09  3:25 Notes on bash(1) Phil Pennock
1998-12-09  9:01 ` Peter Stephenson
1998-12-09 17:04   ` PATCH: 3.1.5: bash ${.../old/new} Peter Stephenson
1998-12-10 15:52     ` Peter Stephenson [this message]
1998-12-09 19:43   ` PATCH: Docs out of sync Phil Pennock
1998-12-12  7:45     ` Bart Schaefer
1998-12-11  8:07 Strange substring search behaviour Sven Wischnowsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9812101552.AA30992@ibmth.df.unipi.it \
    --to=pws@ibmth.df.unipi.it \
    --cc=zsh-workers@math.gatech.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).