zsh-workers
 help / color / mirror / code / Atom feed
From: Bart Schaefer <schaefer@brasslantern.com>
To: Peter Stephenson <p.w.stephenson@ntlworld.com>
Cc: Zsh hackers list <zsh-workers@zsh.org>
Subject: Re: Extending regexes
Date: Mon, 4 Jul 2022 12:15:47 -0700	[thread overview]
Message-ID: <CAH+w=7bey3h8XzyxXcYiOXx3yzYk50cjm5t4yt5kGNWB-6XJdw@mail.gmail.com> (raw)
In-Reply-To: <76883431.1281129.1656942459330@mail2.virginmedia.com>

On Mon, Jul 4, 2022 at 6:53 AM Peter Stephenson
<p.w.stephenson@ntlworld.com> wrote:>
> > On 04 July 2022 at 13:03 Sebastian Gniazdowski <sgniazdowski@gmail.com> wrote:
> > Zsh has extensions to regular regexes - the ~ and ^ negations.
>
> You're quite right both that they're very useful in zsh and there's nothing
> like this in normal regular expressions, but unfortunately I've got a strong
> feeling this is a big can of worms [hope that image is graphic enough that
> I don't need to explain the phrase for non-native English speakers].

In particular, these no longer fit the formal definition of "regular".

PWS correct me if I go too far astray, but (^Y) is internally (*~Y)
and (X~Y) is implemented by first matching (X) and then removing
anything that matches (Y) ... which is where the regular-ness goes
astray.  My formal training on this is more than a little rusty, but I
believe this means chaining together two finite-state machines rather
than building a single one.

On Mon, Jul 4, 2022 at 5:06 AM Sebastian Gniazdowski
<sgniazdowski@gmail.com> wrote:
>
> I think that regexes look pretty limited from this point of view and that pcre extensions went wrong path with the look forward and behind semantics.

Note that of course "pcre" stands for "perl-compatible RE" so you can
find the justifications for look-{ahead,behind} in the history of perl
development.  Again, a long time ago, but my recollection is that the
reason "lookaround assertions" are zero-width elements is to preserve
the finite-state semantics.  Please take that with 30 years worth of
salt grains (a less self-explanatory idiom than Peter's, I fear).


  reply	other threads:[~2022-07-04 19:21 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-04 12:03 Sebastian Gniazdowski
2022-07-04 13:47 ` Peter Stephenson
2022-07-04 19:15   ` Bart Schaefer [this message]
2022-07-04 19:41     ` Peter Stephenson
2022-07-06 10:03       ` Daniel Shahaf
2022-07-06 18:40   ` stephane
2022-07-06 23:07 ` Phil Pennock
2022-07-07  0:22   ` Bart Schaefer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAH+w=7bey3h8XzyxXcYiOXx3yzYk50cjm5t4yt5kGNWB-6XJdw@mail.gmail.com' \
    --to=schaefer@brasslantern.com \
    --cc=p.w.stephenson@ntlworld.com \
    --cc=zsh-workers@zsh.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).