ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
From: Mojca Miklavec <mojca.miklavec.lists@gmail.com>
To: mailing list for ConTeXt users <ntg-context@ntg.nl>
Subject: Re: Off-topic: Struggles with LPEG grammar
Date: Mon, 21 Dec 2020 14:08:11 +0100	[thread overview]
Message-ID: <CALBOmsYx45L+SQLbFz3gW6MxMQa4mu-gJWbdL2iMXWbRkpA7mw@mail.gmail.com> (raw)
In-Reply-To: <364BD99F-0BC4-4CAB-84AD-AED668455854@elvenkind.com>

Dear Taco,

On Mon, 21 Dec 2020 at 13:46, Taco Hoekwater wrote:
> > On 21 Dec 2020, at 13:16, Mojca Miklavec wrote:
> >
> > My only explanation would be that perhaps "^1" is so greedy that the
> > rest of the pattern doesn't get found. But I don't want to believe
> > that explanation.
>
> Which (of course) means that that is exactly what happens ;)
>
> The ones that match are
>
> ababbb (a (ba+bb) b) => r4 r1(r3(r5 r4) r2(r5 r5)) r5
> abbbab (a (bb+ba) b) => r4 r1(r2(r5 r5) r3(r5 r4)) r5
>
> With the ^1, in the “bb” cases the first “b” eats all three “b”s:
>
> ababbb fails the r5 at the end
>
> abbbab fails the first r2 already (since the second r5 therein never happens)

Is this a deliberate choice, a limitation of the grammar
expressiveness, some misuse on my side that could/should/needs to be
implemented in a different way, or does it count as a "bug" on the
lpeg side?

For example, I wouldn't expect a regexp "b+b" to fail on "bbb" just
because "b+" would eat all three "b"s at once (the regexp "b+b" in
fact finds "bbb", and I would expect a less-than-totally-greedy hit
with lpeg as well). Or is my reasoning wrong here?

It certainly works if I use
    lpeg.P('b') + lpeg.P('bb') + lpeg.P('bbb') -- and a couple more
(as long as I can predict the maximum length)
but that's not really a viable workaround in general.

Thank you,
    Mojca

PS: sorry, a tiny bug also crippled into my sample code. The line
after matching the 'parser1' should have used 'total1' rather than
'total':
    if lpeg.match(parser1, s) then
        total1 = total1 + 1
    end
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : http://contextgarden.net
___________________________________________________________________________________

  parent reply	other threads:[~2020-12-21 13:08 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-21 12:16 Mojca Miklavec
2020-12-21 12:46 ` Taco Hoekwater
2020-12-21 12:50   ` Taco Hoekwater
2020-12-21 13:08   ` Mojca Miklavec [this message]
2020-12-21 13:36     ` Taco Hoekwater
2020-12-21 13:47       ` luigi scarso
2020-12-21 14:10       ` Mojca Miklavec
2020-12-21 14:18         ` Taco Hoekwater
2020-12-21 14:44         ` Hans Hagen
2020-12-21 13:38     ` Arthur Reutenauer
2020-12-21 13:52 ` Hans Hagen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALBOmsYx45L+SQLbFz3gW6MxMQa4mu-gJWbdL2iMXWbRkpA7mw@mail.gmail.com \
    --to=mojca.miklavec.lists@gmail.com \
    --cc=ntg-context@ntg.nl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).