From: Ralph Corderoy <ralph@inputplus.co.uk>
To: TUHS <tuhs@tuhs.org>
Subject: [TUHS] Testing an RE recogniser exhaustively. (Was: A fuzzy awk)
Date: Mon, 27 May 2024 10:39:09 +0100 [thread overview]
Message-ID: <20240527093909.91CAD21F18@orac.inputplus.co.uk> (raw)
In-Reply-To: <CAKH6PiU8t3HoTXkeY3MFV02Sy6p4M1k2KChcbbga37_SX5pYHg@mail.gmail.com>
Hi,
Doug wrote:
> The trick: From recursive equations (easily derived from the grammar
> of REs), I counted how many REs exist up to various limits on token
> counts, Then I generated all strings that satisfied those limits,
> turned the recognizer loose on them and counted how many it accepted.
Which reminded me of Doug's paper.
Enumerating the strings of regular languages,
J. Functional Programming 14 (2004) 503-518
Haskell code is developed for two ways to list the strings of the
language defined by a regular expression: directly by set operations
and indirectly by converting to and simulating an equivalent
automaton. The exercise illustrates techniques for dealing with
infinite ordered domains and leads to an effective standard form for
nondeterministic finite automata.
PDF preprint: https://www.cs.dartmouth.edu/~doug/nfa.pdf
It's also nice for the NFA construction with one state per symbol plus
one final state, and no epsilon transitions. Doug writes:
The even-a language (ab*a|b)* is defined by automaton h, with three
start states.
h0 = State 0 ’~’ []
h1 = State 1 ’b’ [h4,h1,h0]
h2 = State 2 ’a’ [h4,h1,h0]
h3 = State 3 ’b’ [h2,h3]
h4 = State 4 ’a’ [h2,h3]
h = [h4,h1,h0]
The symbols replaced by their state numbers gives (43*2|1)*; state 0 is
the sole final state.
--
Cheers, Ralph.
next prev parent reply other threads:[~2024-05-27 9:39 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-23 13:49 [TUHS] Re: A fuzzy awk Douglas McIlroy
2024-05-23 20:52 ` Rob Pike
2024-05-24 5:41 ` andrew
2024-05-24 7:17 ` Ralph Corderoy
2024-05-24 7:41 ` Rob Pike
2024-05-24 10:00 ` [TUHS] Is fuzz testing random? (Was: A fuzzy awk) Ralph Corderoy
2024-05-24 11:56 ` [TUHS] Re: A fuzzy awk Dan Halbert
2024-05-25 0:17 ` Bakul Shah via TUHS
2024-05-25 0:57 ` G. Branden Robinson
2024-05-25 13:56 ` David Arnold
2024-05-25 17:18 ` Paul Winalski
2024-05-25 17:36 ` Tom Perrine
2024-05-25 17:53 ` [TUHS] Prof Don Good [was " Charles H Sauer (he/him)
2024-05-27 9:39 ` Ralph Corderoy [this message]
2024-05-27 13:03 ` [TUHS] Re: Testing an RE recogniser exhaustively. (Was: A fuzzy awk) Hellwig Geisse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240527093909.91CAD21F18@orac.inputplus.co.uk \
--to=ralph@inputplus.co.uk \
--cc=tuhs@tuhs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).