From: Rob Pike <robpike@gmail.com>
To: Douglas McIlroy <douglas.mcilroy@dartmouth.edu>
Cc: TUHS main list <tuhs@tuhs.org>
Subject: [TUHS] Re: A fuzzy awk
Date: Fri, 24 May 2024 06:52:35 +1000 [thread overview]
Message-ID: <CAKzdPgyi5pt6XyJtBAqzmh6KdsFssmvXE75_Ukd_0-eHiw8PCQ@mail.gmail.com> (raw)
In-Reply-To: <CAKH6PiU8t3HoTXkeY3MFV02Sy6p4M1k2KChcbbga37_SX5pYHg@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 1551 bytes --]
The semantic distinction is important but the end result is very similar.
"Fuzzing" as it is now called (for no reason I can intuit) tries to get to
the troublesome cases faster by a sort of depth-first search, but
exhaustive will always beat it for value. Our exhaustive tester for bitblt,
first done by John Reiser if I remember right, set the stage for my own
thinking about how you properly test something.
-rob
On Thu, May 23, 2024 at 11:49 PM Douglas McIlroy <
douglas.mcilroy@dartmouth.edu> wrote:
> > Doug McIlroy was generating random regular expressions
>
> Actually not. I exhaustively (within limits) tested an RE recognizer
> without knowingly generating any RE either mechanically or by hand.
>
> The trick: From recursive equations (easily derived from the grammar of
> REs), I counted how many REs exist up to various limits on token counts,
> Then I generated all strings that satisfied those limits, turned the
> recognizer loose on them and counted how many it accepted. Any disagreement
> of counts revealed the existence (but not any symptom) of bugs.
>
> Unlike most diagnostic techniques, this scheme produces a certificate of
> (very high odds on) correctness over a representative subdomain. The
> scheme also agnostically checks behavior on bad inputs as well as good. It
> does not, however, provide a stress test of a recognizer's capacity limits. And
> its exponential nature limits its applicability to rather small domains.
> (REs have only 5 distinct kinds of token.)
>
> Doug
>
[-- Attachment #2: Type: text/html, Size: 2792 bytes --]
next prev parent reply other threads:[~2024-05-23 20:52 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-23 13:49 Douglas McIlroy
2024-05-23 20:52 ` Rob Pike [this message]
2024-05-24 5:41 ` andrew
2024-05-24 7:17 ` Ralph Corderoy
2024-05-24 7:41 ` Rob Pike
2024-05-24 10:00 ` [TUHS] Is fuzz testing random? (Was: A fuzzy awk) Ralph Corderoy
2024-05-24 11:56 ` [TUHS] Re: A fuzzy awk Dan Halbert
2024-05-25 0:17 ` Bakul Shah via TUHS
2024-05-25 0:57 ` G. Branden Robinson
2024-05-25 13:56 ` David Arnold
2024-05-25 17:18 ` Paul Winalski
2024-05-25 17:36 ` Tom Perrine
2024-05-25 17:53 ` [TUHS] Prof Don Good [was " Charles H Sauer (he/him)
2024-05-27 9:39 ` [TUHS] Testing an RE recogniser exhaustively. (Was: A fuzzy awk) Ralph Corderoy
2024-05-27 13:03 ` [TUHS] " Hellwig Geisse
-- strict thread matches above, loose matches on Subject: below --
2024-05-20 14:09 [TUHS] Re: A fuzzy awk Serissa
2024-05-21 1:56 ` Rob Pike
2024-05-21 2:47 ` Larry McVoy
2024-05-21 2:54 ` Lawrence Stewart
2024-05-21 3:36 ` Rob Pike
2024-05-21 11:59 ` Peter Weinberger (温博格) via TUHS
2024-05-21 3:53 ` George Michaelson
2024-05-21 16:59 ` Paul Winalski
2024-05-21 17:56 ` segaloco via TUHS
2024-05-21 18:12 ` Luther Johnson
2024-05-22 15:37 ` Paul Winalski
2024-05-22 18:49 ` Larry McVoy
2024-05-22 20:17 ` Larry McVoy
2024-05-22 3:26 ` Dave Horsfall
2024-05-22 5:08 ` Alexis
2024-05-22 13:12 ` Warner Losh
2024-05-20 13:06 [TUHS] A fuzzy awk. (Was: The 'usage: ...' message.) Douglas McIlroy
2024-05-20 13:25 ` [TUHS] " Chet Ramey
2024-05-20 13:41 ` [TUHS] Re: A fuzzy awk Ralph Corderoy
2024-05-20 14:26 ` Chet Ramey
2024-05-22 13:44 ` arnold
2024-05-20 13:54 ` Ralph Corderoy
2024-05-19 23:08 [TUHS] The 'usage: ...' message. (Was: On Bloat...) Douglas McIlroy
2024-05-20 0:58 ` [TUHS] " Rob Pike
2024-05-20 3:19 ` arnold
2024-05-20 9:20 ` [TUHS] A fuzzy awk. (Was: The 'usage: ...' message.) Ralph Corderoy
2024-05-20 13:10 ` [TUHS] " Chet Ramey
2024-05-20 13:30 ` [TUHS] Re: A fuzzy awk Ralph Corderoy
2024-05-20 13:48 ` Chet Ramey
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAKzdPgyi5pt6XyJtBAqzmh6KdsFssmvXE75_Ukd_0-eHiw8PCQ@mail.gmail.com \
--to=robpike@gmail.com \
--cc=douglas.mcilroy@dartmouth.edu \
--cc=tuhs@tuhs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).