[TUHS] Re: A fuzzy awk - Bakul Shah via TUHS

The Unix Heritage Society mailing list
 help / color / mirror / Atom feed

From: Bakul Shah via TUHS <tuhs@tuhs.org>
To: Rob Pike <robpike@gmail.com>
Cc: Douglas McIlroy <douglas.mcilroy@dartmouth.edu>,
	The Unix Heritage Society mailing list <tuhs@tuhs.org>
Subject: [TUHS] Re: A fuzzy awk
Date: Fri, 24 May 2024 17:17:53 -0700	[thread overview]
Message-ID: <1E156D9B-C841-4693-BEA4-34C4BF42BCD5@iitbombay.org> (raw)
In-Reply-To: <CAKzdPgyi5pt6XyJtBAqzmh6KdsFssmvXE75_Ukd_0-eHiw8PCQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3131 bytes --]

What would be nice if programming languages provided some support for such exhaustive testing[1].

At one point I had suggested turning Go's Interface type to something like Guttag style abstract data types in that relevant axioms are specified right in the interface definition. The idea was that any concrete type that implements that interface must satisfy its axioms. Even if the compiler ignored these axioms, one can write a support program that can generate a set of comprehensive tests based on these axioms. [Right now a type "implementing" an interface only needs to have a set of methods that exactly match the interface methods but nothing more] The underlying idea is that each type is in essence a constraint on what values an instance of that type can take. So adding such axioms simply tightens (& documents) these constraints. Just the process of coming up with such axioms can improve the design (sor of like test driven design but better!).

Now it may be that applying this to anything more complex than stacks won't work well & it won't be perfect but I thought this was worth experimenting with. This would be like functional testing of all the nuts and bolts and components that go in an airplane. The airplane may still fall apart but that would be a "composition" error!

[1] There are "proof assisant" or formal spec languages such as TLA+, Coq, Isabelle etc. but they don't get used much by the average programmer. I want something more retail!

> On May 23, 2024, at 1:52 PM, Rob Pike <robpike@gmail.com> wrote:
> 
> The semantic distinction is important but the end result is very similar. "Fuzzing" as it is now called (for no reason I can intuit) tries to get to the troublesome cases faster by a sort of depth-first search, but exhaustive will always beat it for value. Our exhaustive tester for bitblt, first done by John Reiser if I remember right, set the stage for my own thinking about how you properly test something.
> 
> -rob
> 
> 
> On Thu, May 23, 2024 at 11:49 PM Douglas McIlroy <douglas.mcilroy@dartmouth.edu <mailto:douglas.mcilroy@dartmouth.edu>> wrote:
>> > Doug McIlroy was generating random regular expressions
>> 
>> Actually not. I exhaustively (within limits) tested an RE recognizer without knowingly generating any RE either mechanically or by hand.
>> 
>> The trick: From recursive equations (easily derived from the grammar of REs), I counted how many REs exist up to various limits on token counts, Then I generated all strings that satisfied those limits, turned the recognizer loose on them and counted how many it accepted. Any disagreement of counts revealed the existence (but not any symptom) of bugs. 
>> 
>> Unlike most diagnostic techniques, this scheme produces a certificate of (very high odds on) correctness over a representative subdomain. The scheme also agnostically checks behavior on bad inputs as well as good.  It does not, however, provide a stress test of a recognizer's capacity limits. And its exponential nature limits its applicability to rather small domains. (REs have only 5 distinct kinds of token.)
>> 
>> Doug

[-- Attachment #2: Type: text/html, Size: 4916 bytes --]

next prev parent reply	other threads:[~2024-05-25  0:18 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-23 13:49 Douglas McIlroy
2024-05-23 20:52 ` Rob Pike
2024-05-24  5:41   ` andrew
2024-05-24  7:17   ` Ralph Corderoy
2024-05-24  7:41     ` Rob Pike
2024-05-24 10:00       ` [TUHS] Is fuzz testing random? (Was: A fuzzy awk) Ralph Corderoy
2024-05-24 11:56     ` [TUHS] Re: A fuzzy awk Dan Halbert
2024-05-25  0:17   ` Bakul Shah via TUHS [this message]
2024-05-25  0:57     ` G. Branden Robinson
2024-05-25 13:56     ` David Arnold
2024-05-25 17:18     ` Paul Winalski
2024-05-25 17:36       ` Tom Perrine
2024-05-25 17:53         ` [TUHS] Prof Don Good [was " Charles H Sauer (he/him)
2024-05-27  9:39 ` [TUHS] Testing an RE recogniser exhaustively. (Was: A fuzzy awk) Ralph Corderoy
2024-05-27 13:03   ` [TUHS] " Hellwig Geisse
  -- strict thread matches above, loose matches on Subject: below --
2024-05-20 14:09 [TUHS] Re: A fuzzy awk Serissa
2024-05-21  1:56 ` Rob Pike
2024-05-21  2:47   ` Larry McVoy
2024-05-21  2:54     ` Lawrence Stewart
2024-05-21  3:36       ` Rob Pike
2024-05-21 11:59         ` Peter Weinberger (温博格) via TUHS
2024-05-21  3:53   ` George Michaelson
2024-05-21 16:59   ` Paul Winalski
2024-05-21 17:56     ` segaloco via TUHS
2024-05-21 18:12     ` Luther Johnson
2024-05-22 15:37       ` Paul Winalski
2024-05-22 18:49         ` Larry McVoy
2024-05-22 20:17           ` Larry McVoy
2024-05-22  3:26     ` Dave Horsfall
2024-05-22  5:08       ` Alexis
2024-05-22 13:12         ` Warner Losh
2024-05-20 13:06 [TUHS] A fuzzy awk. (Was: The 'usage: ...' message.) Douglas McIlroy
2024-05-20 13:25 ` [TUHS] " Chet Ramey
2024-05-20 13:41   ` [TUHS] Re: A fuzzy awk Ralph Corderoy
2024-05-20 14:26     ` Chet Ramey
2024-05-22 13:44     ` arnold
2024-05-20 13:54 ` Ralph Corderoy
2024-05-19 23:08 [TUHS] The 'usage: ...' message. (Was: On Bloat...) Douglas McIlroy
2024-05-20  0:58 ` [TUHS] " Rob Pike
2024-05-20  3:19   ` arnold
2024-05-20  9:20     ` [TUHS] A fuzzy awk. (Was: The 'usage: ...' message.) Ralph Corderoy
2024-05-20 13:10       ` [TUHS] " Chet Ramey
2024-05-20 13:30         ` [TUHS] Re: A fuzzy awk Ralph Corderoy
2024-05-20 13:48           ` Chet Ramey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1E156D9B-C841-4693-BEA4-34C4BF42BCD5@iitbombay.org \
    --to=tuhs@tuhs.org \
    --cc=bakul@iitbombay.org \
    --cc=douglas.mcilroy@dartmouth.edu \
    --cc=robpike@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).