[TUHS] Re: A fuzzy awk. - segaloco via TUHS

The Unix Heritage Society mailing list
 help / color / mirror / Atom feed

From: segaloco via TUHS <tuhs@tuhs.org>
To: The Eunuchs Hysterical Society <tuhs@tuhs.org>
Subject: [TUHS] Re: A fuzzy awk.
Date: Tue, 21 May 2024 17:56:03 +0000	[thread overview]
Message-ID: <KCjmXMPI5ZDuexJE27aZRTMulIK9fIBcXHIE3KYFZCykgH1WVny1RaA2T7gl2POe0b25M0xJT67DxMJTMQTXTth8-CC5UPQMxPta4jlzbYI=@protonmail.com> (raw)
In-Reply-To: <CABH=_VR9TEnPLtjexUKtpkfG-81bg=g1X2+0v7upN=f-sEkA4A@mail.gmail.com>

On Tuesday, May 21st, 2024 at 9:59 AM, Paul Winalski <paul.winalski@gmail.com> wrote:

> On Tue, May 21, 2024 at 12:09 AM Serissa <stewart@serissa.com> wrote:
> 
> > > Well this is obviously a hot button topic. AFAIK I was nearby when fuzz-testing for software was invented. I was the main advocate for hiring Andy Payne into the Digital Cambridge Research Lab. One of his little projects was a thing that generated random but correct C programs and fed them to different compilers or compilers with different switches to see if they crashed or generated incorrect results. Overnight, his tester filed 300 or so bug reports against the Digital C compiler. This was met with substantial pushback, but it was a mostly an issue that many of the reports traced to the same underlying bugs.
> > > 
> > > Bill McKeemon expanded the technique and published "Differential Testing of Software" https://www.cs.swarthmore.edu/~bylvisa1/cs97/f13/Papers/DifferentialTestingForSoftware.pdf
> 
> In the mid-late 1980s Bill Mckeeman worked with DEC's compiler product teams to introduce fuzz testing into our testing process. As with the C compiler work at DEC Cambridge, fuzz testing for other compilers (Fortran, PL/I) also found large numbers of bugs.
> 
> The pushback from the compiler folks was mainly a matter of priorities. Fuzz testing is very adept at finding edge conditions, but most failing fuzz tests have syntax that no human programmer would ever write. As a compiler engineer you have limited time to devote to bug testing. Do you spend that time addressing real customer issues that have been reported or do you spend it fixing problems with code that no human being would ever write? To take an example that really happened, a fuzz test consisting of 100 nested parentheses caused an overflow in a parser table (it could only handle 50 nested parens). Is that worth fixing?
> 
> As you pointed out, fuzz test failures tend to occur in clusters and many of the failures eventually are traced to the same underlying bug. Which leads to the counter-argument to the pushback. The fuzz tests are finding real underlying bugs. Why not fix them before a customer runs into them? That very thing did happen several times. A customer-reported bug was fixed and suddenly several of the fuzz test problems that had been reported went away. Another consideration is that, even back in the 1980s, humans weren't the only ones writing programs. There were programs writing programs and they sometimes produced bizarre (but syntactically correct) code.
> 
> -Paul W.

A happy medium could be including far-out fuzzing to characterize issues, but not necessarily then immediately sink the resources into resolving bizarre discoveries from the fuzzing.  Better to know then not but also have the wisdom to determine "is someone actually going to trip this" vs. "this is something that is possible and good to document".  In my own work we have several of the latter where something is almost guaranteed to never happen with a human interaction, but is also something we want documented somewhere so if unlikely problem <xyz> ever does happen, the discovery is already done and we just start plotting out a solution.  That's also some nice low hanging fruit to pluck when there isn't much else going on, but avoids the phenomenon where we sink critical time into bugfixes with a microscopic ROI.

- Matt G.

next prev parent reply	other threads:[~2024-05-21 17:56 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-20 14:09 Serissa
2024-05-21  1:56 ` Rob Pike
2024-05-21  2:47   ` Larry McVoy
2024-05-21  2:54     ` Lawrence Stewart
2024-05-21  3:36       ` Rob Pike
2024-05-21 11:59         ` Peter Weinberger (温博格) via TUHS
2024-05-21  3:53   ` George Michaelson
2024-05-21 16:59   ` Paul Winalski
2024-05-21 17:56     ` segaloco via TUHS [this message]
2024-05-21 18:12     ` Luther Johnson
2024-05-22 15:37       ` Paul Winalski
2024-05-22 18:49         ` Larry McVoy
2024-05-22 20:17           ` Larry McVoy
2024-05-22  3:26     ` Dave Horsfall
2024-05-22  5:08       ` Alexis
2024-05-22 13:12         ` Warner Losh
  -- strict thread matches above, loose matches on Subject: below --
2024-05-23 13:49 Douglas McIlroy
2024-05-23 20:52 ` Rob Pike
2024-05-24  5:41   ` andrew
2024-05-24  7:17   ` Ralph Corderoy
2024-05-24  7:41     ` Rob Pike
2024-05-24 11:56     ` Dan Halbert
2024-05-25  0:17   ` Bakul Shah via TUHS
2024-05-25  0:57     ` G. Branden Robinson
2024-05-25 13:56     ` David Arnold
2024-05-25 17:18     ` Paul Winalski
2024-05-25 17:36       ` Tom Perrine
2024-05-20 13:06 [TUHS] A fuzzy awk. (Was: The 'usage: ...' message.) Douglas McIlroy
2024-05-20 13:25 ` [TUHS] " Chet Ramey
2024-05-20 13:41   ` [TUHS] Re: A fuzzy awk Ralph Corderoy
2024-05-20 14:26     ` Chet Ramey
2024-05-22 13:44     ` arnold
2024-05-20 13:54 ` Ralph Corderoy
2024-05-19 23:08 [TUHS] The 'usage: ...' message. (Was: On Bloat...) Douglas McIlroy
2024-05-20  0:58 ` [TUHS] " Rob Pike
2024-05-20  3:19   ` arnold
2024-05-20  9:20     ` [TUHS] A fuzzy awk. (Was: The 'usage: ...' message.) Ralph Corderoy
2024-05-20 13:10       ` [TUHS] " Chet Ramey
2024-05-20 13:30         ` [TUHS] Re: A fuzzy awk Ralph Corderoy
2024-05-20 13:48           ` Chet Ramey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='KCjmXMPI5ZDuexJE27aZRTMulIK9fIBcXHIE3KYFZCykgH1WVny1RaA2T7gl2POe0b25M0xJT67DxMJTMQTXTth8-CC5UPQMxPta4jlzbYI=@protonmail.com' \
    --to=tuhs@tuhs.org \
    --cc=segaloco@protonmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).