[TUHS] The 'usage: ...' message. (Was: On Bloat...)

The Unix Heritage Society mailing list
 help / color / mirror / Atom feed

* [TUHS] The 'usage: ...' message. (Was: On Bloat...)
@ 2024-05-19 23:08 Douglas McIlroy
  2024-05-20  0:58 ` [TUHS] " Rob Pike
  0 siblings, 1 reply; 47+ messages in thread
From: Douglas McIlroy @ 2024-05-19 23:08 UTC (permalink / raw)
  To: TUHS main list

[-- Attachment #1: Type: text/plain, Size: 2004 bytes --]

>> Another non-descriptive style of error message that I admired was that
>> of Berkeley Pascal's syntax diagnostics. When the LR parser could not
>> proceed, it reported where, and automatically provided a sample token
>> that would allow the parsing to progress. I found this uniform
>> convention to be at least as informative as distinct hand-crafted
>> messages, which almost by definition can't foresee every contingency.
>> Alas, this elegant scheme seems not to have inspired imitators.

> The hazard with this approach is that the suggested syntactic correction
> might simply lead the user farther into the weeds

I don't think there's enough experience to justify this claim. Before I
experienced the Berkeley compiler, I would have thought such bad outcomes
were inevitable in any language. Although the compilers' suggestions often
bore little or no relationship to the real correction,  I always found them
informative. In particular, the utterly consistent style assured there was
never an issue of ambiguity or of technical jargon.

The compiler taught me Pascal in an evening. I had scanned the Pascal
Report a couple of years before but had never written a Pascal program.
With no manual at hand, I looked at one program to find out what
mumbo-jumbo had to come first and how to print integers, then wrote the
rest by trial and error. Within a couple of hours  I had a working program
good enough to pass muster in an ACM journal.

An example arose that one might think would lead "into the weeds". The
parser balked before 'or' in a compound Boolean expression like  'a=b and
c=d or x=y'. It couldn't suggest a right paren because no left paren had
been seen. Whatever suggestion it did make (perhaps 'then') was enough to
lead me to insert a remote left paren and teach me that parens are required
around Boolean-valued subexpressions. (I will agree that this lesson might
be less clear to a programming novice, but so might be many conventional
diagnostics, e.g. "no effect".)

Doug

[-- Attachment #2: Type: text/html, Size: 3392 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: The 'usage: ...' message. (Was: On Bloat...)
  2024-05-19 23:08 [TUHS] The 'usage: ...' message. (Was: On Bloat...) Douglas McIlroy
@ 2024-05-20  0:58 ` Rob Pike
  2024-05-20  3:19   ` arnold
                     ` (3 more replies)
  0 siblings, 4 replies; 47+ messages in thread
From: Rob Pike @ 2024-05-20  0:58 UTC (permalink / raw)
  To: Douglas McIlroy; +Cc: TUHS main list

[-- Attachment #1: Type: text/plain, Size: 561 bytes --]

The Cornell PL/I compiler, PL/C, ran on the IBM 360 so of course used batch
input. It tried automatically to keep things running after a parsing error
by inserting some token - semicolon, parenthesis, whatever seemed best -
and continuing to parse, in order to maximize the amount of input that
could be parsed before giving up. At least, that's what I took the
motivation to be. It rarely succeeded in fixing the actual problem, despite
PL/I being plastered with semicolons, but it did tend to ferret out more
errors per run. I found the tactic helpful.

-rob

[-- Attachment #2: Type: text/html, Size: 873 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: The 'usage: ...' message. (Was: On Bloat...)
  2024-05-20  0:58 ` [TUHS] " Rob Pike
@ 2024-05-20  3:19   ` arnold
  2024-05-20  3:43     ` Warner Losh
  2024-05-20  9:20     ` [TUHS] A fuzzy awk. (Was: The 'usage: ...' message.) Ralph Corderoy
  2024-05-20  3:54   ` [TUHS] Re: The 'usage: ...' message. (Was: On Bloat...) Bakul Shah via TUHS
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 47+ messages in thread
From: arnold @ 2024-05-20  3:19 UTC (permalink / raw)
  To: robpike, douglas.mcilroy; +Cc: tuhs

Rob Pike <robpike@gmail.com> wrote:

> The Cornell PL/I compiler, PL/C, ran on the IBM 360 so of course used batch
> input. It tried automatically to keep things running after a parsing error
> by inserting some token - semicolon, parenthesis, whatever seemed best -
> and continuing to parse, in order to maximize the amount of input that
> could be parsed before giving up. At least, that's what I took the
> motivation to be. It rarely succeeded in fixing the actual problem, despite
> PL/I being plastered with semicolons, but it did tend to ferret out more
> errors per run. I found the tactic helpful.
>
> -rob

Gawk used to do this, until people started fuzzing it, causing cascading
errors and eventually core dumps. Now the first syntax error is fatal.
It got to the point where I added this text to the manual:

	In recent years, people have been running "fuzzers" to generate
	invalid awk programs in order to find and report (so-called)
	bugs in gawk.

	In general, such reports are not of much practical use. The
	programs they create are not realistic and the bugs found are
	generally from some kind of memory corruption that is fatal
	anyway.

	So, if you want to run a fuzzer against gawk and report the
	results, you may do so, but be aware that such reports don’t
	carry the same weight as reports of real bugs do.

(Yeah, I've just changed the subject, feel free to stay on topic. :-)

Arnold

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: The 'usage: ...' message. (Was: On Bloat...)
  2024-05-20  3:19   ` arnold
@ 2024-05-20  3:43     ` Warner Losh
  2024-05-20  4:46       ` arnold
  2024-05-20  9:20     ` [TUHS] A fuzzy awk. (Was: The 'usage: ...' message.) Ralph Corderoy
  1 sibling, 1 reply; 47+ messages in thread
From: Warner Losh @ 2024-05-20  3:43 UTC (permalink / raw)
  To: Arnold Robbins; +Cc: Douglas McIlroy, The Eunuchs Hysterical Society

[-- Attachment #1: Type: text/plain, Size: 1697 bytes --]

On Sun, May 19, 2024, 9:19 PM <arnold@skeeve.com> wrote:

> Rob Pike <robpike@gmail.com> wrote:
>
> > The Cornell PL/I compiler, PL/C, ran on the IBM 360 so of course used
> batch
> > input. It tried automatically to keep things running after a parsing
> error
> > by inserting some token - semicolon, parenthesis, whatever seemed best -
> > and continuing to parse, in order to maximize the amount of input that
> > could be parsed before giving up. At least, that's what I took the
> > motivation to be. It rarely succeeded in fixing the actual problem,
> despite
> > PL/I being plastered with semicolons, but it did tend to ferret out more
> > errors per run. I found the tactic helpful.
> >
> > -rob
>
> Gawk used to do this, until people started fuzzing it, causing cascading
> errors and eventually core dumps. Now the first syntax error is fatal.
> It got to the point where I added this text to the manual:
>
>         In recent years, people have been running "fuzzers" to generate
>         invalid awk programs in order to find and report (so-called)
>         bugs in gawk.
>
>         In general, such reports are not of much practical use. The
>         programs they create are not realistic and the bugs found are
>         generally from some kind of memory corruption that is fatal
>         anyway.
>
>         So, if you want to run a fuzzer against gawk and report the
>         results, you may do so, but be aware that such reports don’t
>         carry the same weight as reports of real bugs do.
>
> (Yeah, I've just changed the subject, feel free to stay on topic. :-)
>


Awk bailing out near line 1.

Warner

> Arnold
>

[-- Attachment #2: Type: text/html, Size: 2473 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: The 'usage: ...' message. (Was: On Bloat...)
  2024-05-20  0:58 ` [TUHS] " Rob Pike
  2024-05-20  3:19   ` arnold
@ 2024-05-20  3:54   ` Bakul Shah via TUHS
  2024-05-20 14:23   ` Clem Cole
  2024-05-20 17:40   ` Stuff Received
  3 siblings, 0 replies; 47+ messages in thread
From: Bakul Shah via TUHS @ 2024-05-20  3:54 UTC (permalink / raw)
  To: The Unix Heritage Society mailing list

[-- Attachment #1: Type: text/plain, Size: 798 bytes --]

I remember helping newbie students at USC who were very confused that even though they made the changes "PL/C USES", their program didn't work! 

> On May 19, 2024, at 5:58 PM, Rob Pike <robpike@gmail.com> wrote:
> 
> The Cornell PL/I compiler, PL/C, ran on the IBM 360 so of course used batch input. It tried automatically to keep things running after a parsing error by inserting some token - semicolon, parenthesis, whatever seemed best - and continuing to parse, in order to maximize the amount of input that could be parsed before giving up. At least, that's what I took the motivation to be. It rarely succeeded in fixing the actual problem, despite PL/I being plastered with semicolons, but it did tend to ferret out more errors per run. I found the tactic helpful.
> 
> -rob
> 


[-- Attachment #2: Type: text/html, Size: 1443 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: The 'usage: ...' message. (Was: On Bloat...)
  2024-05-20  3:43     ` Warner Losh
@ 2024-05-20  4:46       ` arnold
  0 siblings, 0 replies; 47+ messages in thread
From: arnold @ 2024-05-20  4:46 UTC (permalink / raw)
  To: imp, arnold; +Cc: tuhs, douglas.mcilroy

Warner Losh <imp@bsdimp.com> wrote:

> > (Yeah, I've just changed the subject, feel free to stay on topic. :-)
>
> Awk bailing out near line 1.

	$ gawk --nostalgia
	awk: bailing out near line 1
	Aborted (core dumped)

A very long time Easter Egg... :-)

Arnold

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] A fuzzy awk.  (Was: The 'usage: ...' message.)
  2024-05-20  3:19   ` arnold
  2024-05-20  3:43     ` Warner Losh
@ 2024-05-20  9:20     ` Ralph Corderoy
  2024-05-20 11:58       ` [TUHS] " arnold
  2024-05-20 13:10       ` Chet Ramey
  1 sibling, 2 replies; 47+ messages in thread
From: Ralph Corderoy @ 2024-05-20  9:20 UTC (permalink / raw)
  To: tuhs

Hi Arnold,

> > in order to maximize the amount of input that could be parsed before
> > giving up.
>
> Gawk used to do this, until people started fuzzing it, causing
> cascading errors and eventually core dumps.  Now the first syntax
> error is fatal.

This is the first time I've heard of making life difficult for fuzzers
so I'm curious...

I'm assuming you agree the eventual core dump was a bug somewhere to be
fixed, and probably was.  Stopping on the first error lessens the
‘attack surface’ for the fuzzer.  Do you think there remains a bug which
would bite a user which the fuzzer might have found more easily before
the shrunken surface?

-- 
Cheers, Ralph.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.  (Was: The 'usage: ...' message.)
  2024-05-20  9:20     ` [TUHS] A fuzzy awk. (Was: The 'usage: ...' message.) Ralph Corderoy
@ 2024-05-20 11:58       ` arnold
  2024-05-20 13:10       ` Chet Ramey
  1 sibling, 0 replies; 47+ messages in thread
From: arnold @ 2024-05-20 11:58 UTC (permalink / raw)
  To: tuhs, ralph

Ralph Corderoy <ralph@inputplus.co.uk> wrote:

> This is the first time I've heard of making life difficult for fuzzers
> so I'm curious...

I was making life easier for me. :-)

> I'm assuming you agree the eventual core dump was a bug somewhere to be
> fixed, and probably was.

Not really. Hugely syntactically invalid programs can end up causing
memory corruption as necessary data structures don't get built correctly
(or at all); since they're invalid, subsequent bits of gawk that expect
valid data structures end up not working.  These are "bugs" that can't
happen when using the tool correctly.

> Stopping on the first error lessens the ‘attack surface’ for the
> fuzzer.  Do you think there remains a bug which would bite a user which
> the fuzzer might have found more easily before the shrunken surface?

No.

I don't have any examples handy, but you can look back through the
bug-gawk archives for some examples of these reports.  The number
of true bugs that fuzzers have caught (if any!) could be counted
on one hand.

Sometimes they like to claim that the "bugs" they find could cause
denial of service attacks. That's also specious, gawk isn't used for
long-running server kinds of programs.

The joys of being a Free Software Maintainer.

Arnold

P.S. I don't claim that gawk is bug-free.  But I do think that there
are qualitatively different kinds of bugs, and bug reports.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk. (Was: The 'usage: ...' message.)
  2024-05-20  9:20     ` [TUHS] A fuzzy awk. (Was: The 'usage: ...' message.) Ralph Corderoy
  2024-05-20 11:58       ` [TUHS] " arnold
@ 2024-05-20 13:10       ` Chet Ramey
  2024-05-20 13:30         ` [TUHS] Re: A fuzzy awk Ralph Corderoy
  1 sibling, 1 reply; 47+ messages in thread
From: Chet Ramey @ 2024-05-20 13:10 UTC (permalink / raw)
  To: Ralph Corderoy, tuhs


[-- Attachment #1.1: Type: text/plain, Size: 1487 bytes --]

On 5/20/24 5:20 AM, Ralph Corderoy wrote:
> Hi Arnold,
> 
>>> in order to maximize the amount of input that could be parsed before
>>> giving up.
>>
>> Gawk used to do this, until people started fuzzing it, causing
>> cascading errors and eventually core dumps.  Now the first syntax
>> error is fatal.
> 
> This is the first time I've heard of making life difficult for fuzzers
> so I'm curious...

It's not making life difficult for them -- they can still fuzz all they
want. Chances are better they'll find a genuine bug if you stop right away.


> I'm assuming you agree the eventual core dump was a bug somewhere to be
> fixed, and probably was.  > Stopping on the first error lessens the
> ‘attack surface’ for the fuzzer.  Do you think there remains a bug which
> would bite a user which the fuzzer might have found more easily before
> the shrunken surface?

Chances are small. (People fuzz bash all the time, and that is my
experience.)

Look at it this way. Free Software maintainers have limited resources. Is
it better to spend time on bugs that will affect a larger percentage of
the user population, instead of those that require artificial circumstances
that won't be encountered by normal usage? Those get pushed down on the
priority list.

Chet
-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
		 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    chet@case.edu    http://tiswww.cwru.edu/~chet/


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 203 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-20 13:10       ` Chet Ramey
@ 2024-05-20 13:30         ` Ralph Corderoy
  2024-05-20 13:48           ` Chet Ramey
  0 siblings, 1 reply; 47+ messages in thread
From: Ralph Corderoy @ 2024-05-20 13:30 UTC (permalink / raw)
  To: tuhs

Hi Chet,

> Is it better to spend time on bugs that will affect a larger
> percentage of the user population, instead of those that require
> artificial circumstances that won't be encountered by normal usage?
> Those get pushed down on the priority list.

You're talking about pushing unlikely, fuzzed bugs down the prioritised
list, but we're discussing those bugs not getting onto the list for
consideration.

Lack of resources also applies to triaging bugs and I agree a fuzzed bug
which hands over a 42 KiB of dense, gibberish awk will probably not get
volunteer attention.  But then fuzzers can seek a smaller test case,
similar to Andreas Zeller's delta debugging.

I'm in no way criticising Arnold who, like you, has spent many years
voluntarily enhancing a program many of us use every day.  But it's
interesting to shine some light on this corner to better understand
what's happening.

-- 
Cheers, Ralph.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-20 13:30         ` [TUHS] Re: A fuzzy awk Ralph Corderoy
@ 2024-05-20 13:48           ` Chet Ramey
  0 siblings, 0 replies; 47+ messages in thread
From: Chet Ramey @ 2024-05-20 13:48 UTC (permalink / raw)
  To: Ralph Corderoy, tuhs

[-- Attachment #1.1: Type: text/plain, Size: 1341 bytes --]

On 5/20/24 9:30 AM, Ralph Corderoy wrote:
> Hi Chet,
> 
>> Is it better to spend time on bugs that will affect a larger
>> percentage of the user population, instead of those that require
>> artificial circumstances that won't be encountered by normal usage?
>> Those get pushed down on the priority list.
> 
> You're talking about pushing unlikely, fuzzed bugs down the prioritised
> list, but we're discussing those bugs not getting onto the list for
> consideration.

I think the question is whether they were bugs in gawk at all, or the
result of gawk trying to be helpful by guessing at the script's intent
and trying to go on. Arnold's reaction to that, which had these negative
effects most often as the result of fuzzing attempts, was to exit on the
first syntax error.

Would those `bugs' have manifested themselves if gawk hadn't tried to do
this? Are they bugs at all? Guessing at intent is bound to be wrong some
of the time, and cause errors of its own.

I'm saying that fuzzing does occasionally find obscure bugs -- bugs that
would never be encountered in normal usage -- and those should be fixed.
Eventually.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
		 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    chet@case.edu    http://tiswww.cwru.edu/~chet/

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 203 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: The 'usage: ...' message. (Was: On Bloat...)
  2024-05-20  0:58 ` [TUHS] " Rob Pike
  2024-05-20  3:19   ` arnold
  2024-05-20  3:54   ` [TUHS] Re: The 'usage: ...' message. (Was: On Bloat...) Bakul Shah via TUHS
@ 2024-05-20 14:23   ` Clem Cole
  2024-05-20 17:30     ` Greg A. Woods
  2024-05-20 20:10     ` John Levine
  2024-05-20 17:40   ` Stuff Received
  3 siblings, 2 replies; 47+ messages in thread
From: Clem Cole @ 2024-05-20 14:23 UTC (permalink / raw)
  To: Rob Pike; +Cc: Douglas McIlroy, TUHS main list

[-- Attachment #1: Type: text/plain, Size: 3354 bytes --]

I was going to keep silent on this one until I realized I disagree with
both Doug and Rob here (always a little dangerous). But because of personal
experience, I have a pretty strong opinion is not really a win.  Note that I
cribbed this email response from an answer I wrote on Quora to the
question:

*When you are programming and commit a minor error, such as forgetting a
semicolon, the compiler throws an error and makes you fix it for yourself.
Why doesn’t it just fix it by itself and notify you of the fix instead?*


FWIW: The first version of the idea that I now about was DWIM - *Do What I
Mean* feature from BBN’s LISP (that eventually made it into InterLISP). As
the Wikipedia page describes DWIM became known as "Damn Warren's Infernal
Machine" [more details in the DWIM section of the jargon file].  As Doug
points out, the original Pascal implementation for Unix, pix(1), also
supported this idea of fixing your code for you, and as Rob points out, UCB’s
pix(1) took the idea of trying to keep going and make the compile work from
the earlier Cornell PL/C compiler for the IBM 360[1], which to quote
Wikipedia:

“The PL/C compiler had the unusual capability of never failing to compile
> any program, through the use of extensive automatic correction of many
> syntax errors and by converting any remaining syntax errors to output
> statements.”


The problem is that people can be lazy, and instead of using " DWIM" as a
tool to speed up their development and fix their own errors, they just
ignore the errors. In fact, when we were teaching the “Intro to CS” course
at UCB in the early 1980s; we actually had students turn in programs that
had syntax errors in them because pix(1) had corrected their code -- instead
of the student fixing his/her code before handing the program into the TA
(and then they would complain when they got “marked down” on the assignment
— sigh).

IMO: All in all, the experiment failed because many (??most??) people
really don’t work that way. Putting a feature like this in an IDE or even
an editor like emacs might be reasonable since the sources would be
modified, but it means you need to like using an IDE. I also ask --> what
happens when the computer’s (IDE) guess is different from the programmer's
real intent, and since it was ‘fixed’ behind the curtain, you don’t notice
it?

Some other people have suggested that DWIM isn’t a little like spelling
‘auto-correct’ or tools like ‘Grammarly.’ The truth is, I have a love/hate
relationship with auto-correct, particularly on my mobile devices. I'm
dyslexic, so tools like this can be helpful to me sometimes, but I spend a
great deal of my time fighting these types of tools because they are so
often wrong, particularly with a small screen/keyboard, that it is just
“not fun.”

This brings me back to my experience. IMO, auto-correct for programming is
like DWIM all over again, and the cure causes more problems than it solves.

Clem

[1] I should add that after Cornell’s PL/C compiler was introduced, IBM
eventually added a similar feature to its own PL/1, although it was not
nearly as extensive as the Cornell solution. I’m sure you can find people
who liked it, but in both cases, I personally never found it that useful.

>
ᐧ

[-- Attachment #2: Type: text/html, Size: 5841 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: The 'usage: ...' message. (Was: On Bloat...)
  2024-05-20 14:23   ` Clem Cole
@ 2024-05-20 17:30     ` Greg A. Woods
  2024-05-20 20:10     ` John Levine
  1 sibling, 0 replies; 47+ messages in thread
From: Greg A. Woods @ 2024-05-20 17:30 UTC (permalink / raw)
  To: The Unix Heritage Society mailing list

[-- Attachment #1: Type: text/plain, Size: 724 bytes --]

At Mon, 20 May 2024 10:23:56 -0400, Clem Cole <clemc@ccc.com> wrote:
Subject: [TUHS] Re: The 'usage: ...' message. (Was: On Bloat...)
>
> This brings me back to my experience. IMO, auto-correct for programming is
> like DWIM all over again, and the cure causes more problems than it solves.

We're deep down that rabbit hole this time with LLM/GPT systems
generating large swathes of code that I believe all too often gets into
production without any human programmer fully vetting its fitness for
purpose, or perhaps even understanding it.

--
					Greg A. Woods <gwoods@acm.org>

Kelowna, BC     +1 250 762-7675           RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com>     Avoncote Farms <woods@avoncote.ca>

[-- Attachment #2: OpenPGP Digital Signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: The 'usage: ...' message. (Was: On Bloat...)
  2024-05-20  0:58 ` [TUHS] " Rob Pike
                     ` (2 preceding siblings ...)
  2024-05-20 14:23   ` Clem Cole
@ 2024-05-20 17:40   ` Stuff Received
  3 siblings, 0 replies; 47+ messages in thread
From: Stuff Received @ 2024-05-20 17:40 UTC (permalink / raw)
  To: tuhs

On 2024-05-19 20:58, Rob Pike wrote:
> The Cornell PL/I compiler, PL/C, ran on the IBM 360 so of course used 
> batch input. It tried automatically to keep things running after a 
> parsing error by inserting some token - semicolon, parenthesis, whatever 
> seemed best - and continuing to parse, in order to maximize the amount 
> of input that could be parsed before giving up. At least, that's what I 
> took the motivation to be. It rarely succeeded in fixing the actual 
> problem, despite PL/I being plastered with semicolons, but it did tend 
> to ferret out more errors per run. I found the tactic helpful.
> 
> -rob
> 

Possibly way off topic but Toronto allowed anyone to run PL/C decks for 
free, which I often did.  One day, they decided to allow all of the card 
to be read as text and my card numbers generated all sorts of errors. 
(At least easily fixed by a visit the card punch.)

S.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: The 'usage: ...' message. (Was: On Bloat...)
  2024-05-20 14:23   ` Clem Cole
  2024-05-20 17:30     ` Greg A. Woods
@ 2024-05-20 20:10     ` John Levine
  2024-05-21  1:14       ` John Cowan
  1 sibling, 1 reply; 47+ messages in thread
From: John Levine @ 2024-05-20 20:10 UTC (permalink / raw)
  To: tuhs

It appears that Clem Cole <clemc@ccc.com> said:
>“The PL/C compiler had the unusual capability of never failing to compile
>> any program, through the use of extensive automatic correction of many
>> syntax errors and by converting any remaining syntax errors to output
>> statements.”
>
>The problem is that people can be lazy, and instead of using " DWIM" as a
>tool to speed up their development and fix their own errors, they just
>ignore the errors. ...

PL/C was a long time ago in the early 1970s. People used it on batch
systems whre you handed in your cards at the window, waited a while,
and later got your printout back. Or at advanced places, you could
run the cards through the reader yourself, then wait until the batch
ran.

In that environment, the benefit from possibly guessing an error
correction right meant fewer trips to the card reader. In my youth I
did a fair amount of programming that way in WATFOR/WATFIV and Algol W
where we really tried to get the programs right since we wanted to
finish up and go home.

When I was using interactive systems where you could fix one bug and
try again, over and over, it seemed like cheating.

R's,
John

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: The 'usage: ...' message. (Was: On Bloat...)
  2024-05-20 20:10     ` John Levine
@ 2024-05-21  1:14       ` John Cowan
  0 siblings, 0 replies; 47+ messages in thread
From: John Cowan @ 2024-05-21  1:14 UTC (permalink / raw)
  To: John Levine; +Cc: tuhs

[-- Attachment #1: Type: text/plain, Size: 1465 bytes --]

On Mon, May 20, 2024 at 4:11 PM John Levine <johnl@taugh.com> wrote:

It appears that Clem Cole <clemc@ccc.com> said:
> >“The PL/C compiler had the unusual capability of never failing to compile
> >> any program, through the use of extensive automatic correction of many
> >> syntax errors and by converting any remaining syntax errors to output
> >> statements.”
> PL/C was a long time ago in the early 1970s. People used it on batch
> systems whre you handed in your cards at the window, waited a while,
> and later got your printout back. Or at advanced places, you could
> run the cards through the reader yourself, then wait until the batch
> ran.


PL/C was a 3rd-generation autocorrection programming language.  CORC was
the 1962 version and CUPL was the 1966 version (same date as DWIM), neither
of them based on PL/I.  There is an implementation of both at <
http://www.catb.org/~esr/cupl/>.

The Wikipedia DWIM article also points to Magit, the Emacs git client.

>
> In that environment, the benefit from possibly guessing an error
> correction right meant fewer trips to the card reader. In my youth I
> did a fair amount of programming that way in WATFOR/WATFIV and Algol W
> where we really tried to get the programs right since we wanted to
> finish up and go home.
>
> When I was using interactive systems where you could fix one bug and
> try again, over and over, it seemed like cheating.
>
> R's,
> John
>

[-- Attachment #2: Type: text/html, Size: 2636 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-25 17:18     ` Paul Winalski
@ 2024-05-25 17:36       ` Tom Perrine
  0 siblings, 0 replies; 47+ messages in thread
From: Tom Perrine @ 2024-05-25 17:36 UTC (permalink / raw)
  To: Paul Winalski; +Cc: The Unix Heritage Society mailing list

[-- Attachment #1: Type: text/plain, Size: 2272 bytes --]

Another Gypsy user here...

For KSOS-11 the kernel was described in SPECIAL - as a set of axioms and
theorems. There was no actual connection between the formal specification
in SPECIAL and the Modula code.

Some of the critical user-space code for a trusted downgrade program, to
bridge data from higher levels of classification to lower, was written in
Gypsy. I visited UT Austin and Dr Good(?)'s team to learn it, IIRC. Gypsy
was considered better in that the specification was tied to the executable
through the pre/post conditions - and the better support for semi-automated
theorem proving.




On Sat, May 25, 2024 at 10:18 AM Paul Winalski <paul.winalski@gmail.com>
wrote:

> On Fri, May 24, 2024 at 8:18 PM Bakul Shah via TUHS <tuhs@tuhs.org> wrote:
>
> At one point I had suggested turning Go's Interface type to something like
>> Guttag style abstract data types in that relevant axioms are specified
>> right in the interface definition. The idea was that any concrete type that
>> implements that interface must satisfy its axioms. Even if the compiler
>> ignored these axioms, one can write a support program that can generate a
>> set of comprehensive tests based on these axioms. [Right now a type
>> "implementing" an interface only needs to have a set of methods that
>> exactly match the interface methods but nothing more] The underlying idea
>> is that each type is in essence a constraint on what values an instance of
>> that type can take. So adding such axioms simply tightens (& documents)
>> these constraints. Just the process of coming up with such axioms can
>> improve the design (sor of like test driven design but better!).
>>
>
> At one point I worked with a programming language called Gypsy that
> implemented this concept.  Each routine had a prefix that specified axioms
> on the routine's parameters and outputs.  The rest of Gypsy was a
> conventional procedural language but the semantics were carefully chosen to
> allow for automated proof of correctness.  I wrote a formal specification
> for the DECnet session layer protocol (DECnet's equivalent of TCP) in
> Gypsy.  I turned up a subtle bug in the prose version of the protocol
> specification in the process.
>
> -Paul W.
>

[-- Attachment #2: Type: text/html, Size: 2933 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-25  0:17   ` Bakul Shah via TUHS
  2024-05-25  0:57     ` G. Branden Robinson
  2024-05-25 13:56     ` David Arnold
@ 2024-05-25 17:18     ` Paul Winalski
  2024-05-25 17:36       ` Tom Perrine
  2 siblings, 1 reply; 47+ messages in thread
From: Paul Winalski @ 2024-05-25 17:18 UTC (permalink / raw)
  To: The Unix Heritage Society mailing list

[-- Attachment #1: Type: text/plain, Size: 1529 bytes --]

On Fri, May 24, 2024 at 8:18 PM Bakul Shah via TUHS <tuhs@tuhs.org> wrote:

At one point I had suggested turning Go's Interface type to something like
> Guttag style abstract data types in that relevant axioms are specified
> right in the interface definition. The idea was that any concrete type that
> implements that interface must satisfy its axioms. Even if the compiler
> ignored these axioms, one can write a support program that can generate a
> set of comprehensive tests based on these axioms. [Right now a type
> "implementing" an interface only needs to have a set of methods that
> exactly match the interface methods but nothing more] The underlying idea
> is that each type is in essence a constraint on what values an instance of
> that type can take. So adding such axioms simply tightens (& documents)
> these constraints. Just the process of coming up with such axioms can
> improve the design (sor of like test driven design but better!).
>

At one point I worked with a programming language called Gypsy that
implemented this concept.  Each routine had a prefix that specified axioms
on the routine's parameters and outputs.  The rest of Gypsy was a
conventional procedural language but the semantics were carefully chosen to
allow for automated proof of correctness.  I wrote a formal specification
for the DECnet session layer protocol (DECnet's equivalent of TCP) in
Gypsy.  I turned up a subtle bug in the prose version of the protocol
specification in the process.

-Paul W.

[-- Attachment #2: Type: text/html, Size: 1855 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-25  0:17   ` Bakul Shah via TUHS
  2024-05-25  0:57     ` G. Branden Robinson
@ 2024-05-25 13:56     ` David Arnold
  2024-05-25 17:18     ` Paul Winalski
  2 siblings, 0 replies; 47+ messages in thread
From: David Arnold @ 2024-05-25 13:56 UTC (permalink / raw)
  To: Bakul Shah; +Cc: The Unix Heritage Society mailing list


> On 25 May 2024, at 10:18, Bakul Shah via TUHS <tuhs@tuhs.org> wrote:
> 
> 
> What would be nice if programming languages provided some support for such exhaustive testing[1].
> 
> At one point I had suggested turning Go's Interface type to something like Guttag style abstract data types in that relevant axioms are specified right in the interface definition. The idea was that any concrete type that implements that interface must satisfy its axioms. Even if the compiler ignored these axioms, one can write a support program that can generate a set of comprehensive tests based on these axioms.

Sounds like Eiffel, whose compiler had support for checking pre and post conditions (and maybe invariants?) at runtime, or disabling the checks for “performance” mode. 



d

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-25  0:17   ` Bakul Shah via TUHS
@ 2024-05-25  0:57     ` G. Branden Robinson
  2024-05-25 13:56     ` David Arnold
  2024-05-25 17:18     ` Paul Winalski
  2 siblings, 0 replies; 47+ messages in thread
From: G. Branden Robinson @ 2024-05-25  0:57 UTC (permalink / raw)
  To: The Unix Heritage Society mailing list

[-- Attachment #1: Type: text/plain, Size: 5170 bytes --]

[restricting to list; strong opinions here]

At 2024-05-24T17:17:53-0700, Bakul Shah via TUHS wrote:
> What would be nice if programming languages provided some support for
> such exhaustive testing[1].

[rearranging]
> At one point I had suggested turning Go's Interface type to something
> like Guttag style abstract data types in that relevant axioms are
> specified right in the interface definition.

It's an excellent idea.

> The underlying idea is that each type is in essence a constraint on
> what values an instance of that type can take.

In the simple form of a data type plus a range constraint, that's the
Ada definition of a subtype since day one--Ada '80 or Ada 83 if you
insist on the standardized form of the language.

40 years later we have Linus Torvalds tearing up his achievement
certificate in "kinder, gentler email interactions" just to trash the
notion of range checks on data types.[1][2][3]

Naturally, the brogrammers are quick to take Torvalds's side.[4]

Pascal had range checks too, and Kernighan famously punked on Wirth for
that.  I'm not certain, but I get the feeling the latter got somewhat
over-interpreted.  (To be fair to Kernighan, Pascal _as specced in the
Revised Report of 1973_[5] was in my opinion too weak a language to
leave the lab, for many of the reasons he noted.  The inflexible array
typing was fatal, in my view.)

> The idea was that any concrete type that implements that interface
> must satisfy its axioms.

Yes.  There is of course much more to the universe of potential
constraints than range checks.  Ada 2022 has these in great generality
with "subtype predicates".

http://www.ada-auth.org/standards/22aarm/html/aa-3-2-4.html

> Even if the compiler ignored these axioms,

I don't understand why this idea wasn't seized upon with more force at
the CSRC.  The notion of a compiler flag that turned "extra" (in the
Ritchie compiler circa 1980, this is perhaps expressed better as "any")
correctness checks could not have been a novelty.  NDEBUG and assert()
are similarly extremely old even in Unix.

> one can write a support program that can generate a set of
> comprehensive tests based on these axioms.

Yes.  As I understand it, this is how Spark/Ada got started.  Specially
annotated comments expressing predicates communicated with such a
support program, running much like the sort of automated theorem
prover you characterize below as not "retail".

In the last two revision cycles of the Ada standard (2013, 2022),
Spark/Ada's enhancements have made it into the language--though I am not
certain, and would not claim, that they compose with _every_ language
feature.  Spark/Ada started life as a subset of the language for a
reason.

But C has its own subset, MISRA C, so this is hardly a reason to scoff.

> [Right now a type "implementing" an interface only needs to
> have a set of methods that exactly match the interface methods but
> nothing more] The underlying idea is that each type is in essence a
> constraint on what values an instance of that type can take. So adding
> such axioms simply tightens (& documents) these constraints. Just the
> process of coming up with such axioms can improve the design (sor of
> like test driven design but better!).

Absolutely.  Generally, software engineers like to operationalize things
consistently enough that they can then be scripted/automated.

Evidently software testing is so mind-numblingly tedious that the will
to undertake it, even with automation, evaporates.

> Now it may be that applying this to anything more complex than stacks
> won't work well & it won't be perfect but I thought this was worth
> experimenting with. This would be like functional testing of all the
> nuts and bolts and components that go in an airplane. The airplane may
> still fall apart but that would be a "composition" error!

Yes.  And even if you can prove 100% of the theorems in your system, you
may learn to your dismay that your specification was defective.
Automated provers are as yet no aid to system architects.

> [1] There are "proof assisant" or formal spec languages such as TLA+,
> Coq, Isabelle etc. but they don't get used much by the average
> programmer. I want something more retail!

I've had a little exposure to these.  They are indeed esoteric, but also
extremely resource-hungry.  My _impression_, based on no hard data, is
that increasing the abilities of static analyzers and the expressiveness
with which they are directed with predicates is much cheaper.

But a lot of programmers will not budge at any cost, and will moreover
be celebrated by their peers for their obstinacy.  See footnotes.

There is much work still to be done.

Regards,
Branden

[1] https://lore.kernel.org/all/202404291502.612E0A10@keescook/
    https://lore.kernel.org/all/CAHk-=wi5YPwWA8f5RAf_Hi8iL0NhGJeL6MN6UFWwRMY8L6UDvQ@mail.gmail.com/
[2] https://lore.kernel.org/lkml/CAHk-=whkGHOmpM_1kNgzX1UDAs10+UuALcpeEWN29EE0m-my=w@mail.gmail.com/
[3] https://www.businessinsider.com/linus-torvalds-linux-time-away-empathy-2018-9
[4] https://lwn.net/Articles/973108/
[5] https://archive.org/details/1973-the-programming-language-pascal-revised-report-wirth

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-23 20:52 ` Rob Pike
  2024-05-24  5:41   ` andrew
  2024-05-24  7:17   ` Ralph Corderoy
@ 2024-05-25  0:17   ` Bakul Shah via TUHS
  2024-05-25  0:57     ` G. Branden Robinson
                       ` (2 more replies)
  2 siblings, 3 replies; 47+ messages in thread
From: Bakul Shah via TUHS @ 2024-05-25  0:17 UTC (permalink / raw)
  To: Rob Pike; +Cc: Douglas McIlroy, The Unix Heritage Society mailing list

[-- Attachment #1: Type: text/plain, Size: 3131 bytes --]

What would be nice if programming languages provided some support for such exhaustive testing[1].

At one point I had suggested turning Go's Interface type to something like Guttag style abstract data types in that relevant axioms are specified right in the interface definition. The idea was that any concrete type that implements that interface must satisfy its axioms. Even if the compiler ignored these axioms, one can write a support program that can generate a set of comprehensive tests based on these axioms. [Right now a type "implementing" an interface only needs to have a set of methods that exactly match the interface methods but nothing more] The underlying idea is that each type is in essence a constraint on what values an instance of that type can take. So adding such axioms simply tightens (& documents) these constraints. Just the process of coming up with such axioms can improve the design (sor of like test driven design but better!).

Now it may be that applying this to anything more complex than stacks won't work well & it won't be perfect but I thought this was worth experimenting with. This would be like functional testing of all the nuts and bolts and components that go in an airplane. The airplane may still fall apart but that would be a "composition" error!

[1] There are "proof assisant" or formal spec languages such as TLA+, Coq, Isabelle etc. but they don't get used much by the average programmer. I want something more retail!

> On May 23, 2024, at 1:52 PM, Rob Pike <robpike@gmail.com> wrote:
> 
> The semantic distinction is important but the end result is very similar. "Fuzzing" as it is now called (for no reason I can intuit) tries to get to the troublesome cases faster by a sort of depth-first search, but exhaustive will always beat it for value. Our exhaustive tester for bitblt, first done by John Reiser if I remember right, set the stage for my own thinking about how you properly test something.
> 
> -rob
> 
> 
> On Thu, May 23, 2024 at 11:49 PM Douglas McIlroy <douglas.mcilroy@dartmouth.edu <mailto:douglas.mcilroy@dartmouth.edu>> wrote:
>> > Doug McIlroy was generating random regular expressions
>> 
>> Actually not. I exhaustively (within limits) tested an RE recognizer without knowingly generating any RE either mechanically or by hand.
>> 
>> The trick: From recursive equations (easily derived from the grammar of REs), I counted how many REs exist up to various limits on token counts, Then I generated all strings that satisfied those limits, turned the recognizer loose on them and counted how many it accepted. Any disagreement of counts revealed the existence (but not any symptom) of bugs. 
>> 
>> Unlike most diagnostic techniques, this scheme produces a certificate of (very high odds on) correctness over a representative subdomain. The scheme also agnostically checks behavior on bad inputs as well as good.  It does not, however, provide a stress test of a recognizer's capacity limits. And its exponential nature limits its applicability to rather small domains. (REs have only 5 distinct kinds of token.)
>> 
>> Doug

[-- Attachment #2: Type: text/html, Size: 4916 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-24  7:17   ` Ralph Corderoy
  2024-05-24  7:41     ` Rob Pike
@ 2024-05-24 11:56     ` Dan Halbert
  1 sibling, 0 replies; 47+ messages in thread
From: Dan Halbert @ 2024-05-24 11:56 UTC (permalink / raw)
  To: tuhs

[-- Attachment #1: Type: text/plain, Size: 822 bytes --]

On 5/24/24 03:17, Ralph Corderoy wrote:
> Rob wrote:
>> "Fuzzing" as it is now called (for no reason I can intuit)
> Barton Miller describes coining the term.
>
As to where the inspiration of choice of word came from, I'll speculate 
: Bart Miller was a CS grad student contemporary of mine at Berkeley. 
Prof. Lotfi Zadeh was working on fuzzy logic, fuzzy sets, and 
"possibility theory". (Prof. William Kahan hated this work, and called 
it "wrong, and pernicious": cf. 
https://www.sciencedirect.com/science/article/abs/pii/S0020025508000716.) 
So the term "fuzzy" was almost infamous in the department.

Prof. Richard Lipton was also at Berkeley at that time, and was working 
on program mutation testing, which fuzzes the program to determine the 
adequacy of test coverage, rather than fuzzing the test data.

Dan H.

[-- Attachment #2: Type: text/html, Size: 1582 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-24  7:17   ` Ralph Corderoy
@ 2024-05-24  7:41     ` Rob Pike
  2024-05-24 11:56     ` Dan Halbert
  1 sibling, 0 replies; 47+ messages in thread
From: Rob Pike @ 2024-05-24  7:41 UTC (permalink / raw)
  To: Ralph Corderoy; +Cc: tuhs

[-- Attachment #1: Type: text/plain, Size: 768 bytes --]

I'm sure that's the etymology but fuzzing isn't exactly random. That's
kinda the point of it.

-rob


On Fri, May 24, 2024 at 5:18 PM Ralph Corderoy <ralph@inputplus.co.uk>
wrote:

> Hi,
>
> Rob wrote:
> > "Fuzzing" as it is now called (for no reason I can intuit)
>
> Barton Miller describes coining the term.
>
>    ‘That night, I was logged on to the Unix system in my office via
>     a dial-up phone line over a 1200 baud modem.  ...
>     I wanted a name that would evoke the feeling of random, unstructured
>     data.  After trying out several ideas, I settled on the term “fuzz”.’
>
>         — https://pages.cs.wisc.edu/~bart/fuzz/Foreword1.html
>
> Line noise inspired him, as he describes.
>
> --
> Cheers, Ralph.
>

[-- Attachment #2: Type: text/html, Size: 1507 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-23 20:52 ` Rob Pike
  2024-05-24  5:41   ` andrew
@ 2024-05-24  7:17   ` Ralph Corderoy
  2024-05-24  7:41     ` Rob Pike
  2024-05-24 11:56     ` Dan Halbert
  2024-05-25  0:17   ` Bakul Shah via TUHS
  2 siblings, 2 replies; 47+ messages in thread
From: Ralph Corderoy @ 2024-05-24  7:17 UTC (permalink / raw)
  To: tuhs

Hi,

Rob wrote:
> "Fuzzing" as it is now called (for no reason I can intuit)

Barton Miller describes coining the term.

   ‘That night, I was logged on to the Unix system in my office via
    a dial-up phone line over a 1200 baud modem.  ...
    I wanted a name that would evoke the feeling of random, unstructured
    data.  After trying out several ideas, I settled on the term “fuzz”.’

        — https://pages.cs.wisc.edu/~bart/fuzz/Foreword1.html

Line noise inspired him, as he describes.

-- 
Cheers, Ralph.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-23 20:52 ` Rob Pike
@ 2024-05-24  5:41   ` andrew
  2024-05-24  7:17   ` Ralph Corderoy
  2024-05-25  0:17   ` Bakul Shah via TUHS
  2 siblings, 0 replies; 47+ messages in thread
From: andrew @ 2024-05-24  5:41 UTC (permalink / raw)
  To: Rob Pike; +Cc: Douglas McIlroy, TUHS main list

[-- Attachment #1: Type: text/plain, Size: 1920 bytes --]

i did some of the later testing of bitblt.
it was a lovely thing, slowly constructing a trustable synthetic bitblt
of ever great size and range that you could compare the bitblt to be tested against.

and we did find a couple of bugs, much to reiser’s chagrin.

> On May 23, 2024, at 1:52 PM, Rob Pike <robpike@gmail.com> wrote:
> 
> The semantic distinction is important but the end result is very similar. "Fuzzing" as it is now called (for no reason I can intuit) tries to get to the troublesome cases faster by a sort of depth-first search, but exhaustive will always beat it for value. Our exhaustive tester for bitblt, first done by John Reiser if I remember right, set the stage for my own thinking about how you properly test something.
> 
> -rob
> 
> 
> On Thu, May 23, 2024 at 11:49 PM Douglas McIlroy <douglas.mcilroy@dartmouth.edu <mailto:douglas.mcilroy@dartmouth.edu>> wrote:
>> > Doug McIlroy was generating random regular expressions
>> 
>> Actually not. I exhaustively (within limits) tested an RE recognizer without knowingly generating any RE either mechanically or by hand.
>> 
>> The trick: From recursive equations (easily derived from the grammar of REs), I counted how many REs exist up to various limits on token counts, Then I generated all strings that satisfied those limits, turned the recognizer loose on them and counted how many it accepted. Any disagreement of counts revealed the existence (but not any symptom) of bugs. 
>> 
>> Unlike most diagnostic techniques, this scheme produces a certificate of (very high odds on) correctness over a representative subdomain. The scheme also agnostically checks behavior on bad inputs as well as good.  It does not, however, provide a stress test of a recognizer's capacity limits. And its exponential nature limits its applicability to rather small domains. (REs have only 5 distinct kinds of token.)
>> 
>> Doug


[-- Attachment #2: Type: text/html, Size: 3529 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-23 13:49 Douglas McIlroy
@ 2024-05-23 20:52 ` Rob Pike
  2024-05-24  5:41   ` andrew
                     ` (2 more replies)
  0 siblings, 3 replies; 47+ messages in thread
From: Rob Pike @ 2024-05-23 20:52 UTC (permalink / raw)
  To: Douglas McIlroy; +Cc: TUHS main list

[-- Attachment #1: Type: text/plain, Size: 1551 bytes --]

The semantic distinction is important but the end result is very similar.
"Fuzzing" as it is now called (for no reason I can intuit) tries to get to
the troublesome cases faster by a sort of depth-first search, but
exhaustive will always beat it for value. Our exhaustive tester for bitblt,
first done by John Reiser if I remember right, set the stage for my own
thinking about how you properly test something.

-rob


On Thu, May 23, 2024 at 11:49 PM Douglas McIlroy <
douglas.mcilroy@dartmouth.edu> wrote:

> > Doug McIlroy was generating random regular expressions
>
> Actually not. I exhaustively (within limits) tested an RE recognizer
> without knowingly generating any RE either mechanically or by hand.
>
> The trick: From recursive equations (easily derived from the grammar of
> REs), I counted how many REs exist up to various limits on token counts,
> Then I generated all strings that satisfied those limits, turned the
> recognizer loose on them and counted how many it accepted. Any disagreement
> of counts revealed the existence (but not any symptom) of bugs.
>
> Unlike most diagnostic techniques, this scheme produces a certificate of
> (very high odds on) correctness over a representative subdomain. The
> scheme also agnostically checks behavior on bad inputs as well as good.  It
> does not, however, provide a stress test of a recognizer's capacity limits. And
> its exponential nature limits its applicability to rather small domains.
> (REs have only 5 distinct kinds of token.)
>
> Doug
>

[-- Attachment #2: Type: text/html, Size: 2792 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk
@ 2024-05-23 13:49 Douglas McIlroy
  2024-05-23 20:52 ` Rob Pike
  0 siblings, 1 reply; 47+ messages in thread
From: Douglas McIlroy @ 2024-05-23 13:49 UTC (permalink / raw)
  To: TUHS main list

[-- Attachment #1: Type: text/plain, Size: 971 bytes --]

> Doug McIlroy was generating random regular expressions

Actually not. I exhaustively (within limits) tested an RE recognizer
without knowingly generating any RE either mechanically or by hand.

The trick: From recursive equations (easily derived from the grammar of
REs), I counted how many REs exist up to various limits on token counts,
Then I generated all strings that satisfied those limits, turned the
recognizer loose on them and counted how many it accepted. Any disagreement
of counts revealed the existence (but not any symptom) of bugs.

Unlike most diagnostic techniques, this scheme produces a certificate of
(very high odds on) correctness over a representative subdomain. The scheme
also agnostically checks behavior on bad inputs as well as good.  It does
not, however, provide a stress test of a recognizer's capacity limits. And
its exponential nature limits its applicability to rather small domains.
(REs have only 5 distinct kinds of token.)

Doug

[-- Attachment #2: Type: text/html, Size: 1698 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-22 18:49         ` Larry McVoy
@ 2024-05-22 20:17           ` Larry McVoy
  0 siblings, 0 replies; 47+ messages in thread
From: Larry McVoy @ 2024-05-22 20:17 UTC (permalink / raw)
  To: tuhs

Wayne teased this into a stand alone library here: 

https://github.com/wscott/bksupport

On Wed, May 22, 2024 at 11:49:04AM -0700, Larry McVoy wrote:
> On Wed, May 22, 2024 at 11:37:39AM -0400, Paul Winalski wrote:
> > On Tue, May 21, 2024 at 2:12???PM Luther Johnson <luther.johnson@makerlisp.com>
> > wrote:
> > 
> > > I like this anecdote because it points out the difference between being
> > > able to handle and process bizarre conditions, as if they were something
> > > that should work, which is maybe not that helpful, vs. detecting them and
> > > doing something reasonable, like failiing with a "limit exceeded" message
> > >
> > That is in fact precisely how the DEC compiler handled the 100 nested
> > parentheses condition.
> > 
> > > . A silent, insidious failure down the line because a limit was exceeded
> > > is never good.
> > >
> > Amen!  One should always do bounds checking when dealing with fixed-size
> > aggregate data structures.  One compiler that I worked on got a bug report
> > of bad code being generated.  The problem was an illegal optimization that
> > never should have triggered but did due to a corrupted data table.  Finding
> > the culprit of the corruption took hours.  It finally turned out to be due
> > to overflow of an adjacent data table in use elsewhere in the compiler.
> > The routine to add another entry to that table didn't check for table
> > overflow.
> 
> We invented a data structure that gets around this problem nicely.  It's
> an array of pointers that starts at [1] instead of [0].  The [0]
> entry encodes 2 things:
> 
> In the upper bits, the log(2) the size of the array.  So all arrays
> have at least [0] and [1].  So 2 pointers is the smallest array and
> that was important to us, we wanted it to scale up and scale down.
> 
> In the lower bits, we record the number of used entries in the array.
> We assumed 32 bit pointers and with those we got ~134 million entries
> as our maximum number of entries.
> 
> Usage is like
> 
> char **space = allocLines(4);	// start with space for 4 entries
> 
> space = addLine(space, "I am [1]");
> space = addLine(space, "I am [2]");
> space = addLine(space, "I am [3]");
> space = addLine(space, "I am [4]");	// realloc's to 8 entries
> 
> freelines(space, 0);	// second arg is typically 0 or free()
> 
> It works GREAT.  We used it all over BitKeeper, for stuff as small as
> commit comments to arrays of data structures.  It scales down, scales
> up.  Helper functions:
> 
> /*
>  * liblines - interfaces for autoexpanding data structures
>  *
>  * s= allocLines(n)
>  *      pre allocate space for slightly less than N entries.
>  * s = addLine(s, line)
>  *      add line to s, allocating as needed.
>  *      line must be a pointer to preallocated space.
>  * freeLines(s, freep)
>  *      free the lines array; if freep is set, call that on each entry.
>  *      if freep is 0, do not free each entry.
>  * buf = popLine(s)
>  *      return the most recently added line (not an alloced copy of it)
>  * reverseLines(s)
>  *      reverse the order of the lines in the array
>  * sortLines(space, compar)
>  *      sort the lines using the compar function if set, else string_sort()
>  * removeLine(s, which, freep)
>  *      look for all lines which match "which" and remove them from the array
>  *      returns number of matches found
>  * removeLineN(s, i, freep)
>  *      remove the 'i'th line.
>  * lines = splitLine(buf, delim, lines)
>  *      split buf on any/all chars in delim and put the tokens in lines.
>  * buf = joinLines(":", s)
>  *      return one string which is all the strings glued together with ":"
>  *      does not free s, caller must free s.
>  * buf = findLine(lines, needle);
>  *      Return the index the line in lines that matches needle
>  */
> 
> It's all open source, apache licensed, but you'd have to tease it out of
> the bitkeeper source tree.  Wouldn't be that hard and it would be useful.

-- 
---
Larry McVoy           Retired to fishing          http://www.mcvoy.com/lm/boat

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-22 15:37       ` Paul Winalski
@ 2024-05-22 18:49         ` Larry McVoy
  2024-05-22 20:17           ` Larry McVoy
  0 siblings, 1 reply; 47+ messages in thread
From: Larry McVoy @ 2024-05-22 18:49 UTC (permalink / raw)
  To: Paul Winalski; +Cc: Luther Johnson, tuhs

On Wed, May 22, 2024 at 11:37:39AM -0400, Paul Winalski wrote:
> On Tue, May 21, 2024 at 2:12???PM Luther Johnson <luther.johnson@makerlisp.com>
> wrote:
> 
> > I like this anecdote because it points out the difference between being
> > able to handle and process bizarre conditions, as if they were something
> > that should work, which is maybe not that helpful, vs. detecting them and
> > doing something reasonable, like failiing with a "limit exceeded" message
> >
> That is in fact precisely how the DEC compiler handled the 100 nested
> parentheses condition.
> 
> > . A silent, insidious failure down the line because a limit was exceeded
> > is never good.
> >
> Amen!  One should always do bounds checking when dealing with fixed-size
> aggregate data structures.  One compiler that I worked on got a bug report
> of bad code being generated.  The problem was an illegal optimization that
> never should have triggered but did due to a corrupted data table.  Finding
> the culprit of the corruption took hours.  It finally turned out to be due
> to overflow of an adjacent data table in use elsewhere in the compiler.
> The routine to add another entry to that table didn't check for table
> overflow.

We invented a data structure that gets around this problem nicely.  It's
an array of pointers that starts at [1] instead of [0].  The [0]
entry encodes 2 things:

In the upper bits, the log(2) the size of the array.  So all arrays
have at least [0] and [1].  So 2 pointers is the smallest array and
that was important to us, we wanted it to scale up and scale down.

In the lower bits, we record the number of used entries in the array.
We assumed 32 bit pointers and with those we got ~134 million entries
as our maximum number of entries.

Usage is like

char **space = allocLines(4);	// start with space for 4 entries

space = addLine(space, "I am [1]");
space = addLine(space, "I am [2]");
space = addLine(space, "I am [3]");
space = addLine(space, "I am [4]");	// realloc's to 8 entries

freelines(space, 0);	// second arg is typically 0 or free()

It works GREAT.  We used it all over BitKeeper, for stuff as small as
commit comments to arrays of data structures.  It scales down, scales
up.  Helper functions:

/*
 * liblines - interfaces for autoexpanding data structures
 *
 * s= allocLines(n)
 *      pre allocate space for slightly less than N entries.
 * s = addLine(s, line)
 *      add line to s, allocating as needed.
 *      line must be a pointer to preallocated space.
 * freeLines(s, freep)
 *      free the lines array; if freep is set, call that on each entry.
 *      if freep is 0, do not free each entry.
 * buf = popLine(s)
 *      return the most recently added line (not an alloced copy of it)
 * reverseLines(s)
 *      reverse the order of the lines in the array
 * sortLines(space, compar)
 *      sort the lines using the compar function if set, else string_sort()
 * removeLine(s, which, freep)
 *      look for all lines which match "which" and remove them from the array
 *      returns number of matches found
 * removeLineN(s, i, freep)
 *      remove the 'i'th line.
 * lines = splitLine(buf, delim, lines)
 *      split buf on any/all chars in delim and put the tokens in lines.
 * buf = joinLines(":", s)
 *      return one string which is all the strings glued together with ":"
 *      does not free s, caller must free s.
 * buf = findLine(lines, needle);
 *      Return the index the line in lines that matches needle
 */

It's all open source, apache licensed, but you'd have to tease it out of
the bitkeeper source tree.  Wouldn't be that hard and it would be useful.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-21 18:12     ` Luther Johnson
@ 2024-05-22 15:37       ` Paul Winalski
  2024-05-22 18:49         ` Larry McVoy
  0 siblings, 1 reply; 47+ messages in thread
From: Paul Winalski @ 2024-05-22 15:37 UTC (permalink / raw)
  To: Luther Johnson; +Cc: tuhs

[-- Attachment #1: Type: text/plain, Size: 1139 bytes --]

On Tue, May 21, 2024 at 2:12 PM Luther Johnson <luther.johnson@makerlisp.com>
wrote:

> I like this anecdote because it points out the difference between being
> able to handle and process bizarre conditions, as if they were something
> that should work, which is maybe not that helpful, vs. detecting them and
> doing something reasonable, like failiing with a "limit exceeded" message
>
That is in fact precisely how the DEC compiler handled the 100 nested
parentheses condition.

> . A silent, insidious failure down the line because a limit was exceeded
> is never good.
>
Amen!  One should always do bounds checking when dealing with fixed-size
aggregate data structures.  One compiler that I worked on got a bug report
of bad code being generated.  The problem was an illegal optimization that
never should have triggered but did due to a corrupted data table.  Finding
the culprit of the corruption took hours.  It finally turned out to be due
to overflow of an adjacent data table in use elsewhere in the compiler.
The routine to add another entry to that table didn't check for table
overflow.

-Paul W.

[-- Attachment #2: Type: text/html, Size: 1742 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-20 13:41   ` [TUHS] Re: A fuzzy awk Ralph Corderoy
  2024-05-20 14:26     ` Chet Ramey
@ 2024-05-22 13:44     ` arnold
  1 sibling, 0 replies; 47+ messages in thread
From: arnold @ 2024-05-22 13:44 UTC (permalink / raw)
  To: tuhs, ralph

I've been travelling, so I haven't been able to answer
these mails until now.

Ralph Corderoy <ralph@inputplus.co.uk> wrote:

> I can see an avalanche of errors in an earlier gawk caused problems, but
> each time there would have been a first patch of the input which made
> a mistake causing the pebble to start rolling.  My understanding is that
> there was potentially a lot of these and rather than fix them it was
> more productive of the limited time to stop patching the input.  Then
> the code which patched could be deleted, getting rid of the buggy bits
> along the way?

That's not the case. Gawk didn't try to patch the input. It
simply set a flag saying "don't try to run" but kept on parsing
anyway, in the hope of finding more errors.

That was a bad idea, because the representation of the program
being built was then not in the correct state to have more
stuff parsed and converted into byte code.

Very early on, the first parse error caused an exit. I changed
it to keep going to try to be helpful. But when that became a source
for essentially specious bug reports and a time sink for me, it
became time to go back to exiting on the first problem.

HTH,

Arnold

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-22  5:08       ` Alexis
@ 2024-05-22 13:12         ` Warner Losh
  0 siblings, 0 replies; 47+ messages in thread
From: Warner Losh @ 2024-05-22 13:12 UTC (permalink / raw)
  To: Alexis; +Cc: The Unix Heritage Society

[-- Attachment #1: Type: text/plain, Size: 1095 bytes --]

On Tue, May 21, 2024, 11:08 PM Alexis <flexibeast@gmail.com> wrote:

> Dave Horsfall <dave@horsfall.org> writes:
>
> > On Tue, 21 May 2024, Paul Winalski wrote:
> >
> >> To take an example that really happened, a fuzz test consisting
> >> of 100
> >> nested parentheses caused an overflow in a parser table (it
> >> could only
> >> handle 50 nested parens).  Is that worth fixing?
> >
> > Well, they could be a rabid LISP programmer...
>
> Just did a quick check of some of the ELisp packages on my system:
>
> * For my own packages, the maximum was 10 closing parentheses.
> * For the packages in my elpa/ directory, the maximum was 26 in
>   ducpel-glyphs.el, where they were part of a glyph, rather than
>   delimiting code. The next highest value was 16, in org.el and
>   magit-sequence.el.
>
> i would suggest that any Lisp with more than a couple of dozen
> closing parentheses is in dire need of refactoring. Although of
> course someone who's rabid is probably not in the appropriate
> mental state for that. :-)
>

That's what ']' is for.

Warner

>

[-- Attachment #2: Type: text/html, Size: 1822 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-22  3:26     ` Dave Horsfall
@ 2024-05-22  5:08       ` Alexis
  2024-05-22 13:12         ` Warner Losh
  0 siblings, 1 reply; 47+ messages in thread
From: Alexis @ 2024-05-22  5:08 UTC (permalink / raw)
  To: Dave Horsfall; +Cc: The Unix Heritage Society

Dave Horsfall <dave@horsfall.org> writes:

> On Tue, 21 May 2024, Paul Winalski wrote:
>
>> To take an example that really happened, a fuzz test consisting 
>> of 100 
>> nested parentheses caused an overflow in a parser table (it 
>> could only 
>> handle 50 nested parens).  Is that worth fixing?
>
> Well, they could be a rabid LISP programmer...

Just did a quick check of some of the ELisp packages on my system:

* For my own packages, the maximum was 10 closing parentheses.
* For the packages in my elpa/ directory, the maximum was 26 in 
  ducpel-glyphs.el, where they were part of a glyph, rather than 
  delimiting code. The next highest value was 16, in org.el and 
  magit-sequence.el.

i would suggest that any Lisp with more than a couple of dozen 
closing parentheses is in dire need of refactoring. Although of 
course someone who's rabid is probably not in the appropriate 
mental state for that. :-)

Alexis.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-21 16:59   ` Paul Winalski
  2024-05-21 17:56     ` segaloco via TUHS
  2024-05-21 18:12     ` Luther Johnson
@ 2024-05-22  3:26     ` Dave Horsfall
  2024-05-22  5:08       ` Alexis
  2 siblings, 1 reply; 47+ messages in thread
From: Dave Horsfall @ 2024-05-22  3:26 UTC (permalink / raw)
  To: The Eunuchs Hysterical Society

[-- Attachment #1: Type: text/plain, Size: 307 bytes --]

On Tue, 21 May 2024, Paul Winalski wrote:

> To take an example that really happened, a fuzz test consisting of 100 
> nested parentheses caused an overflow in a parser table (it could only 
> handle 50 nested parens).  Is that worth fixing?

Well, they could be a rabid LISP programmer...

-- Dave

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-21 16:59   ` Paul Winalski
  2024-05-21 17:56     ` segaloco via TUHS
@ 2024-05-21 18:12     ` Luther Johnson
  2024-05-22 15:37       ` Paul Winalski
  2024-05-22  3:26     ` Dave Horsfall
  2 siblings, 1 reply; 47+ messages in thread
From: Luther Johnson @ 2024-05-21 18:12 UTC (permalink / raw)
  To: tuhs

[-- Attachment #1: Type: text/plain, Size: 3463 bytes --]

I like this anecdote because it points out the difference between being
able to handle and process bizarre conditions, as if they were something
that should work, which is maybe not that helpful, vs. detecting them
and doing something reasonable, like failiing with a "limit exceeded"
message. A silent, insidious failure down the line because a limit was
exceeded is never good. If "fuzz testing" helps exercise limits and
identifies places where software hasn't realized it has exceeded its
limits, has run off the end of a table, etc., that seems like a good
thing to me.

On 05/21/2024 09:59 AM, Paul Winalski wrote:
> On Tue, May 21, 2024 at 12:09 AM Serissa <stewart@serissa.com
> <mailto:stewart@serissa.com>> wrote:
>
>         Well this is obviously a hot button topic.  AFAIK I was nearby
>         when fuzz-testing for software was invented. I was the main
>         advocate for hiring Andy Payne into the Digital Cambridge
>         Research Lab.  One of his little projects was a thing that
>         generated random but correct C programs and fed them to
>         different compilers or compilers with different switches to
>         see if they crashed or generated incorrect results.
>         Overnight, his tester filed 300 or so bug reports against the
>         Digital C compiler.  This was met with substantial pushback,
>         but it was a mostly an issue that many of the reports traced
>         to the same underlying bugs.
>
>         Bill McKeemon expanded the technique and published
>         "Differential Testing of Software"
>         https://www.cs.swarthmore.edu/~bylvisa1/cs97/f13/Papers/DifferentialTestingForSoftware.pdf
>         <https://www.cs.swarthmore.edu/%7Ebylvisa1/cs97/f13/Papers/DifferentialTestingForSoftware.pdf>
>
> In the mid-late 1980s Bill Mckeeman worked with DEC's compiler product
> teams to introduce fuzz testing into our testing process.  As with the
> C compiler work at DEC Cambridge, fuzz testing for other compilers
> (Fortran, PL/I) also found large numbers of bugs.
>
> The pushback from the compiler folks was mainly a matter of
> priorities.  Fuzz testing is very adept at finding edge conditions,
> but most failing fuzz tests have syntax that no human programmer would
> ever write.  As a compiler engineer you have limited time to devote to
> bug testing.  Do you spend that time addressing real customer issues
> that have been reported or do you spend it fixing problems with code
> that no human being would ever write?  To take an example that really
> happened, a fuzz test consisting of 100 nested parentheses caused an
> overflow in a parser table (it could only handle 50 nested parens).
> Is that worth fixing?
>
> As you pointed out, fuzz test failures tend to occur in clusters and
> many of the failures eventually are traced to the same underlying
> bug.  Which leads to the counter-argument to the pushback.  The fuzz
> tests are finding real underlying bugs.  Why not fix them before a
> customer runs into them?  That very thing did happen several times.  A
> customer-reported bug was fixed and suddenly several of the fuzz test
> problems that had been reported went away.  Another consideration is
> that, even back in the 1980s, humans weren't the only ones writing
> programs.  There were programs writing programs and they sometimes
> produced bizarre (but syntactically correct) code.
>
> -Paul W.


[-- Attachment #2: Type: text/html, Size: 5407 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-21 16:59   ` Paul Winalski
@ 2024-05-21 17:56     ` segaloco via TUHS
  2024-05-21 18:12     ` Luther Johnson
  2024-05-22  3:26     ` Dave Horsfall
  2 siblings, 0 replies; 47+ messages in thread
From: segaloco via TUHS @ 2024-05-21 17:56 UTC (permalink / raw)
  To: The Eunuchs Hysterical Society

On Tuesday, May 21st, 2024 at 9:59 AM, Paul Winalski <paul.winalski@gmail.com> wrote:

> On Tue, May 21, 2024 at 12:09 AM Serissa <stewart@serissa.com> wrote:
> 
> > > Well this is obviously a hot button topic. AFAIK I was nearby when fuzz-testing for software was invented. I was the main advocate for hiring Andy Payne into the Digital Cambridge Research Lab. One of his little projects was a thing that generated random but correct C programs and fed them to different compilers or compilers with different switches to see if they crashed or generated incorrect results. Overnight, his tester filed 300 or so bug reports against the Digital C compiler. This was met with substantial pushback, but it was a mostly an issue that many of the reports traced to the same underlying bugs.
> > > 
> > > Bill McKeemon expanded the technique and published "Differential Testing of Software" https://www.cs.swarthmore.edu/~bylvisa1/cs97/f13/Papers/DifferentialTestingForSoftware.pdf
> 
> In the mid-late 1980s Bill Mckeeman worked with DEC's compiler product teams to introduce fuzz testing into our testing process. As with the C compiler work at DEC Cambridge, fuzz testing for other compilers (Fortran, PL/I) also found large numbers of bugs.
> 
> The pushback from the compiler folks was mainly a matter of priorities. Fuzz testing is very adept at finding edge conditions, but most failing fuzz tests have syntax that no human programmer would ever write. As a compiler engineer you have limited time to devote to bug testing. Do you spend that time addressing real customer issues that have been reported or do you spend it fixing problems with code that no human being would ever write? To take an example that really happened, a fuzz test consisting of 100 nested parentheses caused an overflow in a parser table (it could only handle 50 nested parens). Is that worth fixing?
> 
> As you pointed out, fuzz test failures tend to occur in clusters and many of the failures eventually are traced to the same underlying bug. Which leads to the counter-argument to the pushback. The fuzz tests are finding real underlying bugs. Why not fix them before a customer runs into them? That very thing did happen several times. A customer-reported bug was fixed and suddenly several of the fuzz test problems that had been reported went away. Another consideration is that, even back in the 1980s, humans weren't the only ones writing programs. There were programs writing programs and they sometimes produced bizarre (but syntactically correct) code.
> 
> -Paul W.

A happy medium could be including far-out fuzzing to characterize issues, but not necessarily then immediately sink the resources into resolving bizarre discoveries from the fuzzing.  Better to know then not but also have the wisdom to determine "is someone actually going to trip this" vs. "this is something that is possible and good to document".  In my own work we have several of the latter where something is almost guaranteed to never happen with a human interaction, but is also something we want documented somewhere so if unlikely problem <xyz> ever does happen, the discovery is already done and we just start plotting out a solution.  That's also some nice low hanging fruit to pluck when there isn't much else going on, but avoids the phenomenon where we sink critical time into bugfixes with a microscopic ROI.

- Matt G.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-21  1:56 ` Rob Pike
  2024-05-21  2:47   ` Larry McVoy
  2024-05-21  3:53   ` George Michaelson
@ 2024-05-21 16:59   ` Paul Winalski
  2024-05-21 17:56     ` segaloco via TUHS
                       ` (2 more replies)
  2 siblings, 3 replies; 47+ messages in thread
From: Paul Winalski @ 2024-05-21 16:59 UTC (permalink / raw)
  To: TUHS main list

[-- Attachment #1: Type: text/plain, Size: 2532 bytes --]

On Tue, May 21, 2024 at 12:09 AM Serissa <stewart@serissa.com> wrote:

> Well this is obviously a hot button topic.  AFAIK I was nearby when
>> fuzz-testing for software was invented. I was the main advocate for hiring
>> Andy Payne into the Digital Cambridge Research Lab.  One of his little
>> projects was a thing that generated random but correct C programs and fed
>> them to different compilers or compilers with different switches to see if
>> they crashed or generated incorrect results.  Overnight, his tester filed
>> 300 or so bug reports against the Digital C compiler.  This was met with
>> substantial pushback, but it was a mostly an issue that many of the reports
>> traced to the same underlying bugs.
>>
>> Bill McKeemon expanded the technique and published "Differential Testing
>> of Software"
>> https://www.cs.swarthmore.edu/~bylvisa1/cs97/f13/Papers/DifferentialTestingForSoftware.pdf
>>
>
In the mid-late 1980s Bill Mckeeman worked with DEC's compiler product
teams to introduce fuzz testing into our testing process.  As with the C
compiler work at DEC Cambridge, fuzz testing for other compilers (Fortran,
PL/I) also found large numbers of bugs.

The pushback from the compiler folks was mainly a matter of priorities.
Fuzz testing is very adept at finding edge conditions, but most failing
fuzz tests have syntax that no human programmer would ever write.  As a
compiler engineer you have limited time to devote to bug testing.  Do you
spend that time addressing real customer issues that have been reported or
do you spend it fixing problems with code that no human being would ever
write?  To take an example that really happened, a fuzz test consisting of
100 nested parentheses caused an overflow in a parser table (it could only
handle 50 nested parens).  Is that worth fixing?

As you pointed out, fuzz test failures tend to occur in clusters and many
of the failures eventually are traced to the same underlying bug.  Which
leads to the counter-argument to the pushback.  The fuzz tests are finding
real underlying bugs.  Why not fix them before a customer runs into them?
That very thing did happen several times.  A customer-reported bug was
fixed and suddenly several of the fuzz test problems that had been reported
went away.  Another consideration is that, even back in the 1980s, humans
weren't the only ones writing programs.  There were programs writing
programs and they sometimes produced bizarre (but syntactically correct)
code.

-Paul W.

[-- Attachment #2: Type: text/html, Size: 3170 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-21  3:36       ` Rob Pike
@ 2024-05-21 11:59         ` Peter Weinberger (温博格) via TUHS
  0 siblings, 0 replies; 47+ messages in thread
From: Peter Weinberger (温博格) via TUHS @ 2024-05-21 11:59 UTC (permalink / raw)
  To: Rob Pike; +Cc: TUHS main list

On a lesser note, one day I got tired of C compiler crashes (probably
on the Vax, possibly originating in my code generator) and converted
them into 'fatal internal error' messages.

On Mon, May 20, 2024 at 11:36 PM Rob Pike <robpike@gmail.com> wrote:
>
> Eventually Dennis told Ron to stop as he wasn't interested in protecting against insane things like "unsigned register union". Now that computing has become more adversarial, he might feel differently.
>
> -rob
>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-21  1:56 ` Rob Pike
  2024-05-21  2:47   ` Larry McVoy
@ 2024-05-21  3:53   ` George Michaelson
  2024-05-21 16:59   ` Paul Winalski
  2 siblings, 0 replies; 47+ messages in thread
From: George Michaelson @ 2024-05-21  3:53 UTC (permalink / raw)
  To: The Eunuchs Hysterical Society

On Tue, May 21, 2024 at 11:56 AM Rob Pike <robpike@gmail.com> wrote:
>It's probably impossible to decide who invented fuzzing, so the credit will surely go to the person who named it.

That theory probably applies to the Earl of Sandwich, And the Earl of
Cardigan. Hoare Belisha also did ok for giant orange balls at zebra
crossings. I'm less sure the Earl of Zebra feels recognised, or that
Eugène-René Poubelle feels happy with his namesake (he should do, dust
bins are huge)

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-21  2:54     ` Lawrence Stewart
@ 2024-05-21  3:36       ` Rob Pike
  2024-05-21 11:59         ` Peter Weinberger (温博格) via TUHS
  0 siblings, 1 reply; 47+ messages in thread
From: Rob Pike @ 2024-05-21  3:36 UTC (permalink / raw)
  To: Lawrence Stewart; +Cc: TUHS main list

[-- Attachment #1: Type: text/plain, Size: 208 bytes --]

Eventually Dennis told Ron to stop as he wasn't interested in protecting
against insane things like "unsigned register union". Now that computing
has become more adversarial, he might feel differently.

-rob

[-- Attachment #2: Type: text/html, Size: 530 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-21  2:47   ` Larry McVoy
@ 2024-05-21  2:54     ` Lawrence Stewart
  2024-05-21  3:36       ` Rob Pike
  0 siblings, 1 reply; 47+ messages in thread
From: Lawrence Stewart @ 2024-05-21  2:54 UTC (permalink / raw)
  To: Larry McVoy; +Cc: TUHS main list

Good to learn more of the history!  I wonder when the technique got started on the hardware side?  
I wouldn’t be surprised if IBM were doing some of this for the S/360 since it was a nearly 
compatible set of systems.
-L

> On May 20, 2024, at 10:47 PM, Larry McVoy <lm@mcvoy.com> wrote:
> 
> I think the title might go to my OS prof, Bart Miller.  He did a paper 
> 
> https://www.paradyn.org/papers/fuzz.pdf
> 
> that named it that in 1990.  
> 
> On Tue, May 21, 2024 at 11:56:30AM +1000, Rob Pike wrote:
>> Ron Hardin was doing this to Dennis's C compiler in the 1980s, well before
>> 1998. And I believe Doug McIlroy was generating random regular expressions
>> to compare different implementations. It's probably impossible to decide
>> who invented fuzzing, so the credit will surely go to the person who named
>> it.
>> 
>> -rob
>> 
>> 
>> On Tue, May 21, 2024 at 12:09???AM Serissa <stewart@serissa.com> wrote:
>> 
>>> Well this is obviously a hot button topic.  AFAIK I was nearby when
>>> fuzz-testing for software was invented. I was the main advocate for hiring
>>> Andy Payne into the Digital Cambridge Research Lab.  One of his little
>>> projects was a thing that generated random but correct C programs and fed
>>> them to different compilers or compilers with different switches to see if
>>> they crashed or generated incorrect results.  Overnight, his tester filed
>>> 300 or so bug reports against the Digital C compiler.  This was met with
>>> substantial pushback, but it was a mostly an issue that many of the reports
>>> traced to the same underlying bugs.
>>> 
>>> Bill McKeemon expanded the technique and published "Differential Testing
>>> of Software"
>>> https://www.cs.swarthmore.edu/~bylvisa1/cs97/f13/Papers/DifferentialTestingForSoftware.pdf
>>> 
>>> Andy had encountered the underlying idea while working as an intern on the
>>> Alpha processor development team.  Among many other testers, they used an
>>> architectural tester called REX to generate more or less random sequences
>>> of instructions, which were then run through different simulation chains
>>> (functional, RTL, cycle-accurate) to see if they did the same thing.
>>> Finding user-accessible bugs in hardware seems like a good thing.
>>> 
>>> The point of generating correct programs (mentioned under the term LangSec
>>> here) goes a long way to avoid irritating the maintainers.  Making the test
>>> cases short is also maintainer-friendly.  The test generator is also in a
>>> position to annotate the source with exactly what it is supposed to do,
>>> which is also helpful.
>>> 
>>> -L
>>> 
>>> 
>>> 
> 
> -- 
> ---
> Larry McVoy           Retired to fishing          http://www.mcvoy.com/lm/boat


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-21  1:56 ` Rob Pike
@ 2024-05-21  2:47   ` Larry McVoy
  2024-05-21  2:54     ` Lawrence Stewart
  2024-05-21  3:53   ` George Michaelson
  2024-05-21 16:59   ` Paul Winalski
  2 siblings, 1 reply; 47+ messages in thread
From: Larry McVoy @ 2024-05-21  2:47 UTC (permalink / raw)
  To: Rob Pike; +Cc: TUHS main list

I think the title might go to my OS prof, Bart Miller.  He did a paper 

https://www.paradyn.org/papers/fuzz.pdf

that named it that in 1990.  

On Tue, May 21, 2024 at 11:56:30AM +1000, Rob Pike wrote:
> Ron Hardin was doing this to Dennis's C compiler in the 1980s, well before
> 1998. And I believe Doug McIlroy was generating random regular expressions
> to compare different implementations. It's probably impossible to decide
> who invented fuzzing, so the credit will surely go to the person who named
> it.
> 
> -rob
> 
> 
> On Tue, May 21, 2024 at 12:09???AM Serissa <stewart@serissa.com> wrote:
> 
> > Well this is obviously a hot button topic.  AFAIK I was nearby when
> > fuzz-testing for software was invented. I was the main advocate for hiring
> > Andy Payne into the Digital Cambridge Research Lab.  One of his little
> > projects was a thing that generated random but correct C programs and fed
> > them to different compilers or compilers with different switches to see if
> > they crashed or generated incorrect results.  Overnight, his tester filed
> > 300 or so bug reports against the Digital C compiler.  This was met with
> > substantial pushback, but it was a mostly an issue that many of the reports
> > traced to the same underlying bugs.
> >
> > Bill McKeemon expanded the technique and published "Differential Testing
> > of Software"
> > https://www.cs.swarthmore.edu/~bylvisa1/cs97/f13/Papers/DifferentialTestingForSoftware.pdf
> >
> > Andy had encountered the underlying idea while working as an intern on the
> > Alpha processor development team.  Among many other testers, they used an
> > architectural tester called REX to generate more or less random sequences
> > of instructions, which were then run through different simulation chains
> > (functional, RTL, cycle-accurate) to see if they did the same thing.
> > Finding user-accessible bugs in hardware seems like a good thing.
> >
> > The point of generating correct programs (mentioned under the term LangSec
> > here) goes a long way to avoid irritating the maintainers.  Making the test
> > cases short is also maintainer-friendly.  The test generator is also in a
> > position to annotate the source with exactly what it is supposed to do,
> > which is also helpful.
> >
> > -L
> >
> >
> >

-- 
---
Larry McVoy           Retired to fishing          http://www.mcvoy.com/lm/boat

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-20 14:09 Serissa
@ 2024-05-21  1:56 ` Rob Pike
  2024-05-21  2:47   ` Larry McVoy
                     ` (2 more replies)
  0 siblings, 3 replies; 47+ messages in thread
From: Rob Pike @ 2024-05-21  1:56 UTC (permalink / raw)
  To: Serissa; +Cc: TUHS main list

[-- Attachment #1: Type: text/plain, Size: 2040 bytes --]

Ron Hardin was doing this to Dennis's C compiler in the 1980s, well before
1998. And I believe Doug McIlroy was generating random regular expressions
to compare different implementations. It's probably impossible to decide
who invented fuzzing, so the credit will surely go to the person who named
it.

-rob


On Tue, May 21, 2024 at 12:09 AM Serissa <stewart@serissa.com> wrote:

> Well this is obviously a hot button topic.  AFAIK I was nearby when
> fuzz-testing for software was invented. I was the main advocate for hiring
> Andy Payne into the Digital Cambridge Research Lab.  One of his little
> projects was a thing that generated random but correct C programs and fed
> them to different compilers or compilers with different switches to see if
> they crashed or generated incorrect results.  Overnight, his tester filed
> 300 or so bug reports against the Digital C compiler.  This was met with
> substantial pushback, but it was a mostly an issue that many of the reports
> traced to the same underlying bugs.
>
> Bill McKeemon expanded the technique and published "Differential Testing
> of Software"
> https://www.cs.swarthmore.edu/~bylvisa1/cs97/f13/Papers/DifferentialTestingForSoftware.pdf
>
> Andy had encountered the underlying idea while working as an intern on the
> Alpha processor development team.  Among many other testers, they used an
> architectural tester called REX to generate more or less random sequences
> of instructions, which were then run through different simulation chains
> (functional, RTL, cycle-accurate) to see if they did the same thing.
> Finding user-accessible bugs in hardware seems like a good thing.
>
> The point of generating correct programs (mentioned under the term LangSec
> here) goes a long way to avoid irritating the maintainers.  Making the test
> cases short is also maintainer-friendly.  The test generator is also in a
> position to annotate the source with exactly what it is supposed to do,
> which is also helpful.
>
> -L
>
>
>

[-- Attachment #2: Type: text/html, Size: 2796 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-20 13:41   ` [TUHS] Re: A fuzzy awk Ralph Corderoy
@ 2024-05-20 14:26     ` Chet Ramey
  2024-05-22 13:44     ` arnold
  1 sibling, 0 replies; 47+ messages in thread
From: Chet Ramey @ 2024-05-20 14:26 UTC (permalink / raw)
  To: Ralph Corderoy, TUHS main list


[-- Attachment #1.1: Type: text/plain, Size: 2043 bytes --]

On 5/20/24 9:41 AM, Ralph Corderoy wrote:
> Hi Chet,
> 
>> Doug wrote:
>>> I'm surprised by nonchalance about bad inputs evoking bad program
>>> behavior.
>>
>> I think the claim is that it's better to stop immediately with an
>> error on invalid input rather than guess at the user's intent and try
>> to go on.
> 
> That aside, having made the decision to patch up the input so more
> punched cards are consumed, the patch should be bug free.
> 
> Say it's inserting a semicolon token for pretence.  It should have
> initialised source-file locations just as if it were real.  Not an
> uninitialised pointer to a source filename so a later dereference
> failed.
> 
> I can see an avalanche of errors in an earlier gawk caused problems, but
> each time there would have been a first patch of the input which made
> a mistake causing the pebble to start rolling.  My understanding is that
> there was potentially a lot of these and rather than fix them it was
> more productive of the limited time to stop patching the input.  Then
> the code which patched could be deleted, getting rid of the buggy bits
> along the way?

Maybe we're talking about the same thing. My impression is that at
each point there was more than one potential token to insert and go on,
and gawk chose one (probably the most common one), in the hopes that it
would be able to report as many errors as possible. There's always the
chance you'll be wrong there.

(I have no insight into the actual nature of these issues, or the actual
corruption that caused the crashes, so take the next with skepticism.)

And then rather than go back and modify other state after inserting
this token -- which gawk did not do -- for the sole purpose of making
this guessing more crash-resistant, Arnold chose a different approach:
exit on invalid input.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
		 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    chet@case.edu    http://tiswww.cwru.edu/~chet/


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 203 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
@ 2024-05-20 14:09 Serissa
  2024-05-21  1:56 ` Rob Pike
  0 siblings, 1 reply; 47+ messages in thread
From: Serissa @ 2024-05-20 14:09 UTC (permalink / raw)
  To: TUHS main list

[-- Attachment #1: Type: text/plain, Size: 1571 bytes --]

Well this is obviously a hot button topic.  AFAIK I was nearby when fuzz-testing for software was invented. I was the main advocate for hiring Andy Payne into the Digital Cambridge Research Lab.  One of his little projects was a thing that generated random but correct C programs and fed them to different compilers or compilers with different switches to see if they crashed or generated incorrect results.  Overnight, his tester filed 300 or so bug reports against the Digital C compiler.  This was met with substantial pushback, but it was a mostly an issue that many of the reports traced to the same underlying bugs.

Bill McKeemon expanded the technique and published "Differential Testing of Software" https://www.cs.swarthmore.edu/~bylvisa1/cs97/f13/Papers/DifferentialTestingForSoftware.pdf

Andy had encountered the underlying idea while working as an intern on the Alpha processor development team.  Among many other testers, they used an architectural tester called REX to generate more or less random sequences of instructions, which were then run through different simulation chains (functional, RTL, cycle-accurate) to see if they did the same thing.  Finding user-accessible bugs in hardware seems like a good thing.

The point of generating correct programs (mentioned under the term LangSec here) goes a long way to avoid irritating the maintainers.  Making the test cases short is also maintainer-friendly.  The test generator is also in a position to annotate the source with exactly what it is supposed to do, which is also helpful.

-L

[-- Attachment #2: Type: text/html, Size: 1946 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-20 13:06 [TUHS] A fuzzy awk. (Was: The 'usage: ...' message.) Douglas McIlroy
  2024-05-20 13:25 ` [TUHS] " Chet Ramey
@ 2024-05-20 13:54 ` Ralph Corderoy
  1 sibling, 0 replies; 47+ messages in thread
From: Ralph Corderoy @ 2024-05-20 13:54 UTC (permalink / raw)
  To: TUHS main list

Hi,

Doug wrote:
> I commend attention to the LangSec movement, which advocates for
> rigorously enforced separation between legal and illegal inputs.

    https://langsec.org

   ‘The Language-theoretic approach (LangSec) regards the Internet
    insecurity epidemic as a consequence of ‘ad hoc’ programming of
    input handling at all layers of network stacks, and in other kinds
    of software stacks.  LangSec posits that the only path to
    trustworthy software that takes untrusted inputs is treating all
    valid or expected inputs as a formal language, and the respective
    input-handling routines as a ‘recognizer’ for that language.
    The recognition must be feasible, and the recognizer must match the
    language in required computation power.

   ‘When input handling is done in ad hoc way, the ‘de facto’
    recognizer, i.e. the input recognition and validation code ends up
    scattered throughout the program, does not match the programmers'
    assumptions about safety and validity of data, and thus provides
    ample opportunities for exploitation.  Moreover, for complex input
    languages the problem of full recognition of valid or expected
    inputs may be *undecidable*, in which case no amount of
    input-checking code or testing will suffice to secure the program.
    Many popular protocols and formats fell into this trap, the
    empirical fact with which security practitioners are all too
    familiar.

   ‘LangSec helps draw the boundary between protocols and API designs
    that can and cannot be secured and implemented securely, and charts
    a way to building truly trustworthy protocols and systems.  A longer
    summary of LangSec in this USENIX Security BoF hand-out, and in the
    talks, articles, and papers below.’

That does look interesting; I'd not heard of it.

-- 
Cheers, Ralph.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-20 13:25 ` [TUHS] " Chet Ramey
@ 2024-05-20 13:41   ` Ralph Corderoy
  2024-05-20 14:26     ` Chet Ramey
  2024-05-22 13:44     ` arnold
  0 siblings, 2 replies; 47+ messages in thread
From: Ralph Corderoy @ 2024-05-20 13:41 UTC (permalink / raw)
  To: TUHS main list

Hi Chet,

> Doug wrote:
> > I'm surprised by nonchalance about bad inputs evoking bad program
> > behavior.
>
> I think the claim is that it's better to stop immediately with an
> error on invalid input rather than guess at the user's intent and try
> to go on.

That aside, having made the decision to patch up the input so more
punched cards are consumed, the patch should be bug free.

Say it's inserting a semicolon token for pretence.  It should have
initialised source-file locations just as if it were real.  Not an
uninitialised pointer to a source filename so a later dereference
failed.

I can see an avalanche of errors in an earlier gawk caused problems, but
each time there would have been a first patch of the input which made
a mistake causing the pebble to start rolling.  My understanding is that
there was potentially a lot of these and rather than fix them it was
more productive of the limited time to stop patching the input.  Then
the code which patched could be deleted, getting rid of the buggy bits
along the way?

-- 
Cheers, Ralph.

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2024-05-25 17:37 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-19 23:08 [TUHS] The 'usage: ...' message. (Was: On Bloat...) Douglas McIlroy
2024-05-20  0:58 ` [TUHS] " Rob Pike
2024-05-20  3:19   ` arnold
2024-05-20  3:43     ` Warner Losh
2024-05-20  4:46       ` arnold
2024-05-20  9:20     ` [TUHS] A fuzzy awk. (Was: The 'usage: ...' message.) Ralph Corderoy
2024-05-20 11:58       ` [TUHS] " arnold
2024-05-20 13:10       ` Chet Ramey
2024-05-20 13:30         ` [TUHS] Re: A fuzzy awk Ralph Corderoy
2024-05-20 13:48           ` Chet Ramey
2024-05-20  3:54   ` [TUHS] Re: The 'usage: ...' message. (Was: On Bloat...) Bakul Shah via TUHS
2024-05-20 14:23   ` Clem Cole
2024-05-20 17:30     ` Greg A. Woods
2024-05-20 20:10     ` John Levine
2024-05-21  1:14       ` John Cowan
2024-05-20 17:40   ` Stuff Received
2024-05-20 13:06 [TUHS] A fuzzy awk. (Was: The 'usage: ...' message.) Douglas McIlroy
2024-05-20 13:25 ` [TUHS] " Chet Ramey
2024-05-20 13:41   ` [TUHS] Re: A fuzzy awk Ralph Corderoy
2024-05-20 14:26     ` Chet Ramey
2024-05-22 13:44     ` arnold
2024-05-20 13:54 ` Ralph Corderoy
2024-05-20 14:09 Serissa
2024-05-21  1:56 ` Rob Pike
2024-05-21  2:47   ` Larry McVoy
2024-05-21  2:54     ` Lawrence Stewart
2024-05-21  3:36       ` Rob Pike
2024-05-21 11:59         ` Peter Weinberger (温博格) via TUHS
2024-05-21  3:53   ` George Michaelson
2024-05-21 16:59   ` Paul Winalski
2024-05-21 17:56     ` segaloco via TUHS
2024-05-21 18:12     ` Luther Johnson
2024-05-22 15:37       ` Paul Winalski
2024-05-22 18:49         ` Larry McVoy
2024-05-22 20:17           ` Larry McVoy
2024-05-22  3:26     ` Dave Horsfall
2024-05-22  5:08       ` Alexis
2024-05-22 13:12         ` Warner Losh
2024-05-23 13:49 Douglas McIlroy
2024-05-23 20:52 ` Rob Pike
2024-05-24  5:41   ` andrew
2024-05-24  7:17   ` Ralph Corderoy
2024-05-24  7:41     ` Rob Pike
2024-05-24 11:56     ` Dan Halbert
2024-05-25  0:17   ` Bakul Shah via TUHS
2024-05-25  0:57     ` G. Branden Robinson
2024-05-25 13:56     ` David Arnold
2024-05-25 17:18     ` Paul Winalski
2024-05-25 17:36       ` Tom Perrine

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).