[TUHS] A fuzzy awk. (Was: The 'usage: ...' message.)

The Unix Heritage Society mailing list
 help / color / mirror / Atom feed

* [TUHS] A fuzzy awk. (Was: The 'usage: ...' message.)
@ 2024-05-20 13:06 Douglas McIlroy
  2024-05-20 13:14 ` [TUHS] " arnold
                   ` (3 more replies)
  0 siblings, 4 replies; 47+ messages in thread
From: Douglas McIlroy @ 2024-05-20 13:06 UTC (permalink / raw)
  To: TUHS main list

[-- Attachment #1: Type: text/plain, Size: 616 bytes --]

I'm surprised by nonchalance about bad inputs evoking bad program behavior.
That attitude may have been excusable 50 years ago. By now, though, we have
seen so much malicious exploitation of open avenues of "undefined behavior"
that we can no longer ignore bugs that "can't happen when using the tool
correctly". Mature software should not brook incorrect usage.

"Bailing out near line 1" is a sign of defensive precautions. Crashes and
unjustified output betray their absence.

I commend attention to the LangSec movement, which advocates for rigorously
enforced separation between legal and illegal inputs.

Doug

[-- Attachment #2: Type: text/html, Size: 752 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk. (Was: The 'usage: ...' message.)
  2024-05-20 13:06 [TUHS] A fuzzy awk. (Was: The 'usage: ...' message.) Douglas McIlroy
@ 2024-05-20 13:14 ` arnold
  2024-05-20 14:00   ` G. Branden Robinson
  2024-05-20 13:25 ` Chet Ramey
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 47+ messages in thread
From: arnold @ 2024-05-20 13:14 UTC (permalink / raw)
  To: tuhs, douglas.mcilroy

Perhaps I should not respond to this immediately. But:

Douglas McIlroy <douglas.mcilroy@dartmouth.edu> wrote:

> I'm surprised by nonchalance about bad inputs evoking bad program behavior.
> That attitude may have been excusable 50 years ago. By now, though, we have
> seen so much malicious exploitation of open avenues of "undefined behavior"
> that we can no longer ignore bugs that "can't happen when using the tool
> correctly". Mature software should not brook incorrect usage.

It's not nonchalance, not at all!

The current behavior is to die on the first syntax error, instead of
trying to be "helpful" by continuing to try to parse the program in the
hope of reporting other errors.

> "Bailing out near line 1" is a sign of defensive precautions. Crashes and
> unjustified output betray their absence.

The crashes came because errors cascaded.  I don't see a reason to spend
valuable, *personal* time on adding defenses *where they aren't needed*.

A steel door on your bedroom closet does no good if your front door
is made of balsa wood. My change was to stop the badness at the
front door.

> I commend attention to the LangSec movement, which advocates for rigorously
> enforced separation between legal and illegal inputs.

Illegal input, in gawk, as far as I know, should always cause a syntax
error report and an immediate exit.

If it doesn't, that is a bug, and I'll be happy to try to fix it.

I hope that clarifies things.

Arnold

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk. (Was: The 'usage: ...' message.)
  2024-05-20 13:06 [TUHS] A fuzzy awk. (Was: The 'usage: ...' message.) Douglas McIlroy
  2024-05-20 13:14 ` [TUHS] " arnold
@ 2024-05-20 13:25 ` Chet Ramey
  2024-05-20 13:41   ` [TUHS] Re: A fuzzy awk Ralph Corderoy
  2024-05-20 13:54 ` Ralph Corderoy
  2024-05-20 16:06 ` [TUHS] Re: A fuzzy awk. (Was: The 'usage: ...' message.) Paul Winalski
  3 siblings, 1 reply; 47+ messages in thread
From: Chet Ramey @ 2024-05-20 13:25 UTC (permalink / raw)
  To: Douglas McIlroy, TUHS main list


[-- Attachment #1.1: Type: text/plain, Size: 465 bytes --]

On 5/20/24 9:06 AM, Douglas McIlroy wrote:
> I'm surprised by nonchalance about bad inputs evoking bad program behavior.

I think the claim is that it's better to stop immediately with an error
on invalid input rather than guess at the user's intent and try to go on.


-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
		 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    chet@case.edu    http://tiswww.cwru.edu/~chet/


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 203 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-20 13:25 ` Chet Ramey
@ 2024-05-20 13:41   ` Ralph Corderoy
  2024-05-20 14:26     ` Chet Ramey
  2024-05-22 13:44     ` arnold
  0 siblings, 2 replies; 47+ messages in thread
From: Ralph Corderoy @ 2024-05-20 13:41 UTC (permalink / raw)
  To: TUHS main list

Hi Chet,

> Doug wrote:
> > I'm surprised by nonchalance about bad inputs evoking bad program
> > behavior.
>
> I think the claim is that it's better to stop immediately with an
> error on invalid input rather than guess at the user's intent and try
> to go on.

That aside, having made the decision to patch up the input so more
punched cards are consumed, the patch should be bug free.

Say it's inserting a semicolon token for pretence.  It should have
initialised source-file locations just as if it were real.  Not an
uninitialised pointer to a source filename so a later dereference
failed.

I can see an avalanche of errors in an earlier gawk caused problems, but
each time there would have been a first patch of the input which made
a mistake causing the pebble to start rolling.  My understanding is that
there was potentially a lot of these and rather than fix them it was
more productive of the limited time to stop patching the input.  Then
the code which patched could be deleted, getting rid of the buggy bits
along the way?

-- 
Cheers, Ralph.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-20 13:06 [TUHS] A fuzzy awk. (Was: The 'usage: ...' message.) Douglas McIlroy
  2024-05-20 13:14 ` [TUHS] " arnold
  2024-05-20 13:25 ` Chet Ramey
@ 2024-05-20 13:54 ` Ralph Corderoy
  2024-05-20 15:39   ` [TUHS] OT: LangSec (Re: A fuzzy awk.) Åke Nordin
  2024-05-20 16:06 ` [TUHS] Re: A fuzzy awk. (Was: The 'usage: ...' message.) Paul Winalski
  3 siblings, 1 reply; 47+ messages in thread
From: Ralph Corderoy @ 2024-05-20 13:54 UTC (permalink / raw)
  To: TUHS main list

Hi,

Doug wrote:
> I commend attention to the LangSec movement, which advocates for
> rigorously enforced separation between legal and illegal inputs.

    https://langsec.org

   ‘The Language-theoretic approach (LangSec) regards the Internet
    insecurity epidemic as a consequence of ‘ad hoc’ programming of
    input handling at all layers of network stacks, and in other kinds
    of software stacks.  LangSec posits that the only path to
    trustworthy software that takes untrusted inputs is treating all
    valid or expected inputs as a formal language, and the respective
    input-handling routines as a ‘recognizer’ for that language.
    The recognition must be feasible, and the recognizer must match the
    language in required computation power.

   ‘When input handling is done in ad hoc way, the ‘de facto’
    recognizer, i.e. the input recognition and validation code ends up
    scattered throughout the program, does not match the programmers'
    assumptions about safety and validity of data, and thus provides
    ample opportunities for exploitation.  Moreover, for complex input
    languages the problem of full recognition of valid or expected
    inputs may be *undecidable*, in which case no amount of
    input-checking code or testing will suffice to secure the program.
    Many popular protocols and formats fell into this trap, the
    empirical fact with which security practitioners are all too
    familiar.

   ‘LangSec helps draw the boundary between protocols and API designs
    that can and cannot be secured and implemented securely, and charts
    a way to building truly trustworthy protocols and systems.  A longer
    summary of LangSec in this USENIX Security BoF hand-out, and in the
    talks, articles, and papers below.’

That does look interesting; I'd not heard of it.

-- 
Cheers, Ralph.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk. (Was: The 'usage: ...' message.)
  2024-05-20 13:14 ` [TUHS] " arnold
@ 2024-05-20 14:00   ` G. Branden Robinson
  0 siblings, 0 replies; 47+ messages in thread
From: G. Branden Robinson @ 2024-05-20 14:00 UTC (permalink / raw)
  To: tuhs; +Cc: douglas.mcilroy, groff

[-- Attachment #1: Type: text/plain, Size: 6974 bytes --]

Hi folks,

At 2024-05-20T07:14:07-0600, arnold@skeeve.com wrote:
> Douglas McIlroy <douglas.mcilroy@dartmouth.edu> wrote:
> > I'm surprised by nonchalance about bad inputs evoking bad program
> > behavior.  That attitude may have been excusable 50 years ago. By
> > now, though, we have seen so much malicious exploitation of open
> > avenues of "undefined behavior" that we can no longer ignore bugs
> > that "can't happen when using the tool correctly". Mature software
> > should not brook incorrect usage.
> 
> It's not nonchalance, not at all!
> 
> The current behavior is to die on the first syntax error, instead of
> trying to be "helpful" by continuing to try to parse the program in
> the hope of reporting other errors.
[...]
> The crashes came because errors cascaded.  I don't see a reason to
> spend valuable, *personal* time on adding defenses *where they aren't
> needed*.
> 
> A steel door on your bedroom closet does no good if your front door is
> made of balsa wood. My change was to stop the badness at the front
> door.
> 
> > I commend attention to the LangSec movement, which advocates for
> > rigorously enforced separation between legal and illegal inputs.
> 
> Illegal input, in gawk, as far as I know, should always cause a syntax
> error report and an immediate exit.
> 
> If it doesn't, that is a bug, and I'll be happy to try to fix it.
> 
> I hope that clarifies things.

For grins, and for a data point from elsewhere in GNU-land, GNU troff is
pretty robust to this sort of thing.  Much as I might like to boast of
having improved it in this area, it appears to have already come with
iron long johns courtesy of James Clark and/or Werner Lemberg.  I threw
troff its own ELF executable as a crude fuzz test some years ago, and I
don't recall needing to fix anything except unhelpfully vague diagnostic
messages (a phenomenon I am predisposed to observe anyway).

I did notice today that in one case we were spewing back out unprintable
characters (newlines, character codes > 127) _in_ one (but only one) of
the diagnostic messages, and while that's ugly, it's not an obvious
exploitation vector to me.

Nevertheless I decided to fix it and it will be in my next push.

So here's the mess you get when feeding GNU troff to itself.  No GNU
troff since before 1.22.3 core dumps on this sort of unprepossessing
input.

$ ./build/test-groff -Ww -z /usr/bin/troff 2>&1 | sed 's/:[0-9]\+:/:/' | sort | uniq -c
     17 troff:/usr/bin/troff: error: a backspace character is not allowed in an escape sequence parameter
     10 troff:/usr/bin/troff: error: a space character is not allowed in an escape sequence parameter
      1 troff:/usr/bin/troff: error: a space is not allowed as a starting delimiter
      1 troff:/usr/bin/troff: error: a special character is not allowed in an identifier
      1 troff:/usr/bin/troff: error: character '-' is not allowed as a starting delimiter
      1 troff:/usr/bin/troff: error: invalid argument ')' to output suppression escape sequence
      1 troff:/usr/bin/troff: error: invalid argument 'c' to output suppression escape sequence
      1 troff:/usr/bin/troff: error: invalid argument 'l' to output suppression escape sequence
      1 troff:/usr/bin/troff: error: invalid argument 'm' to output suppression escape sequence
      1 troff:/usr/bin/troff: error: invalid positional argument number ','
      3 troff:/usr/bin/troff: error: invalid positional argument number '<'
      3 troff:/usr/bin/troff: error: invalid positional argument number 'D'
      1 troff:/usr/bin/troff: error: invalid positional argument number 'E'
     10 troff:/usr/bin/troff: error: invalid positional argument number 'H'
      1 troff:/usr/bin/troff: error: invalid positional argument number 'Hi'
      1 troff:/usr/bin/troff: error: invalid positional argument number 'I'
      1 troff:/usr/bin/troff: error: invalid positional argument number 'I9'
      1 troff:/usr/bin/troff: error: invalid positional argument number 'L'
      1 troff:/usr/bin/troff: error: invalid positional argument number 'LD'
      2 troff:/usr/bin/troff: error: invalid positional argument number 'LL'
      5 troff:/usr/bin/troff: error: invalid positional argument number 'LT'
      1 troff:/usr/bin/troff: error: invalid positional argument number 'M'
      4 troff:/usr/bin/troff: error: invalid positional argument number 'P'
      5 troff:/usr/bin/troff: error: invalid positional argument number 'X'
      1 troff:/usr/bin/troff: error: invalid positional argument number 'dH'
      1 troff:/usr/bin/troff: error: invalid positional argument number 'h'
      1 troff:/usr/bin/troff: error: invalid positional argument number 'l'
      1 troff:/usr/bin/troff: error: invalid positional argument number 'p'
      1 troff:/usr/bin/troff: error: invalid positional argument number 'x'
      3 troff:/usr/bin/troff: error: invalid positional argument number '|'
     35 troff:/usr/bin/troff: error: invalid positional argument number (unprintable)
      3 troff:/usr/bin/troff: error: unterminated transparent embedding escape sequence

The second to last (and most frequent) message in the list above is the
"new" one.  Here's the diff.

diff --git a/src/roff/troff/input.cpp b/src/roff/troff/input.cpp
index 8d828a01e..596ecf6f9 100644
--- a/src/roff/troff/input.cpp
+++ b/src/roff/troff/input.cpp
@@ -4556,10 +4556,21 @@ static void interpolate_arg(symbol nm)
   }
   else {
     const char *p;
-    for (p = s; *p && csdigit(*p); p++)
-      ;
-    if (*p)
-      copy_mode_error("invalid positional argument number '%1'", s);
+    bool is_valid = true;
+    bool is_printable = true;
+    for (p = s; *p != 0 /* nullptr */; p++) {
+      if (!csdigit(*p))
+       is_valid = false;
+      if (!csprint(*p))
+       is_printable = false;
+    }
+    if (!is_valid) {
+      const char msg[] = "invalid positional argument number";
+      if (is_printable)
+       copy_mode_error("%1 '%2'", msg, s);
+      else
+       copy_mode_error("%1 (unprintable)", msg);
+    }
     else
       input_stack::push(input_stack::get_arg(atoi(s)));
   }

GNU troff may have started out with an easier task in this area than an
AWK or a shell had; its syntax is not block-structured in the same way,
so parser state recovery is easier, and it's _inherently_ a filter.

The only fruitful fuzz attack on groff I can recall was upon indexed
bibliographic database files, which are a binary format.  This went
unresolved for several years[1] but I fixed it for groff 1.23.0.

https://bugs.debian.org/716109

Regards,
Branden

[1] I think I understand the low triage priority.  Few groff users use
    the refer(1) preprocessor, and of those who do, even fewer find
    modern systems so poorly performant at text scanning that they
    desire the services of indxbib(1) to speed lookup of bibliographic
    entries.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-20 13:41   ` [TUHS] Re: A fuzzy awk Ralph Corderoy
@ 2024-05-20 14:26     ` Chet Ramey
  2024-05-22 13:44     ` arnold
  1 sibling, 0 replies; 47+ messages in thread
From: Chet Ramey @ 2024-05-20 14:26 UTC (permalink / raw)
  To: Ralph Corderoy, TUHS main list


[-- Attachment #1.1: Type: text/plain, Size: 2043 bytes --]

On 5/20/24 9:41 AM, Ralph Corderoy wrote:
> Hi Chet,
> 
>> Doug wrote:
>>> I'm surprised by nonchalance about bad inputs evoking bad program
>>> behavior.
>>
>> I think the claim is that it's better to stop immediately with an
>> error on invalid input rather than guess at the user's intent and try
>> to go on.
> 
> That aside, having made the decision to patch up the input so more
> punched cards are consumed, the patch should be bug free.
> 
> Say it's inserting a semicolon token for pretence.  It should have
> initialised source-file locations just as if it were real.  Not an
> uninitialised pointer to a source filename so a later dereference
> failed.
> 
> I can see an avalanche of errors in an earlier gawk caused problems, but
> each time there would have been a first patch of the input which made
> a mistake causing the pebble to start rolling.  My understanding is that
> there was potentially a lot of these and rather than fix them it was
> more productive of the limited time to stop patching the input.  Then
> the code which patched could be deleted, getting rid of the buggy bits
> along the way?

Maybe we're talking about the same thing. My impression is that at
each point there was more than one potential token to insert and go on,
and gawk chose one (probably the most common one), in the hopes that it
would be able to report as many errors as possible. There's always the
chance you'll be wrong there.

(I have no insight into the actual nature of these issues, or the actual
corruption that caused the crashes, so take the next with skepticism.)

And then rather than go back and modify other state after inserting
this token -- which gawk did not do -- for the sole purpose of making
this guessing more crash-resistant, Arnold chose a different approach:
exit on invalid input.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
		 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    chet@case.edu    http://tiswww.cwru.edu/~chet/


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 203 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] OT: LangSec (Re: A fuzzy awk.)
  2024-05-20 13:54 ` Ralph Corderoy
@ 2024-05-20 15:39   ` Åke Nordin
  2024-05-20 16:09     ` [TUHS] " Ben Kallus
  0 siblings, 1 reply; 47+ messages in thread
From: Åke Nordin @ 2024-05-20 15:39 UTC (permalink / raw)
  To: tuhs

On 2024-05-20 15:54, Ralph Corderoy wrote:

> Doug wrote:
>> I commend attention to the LangSec movement, which advocates for
>> rigorously enforced separation between legal and illegal inputs.
>     https://langsec.org
>
>    ‘The Language-theoretic approach (LangSec) regards the Internet
>     insecurity epidemic as a consequence of ‘ad hoc’ programming of
>     input handling at all layers of network stacks, and in other kinds
>     of software stacks.  LangSec posits that the only path to
>     trustworthy software that takes untrusted inputs is treating all
>     valid or expected inputs as a formal language, and the respective
>     input-handling routines as a ‘recognizer’ for that language.

. . .

>    ‘LangSec helps draw the boundary between protocols and API designs
>     that can and cannot be secured and implemented securely, and charts
>     a way to building truly trustworthy protocols and systems.  A longer
>     summary of LangSec in this USENIX Security BoF hand-out, and in the
>     talks, articles, and papers below.’

Yes, it's an interesting concept. Those *n?x tools that have
lex/yacc frontends are probably closer to this than the average
hack.

It may become hard to reconcile this with the robustness principle 
(Be conservative in what you send, be liberal in what you accept)
that Jon Postel popularized. Maybe it becomes necessary, though.

-- 
Åke Nordin <ake.nordin@netia.se>, resident Net/Lunix/telecom geek.
Netia Data AB, Stockholm SWEDEN *46#7O466OI99#


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk. (Was: The 'usage: ...' message.)
  2024-05-20 13:06 [TUHS] A fuzzy awk. (Was: The 'usage: ...' message.) Douglas McIlroy
                   ` (2 preceding siblings ...)
  2024-05-20 13:54 ` Ralph Corderoy
@ 2024-05-20 16:06 ` Paul Winalski
  3 siblings, 0 replies; 47+ messages in thread
From: Paul Winalski @ 2024-05-20 16:06 UTC (permalink / raw)
  To: TUHS main list

[-- Attachment #1: Type: text/plain, Size: 2182 bytes --]

On Mon, May 20, 2024 at 9:17 AM Douglas McIlroy <
douglas.mcilroy@dartmouth.edu> wrote:

> I'm surprised by nonchalance about bad inputs evoking bad program
> behavior. That attitude may have been excusable 50 years ago. By now,
> though, we have seen so much malicious exploitation of open avenues of
> "undefined behavior" that we can no longer ignore bugs that "can't happen
> when using the tool correctly". Mature software should not brook incorrect
> usage.
>
> Accepting bad inputs can also lead to security issues.  The data breaches
from SQL-based attacks are a modern case in point.

IMO, as a programmer you owe it to your users to do your best to detect bad
input and to handle it in a graceful fashion.  Nothing is more frustrating
to a user than to have a program blow up in their face with a seg fault, or
even worse, simply exit silently.

As the DEC compiler team's expert on object files, I was called on to add
object file support to a compiler back end originally targeted to VMS
only.  I inherited support of the object file generator for Unix COFF and
later wrote the support for Microsoft PECOFF and ELF.  When our group was
bought by Intel I did the object file support for Apple OS X MACH-O in the
Intel compiler back end.

I found that the folks who write linkers are particularly lazy about error
checking and error handling.  They assume that the compiler always
generates clean object files.  That's OK I suppose if the compiler and
linker people are in the same organization.  If the linker falls over you
can just go down the hall and have the linker developer debug the issue and
tell you where you went wrong.  But that doesn't work when they work for
different companies and the compiler person doesn't have access to the
linker sources.  I ran into a lot of cases where my buggy object file
caused the linker to seg fault or, even worse, simply exit without an error
message.

I ended up writing a very thorough formatted dumper for each object file
format that did very thorough checking for proper syntax and as many
semantic errors (e.g., symbol table index number out of range) as I could.

-Paul W.

[-- Attachment #2: Type: text/html, Size: 2610 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: OT: LangSec (Re: A fuzzy awk.)
  2024-05-20 15:39   ` [TUHS] OT: LangSec (Re: A fuzzy awk.) Åke Nordin
@ 2024-05-20 16:09     ` Ben Kallus
  2024-05-20 20:02       ` John Levine
  0 siblings, 1 reply; 47+ messages in thread
From: Ben Kallus @ 2024-05-20 16:09 UTC (permalink / raw)
  To: Åke Nordin; +Cc: tuhs

> It may become hard to reconcile this with the robustness principle
> (Be conservative in what you send, be liberal in what you accept)
> that Jon Postel popularized. Maybe it becomes necessary, though.

Yes; the LangSec people essentially reject the robustness principle.

See https://langsec.org/papers/postel-patch.pdf

-Ben

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: OT: LangSec (Re: A fuzzy awk.)
  2024-05-20 16:09     ` [TUHS] " Ben Kallus
@ 2024-05-20 20:02       ` John Levine
  2024-05-20 20:11         ` Larry McVoy
  0 siblings, 1 reply; 47+ messages in thread
From: John Levine @ 2024-05-20 20:02 UTC (permalink / raw)
  To: tuhs; +Cc: benjamin.p.kallus.gr

It appears that Ben Kallus <benjamin.p.kallus.gr@dartmouth.edu> said:
>> It may become hard to reconcile this with the robustness principle
>> (Be conservative in what you send, be liberal in what you accept)
>> that Jon Postel popularized. Maybe it becomes necessary, though.
>
>Yes; the LangSec people essentially reject the robustness principle.
>
>See https://langsec.org/papers/postel-patch.pdf

On the contrary, they actually understand it.

Postel was widely misunderstood to say that you should try to accept
arbitrary garbage. People who knew him tell me that he meant to be
liberal when the spec is ambiguous, not to allow stuff that is just
wrong. As their quote from RFC 1122 points out, he also said you
should be prepared for arbitrary garbage so you can reject it.

R's,
John

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: OT: LangSec (Re: A fuzzy awk.)
  2024-05-20 20:02       ` John Levine
@ 2024-05-20 20:11         ` Larry McVoy
  2024-05-20 21:00           ` Ben Kallus
  0 siblings, 1 reply; 47+ messages in thread
From: Larry McVoy @ 2024-05-20 20:11 UTC (permalink / raw)
  To: John Levine; +Cc: tuhs, benjamin.p.kallus.gr

On Mon, May 20, 2024 at 04:02:26PM -0400, John Levine wrote:
> It appears that Ben Kallus <benjamin.p.kallus.gr@dartmouth.edu> said:
> >> It may become hard to reconcile this with the robustness principle
> >> (Be conservative in what you send, be liberal in what you accept)
> >> that Jon Postel popularized. Maybe it becomes necessary, though.
> >
> >Yes; the LangSec people essentially reject the robustness principle.
> >
> >See https://langsec.org/papers/postel-patch.pdf
> 
> On the contrary, they actually understand it.
> 
> Postel was widely misunderstood to say that you should try to accept
> arbitrary garbage. People who knew him tell me that he meant to be
> liberal when the spec is ambiguous, not to allow stuff that is just
> wrong. As their quote from RFC 1122 points out, he also said you
> should be prepared for arbitrary garbage so you can reject it.

Yeah, I read the pdf and I took away the same thing as John.
-- 
---
Larry McVoy           Retired to fishing          http://www.mcvoy.com/lm/boat

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: OT: LangSec (Re: A fuzzy awk.)
  2024-05-20 20:11         ` Larry McVoy
@ 2024-05-20 21:00           ` Ben Kallus
  2024-05-20 21:03             ` John R Levine
  2024-05-20 21:14             ` Larry McVoy
  0 siblings, 2 replies; 47+ messages in thread
From: Ben Kallus @ 2024-05-20 21:00 UTC (permalink / raw)
  To: Larry McVoy; +Cc: John Levine, tuhs

What I meant was that the LangSec people reject the robustness
principle as it is commonly understood (i.e., make a "reasonable"
guess when receiving garbage), not necessarily that their view is
incompatible with Postel's original vision. This interpretation of the
principle is pretty widespread; take a look at the Nginx mailing list
if you have any doubt. I attribute this to the same phenomenon that
inverted the meaning of REST.

-Ben

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: OT: LangSec (Re: A fuzzy awk.)
  2024-05-20 21:00           ` Ben Kallus
@ 2024-05-20 21:03             ` John R Levine
  2024-05-20 21:14             ` Larry McVoy
  1 sibling, 0 replies; 47+ messages in thread
From: John R Levine @ 2024-05-20 21:03 UTC (permalink / raw)
  To: Ben Kallus; +Cc: tuhs

> What I meant was that the LangSec people reject the robustness
> principle as it is commonly understood (i.e., make a "reasonable"
> guess when receiving garbage), not necessarily that their view is
> incompatible with Postel's original vision. This interpretation of the
> principle is pretty widespread; take a look at the Nginx mailing list
> if you have any doubt. I attribute this to the same phenomenon that
> inverted the meaning of REST.

Oh, OK, no disagreement there.  I'm as tired as you are of people invoking 
Postel to excuse slovenly code.

Regards,
John Levine, johnl@taugh.com, Taughannock Networks, Trumansburg NY
Please consider the environment before reading this e-mail. https://jl.ly

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: OT: LangSec (Re: A fuzzy awk.)
  2024-05-20 21:00           ` Ben Kallus
  2024-05-20 21:03             ` John R Levine
@ 2024-05-20 21:14             ` Larry McVoy
  2024-05-20 21:46               ` Ben Kallus
  1 sibling, 1 reply; 47+ messages in thread
From: Larry McVoy @ 2024-05-20 21:14 UTC (permalink / raw)
  To: Ben Kallus; +Cc: John Levine, tuhs

On Mon, May 20, 2024 at 05:00:40PM -0400, Ben Kallus wrote:
> What I meant was that the LangSec people reject the robustness
> principle as it is commonly understood (i.e., make a "reasonable"
> guess when receiving garbage)

That most certainly is not what I took from what Postel said.  And I
say that as someone who designed a distributed system that had client
and server sides and had to make that work across versions from last
week to 10-20 years ago.

I took it more as "Be more and more careful what you say, get that more
correct with each release, but tolerate the less correct stuff you might
get from earlier versions".  In no way did I think he meant ``make a
"reasonable" guess when receiving garbage''.  Garbage is garbage, you
error on that.  
-- 
---
Larry McVoy           Retired to fishing          http://www.mcvoy.com/lm/boat

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: OT: LangSec (Re: A fuzzy awk.)
  2024-05-20 21:14             ` Larry McVoy
@ 2024-05-20 21:46               ` Ben Kallus
  2024-05-20 21:57                 ` Larry McVoy
  0 siblings, 1 reply; 47+ messages in thread
From: Ben Kallus @ 2024-05-20 21:46 UTC (permalink / raw)
  To: Larry McVoy; +Cc: John Levine, tuhs

My point was that, regardless of Postel's original intent, many people
have interpreted his principle to mean that accepting garbage is good.
*This* interpretation is incompatible with LangSec.

See RFC 9413 for an exploration of the many interpretations of
Postel's principle.

-Ben

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: OT: LangSec (Re: A fuzzy awk.)
  2024-05-20 21:46               ` Ben Kallus
@ 2024-05-20 21:57                 ` Larry McVoy
  0 siblings, 0 replies; 47+ messages in thread
From: Larry McVoy @ 2024-05-20 21:57 UTC (permalink / raw)
  To: Ben Kallus; +Cc: John Levine, tuhs

Those would be the stupid people and you can't fix stupid.  Seriously,
people can twist anything into anything.  Just because dumb people
didn't understand his principle doesn't mean it was a bad principle.

On Mon, May 20, 2024 at 05:46:48PM -0400, Ben Kallus wrote:
> My point was that, regardless of Postel's original intent, many people
> have interpreted his principle to mean that accepting garbage is good.
> *This* interpretation is incompatible with LangSec.
> 
> See RFC 9413 for an exploration of the many interpretations of
> Postel's principle.
> 
> -Ben

-- 
---
Larry McVoy           Retired to fishing          http://www.mcvoy.com/lm/boat

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-20 13:41   ` [TUHS] Re: A fuzzy awk Ralph Corderoy
  2024-05-20 14:26     ` Chet Ramey
@ 2024-05-22 13:44     ` arnold
  1 sibling, 0 replies; 47+ messages in thread
From: arnold @ 2024-05-22 13:44 UTC (permalink / raw)
  To: tuhs, ralph

I've been travelling, so I haven't been able to answer
these mails until now.

Ralph Corderoy <ralph@inputplus.co.uk> wrote:

> I can see an avalanche of errors in an earlier gawk caused problems, but
> each time there would have been a first patch of the input which made
> a mistake causing the pebble to start rolling.  My understanding is that
> there was potentially a lot of these and rather than fix them it was
> more productive of the limited time to stop patching the input.  Then
> the code which patched could be deleted, getting rid of the buggy bits
> along the way?

That's not the case. Gawk didn't try to patch the input. It
simply set a flag saying "don't try to run" but kept on parsing
anyway, in the hope of finding more errors.

That was a bad idea, because the representation of the program
being built was then not in the correct state to have more
stuff parsed and converted into byte code.

Very early on, the first parse error caused an exit. I changed
it to keep going to try to be helpful. But when that became a source
for essentially specious bug reports and a time sink for me, it
became time to go back to exiting on the first problem.

HTH,

Arnold

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-25 17:18     ` Paul Winalski
@ 2024-05-25 17:36       ` Tom Perrine
  0 siblings, 0 replies; 47+ messages in thread
From: Tom Perrine @ 2024-05-25 17:36 UTC (permalink / raw)
  To: Paul Winalski; +Cc: The Unix Heritage Society mailing list

[-- Attachment #1: Type: text/plain, Size: 2272 bytes --]

Another Gypsy user here...

For KSOS-11 the kernel was described in SPECIAL - as a set of axioms and
theorems. There was no actual connection between the formal specification
in SPECIAL and the Modula code.

Some of the critical user-space code for a trusted downgrade program, to
bridge data from higher levels of classification to lower, was written in
Gypsy. I visited UT Austin and Dr Good(?)'s team to learn it, IIRC. Gypsy
was considered better in that the specification was tied to the executable
through the pre/post conditions - and the better support for semi-automated
theorem proving.




On Sat, May 25, 2024 at 10:18 AM Paul Winalski <paul.winalski@gmail.com>
wrote:

> On Fri, May 24, 2024 at 8:18 PM Bakul Shah via TUHS <tuhs@tuhs.org> wrote:
>
> At one point I had suggested turning Go's Interface type to something like
>> Guttag style abstract data types in that relevant axioms are specified
>> right in the interface definition. The idea was that any concrete type that
>> implements that interface must satisfy its axioms. Even if the compiler
>> ignored these axioms, one can write a support program that can generate a
>> set of comprehensive tests based on these axioms. [Right now a type
>> "implementing" an interface only needs to have a set of methods that
>> exactly match the interface methods but nothing more] The underlying idea
>> is that each type is in essence a constraint on what values an instance of
>> that type can take. So adding such axioms simply tightens (& documents)
>> these constraints. Just the process of coming up with such axioms can
>> improve the design (sor of like test driven design but better!).
>>
>
> At one point I worked with a programming language called Gypsy that
> implemented this concept.  Each routine had a prefix that specified axioms
> on the routine's parameters and outputs.  The rest of Gypsy was a
> conventional procedural language but the semantics were carefully chosen to
> allow for automated proof of correctness.  I wrote a formal specification
> for the DECnet session layer protocol (DECnet's equivalent of TCP) in
> Gypsy.  I turned up a subtle bug in the prose version of the protocol
> specification in the process.
>
> -Paul W.
>

[-- Attachment #2: Type: text/html, Size: 2933 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-25  0:17   ` Bakul Shah via TUHS
  2024-05-25  0:57     ` G. Branden Robinson
  2024-05-25 13:56     ` David Arnold
@ 2024-05-25 17:18     ` Paul Winalski
  2024-05-25 17:36       ` Tom Perrine
  2 siblings, 1 reply; 47+ messages in thread
From: Paul Winalski @ 2024-05-25 17:18 UTC (permalink / raw)
  To: The Unix Heritage Society mailing list

[-- Attachment #1: Type: text/plain, Size: 1529 bytes --]

On Fri, May 24, 2024 at 8:18 PM Bakul Shah via TUHS <tuhs@tuhs.org> wrote:

At one point I had suggested turning Go's Interface type to something like
> Guttag style abstract data types in that relevant axioms are specified
> right in the interface definition. The idea was that any concrete type that
> implements that interface must satisfy its axioms. Even if the compiler
> ignored these axioms, one can write a support program that can generate a
> set of comprehensive tests based on these axioms. [Right now a type
> "implementing" an interface only needs to have a set of methods that
> exactly match the interface methods but nothing more] The underlying idea
> is that each type is in essence a constraint on what values an instance of
> that type can take. So adding such axioms simply tightens (& documents)
> these constraints. Just the process of coming up with such axioms can
> improve the design (sor of like test driven design but better!).
>

At one point I worked with a programming language called Gypsy that
implemented this concept.  Each routine had a prefix that specified axioms
on the routine's parameters and outputs.  The rest of Gypsy was a
conventional procedural language but the semantics were carefully chosen to
allow for automated proof of correctness.  I wrote a formal specification
for the DECnet session layer protocol (DECnet's equivalent of TCP) in
Gypsy.  I turned up a subtle bug in the prose version of the protocol
specification in the process.

-Paul W.

[-- Attachment #2: Type: text/html, Size: 1855 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-25  0:17   ` Bakul Shah via TUHS
  2024-05-25  0:57     ` G. Branden Robinson
@ 2024-05-25 13:56     ` David Arnold
  2024-05-25 17:18     ` Paul Winalski
  2 siblings, 0 replies; 47+ messages in thread
From: David Arnold @ 2024-05-25 13:56 UTC (permalink / raw)
  To: Bakul Shah; +Cc: The Unix Heritage Society mailing list


> On 25 May 2024, at 10:18, Bakul Shah via TUHS <tuhs@tuhs.org> wrote:
> 
> 
> What would be nice if programming languages provided some support for such exhaustive testing[1].
> 
> At one point I had suggested turning Go's Interface type to something like Guttag style abstract data types in that relevant axioms are specified right in the interface definition. The idea was that any concrete type that implements that interface must satisfy its axioms. Even if the compiler ignored these axioms, one can write a support program that can generate a set of comprehensive tests based on these axioms.

Sounds like Eiffel, whose compiler had support for checking pre and post conditions (and maybe invariants?) at runtime, or disabling the checks for “performance” mode. 



d

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-25  0:17   ` Bakul Shah via TUHS
@ 2024-05-25  0:57     ` G. Branden Robinson
  2024-05-25 13:56     ` David Arnold
  2024-05-25 17:18     ` Paul Winalski
  2 siblings, 0 replies; 47+ messages in thread
From: G. Branden Robinson @ 2024-05-25  0:57 UTC (permalink / raw)
  To: The Unix Heritage Society mailing list

[-- Attachment #1: Type: text/plain, Size: 5170 bytes --]

[restricting to list; strong opinions here]

At 2024-05-24T17:17:53-0700, Bakul Shah via TUHS wrote:
> What would be nice if programming languages provided some support for
> such exhaustive testing[1].

[rearranging]
> At one point I had suggested turning Go's Interface type to something
> like Guttag style abstract data types in that relevant axioms are
> specified right in the interface definition.

It's an excellent idea.

> The underlying idea is that each type is in essence a constraint on
> what values an instance of that type can take.

In the simple form of a data type plus a range constraint, that's the
Ada definition of a subtype since day one--Ada '80 or Ada 83 if you
insist on the standardized form of the language.

40 years later we have Linus Torvalds tearing up his achievement
certificate in "kinder, gentler email interactions" just to trash the
notion of range checks on data types.[1][2][3]

Naturally, the brogrammers are quick to take Torvalds's side.[4]

Pascal had range checks too, and Kernighan famously punked on Wirth for
that.  I'm not certain, but I get the feeling the latter got somewhat
over-interpreted.  (To be fair to Kernighan, Pascal _as specced in the
Revised Report of 1973_[5] was in my opinion too weak a language to
leave the lab, for many of the reasons he noted.  The inflexible array
typing was fatal, in my view.)

> The idea was that any concrete type that implements that interface
> must satisfy its axioms.

Yes.  There is of course much more to the universe of potential
constraints than range checks.  Ada 2022 has these in great generality
with "subtype predicates".

http://www.ada-auth.org/standards/22aarm/html/aa-3-2-4.html

> Even if the compiler ignored these axioms,

I don't understand why this idea wasn't seized upon with more force at
the CSRC.  The notion of a compiler flag that turned "extra" (in the
Ritchie compiler circa 1980, this is perhaps expressed better as "any")
correctness checks could not have been a novelty.  NDEBUG and assert()
are similarly extremely old even in Unix.

> one can write a support program that can generate a set of
> comprehensive tests based on these axioms.

Yes.  As I understand it, this is how Spark/Ada got started.  Specially
annotated comments expressing predicates communicated with such a
support program, running much like the sort of automated theorem
prover you characterize below as not "retail".

In the last two revision cycles of the Ada standard (2013, 2022),
Spark/Ada's enhancements have made it into the language--though I am not
certain, and would not claim, that they compose with _every_ language
feature.  Spark/Ada started life as a subset of the language for a
reason.

But C has its own subset, MISRA C, so this is hardly a reason to scoff.

> [Right now a type "implementing" an interface only needs to
> have a set of methods that exactly match the interface methods but
> nothing more] The underlying idea is that each type is in essence a
> constraint on what values an instance of that type can take. So adding
> such axioms simply tightens (& documents) these constraints. Just the
> process of coming up with such axioms can improve the design (sor of
> like test driven design but better!).

Absolutely.  Generally, software engineers like to operationalize things
consistently enough that they can then be scripted/automated.

Evidently software testing is so mind-numblingly tedious that the will
to undertake it, even with automation, evaporates.

> Now it may be that applying this to anything more complex than stacks
> won't work well & it won't be perfect but I thought this was worth
> experimenting with. This would be like functional testing of all the
> nuts and bolts and components that go in an airplane. The airplane may
> still fall apart but that would be a "composition" error!

Yes.  And even if you can prove 100% of the theorems in your system, you
may learn to your dismay that your specification was defective.
Automated provers are as yet no aid to system architects.

> [1] There are "proof assisant" or formal spec languages such as TLA+,
> Coq, Isabelle etc. but they don't get used much by the average
> programmer. I want something more retail!

I've had a little exposure to these.  They are indeed esoteric, but also
extremely resource-hungry.  My _impression_, based on no hard data, is
that increasing the abilities of static analyzers and the expressiveness
with which they are directed with predicates is much cheaper.

But a lot of programmers will not budge at any cost, and will moreover
be celebrated by their peers for their obstinacy.  See footnotes.

There is much work still to be done.

Regards,
Branden

[1] https://lore.kernel.org/all/202404291502.612E0A10@keescook/
    https://lore.kernel.org/all/CAHk-=wi5YPwWA8f5RAf_Hi8iL0NhGJeL6MN6UFWwRMY8L6UDvQ@mail.gmail.com/
[2] https://lore.kernel.org/lkml/CAHk-=whkGHOmpM_1kNgzX1UDAs10+UuALcpeEWN29EE0m-my=w@mail.gmail.com/
[3] https://www.businessinsider.com/linus-torvalds-linux-time-away-empathy-2018-9
[4] https://lwn.net/Articles/973108/
[5] https://archive.org/details/1973-the-programming-language-pascal-revised-report-wirth

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-23 20:52 ` Rob Pike
  2024-05-24  5:41   ` andrew
  2024-05-24  7:17   ` Ralph Corderoy
@ 2024-05-25  0:17   ` Bakul Shah via TUHS
  2024-05-25  0:57     ` G. Branden Robinson
                       ` (2 more replies)
  2 siblings, 3 replies; 47+ messages in thread
From: Bakul Shah via TUHS @ 2024-05-25  0:17 UTC (permalink / raw)
  To: Rob Pike; +Cc: Douglas McIlroy, The Unix Heritage Society mailing list

[-- Attachment #1: Type: text/plain, Size: 3131 bytes --]

What would be nice if programming languages provided some support for such exhaustive testing[1].

At one point I had suggested turning Go's Interface type to something like Guttag style abstract data types in that relevant axioms are specified right in the interface definition. The idea was that any concrete type that implements that interface must satisfy its axioms. Even if the compiler ignored these axioms, one can write a support program that can generate a set of comprehensive tests based on these axioms. [Right now a type "implementing" an interface only needs to have a set of methods that exactly match the interface methods but nothing more] The underlying idea is that each type is in essence a constraint on what values an instance of that type can take. So adding such axioms simply tightens (& documents) these constraints. Just the process of coming up with such axioms can improve the design (sor of like test driven design but better!).

Now it may be that applying this to anything more complex than stacks won't work well & it won't be perfect but I thought this was worth experimenting with. This would be like functional testing of all the nuts and bolts and components that go in an airplane. The airplane may still fall apart but that would be a "composition" error!

[1] There are "proof assisant" or formal spec languages such as TLA+, Coq, Isabelle etc. but they don't get used much by the average programmer. I want something more retail!

> On May 23, 2024, at 1:52 PM, Rob Pike <robpike@gmail.com> wrote:
> 
> The semantic distinction is important but the end result is very similar. "Fuzzing" as it is now called (for no reason I can intuit) tries to get to the troublesome cases faster by a sort of depth-first search, but exhaustive will always beat it for value. Our exhaustive tester for bitblt, first done by John Reiser if I remember right, set the stage for my own thinking about how you properly test something.
> 
> -rob
> 
> 
> On Thu, May 23, 2024 at 11:49 PM Douglas McIlroy <douglas.mcilroy@dartmouth.edu <mailto:douglas.mcilroy@dartmouth.edu>> wrote:
>> > Doug McIlroy was generating random regular expressions
>> 
>> Actually not. I exhaustively (within limits) tested an RE recognizer without knowingly generating any RE either mechanically or by hand.
>> 
>> The trick: From recursive equations (easily derived from the grammar of REs), I counted how many REs exist up to various limits on token counts, Then I generated all strings that satisfied those limits, turned the recognizer loose on them and counted how many it accepted. Any disagreement of counts revealed the existence (but not any symptom) of bugs. 
>> 
>> Unlike most diagnostic techniques, this scheme produces a certificate of (very high odds on) correctness over a representative subdomain. The scheme also agnostically checks behavior on bad inputs as well as good.  It does not, however, provide a stress test of a recognizer's capacity limits. And its exponential nature limits its applicability to rather small domains. (REs have only 5 distinct kinds of token.)
>> 
>> Doug

[-- Attachment #2: Type: text/html, Size: 4916 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-24  7:17   ` Ralph Corderoy
  2024-05-24  7:41     ` Rob Pike
@ 2024-05-24 11:56     ` Dan Halbert
  1 sibling, 0 replies; 47+ messages in thread
From: Dan Halbert @ 2024-05-24 11:56 UTC (permalink / raw)
  To: tuhs

[-- Attachment #1: Type: text/plain, Size: 822 bytes --]

On 5/24/24 03:17, Ralph Corderoy wrote:
> Rob wrote:
>> "Fuzzing" as it is now called (for no reason I can intuit)
> Barton Miller describes coining the term.
>
As to where the inspiration of choice of word came from, I'll speculate 
: Bart Miller was a CS grad student contemporary of mine at Berkeley. 
Prof. Lotfi Zadeh was working on fuzzy logic, fuzzy sets, and 
"possibility theory". (Prof. William Kahan hated this work, and called 
it "wrong, and pernicious": cf. 
https://www.sciencedirect.com/science/article/abs/pii/S0020025508000716.) 
So the term "fuzzy" was almost infamous in the department.

Prof. Richard Lipton was also at Berkeley at that time, and was working 
on program mutation testing, which fuzzes the program to determine the 
adequacy of test coverage, rather than fuzzing the test data.

Dan H.

[-- Attachment #2: Type: text/html, Size: 1582 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-24  7:17   ` Ralph Corderoy
@ 2024-05-24  7:41     ` Rob Pike
  2024-05-24 11:56     ` Dan Halbert
  1 sibling, 0 replies; 47+ messages in thread
From: Rob Pike @ 2024-05-24  7:41 UTC (permalink / raw)
  To: Ralph Corderoy; +Cc: tuhs

[-- Attachment #1: Type: text/plain, Size: 768 bytes --]

I'm sure that's the etymology but fuzzing isn't exactly random. That's
kinda the point of it.

-rob


On Fri, May 24, 2024 at 5:18 PM Ralph Corderoy <ralph@inputplus.co.uk>
wrote:

> Hi,
>
> Rob wrote:
> > "Fuzzing" as it is now called (for no reason I can intuit)
>
> Barton Miller describes coining the term.
>
>    ‘That night, I was logged on to the Unix system in my office via
>     a dial-up phone line over a 1200 baud modem.  ...
>     I wanted a name that would evoke the feeling of random, unstructured
>     data.  After trying out several ideas, I settled on the term “fuzz”.’
>
>         — https://pages.cs.wisc.edu/~bart/fuzz/Foreword1.html
>
> Line noise inspired him, as he describes.
>
> --
> Cheers, Ralph.
>

[-- Attachment #2: Type: text/html, Size: 1507 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-23 20:52 ` Rob Pike
  2024-05-24  5:41   ` andrew
@ 2024-05-24  7:17   ` Ralph Corderoy
  2024-05-24  7:41     ` Rob Pike
  2024-05-24 11:56     ` Dan Halbert
  2024-05-25  0:17   ` Bakul Shah via TUHS
  2 siblings, 2 replies; 47+ messages in thread
From: Ralph Corderoy @ 2024-05-24  7:17 UTC (permalink / raw)
  To: tuhs

Hi,

Rob wrote:
> "Fuzzing" as it is now called (for no reason I can intuit)

Barton Miller describes coining the term.

   ‘That night, I was logged on to the Unix system in my office via
    a dial-up phone line over a 1200 baud modem.  ...
    I wanted a name that would evoke the feeling of random, unstructured
    data.  After trying out several ideas, I settled on the term “fuzz”.’

        — https://pages.cs.wisc.edu/~bart/fuzz/Foreword1.html

Line noise inspired him, as he describes.

-- 
Cheers, Ralph.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-23 20:52 ` Rob Pike
@ 2024-05-24  5:41   ` andrew
  2024-05-24  7:17   ` Ralph Corderoy
  2024-05-25  0:17   ` Bakul Shah via TUHS
  2 siblings, 0 replies; 47+ messages in thread
From: andrew @ 2024-05-24  5:41 UTC (permalink / raw)
  To: Rob Pike; +Cc: Douglas McIlroy, TUHS main list

[-- Attachment #1: Type: text/plain, Size: 1920 bytes --]

i did some of the later testing of bitblt.
it was a lovely thing, slowly constructing a trustable synthetic bitblt
of ever great size and range that you could compare the bitblt to be tested against.

and we did find a couple of bugs, much to reiser’s chagrin.

> On May 23, 2024, at 1:52 PM, Rob Pike <robpike@gmail.com> wrote:
> 
> The semantic distinction is important but the end result is very similar. "Fuzzing" as it is now called (for no reason I can intuit) tries to get to the troublesome cases faster by a sort of depth-first search, but exhaustive will always beat it for value. Our exhaustive tester for bitblt, first done by John Reiser if I remember right, set the stage for my own thinking about how you properly test something.
> 
> -rob
> 
> 
> On Thu, May 23, 2024 at 11:49 PM Douglas McIlroy <douglas.mcilroy@dartmouth.edu <mailto:douglas.mcilroy@dartmouth.edu>> wrote:
>> > Doug McIlroy was generating random regular expressions
>> 
>> Actually not. I exhaustively (within limits) tested an RE recognizer without knowingly generating any RE either mechanically or by hand.
>> 
>> The trick: From recursive equations (easily derived from the grammar of REs), I counted how many REs exist up to various limits on token counts, Then I generated all strings that satisfied those limits, turned the recognizer loose on them and counted how many it accepted. Any disagreement of counts revealed the existence (but not any symptom) of bugs. 
>> 
>> Unlike most diagnostic techniques, this scheme produces a certificate of (very high odds on) correctness over a representative subdomain. The scheme also agnostically checks behavior on bad inputs as well as good.  It does not, however, provide a stress test of a recognizer's capacity limits. And its exponential nature limits its applicability to rather small domains. (REs have only 5 distinct kinds of token.)
>> 
>> Doug


[-- Attachment #2: Type: text/html, Size: 3529 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-23 13:49 [TUHS] Re: A fuzzy awk Douglas McIlroy
@ 2024-05-23 20:52 ` Rob Pike
  2024-05-24  5:41   ` andrew
                     ` (2 more replies)
  0 siblings, 3 replies; 47+ messages in thread
From: Rob Pike @ 2024-05-23 20:52 UTC (permalink / raw)
  To: Douglas McIlroy; +Cc: TUHS main list

[-- Attachment #1: Type: text/plain, Size: 1551 bytes --]

The semantic distinction is important but the end result is very similar.
"Fuzzing" as it is now called (for no reason I can intuit) tries to get to
the troublesome cases faster by a sort of depth-first search, but
exhaustive will always beat it for value. Our exhaustive tester for bitblt,
first done by John Reiser if I remember right, set the stage for my own
thinking about how you properly test something.

-rob


On Thu, May 23, 2024 at 11:49 PM Douglas McIlroy <
douglas.mcilroy@dartmouth.edu> wrote:

> > Doug McIlroy was generating random regular expressions
>
> Actually not. I exhaustively (within limits) tested an RE recognizer
> without knowingly generating any RE either mechanically or by hand.
>
> The trick: From recursive equations (easily derived from the grammar of
> REs), I counted how many REs exist up to various limits on token counts,
> Then I generated all strings that satisfied those limits, turned the
> recognizer loose on them and counted how many it accepted. Any disagreement
> of counts revealed the existence (but not any symptom) of bugs.
>
> Unlike most diagnostic techniques, this scheme produces a certificate of
> (very high odds on) correctness over a representative subdomain. The
> scheme also agnostically checks behavior on bad inputs as well as good.  It
> does not, however, provide a stress test of a recognizer's capacity limits. And
> its exponential nature limits its applicability to rather small domains.
> (REs have only 5 distinct kinds of token.)
>
> Doug
>

[-- Attachment #2: Type: text/html, Size: 2792 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk
@ 2024-05-23 13:49 Douglas McIlroy
  2024-05-23 20:52 ` Rob Pike
  0 siblings, 1 reply; 47+ messages in thread
From: Douglas McIlroy @ 2024-05-23 13:49 UTC (permalink / raw)
  To: TUHS main list

[-- Attachment #1: Type: text/plain, Size: 971 bytes --]

> Doug McIlroy was generating random regular expressions

Actually not. I exhaustively (within limits) tested an RE recognizer
without knowingly generating any RE either mechanically or by hand.

The trick: From recursive equations (easily derived from the grammar of
REs), I counted how many REs exist up to various limits on token counts,
Then I generated all strings that satisfied those limits, turned the
recognizer loose on them and counted how many it accepted. Any disagreement
of counts revealed the existence (but not any symptom) of bugs.

Unlike most diagnostic techniques, this scheme produces a certificate of
(very high odds on) correctness over a representative subdomain. The scheme
also agnostically checks behavior on bad inputs as well as good.  It does
not, however, provide a stress test of a recognizer's capacity limits. And
its exponential nature limits its applicability to rather small domains.
(REs have only 5 distinct kinds of token.)

Doug

[-- Attachment #2: Type: text/html, Size: 1698 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-22 18:49         ` Larry McVoy
@ 2024-05-22 20:17           ` Larry McVoy
  0 siblings, 0 replies; 47+ messages in thread
From: Larry McVoy @ 2024-05-22 20:17 UTC (permalink / raw)
  To: tuhs

Wayne teased this into a stand alone library here: 

https://github.com/wscott/bksupport

On Wed, May 22, 2024 at 11:49:04AM -0700, Larry McVoy wrote:
> On Wed, May 22, 2024 at 11:37:39AM -0400, Paul Winalski wrote:
> > On Tue, May 21, 2024 at 2:12???PM Luther Johnson <luther.johnson@makerlisp.com>
> > wrote:
> > 
> > > I like this anecdote because it points out the difference between being
> > > able to handle and process bizarre conditions, as if they were something
> > > that should work, which is maybe not that helpful, vs. detecting them and
> > > doing something reasonable, like failiing with a "limit exceeded" message
> > >
> > That is in fact precisely how the DEC compiler handled the 100 nested
> > parentheses condition.
> > 
> > > . A silent, insidious failure down the line because a limit was exceeded
> > > is never good.
> > >
> > Amen!  One should always do bounds checking when dealing with fixed-size
> > aggregate data structures.  One compiler that I worked on got a bug report
> > of bad code being generated.  The problem was an illegal optimization that
> > never should have triggered but did due to a corrupted data table.  Finding
> > the culprit of the corruption took hours.  It finally turned out to be due
> > to overflow of an adjacent data table in use elsewhere in the compiler.
> > The routine to add another entry to that table didn't check for table
> > overflow.
> 
> We invented a data structure that gets around this problem nicely.  It's
> an array of pointers that starts at [1] instead of [0].  The [0]
> entry encodes 2 things:
> 
> In the upper bits, the log(2) the size of the array.  So all arrays
> have at least [0] and [1].  So 2 pointers is the smallest array and
> that was important to us, we wanted it to scale up and scale down.
> 
> In the lower bits, we record the number of used entries in the array.
> We assumed 32 bit pointers and with those we got ~134 million entries
> as our maximum number of entries.
> 
> Usage is like
> 
> char **space = allocLines(4);	// start with space for 4 entries
> 
> space = addLine(space, "I am [1]");
> space = addLine(space, "I am [2]");
> space = addLine(space, "I am [3]");
> space = addLine(space, "I am [4]");	// realloc's to 8 entries
> 
> freelines(space, 0);	// second arg is typically 0 or free()
> 
> It works GREAT.  We used it all over BitKeeper, for stuff as small as
> commit comments to arrays of data structures.  It scales down, scales
> up.  Helper functions:
> 
> /*
>  * liblines - interfaces for autoexpanding data structures
>  *
>  * s= allocLines(n)
>  *      pre allocate space for slightly less than N entries.
>  * s = addLine(s, line)
>  *      add line to s, allocating as needed.
>  *      line must be a pointer to preallocated space.
>  * freeLines(s, freep)
>  *      free the lines array; if freep is set, call that on each entry.
>  *      if freep is 0, do not free each entry.
>  * buf = popLine(s)
>  *      return the most recently added line (not an alloced copy of it)
>  * reverseLines(s)
>  *      reverse the order of the lines in the array
>  * sortLines(space, compar)
>  *      sort the lines using the compar function if set, else string_sort()
>  * removeLine(s, which, freep)
>  *      look for all lines which match "which" and remove them from the array
>  *      returns number of matches found
>  * removeLineN(s, i, freep)
>  *      remove the 'i'th line.
>  * lines = splitLine(buf, delim, lines)
>  *      split buf on any/all chars in delim and put the tokens in lines.
>  * buf = joinLines(":", s)
>  *      return one string which is all the strings glued together with ":"
>  *      does not free s, caller must free s.
>  * buf = findLine(lines, needle);
>  *      Return the index the line in lines that matches needle
>  */
> 
> It's all open source, apache licensed, but you'd have to tease it out of
> the bitkeeper source tree.  Wouldn't be that hard and it would be useful.

-- 
---
Larry McVoy           Retired to fishing          http://www.mcvoy.com/lm/boat

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-22 15:37       ` Paul Winalski
@ 2024-05-22 18:49         ` Larry McVoy
  2024-05-22 20:17           ` Larry McVoy
  0 siblings, 1 reply; 47+ messages in thread
From: Larry McVoy @ 2024-05-22 18:49 UTC (permalink / raw)
  To: Paul Winalski; +Cc: Luther Johnson, tuhs

On Wed, May 22, 2024 at 11:37:39AM -0400, Paul Winalski wrote:
> On Tue, May 21, 2024 at 2:12???PM Luther Johnson <luther.johnson@makerlisp.com>
> wrote:
> 
> > I like this anecdote because it points out the difference between being
> > able to handle and process bizarre conditions, as if they were something
> > that should work, which is maybe not that helpful, vs. detecting them and
> > doing something reasonable, like failiing with a "limit exceeded" message
> >
> That is in fact precisely how the DEC compiler handled the 100 nested
> parentheses condition.
> 
> > . A silent, insidious failure down the line because a limit was exceeded
> > is never good.
> >
> Amen!  One should always do bounds checking when dealing with fixed-size
> aggregate data structures.  One compiler that I worked on got a bug report
> of bad code being generated.  The problem was an illegal optimization that
> never should have triggered but did due to a corrupted data table.  Finding
> the culprit of the corruption took hours.  It finally turned out to be due
> to overflow of an adjacent data table in use elsewhere in the compiler.
> The routine to add another entry to that table didn't check for table
> overflow.

We invented a data structure that gets around this problem nicely.  It's
an array of pointers that starts at [1] instead of [0].  The [0]
entry encodes 2 things:

In the upper bits, the log(2) the size of the array.  So all arrays
have at least [0] and [1].  So 2 pointers is the smallest array and
that was important to us, we wanted it to scale up and scale down.

In the lower bits, we record the number of used entries in the array.
We assumed 32 bit pointers and with those we got ~134 million entries
as our maximum number of entries.

Usage is like

char **space = allocLines(4);	// start with space for 4 entries

space = addLine(space, "I am [1]");
space = addLine(space, "I am [2]");
space = addLine(space, "I am [3]");
space = addLine(space, "I am [4]");	// realloc's to 8 entries

freelines(space, 0);	// second arg is typically 0 or free()

It works GREAT.  We used it all over BitKeeper, for stuff as small as
commit comments to arrays of data structures.  It scales down, scales
up.  Helper functions:

/*
 * liblines - interfaces for autoexpanding data structures
 *
 * s= allocLines(n)
 *      pre allocate space for slightly less than N entries.
 * s = addLine(s, line)
 *      add line to s, allocating as needed.
 *      line must be a pointer to preallocated space.
 * freeLines(s, freep)
 *      free the lines array; if freep is set, call that on each entry.
 *      if freep is 0, do not free each entry.
 * buf = popLine(s)
 *      return the most recently added line (not an alloced copy of it)
 * reverseLines(s)
 *      reverse the order of the lines in the array
 * sortLines(space, compar)
 *      sort the lines using the compar function if set, else string_sort()
 * removeLine(s, which, freep)
 *      look for all lines which match "which" and remove them from the array
 *      returns number of matches found
 * removeLineN(s, i, freep)
 *      remove the 'i'th line.
 * lines = splitLine(buf, delim, lines)
 *      split buf on any/all chars in delim and put the tokens in lines.
 * buf = joinLines(":", s)
 *      return one string which is all the strings glued together with ":"
 *      does not free s, caller must free s.
 * buf = findLine(lines, needle);
 *      Return the index the line in lines that matches needle
 */

It's all open source, apache licensed, but you'd have to tease it out of
the bitkeeper source tree.  Wouldn't be that hard and it would be useful.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-21 18:12     ` Luther Johnson
@ 2024-05-22 15:37       ` Paul Winalski
  2024-05-22 18:49         ` Larry McVoy
  0 siblings, 1 reply; 47+ messages in thread
From: Paul Winalski @ 2024-05-22 15:37 UTC (permalink / raw)
  To: Luther Johnson; +Cc: tuhs

[-- Attachment #1: Type: text/plain, Size: 1139 bytes --]

On Tue, May 21, 2024 at 2:12 PM Luther Johnson <luther.johnson@makerlisp.com>
wrote:

> I like this anecdote because it points out the difference between being
> able to handle and process bizarre conditions, as if they were something
> that should work, which is maybe not that helpful, vs. detecting them and
> doing something reasonable, like failiing with a "limit exceeded" message
>
That is in fact precisely how the DEC compiler handled the 100 nested
parentheses condition.

> . A silent, insidious failure down the line because a limit was exceeded
> is never good.
>
Amen!  One should always do bounds checking when dealing with fixed-size
aggregate data structures.  One compiler that I worked on got a bug report
of bad code being generated.  The problem was an illegal optimization that
never should have triggered but did due to a corrupted data table.  Finding
the culprit of the corruption took hours.  It finally turned out to be due
to overflow of an adjacent data table in use elsewhere in the compiler.
The routine to add another entry to that table didn't check for table
overflow.

-Paul W.

[-- Attachment #2: Type: text/html, Size: 1742 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-22  5:08       ` Alexis
@ 2024-05-22 13:12         ` Warner Losh
  0 siblings, 0 replies; 47+ messages in thread
From: Warner Losh @ 2024-05-22 13:12 UTC (permalink / raw)
  To: Alexis; +Cc: The Unix Heritage Society

[-- Attachment #1: Type: text/plain, Size: 1095 bytes --]

On Tue, May 21, 2024, 11:08 PM Alexis <flexibeast@gmail.com> wrote:

> Dave Horsfall <dave@horsfall.org> writes:
>
> > On Tue, 21 May 2024, Paul Winalski wrote:
> >
> >> To take an example that really happened, a fuzz test consisting
> >> of 100
> >> nested parentheses caused an overflow in a parser table (it
> >> could only
> >> handle 50 nested parens).  Is that worth fixing?
> >
> > Well, they could be a rabid LISP programmer...
>
> Just did a quick check of some of the ELisp packages on my system:
>
> * For my own packages, the maximum was 10 closing parentheses.
> * For the packages in my elpa/ directory, the maximum was 26 in
>   ducpel-glyphs.el, where they were part of a glyph, rather than
>   delimiting code. The next highest value was 16, in org.el and
>   magit-sequence.el.
>
> i would suggest that any Lisp with more than a couple of dozen
> closing parentheses is in dire need of refactoring. Although of
> course someone who's rabid is probably not in the appropriate
> mental state for that. :-)
>

That's what ']' is for.

Warner

>

[-- Attachment #2: Type: text/html, Size: 1822 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-22  3:26     ` Dave Horsfall
@ 2024-05-22  5:08       ` Alexis
  2024-05-22 13:12         ` Warner Losh
  0 siblings, 1 reply; 47+ messages in thread
From: Alexis @ 2024-05-22  5:08 UTC (permalink / raw)
  To: Dave Horsfall; +Cc: The Unix Heritage Society

Dave Horsfall <dave@horsfall.org> writes:

> On Tue, 21 May 2024, Paul Winalski wrote:
>
>> To take an example that really happened, a fuzz test consisting 
>> of 100 
>> nested parentheses caused an overflow in a parser table (it 
>> could only 
>> handle 50 nested parens).  Is that worth fixing?
>
> Well, they could be a rabid LISP programmer...

Just did a quick check of some of the ELisp packages on my system:

* For my own packages, the maximum was 10 closing parentheses.
* For the packages in my elpa/ directory, the maximum was 26 in 
  ducpel-glyphs.el, where they were part of a glyph, rather than 
  delimiting code. The next highest value was 16, in org.el and 
  magit-sequence.el.

i would suggest that any Lisp with more than a couple of dozen 
closing parentheses is in dire need of refactoring. Although of 
course someone who's rabid is probably not in the appropriate 
mental state for that. :-)

Alexis.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-21 16:59   ` Paul Winalski
  2024-05-21 17:56     ` segaloco via TUHS
  2024-05-21 18:12     ` Luther Johnson
@ 2024-05-22  3:26     ` Dave Horsfall
  2024-05-22  5:08       ` Alexis
  2 siblings, 1 reply; 47+ messages in thread
From: Dave Horsfall @ 2024-05-22  3:26 UTC (permalink / raw)
  To: The Eunuchs Hysterical Society

[-- Attachment #1: Type: text/plain, Size: 307 bytes --]

On Tue, 21 May 2024, Paul Winalski wrote:

> To take an example that really happened, a fuzz test consisting of 100 
> nested parentheses caused an overflow in a parser table (it could only 
> handle 50 nested parens).  Is that worth fixing?

Well, they could be a rabid LISP programmer...

-- Dave

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-21 16:59   ` Paul Winalski
  2024-05-21 17:56     ` segaloco via TUHS
@ 2024-05-21 18:12     ` Luther Johnson
  2024-05-22 15:37       ` Paul Winalski
  2024-05-22  3:26     ` Dave Horsfall
  2 siblings, 1 reply; 47+ messages in thread
From: Luther Johnson @ 2024-05-21 18:12 UTC (permalink / raw)
  To: tuhs

[-- Attachment #1: Type: text/plain, Size: 3463 bytes --]

I like this anecdote because it points out the difference between being
able to handle and process bizarre conditions, as if they were something
that should work, which is maybe not that helpful, vs. detecting them
and doing something reasonable, like failiing with a "limit exceeded"
message. A silent, insidious failure down the line because a limit was
exceeded is never good. If "fuzz testing" helps exercise limits and
identifies places where software hasn't realized it has exceeded its
limits, has run off the end of a table, etc., that seems like a good
thing to me.

On 05/21/2024 09:59 AM, Paul Winalski wrote:
> On Tue, May 21, 2024 at 12:09 AM Serissa <stewart@serissa.com
> <mailto:stewart@serissa.com>> wrote:
>
>         Well this is obviously a hot button topic.  AFAIK I was nearby
>         when fuzz-testing for software was invented. I was the main
>         advocate for hiring Andy Payne into the Digital Cambridge
>         Research Lab.  One of his little projects was a thing that
>         generated random but correct C programs and fed them to
>         different compilers or compilers with different switches to
>         see if they crashed or generated incorrect results.
>         Overnight, his tester filed 300 or so bug reports against the
>         Digital C compiler.  This was met with substantial pushback,
>         but it was a mostly an issue that many of the reports traced
>         to the same underlying bugs.
>
>         Bill McKeemon expanded the technique and published
>         "Differential Testing of Software"
>         https://www.cs.swarthmore.edu/~bylvisa1/cs97/f13/Papers/DifferentialTestingForSoftware.pdf
>         <https://www.cs.swarthmore.edu/%7Ebylvisa1/cs97/f13/Papers/DifferentialTestingForSoftware.pdf>
>
> In the mid-late 1980s Bill Mckeeman worked with DEC's compiler product
> teams to introduce fuzz testing into our testing process.  As with the
> C compiler work at DEC Cambridge, fuzz testing for other compilers
> (Fortran, PL/I) also found large numbers of bugs.
>
> The pushback from the compiler folks was mainly a matter of
> priorities.  Fuzz testing is very adept at finding edge conditions,
> but most failing fuzz tests have syntax that no human programmer would
> ever write.  As a compiler engineer you have limited time to devote to
> bug testing.  Do you spend that time addressing real customer issues
> that have been reported or do you spend it fixing problems with code
> that no human being would ever write?  To take an example that really
> happened, a fuzz test consisting of 100 nested parentheses caused an
> overflow in a parser table (it could only handle 50 nested parens).
> Is that worth fixing?
>
> As you pointed out, fuzz test failures tend to occur in clusters and
> many of the failures eventually are traced to the same underlying
> bug.  Which leads to the counter-argument to the pushback.  The fuzz
> tests are finding real underlying bugs.  Why not fix them before a
> customer runs into them?  That very thing did happen several times.  A
> customer-reported bug was fixed and suddenly several of the fuzz test
> problems that had been reported went away.  Another consideration is
> that, even back in the 1980s, humans weren't the only ones writing
> programs.  There were programs writing programs and they sometimes
> produced bizarre (but syntactically correct) code.
>
> -Paul W.


[-- Attachment #2: Type: text/html, Size: 5407 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-21 16:59   ` Paul Winalski
@ 2024-05-21 17:56     ` segaloco via TUHS
  2024-05-21 18:12     ` Luther Johnson
  2024-05-22  3:26     ` Dave Horsfall
  2 siblings, 0 replies; 47+ messages in thread
From: segaloco via TUHS @ 2024-05-21 17:56 UTC (permalink / raw)
  To: The Eunuchs Hysterical Society

On Tuesday, May 21st, 2024 at 9:59 AM, Paul Winalski <paul.winalski@gmail.com> wrote:

> On Tue, May 21, 2024 at 12:09 AM Serissa <stewart@serissa.com> wrote:
> 
> > > Well this is obviously a hot button topic. AFAIK I was nearby when fuzz-testing for software was invented. I was the main advocate for hiring Andy Payne into the Digital Cambridge Research Lab. One of his little projects was a thing that generated random but correct C programs and fed them to different compilers or compilers with different switches to see if they crashed or generated incorrect results. Overnight, his tester filed 300 or so bug reports against the Digital C compiler. This was met with substantial pushback, but it was a mostly an issue that many of the reports traced to the same underlying bugs.
> > > 
> > > Bill McKeemon expanded the technique and published "Differential Testing of Software" https://www.cs.swarthmore.edu/~bylvisa1/cs97/f13/Papers/DifferentialTestingForSoftware.pdf
> 
> In the mid-late 1980s Bill Mckeeman worked with DEC's compiler product teams to introduce fuzz testing into our testing process. As with the C compiler work at DEC Cambridge, fuzz testing for other compilers (Fortran, PL/I) also found large numbers of bugs.
> 
> The pushback from the compiler folks was mainly a matter of priorities. Fuzz testing is very adept at finding edge conditions, but most failing fuzz tests have syntax that no human programmer would ever write. As a compiler engineer you have limited time to devote to bug testing. Do you spend that time addressing real customer issues that have been reported or do you spend it fixing problems with code that no human being would ever write? To take an example that really happened, a fuzz test consisting of 100 nested parentheses caused an overflow in a parser table (it could only handle 50 nested parens). Is that worth fixing?
> 
> As you pointed out, fuzz test failures tend to occur in clusters and many of the failures eventually are traced to the same underlying bug. Which leads to the counter-argument to the pushback. The fuzz tests are finding real underlying bugs. Why not fix them before a customer runs into them? That very thing did happen several times. A customer-reported bug was fixed and suddenly several of the fuzz test problems that had been reported went away. Another consideration is that, even back in the 1980s, humans weren't the only ones writing programs. There were programs writing programs and they sometimes produced bizarre (but syntactically correct) code.
> 
> -Paul W.

A happy medium could be including far-out fuzzing to characterize issues, but not necessarily then immediately sink the resources into resolving bizarre discoveries from the fuzzing.  Better to know then not but also have the wisdom to determine "is someone actually going to trip this" vs. "this is something that is possible and good to document".  In my own work we have several of the latter where something is almost guaranteed to never happen with a human interaction, but is also something we want documented somewhere so if unlikely problem <xyz> ever does happen, the discovery is already done and we just start plotting out a solution.  That's also some nice low hanging fruit to pluck when there isn't much else going on, but avoids the phenomenon where we sink critical time into bugfixes with a microscopic ROI.

- Matt G.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-21  1:56 ` Rob Pike
  2024-05-21  2:47   ` Larry McVoy
  2024-05-21  3:53   ` George Michaelson
@ 2024-05-21 16:59   ` Paul Winalski
  2024-05-21 17:56     ` segaloco via TUHS
                       ` (2 more replies)
  2 siblings, 3 replies; 47+ messages in thread
From: Paul Winalski @ 2024-05-21 16:59 UTC (permalink / raw)
  To: TUHS main list

[-- Attachment #1: Type: text/plain, Size: 2532 bytes --]

On Tue, May 21, 2024 at 12:09 AM Serissa <stewart@serissa.com> wrote:

> Well this is obviously a hot button topic.  AFAIK I was nearby when
>> fuzz-testing for software was invented. I was the main advocate for hiring
>> Andy Payne into the Digital Cambridge Research Lab.  One of his little
>> projects was a thing that generated random but correct C programs and fed
>> them to different compilers or compilers with different switches to see if
>> they crashed or generated incorrect results.  Overnight, his tester filed
>> 300 or so bug reports against the Digital C compiler.  This was met with
>> substantial pushback, but it was a mostly an issue that many of the reports
>> traced to the same underlying bugs.
>>
>> Bill McKeemon expanded the technique and published "Differential Testing
>> of Software"
>> https://www.cs.swarthmore.edu/~bylvisa1/cs97/f13/Papers/DifferentialTestingForSoftware.pdf
>>
>
In the mid-late 1980s Bill Mckeeman worked with DEC's compiler product
teams to introduce fuzz testing into our testing process.  As with the C
compiler work at DEC Cambridge, fuzz testing for other compilers (Fortran,
PL/I) also found large numbers of bugs.

The pushback from the compiler folks was mainly a matter of priorities.
Fuzz testing is very adept at finding edge conditions, but most failing
fuzz tests have syntax that no human programmer would ever write.  As a
compiler engineer you have limited time to devote to bug testing.  Do you
spend that time addressing real customer issues that have been reported or
do you spend it fixing problems with code that no human being would ever
write?  To take an example that really happened, a fuzz test consisting of
100 nested parentheses caused an overflow in a parser table (it could only
handle 50 nested parens).  Is that worth fixing?

As you pointed out, fuzz test failures tend to occur in clusters and many
of the failures eventually are traced to the same underlying bug.  Which
leads to the counter-argument to the pushback.  The fuzz tests are finding
real underlying bugs.  Why not fix them before a customer runs into them?
That very thing did happen several times.  A customer-reported bug was
fixed and suddenly several of the fuzz test problems that had been reported
went away.  Another consideration is that, even back in the 1980s, humans
weren't the only ones writing programs.  There were programs writing
programs and they sometimes produced bizarre (but syntactically correct)
code.

-Paul W.

[-- Attachment #2: Type: text/html, Size: 3170 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-21  3:36       ` Rob Pike
@ 2024-05-21 11:59         ` Peter Weinberger (温博格) via TUHS
  0 siblings, 0 replies; 47+ messages in thread
From: Peter Weinberger (温博格) via TUHS @ 2024-05-21 11:59 UTC (permalink / raw)
  To: Rob Pike; +Cc: TUHS main list

On a lesser note, one day I got tired of C compiler crashes (probably
on the Vax, possibly originating in my code generator) and converted
them into 'fatal internal error' messages.

On Mon, May 20, 2024 at 11:36 PM Rob Pike <robpike@gmail.com> wrote:
>
> Eventually Dennis told Ron to stop as he wasn't interested in protecting against insane things like "unsigned register union". Now that computing has become more adversarial, he might feel differently.
>
> -rob
>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-21  1:56 ` Rob Pike
  2024-05-21  2:47   ` Larry McVoy
@ 2024-05-21  3:53   ` George Michaelson
  2024-05-21 16:59   ` Paul Winalski
  2 siblings, 0 replies; 47+ messages in thread
From: George Michaelson @ 2024-05-21  3:53 UTC (permalink / raw)
  To: The Eunuchs Hysterical Society

On Tue, May 21, 2024 at 11:56 AM Rob Pike <robpike@gmail.com> wrote:
>It's probably impossible to decide who invented fuzzing, so the credit will surely go to the person who named it.

That theory probably applies to the Earl of Sandwich, And the Earl of
Cardigan. Hoare Belisha also did ok for giant orange balls at zebra
crossings. I'm less sure the Earl of Zebra feels recognised, or that
Eugène-René Poubelle feels happy with his namesake (he should do, dust
bins are huge)

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-21  2:54     ` Lawrence Stewart
@ 2024-05-21  3:36       ` Rob Pike
  2024-05-21 11:59         ` Peter Weinberger (温博格) via TUHS
  0 siblings, 1 reply; 47+ messages in thread
From: Rob Pike @ 2024-05-21  3:36 UTC (permalink / raw)
  To: Lawrence Stewart; +Cc: TUHS main list

[-- Attachment #1: Type: text/plain, Size: 208 bytes --]

Eventually Dennis told Ron to stop as he wasn't interested in protecting
against insane things like "unsigned register union". Now that computing
has become more adversarial, he might feel differently.

-rob

[-- Attachment #2: Type: text/html, Size: 530 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-21  2:47   ` Larry McVoy
@ 2024-05-21  2:54     ` Lawrence Stewart
  2024-05-21  3:36       ` Rob Pike
  0 siblings, 1 reply; 47+ messages in thread
From: Lawrence Stewart @ 2024-05-21  2:54 UTC (permalink / raw)
  To: Larry McVoy; +Cc: TUHS main list

Good to learn more of the history!  I wonder when the technique got started on the hardware side?  
I wouldn’t be surprised if IBM were doing some of this for the S/360 since it was a nearly 
compatible set of systems.
-L

> On May 20, 2024, at 10:47 PM, Larry McVoy <lm@mcvoy.com> wrote:
> 
> I think the title might go to my OS prof, Bart Miller.  He did a paper 
> 
> https://www.paradyn.org/papers/fuzz.pdf
> 
> that named it that in 1990.  
> 
> On Tue, May 21, 2024 at 11:56:30AM +1000, Rob Pike wrote:
>> Ron Hardin was doing this to Dennis's C compiler in the 1980s, well before
>> 1998. And I believe Doug McIlroy was generating random regular expressions
>> to compare different implementations. It's probably impossible to decide
>> who invented fuzzing, so the credit will surely go to the person who named
>> it.
>> 
>> -rob
>> 
>> 
>> On Tue, May 21, 2024 at 12:09???AM Serissa <stewart@serissa.com> wrote:
>> 
>>> Well this is obviously a hot button topic.  AFAIK I was nearby when
>>> fuzz-testing for software was invented. I was the main advocate for hiring
>>> Andy Payne into the Digital Cambridge Research Lab.  One of his little
>>> projects was a thing that generated random but correct C programs and fed
>>> them to different compilers or compilers with different switches to see if
>>> they crashed or generated incorrect results.  Overnight, his tester filed
>>> 300 or so bug reports against the Digital C compiler.  This was met with
>>> substantial pushback, but it was a mostly an issue that many of the reports
>>> traced to the same underlying bugs.
>>> 
>>> Bill McKeemon expanded the technique and published "Differential Testing
>>> of Software"
>>> https://www.cs.swarthmore.edu/~bylvisa1/cs97/f13/Papers/DifferentialTestingForSoftware.pdf
>>> 
>>> Andy had encountered the underlying idea while working as an intern on the
>>> Alpha processor development team.  Among many other testers, they used an
>>> architectural tester called REX to generate more or less random sequences
>>> of instructions, which were then run through different simulation chains
>>> (functional, RTL, cycle-accurate) to see if they did the same thing.
>>> Finding user-accessible bugs in hardware seems like a good thing.
>>> 
>>> The point of generating correct programs (mentioned under the term LangSec
>>> here) goes a long way to avoid irritating the maintainers.  Making the test
>>> cases short is also maintainer-friendly.  The test generator is also in a
>>> position to annotate the source with exactly what it is supposed to do,
>>> which is also helpful.
>>> 
>>> -L
>>> 
>>> 
>>> 
> 
> -- 
> ---
> Larry McVoy           Retired to fishing          http://www.mcvoy.com/lm/boat


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-21  1:56 ` Rob Pike
@ 2024-05-21  2:47   ` Larry McVoy
  2024-05-21  2:54     ` Lawrence Stewart
  2024-05-21  3:53   ` George Michaelson
  2024-05-21 16:59   ` Paul Winalski
  2 siblings, 1 reply; 47+ messages in thread
From: Larry McVoy @ 2024-05-21  2:47 UTC (permalink / raw)
  To: Rob Pike; +Cc: TUHS main list

I think the title might go to my OS prof, Bart Miller.  He did a paper 

https://www.paradyn.org/papers/fuzz.pdf

that named it that in 1990.  

On Tue, May 21, 2024 at 11:56:30AM +1000, Rob Pike wrote:
> Ron Hardin was doing this to Dennis's C compiler in the 1980s, well before
> 1998. And I believe Doug McIlroy was generating random regular expressions
> to compare different implementations. It's probably impossible to decide
> who invented fuzzing, so the credit will surely go to the person who named
> it.
> 
> -rob
> 
> 
> On Tue, May 21, 2024 at 12:09???AM Serissa <stewart@serissa.com> wrote:
> 
> > Well this is obviously a hot button topic.  AFAIK I was nearby when
> > fuzz-testing for software was invented. I was the main advocate for hiring
> > Andy Payne into the Digital Cambridge Research Lab.  One of his little
> > projects was a thing that generated random but correct C programs and fed
> > them to different compilers or compilers with different switches to see if
> > they crashed or generated incorrect results.  Overnight, his tester filed
> > 300 or so bug reports against the Digital C compiler.  This was met with
> > substantial pushback, but it was a mostly an issue that many of the reports
> > traced to the same underlying bugs.
> >
> > Bill McKeemon expanded the technique and published "Differential Testing
> > of Software"
> > https://www.cs.swarthmore.edu/~bylvisa1/cs97/f13/Papers/DifferentialTestingForSoftware.pdf
> >
> > Andy had encountered the underlying idea while working as an intern on the
> > Alpha processor development team.  Among many other testers, they used an
> > architectural tester called REX to generate more or less random sequences
> > of instructions, which were then run through different simulation chains
> > (functional, RTL, cycle-accurate) to see if they did the same thing.
> > Finding user-accessible bugs in hardware seems like a good thing.
> >
> > The point of generating correct programs (mentioned under the term LangSec
> > here) goes a long way to avoid irritating the maintainers.  Making the test
> > cases short is also maintainer-friendly.  The test generator is also in a
> > position to annotate the source with exactly what it is supposed to do,
> > which is also helpful.
> >
> > -L
> >
> >
> >

-- 
---
Larry McVoy           Retired to fishing          http://www.mcvoy.com/lm/boat

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-20 14:09 Serissa
@ 2024-05-21  1:56 ` Rob Pike
  2024-05-21  2:47   ` Larry McVoy
                     ` (2 more replies)
  0 siblings, 3 replies; 47+ messages in thread
From: Rob Pike @ 2024-05-21  1:56 UTC (permalink / raw)
  To: Serissa; +Cc: TUHS main list

[-- Attachment #1: Type: text/plain, Size: 2040 bytes --]

Ron Hardin was doing this to Dennis's C compiler in the 1980s, well before
1998. And I believe Doug McIlroy was generating random regular expressions
to compare different implementations. It's probably impossible to decide
who invented fuzzing, so the credit will surely go to the person who named
it.

-rob


On Tue, May 21, 2024 at 12:09 AM Serissa <stewart@serissa.com> wrote:

> Well this is obviously a hot button topic.  AFAIK I was nearby when
> fuzz-testing for software was invented. I was the main advocate for hiring
> Andy Payne into the Digital Cambridge Research Lab.  One of his little
> projects was a thing that generated random but correct C programs and fed
> them to different compilers or compilers with different switches to see if
> they crashed or generated incorrect results.  Overnight, his tester filed
> 300 or so bug reports against the Digital C compiler.  This was met with
> substantial pushback, but it was a mostly an issue that many of the reports
> traced to the same underlying bugs.
>
> Bill McKeemon expanded the technique and published "Differential Testing
> of Software"
> https://www.cs.swarthmore.edu/~bylvisa1/cs97/f13/Papers/DifferentialTestingForSoftware.pdf
>
> Andy had encountered the underlying idea while working as an intern on the
> Alpha processor development team.  Among many other testers, they used an
> architectural tester called REX to generate more or less random sequences
> of instructions, which were then run through different simulation chains
> (functional, RTL, cycle-accurate) to see if they did the same thing.
> Finding user-accessible bugs in hardware seems like a good thing.
>
> The point of generating correct programs (mentioned under the term LangSec
> here) goes a long way to avoid irritating the maintainers.  Making the test
> cases short is also maintainer-friendly.  The test generator is also in a
> position to annotate the source with exactly what it is supposed to do,
> which is also helpful.
>
> -L
>
>
>

[-- Attachment #2: Type: text/html, Size: 2796 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
@ 2024-05-20 14:09 Serissa
  2024-05-21  1:56 ` Rob Pike
  0 siblings, 1 reply; 47+ messages in thread
From: Serissa @ 2024-05-20 14:09 UTC (permalink / raw)
  To: TUHS main list

[-- Attachment #1: Type: text/plain, Size: 1571 bytes --]

Well this is obviously a hot button topic.  AFAIK I was nearby when fuzz-testing for software was invented. I was the main advocate for hiring Andy Payne into the Digital Cambridge Research Lab.  One of his little projects was a thing that generated random but correct C programs and fed them to different compilers or compilers with different switches to see if they crashed or generated incorrect results.  Overnight, his tester filed 300 or so bug reports against the Digital C compiler.  This was met with substantial pushback, but it was a mostly an issue that many of the reports traced to the same underlying bugs.

Bill McKeemon expanded the technique and published "Differential Testing of Software" https://www.cs.swarthmore.edu/~bylvisa1/cs97/f13/Papers/DifferentialTestingForSoftware.pdf

Andy had encountered the underlying idea while working as an intern on the Alpha processor development team.  Among many other testers, they used an architectural tester called REX to generate more or less random sequences of instructions, which were then run through different simulation chains (functional, RTL, cycle-accurate) to see if they did the same thing.  Finding user-accessible bugs in hardware seems like a good thing.

The point of generating correct programs (mentioned under the term LangSec here) goes a long way to avoid irritating the maintainers.  Making the test cases short is also maintainer-friendly.  The test generator is also in a position to annotate the source with exactly what it is supposed to do, which is also helpful.

-L

[-- Attachment #2: Type: text/html, Size: 1946 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-20 13:30         ` [TUHS] Re: A fuzzy awk Ralph Corderoy
@ 2024-05-20 13:48           ` Chet Ramey
  0 siblings, 0 replies; 47+ messages in thread
From: Chet Ramey @ 2024-05-20 13:48 UTC (permalink / raw)
  To: Ralph Corderoy, tuhs

[-- Attachment #1.1: Type: text/plain, Size: 1341 bytes --]

On 5/20/24 9:30 AM, Ralph Corderoy wrote:
> Hi Chet,
> 
>> Is it better to spend time on bugs that will affect a larger
>> percentage of the user population, instead of those that require
>> artificial circumstances that won't be encountered by normal usage?
>> Those get pushed down on the priority list.
> 
> You're talking about pushing unlikely, fuzzed bugs down the prioritised
> list, but we're discussing those bugs not getting onto the list for
> consideration.

I think the question is whether they were bugs in gawk at all, or the
result of gawk trying to be helpful by guessing at the script's intent
and trying to go on. Arnold's reaction to that, which had these negative
effects most often as the result of fuzzing attempts, was to exit on the
first syntax error.

Would those `bugs' have manifested themselves if gawk hadn't tried to do
this? Are they bugs at all? Guessing at intent is bound to be wrong some
of the time, and cause errors of its own.

I'm saying that fuzzing does occasionally find obscure bugs -- bugs that
would never be encountered in normal usage -- and those should be fixed.
Eventually.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
		 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    chet@case.edu    http://tiswww.cwru.edu/~chet/

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 203 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [TUHS] Re: A fuzzy awk.
  2024-05-20 13:10       ` [TUHS] " Chet Ramey
@ 2024-05-20 13:30         ` Ralph Corderoy
  2024-05-20 13:48           ` Chet Ramey
  0 siblings, 1 reply; 47+ messages in thread
From: Ralph Corderoy @ 2024-05-20 13:30 UTC (permalink / raw)
  To: tuhs

Hi Chet,

> Is it better to spend time on bugs that will affect a larger
> percentage of the user population, instead of those that require
> artificial circumstances that won't be encountered by normal usage?
> Those get pushed down on the priority list.

You're talking about pushing unlikely, fuzzed bugs down the prioritised
list, but we're discussing those bugs not getting onto the list for
consideration.

Lack of resources also applies to triaging bugs and I agree a fuzzed bug
which hands over a 42 KiB of dense, gibberish awk will probably not get
volunteer attention.  But then fuzzers can seek a smaller test case,
similar to Andreas Zeller's delta debugging.

I'm in no way criticising Arnold who, like you, has spent many years
voluntarily enhancing a program many of us use every day.  But it's
interesting to shine some light on this corner to better understand
what's happening.

-- 
Cheers, Ralph.

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2024-05-25 17:37 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-20 13:06 [TUHS] A fuzzy awk. (Was: The 'usage: ...' message.) Douglas McIlroy
2024-05-20 13:14 ` [TUHS] " arnold
2024-05-20 14:00   ` G. Branden Robinson
2024-05-20 13:25 ` Chet Ramey
2024-05-20 13:41   ` [TUHS] Re: A fuzzy awk Ralph Corderoy
2024-05-20 14:26     ` Chet Ramey
2024-05-22 13:44     ` arnold
2024-05-20 13:54 ` Ralph Corderoy
2024-05-20 15:39   ` [TUHS] OT: LangSec (Re: A fuzzy awk.) Åke Nordin
2024-05-20 16:09     ` [TUHS] " Ben Kallus
2024-05-20 20:02       ` John Levine
2024-05-20 20:11         ` Larry McVoy
2024-05-20 21:00           ` Ben Kallus
2024-05-20 21:03             ` John R Levine
2024-05-20 21:14             ` Larry McVoy
2024-05-20 21:46               ` Ben Kallus
2024-05-20 21:57                 ` Larry McVoy
2024-05-20 16:06 ` [TUHS] Re: A fuzzy awk. (Was: The 'usage: ...' message.) Paul Winalski
  -- strict thread matches above, loose matches on Subject: below --
2024-05-23 13:49 [TUHS] Re: A fuzzy awk Douglas McIlroy
2024-05-23 20:52 ` Rob Pike
2024-05-24  5:41   ` andrew
2024-05-24  7:17   ` Ralph Corderoy
2024-05-24  7:41     ` Rob Pike
2024-05-24 11:56     ` Dan Halbert
2024-05-25  0:17   ` Bakul Shah via TUHS
2024-05-25  0:57     ` G. Branden Robinson
2024-05-25 13:56     ` David Arnold
2024-05-25 17:18     ` Paul Winalski
2024-05-25 17:36       ` Tom Perrine
2024-05-20 14:09 Serissa
2024-05-21  1:56 ` Rob Pike
2024-05-21  2:47   ` Larry McVoy
2024-05-21  2:54     ` Lawrence Stewart
2024-05-21  3:36       ` Rob Pike
2024-05-21 11:59         ` Peter Weinberger (温博格) via TUHS
2024-05-21  3:53   ` George Michaelson
2024-05-21 16:59   ` Paul Winalski
2024-05-21 17:56     ` segaloco via TUHS
2024-05-21 18:12     ` Luther Johnson
2024-05-22 15:37       ` Paul Winalski
2024-05-22 18:49         ` Larry McVoy
2024-05-22 20:17           ` Larry McVoy
2024-05-22  3:26     ` Dave Horsfall
2024-05-22  5:08       ` Alexis
2024-05-22 13:12         ` Warner Losh
2024-05-19 23:08 [TUHS] The 'usage: ...' message. (Was: On Bloat...) Douglas McIlroy
2024-05-20  0:58 ` [TUHS] " Rob Pike
2024-05-20  3:19   ` arnold
2024-05-20  9:20     ` [TUHS] A fuzzy awk. (Was: The 'usage: ...' message.) Ralph Corderoy
2024-05-20 13:10       ` [TUHS] " Chet Ramey
2024-05-20 13:30         ` [TUHS] Re: A fuzzy awk Ralph Corderoy
2024-05-20 13:48           ` Chet Ramey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).