[TUHS] Re: A fuzzy awk

The Unix Heritage Society mailing list
 help / color / mirror / Atom feed

* [TUHS] Re: A fuzzy awk
@ 2024-05-23 13:49 Douglas McIlroy
  2024-05-23 20:52 ` Rob Pike
  2024-05-27  9:39 ` [TUHS] Testing an RE recogniser exhaustively. (Was: A fuzzy awk) Ralph Corderoy
  0 siblings, 2 replies; 15+ messages in thread
From: Douglas McIlroy @ 2024-05-23 13:49 UTC (permalink / raw)
  To: TUHS main list

[-- Attachment #1: Type: text/plain, Size: 971 bytes --]

> Doug McIlroy was generating random regular expressions

Actually not. I exhaustively (within limits) tested an RE recognizer
without knowingly generating any RE either mechanically or by hand.

The trick: From recursive equations (easily derived from the grammar of
REs), I counted how many REs exist up to various limits on token counts,
Then I generated all strings that satisfied those limits, turned the
recognizer loose on them and counted how many it accepted. Any disagreement
of counts revealed the existence (but not any symptom) of bugs.

Unlike most diagnostic techniques, this scheme produces a certificate of
(very high odds on) correctness over a representative subdomain. The scheme
also agnostically checks behavior on bad inputs as well as good.  It does
not, however, provide a stress test of a recognizer's capacity limits. And
its exponential nature limits its applicability to rather small domains.
(REs have only 5 distinct kinds of token.)

Doug

[-- Attachment #2: Type: text/html, Size: 1698 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-23 13:49 [TUHS] Re: A fuzzy awk Douglas McIlroy
@ 2024-05-23 20:52 ` Rob Pike
  2024-05-24  5:41   ` andrew
                     ` (2 more replies)
  2024-05-27  9:39 ` [TUHS] Testing an RE recogniser exhaustively. (Was: A fuzzy awk) Ralph Corderoy
  1 sibling, 3 replies; 15+ messages in thread
From: Rob Pike @ 2024-05-23 20:52 UTC (permalink / raw)
  To: Douglas McIlroy; +Cc: TUHS main list

[-- Attachment #1: Type: text/plain, Size: 1551 bytes --]

The semantic distinction is important but the end result is very similar.
"Fuzzing" as it is now called (for no reason I can intuit) tries to get to
the troublesome cases faster by a sort of depth-first search, but
exhaustive will always beat it for value. Our exhaustive tester for bitblt,
first done by John Reiser if I remember right, set the stage for my own
thinking about how you properly test something.

-rob


On Thu, May 23, 2024 at 11:49 PM Douglas McIlroy <
douglas.mcilroy@dartmouth.edu> wrote:

> > Doug McIlroy was generating random regular expressions
>
> Actually not. I exhaustively (within limits) tested an RE recognizer
> without knowingly generating any RE either mechanically or by hand.
>
> The trick: From recursive equations (easily derived from the grammar of
> REs), I counted how many REs exist up to various limits on token counts,
> Then I generated all strings that satisfied those limits, turned the
> recognizer loose on them and counted how many it accepted. Any disagreement
> of counts revealed the existence (but not any symptom) of bugs.
>
> Unlike most diagnostic techniques, this scheme produces a certificate of
> (very high odds on) correctness over a representative subdomain. The
> scheme also agnostically checks behavior on bad inputs as well as good.  It
> does not, however, provide a stress test of a recognizer's capacity limits. And
> its exponential nature limits its applicability to rather small domains.
> (REs have only 5 distinct kinds of token.)
>
> Doug
>

[-- Attachment #2: Type: text/html, Size: 2792 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-23 20:52 ` Rob Pike
@ 2024-05-24  5:41   ` andrew
  2024-05-24  7:17   ` Ralph Corderoy
  2024-05-25  0:17   ` Bakul Shah via TUHS
  2 siblings, 0 replies; 15+ messages in thread
From: andrew @ 2024-05-24  5:41 UTC (permalink / raw)
  To: Rob Pike; +Cc: Douglas McIlroy, TUHS main list

[-- Attachment #1: Type: text/plain, Size: 1920 bytes --]

i did some of the later testing of bitblt.
it was a lovely thing, slowly constructing a trustable synthetic bitblt
of ever great size and range that you could compare the bitblt to be tested against.

and we did find a couple of bugs, much to reiser’s chagrin.

> On May 23, 2024, at 1:52 PM, Rob Pike <robpike@gmail.com> wrote:
> 
> The semantic distinction is important but the end result is very similar. "Fuzzing" as it is now called (for no reason I can intuit) tries to get to the troublesome cases faster by a sort of depth-first search, but exhaustive will always beat it for value. Our exhaustive tester for bitblt, first done by John Reiser if I remember right, set the stage for my own thinking about how you properly test something.
> 
> -rob
> 
> 
> On Thu, May 23, 2024 at 11:49 PM Douglas McIlroy <douglas.mcilroy@dartmouth.edu <mailto:douglas.mcilroy@dartmouth.edu>> wrote:
>> > Doug McIlroy was generating random regular expressions
>> 
>> Actually not. I exhaustively (within limits) tested an RE recognizer without knowingly generating any RE either mechanically or by hand.
>> 
>> The trick: From recursive equations (easily derived from the grammar of REs), I counted how many REs exist up to various limits on token counts, Then I generated all strings that satisfied those limits, turned the recognizer loose on them and counted how many it accepted. Any disagreement of counts revealed the existence (but not any symptom) of bugs. 
>> 
>> Unlike most diagnostic techniques, this scheme produces a certificate of (very high odds on) correctness over a representative subdomain. The scheme also agnostically checks behavior on bad inputs as well as good.  It does not, however, provide a stress test of a recognizer's capacity limits. And its exponential nature limits its applicability to rather small domains. (REs have only 5 distinct kinds of token.)
>> 
>> Doug


[-- Attachment #2: Type: text/html, Size: 3529 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-23 20:52 ` Rob Pike
  2024-05-24  5:41   ` andrew
@ 2024-05-24  7:17   ` Ralph Corderoy
  2024-05-24  7:41     ` Rob Pike
  2024-05-24 11:56     ` [TUHS] Re: A fuzzy awk Dan Halbert
  2024-05-25  0:17   ` Bakul Shah via TUHS
  2 siblings, 2 replies; 15+ messages in thread
From: Ralph Corderoy @ 2024-05-24  7:17 UTC (permalink / raw)
  To: tuhs

Hi,

Rob wrote:
> "Fuzzing" as it is now called (for no reason I can intuit)

Barton Miller describes coining the term.

   ‘That night, I was logged on to the Unix system in my office via
    a dial-up phone line over a 1200 baud modem.  ...
    I wanted a name that would evoke the feeling of random, unstructured
    data.  After trying out several ideas, I settled on the term “fuzz”.’

        — https://pages.cs.wisc.edu/~bart/fuzz/Foreword1.html

Line noise inspired him, as he describes.

-- 
Cheers, Ralph.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-24  7:17   ` Ralph Corderoy
@ 2024-05-24  7:41     ` Rob Pike
  2024-05-24 10:00       ` [TUHS] Is fuzz testing random? (Was: A fuzzy awk) Ralph Corderoy
  2024-05-24 11:56     ` [TUHS] Re: A fuzzy awk Dan Halbert
  1 sibling, 1 reply; 15+ messages in thread
From: Rob Pike @ 2024-05-24  7:41 UTC (permalink / raw)
  To: Ralph Corderoy; +Cc: tuhs

[-- Attachment #1: Type: text/plain, Size: 768 bytes --]

I'm sure that's the etymology but fuzzing isn't exactly random. That's
kinda the point of it.

-rob


On Fri, May 24, 2024 at 5:18 PM Ralph Corderoy <ralph@inputplus.co.uk>
wrote:

> Hi,
>
> Rob wrote:
> > "Fuzzing" as it is now called (for no reason I can intuit)
>
> Barton Miller describes coining the term.
>
>    ‘That night, I was logged on to the Unix system in my office via
>     a dial-up phone line over a 1200 baud modem.  ...
>     I wanted a name that would evoke the feeling of random, unstructured
>     data.  After trying out several ideas, I settled on the term “fuzz”.’
>
>         — https://pages.cs.wisc.edu/~bart/fuzz/Foreword1.html
>
> Line noise inspired him, as he describes.
>
> --
> Cheers, Ralph.
>

[-- Attachment #2: Type: text/html, Size: 1507 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [TUHS] Is fuzz testing random?  (Was: A fuzzy awk)
  2024-05-24  7:41     ` Rob Pike
@ 2024-05-24 10:00       ` Ralph Corderoy
  0 siblings, 0 replies; 15+ messages in thread
From: Ralph Corderoy @ 2024-05-24 10:00 UTC (permalink / raw)
  To: tuhs

Hi Rob,

> I'm sure that's the etymology but fuzzing isn't exactly random.
> That's kinda the point of it.

I was just curious about the etymology, but thinking about it...

The path crept along isn't random but guided by observation, say new
output or increased coverage.  But rather than exhaustively generate all
possible inputs, a random subset is chosen to allow deeper progress to
be made more quickly.

-- 
Cheers, Ralph.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-24  7:17   ` Ralph Corderoy
  2024-05-24  7:41     ` Rob Pike
@ 2024-05-24 11:56     ` Dan Halbert
  1 sibling, 0 replies; 15+ messages in thread
From: Dan Halbert @ 2024-05-24 11:56 UTC (permalink / raw)
  To: tuhs

[-- Attachment #1: Type: text/plain, Size: 822 bytes --]

On 5/24/24 03:17, Ralph Corderoy wrote:
> Rob wrote:
>> "Fuzzing" as it is now called (for no reason I can intuit)
> Barton Miller describes coining the term.
>
As to where the inspiration of choice of word came from, I'll speculate 
: Bart Miller was a CS grad student contemporary of mine at Berkeley. 
Prof. Lotfi Zadeh was working on fuzzy logic, fuzzy sets, and 
"possibility theory". (Prof. William Kahan hated this work, and called 
it "wrong, and pernicious": cf. 
https://www.sciencedirect.com/science/article/abs/pii/S0020025508000716.) 
So the term "fuzzy" was almost infamous in the department.

Prof. Richard Lipton was also at Berkeley at that time, and was working 
on program mutation testing, which fuzzes the program to determine the 
adequacy of test coverage, rather than fuzzing the test data.

Dan H.

[-- Attachment #2: Type: text/html, Size: 1582 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-23 20:52 ` Rob Pike
  2024-05-24  5:41   ` andrew
  2024-05-24  7:17   ` Ralph Corderoy
@ 2024-05-25  0:17   ` Bakul Shah via TUHS
  2024-05-25  0:57     ` G. Branden Robinson
                       ` (2 more replies)
  2 siblings, 3 replies; 15+ messages in thread
From: Bakul Shah via TUHS @ 2024-05-25  0:17 UTC (permalink / raw)
  To: Rob Pike; +Cc: Douglas McIlroy, The Unix Heritage Society mailing list

[-- Attachment #1: Type: text/plain, Size: 3131 bytes --]

What would be nice if programming languages provided some support for such exhaustive testing[1].

At one point I had suggested turning Go's Interface type to something like Guttag style abstract data types in that relevant axioms are specified right in the interface definition. The idea was that any concrete type that implements that interface must satisfy its axioms. Even if the compiler ignored these axioms, one can write a support program that can generate a set of comprehensive tests based on these axioms. [Right now a type "implementing" an interface only needs to have a set of methods that exactly match the interface methods but nothing more] The underlying idea is that each type is in essence a constraint on what values an instance of that type can take. So adding such axioms simply tightens (& documents) these constraints. Just the process of coming up with such axioms can improve the design (sor of like test driven design but better!).

Now it may be that applying this to anything more complex than stacks won't work well & it won't be perfect but I thought this was worth experimenting with. This would be like functional testing of all the nuts and bolts and components that go in an airplane. The airplane may still fall apart but that would be a "composition" error!

[1] There are "proof assisant" or formal spec languages such as TLA+, Coq, Isabelle etc. but they don't get used much by the average programmer. I want something more retail!

> On May 23, 2024, at 1:52 PM, Rob Pike <robpike@gmail.com> wrote:
> 
> The semantic distinction is important but the end result is very similar. "Fuzzing" as it is now called (for no reason I can intuit) tries to get to the troublesome cases faster by a sort of depth-first search, but exhaustive will always beat it for value. Our exhaustive tester for bitblt, first done by John Reiser if I remember right, set the stage for my own thinking about how you properly test something.
> 
> -rob
> 
> 
> On Thu, May 23, 2024 at 11:49 PM Douglas McIlroy <douglas.mcilroy@dartmouth.edu <mailto:douglas.mcilroy@dartmouth.edu>> wrote:
>> > Doug McIlroy was generating random regular expressions
>> 
>> Actually not. I exhaustively (within limits) tested an RE recognizer without knowingly generating any RE either mechanically or by hand.
>> 
>> The trick: From recursive equations (easily derived from the grammar of REs), I counted how many REs exist up to various limits on token counts, Then I generated all strings that satisfied those limits, turned the recognizer loose on them and counted how many it accepted. Any disagreement of counts revealed the existence (but not any symptom) of bugs. 
>> 
>> Unlike most diagnostic techniques, this scheme produces a certificate of (very high odds on) correctness over a representative subdomain. The scheme also agnostically checks behavior on bad inputs as well as good.  It does not, however, provide a stress test of a recognizer's capacity limits. And its exponential nature limits its applicability to rather small domains. (REs have only 5 distinct kinds of token.)
>> 
>> Doug

[-- Attachment #2: Type: text/html, Size: 4916 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-25  0:17   ` Bakul Shah via TUHS
@ 2024-05-25  0:57     ` G. Branden Robinson
  2024-05-25 13:56     ` David Arnold
  2024-05-25 17:18     ` Paul Winalski
  2 siblings, 0 replies; 15+ messages in thread
From: G. Branden Robinson @ 2024-05-25  0:57 UTC (permalink / raw)
  To: The Unix Heritage Society mailing list

[-- Attachment #1: Type: text/plain, Size: 5170 bytes --]

[restricting to list; strong opinions here]

At 2024-05-24T17:17:53-0700, Bakul Shah via TUHS wrote:
> What would be nice if programming languages provided some support for
> such exhaustive testing[1].

[rearranging]
> At one point I had suggested turning Go's Interface type to something
> like Guttag style abstract data types in that relevant axioms are
> specified right in the interface definition.

It's an excellent idea.

> The underlying idea is that each type is in essence a constraint on
> what values an instance of that type can take.

In the simple form of a data type plus a range constraint, that's the
Ada definition of a subtype since day one--Ada '80 or Ada 83 if you
insist on the standardized form of the language.

40 years later we have Linus Torvalds tearing up his achievement
certificate in "kinder, gentler email interactions" just to trash the
notion of range checks on data types.[1][2][3]

Naturally, the brogrammers are quick to take Torvalds's side.[4]

Pascal had range checks too, and Kernighan famously punked on Wirth for
that.  I'm not certain, but I get the feeling the latter got somewhat
over-interpreted.  (To be fair to Kernighan, Pascal _as specced in the
Revised Report of 1973_[5] was in my opinion too weak a language to
leave the lab, for many of the reasons he noted.  The inflexible array
typing was fatal, in my view.)

> The idea was that any concrete type that implements that interface
> must satisfy its axioms.

Yes.  There is of course much more to the universe of potential
constraints than range checks.  Ada 2022 has these in great generality
with "subtype predicates".

http://www.ada-auth.org/standards/22aarm/html/aa-3-2-4.html

> Even if the compiler ignored these axioms,

I don't understand why this idea wasn't seized upon with more force at
the CSRC.  The notion of a compiler flag that turned "extra" (in the
Ritchie compiler circa 1980, this is perhaps expressed better as "any")
correctness checks could not have been a novelty.  NDEBUG and assert()
are similarly extremely old even in Unix.

> one can write a support program that can generate a set of
> comprehensive tests based on these axioms.

Yes.  As I understand it, this is how Spark/Ada got started.  Specially
annotated comments expressing predicates communicated with such a
support program, running much like the sort of automated theorem
prover you characterize below as not "retail".

In the last two revision cycles of the Ada standard (2013, 2022),
Spark/Ada's enhancements have made it into the language--though I am not
certain, and would not claim, that they compose with _every_ language
feature.  Spark/Ada started life as a subset of the language for a
reason.

But C has its own subset, MISRA C, so this is hardly a reason to scoff.

> [Right now a type "implementing" an interface only needs to
> have a set of methods that exactly match the interface methods but
> nothing more] The underlying idea is that each type is in essence a
> constraint on what values an instance of that type can take. So adding
> such axioms simply tightens (& documents) these constraints. Just the
> process of coming up with such axioms can improve the design (sor of
> like test driven design but better!).

Absolutely.  Generally, software engineers like to operationalize things
consistently enough that they can then be scripted/automated.

Evidently software testing is so mind-numblingly tedious that the will
to undertake it, even with automation, evaporates.

> Now it may be that applying this to anything more complex than stacks
> won't work well & it won't be perfect but I thought this was worth
> experimenting with. This would be like functional testing of all the
> nuts and bolts and components that go in an airplane. The airplane may
> still fall apart but that would be a "composition" error!

Yes.  And even if you can prove 100% of the theorems in your system, you
may learn to your dismay that your specification was defective.
Automated provers are as yet no aid to system architects.

> [1] There are "proof assisant" or formal spec languages such as TLA+,
> Coq, Isabelle etc. but they don't get used much by the average
> programmer. I want something more retail!

I've had a little exposure to these.  They are indeed esoteric, but also
extremely resource-hungry.  My _impression_, based on no hard data, is
that increasing the abilities of static analyzers and the expressiveness
with which they are directed with predicates is much cheaper.

But a lot of programmers will not budge at any cost, and will moreover
be celebrated by their peers for their obstinacy.  See footnotes.

There is much work still to be done.

Regards,
Branden

[1] https://lore.kernel.org/all/202404291502.612E0A10@keescook/
    https://lore.kernel.org/all/CAHk-=wi5YPwWA8f5RAf_Hi8iL0NhGJeL6MN6UFWwRMY8L6UDvQ@mail.gmail.com/
[2] https://lore.kernel.org/lkml/CAHk-=whkGHOmpM_1kNgzX1UDAs10+UuALcpeEWN29EE0m-my=w@mail.gmail.com/
[3] https://www.businessinsider.com/linus-torvalds-linux-time-away-empathy-2018-9
[4] https://lwn.net/Articles/973108/
[5] https://archive.org/details/1973-the-programming-language-pascal-revised-report-wirth

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-25  0:17   ` Bakul Shah via TUHS
  2024-05-25  0:57     ` G. Branden Robinson
@ 2024-05-25 13:56     ` David Arnold
  2024-05-25 17:18     ` Paul Winalski
  2 siblings, 0 replies; 15+ messages in thread
From: David Arnold @ 2024-05-25 13:56 UTC (permalink / raw)
  To: Bakul Shah; +Cc: The Unix Heritage Society mailing list


> On 25 May 2024, at 10:18, Bakul Shah via TUHS <tuhs@tuhs.org> wrote:
> 
> 
> What would be nice if programming languages provided some support for such exhaustive testing[1].
> 
> At one point I had suggested turning Go's Interface type to something like Guttag style abstract data types in that relevant axioms are specified right in the interface definition. The idea was that any concrete type that implements that interface must satisfy its axioms. Even if the compiler ignored these axioms, one can write a support program that can generate a set of comprehensive tests based on these axioms.

Sounds like Eiffel, whose compiler had support for checking pre and post conditions (and maybe invariants?) at runtime, or disabling the checks for “performance” mode. 



d

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-25  0:17   ` Bakul Shah via TUHS
  2024-05-25  0:57     ` G. Branden Robinson
  2024-05-25 13:56     ` David Arnold
@ 2024-05-25 17:18     ` Paul Winalski
  2024-05-25 17:36       ` Tom Perrine
  2 siblings, 1 reply; 15+ messages in thread
From: Paul Winalski @ 2024-05-25 17:18 UTC (permalink / raw)
  To: The Unix Heritage Society mailing list

[-- Attachment #1: Type: text/plain, Size: 1529 bytes --]

On Fri, May 24, 2024 at 8:18 PM Bakul Shah via TUHS <tuhs@tuhs.org> wrote:

At one point I had suggested turning Go's Interface type to something like
> Guttag style abstract data types in that relevant axioms are specified
> right in the interface definition. The idea was that any concrete type that
> implements that interface must satisfy its axioms. Even if the compiler
> ignored these axioms, one can write a support program that can generate a
> set of comprehensive tests based on these axioms. [Right now a type
> "implementing" an interface only needs to have a set of methods that
> exactly match the interface methods but nothing more] The underlying idea
> is that each type is in essence a constraint on what values an instance of
> that type can take. So adding such axioms simply tightens (& documents)
> these constraints. Just the process of coming up with such axioms can
> improve the design (sor of like test driven design but better!).
>

At one point I worked with a programming language called Gypsy that
implemented this concept.  Each routine had a prefix that specified axioms
on the routine's parameters and outputs.  The rest of Gypsy was a
conventional procedural language but the semantics were carefully chosen to
allow for automated proof of correctness.  I wrote a formal specification
for the DECnet session layer protocol (DECnet's equivalent of TCP) in
Gypsy.  I turned up a subtle bug in the prose version of the protocol
specification in the process.

-Paul W.

[-- Attachment #2: Type: text/html, Size: 1855 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [TUHS] Re: A fuzzy awk
  2024-05-25 17:18     ` Paul Winalski
@ 2024-05-25 17:36       ` Tom Perrine
  2024-05-25 17:53         ` [TUHS] Prof Don Good [was " Charles H Sauer (he/him)
  0 siblings, 1 reply; 15+ messages in thread
From: Tom Perrine @ 2024-05-25 17:36 UTC (permalink / raw)
  To: Paul Winalski; +Cc: The Unix Heritage Society mailing list

[-- Attachment #1: Type: text/plain, Size: 2272 bytes --]

Another Gypsy user here...

For KSOS-11 the kernel was described in SPECIAL - as a set of axioms and
theorems. There was no actual connection between the formal specification
in SPECIAL and the Modula code.

Some of the critical user-space code for a trusted downgrade program, to
bridge data from higher levels of classification to lower, was written in
Gypsy. I visited UT Austin and Dr Good(?)'s team to learn it, IIRC. Gypsy
was considered better in that the specification was tied to the executable
through the pre/post conditions - and the better support for semi-automated
theorem proving.




On Sat, May 25, 2024 at 10:18 AM Paul Winalski <paul.winalski@gmail.com>
wrote:

> On Fri, May 24, 2024 at 8:18 PM Bakul Shah via TUHS <tuhs@tuhs.org> wrote:
>
> At one point I had suggested turning Go's Interface type to something like
>> Guttag style abstract data types in that relevant axioms are specified
>> right in the interface definition. The idea was that any concrete type that
>> implements that interface must satisfy its axioms. Even if the compiler
>> ignored these axioms, one can write a support program that can generate a
>> set of comprehensive tests based on these axioms. [Right now a type
>> "implementing" an interface only needs to have a set of methods that
>> exactly match the interface methods but nothing more] The underlying idea
>> is that each type is in essence a constraint on what values an instance of
>> that type can take. So adding such axioms simply tightens (& documents)
>> these constraints. Just the process of coming up with such axioms can
>> improve the design (sor of like test driven design but better!).
>>
>
> At one point I worked with a programming language called Gypsy that
> implemented this concept.  Each routine had a prefix that specified axioms
> on the routine's parameters and outputs.  The rest of Gypsy was a
> conventional procedural language but the semantics were carefully chosen to
> allow for automated proof of correctness.  I wrote a formal specification
> for the DECnet session layer protocol (DECnet's equivalent of TCP) in
> Gypsy.  I turned up a subtle bug in the prose version of the protocol
> specification in the process.
>
> -Paul W.
>

[-- Attachment #2: Type: text/html, Size: 2933 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [TUHS] Prof Don Good [was Re: A fuzzy awk
  2024-05-25 17:36       ` Tom Perrine
@ 2024-05-25 17:53         ` Charles H Sauer (he/him)
  0 siblings, 0 replies; 15+ messages in thread
From: Charles H Sauer (he/him) @ 2024-05-25 17:53 UTC (permalink / raw)
  To: tuhs

On 5/25/2024 12:36 PM, Tom Perrine wrote:
> Another Gypsy user here...
> 
> For KSOS-11 the kernel was described in SPECIAL - as a set of axioms and 
> theorems. There was no actual connection between the formal 
> specification in SPECIAL and the Modula code.
> 
> Some of the critical user-space code for a trusted downgrade program, to 
> bridge data from higher levels of classification to lower, was written 
> in Gypsy. I visited UT Austin and Dr Good(?)'s team to learn it, IIRC. 
> Gypsy was considered better in that the specification was tied to the 
> executable through the pre/post conditions - and the better support for 
> semi-automated theorem proving.

When I was transitioning from being a rock n' roller to computer science 
student, I took my first undergraduate languages course from Don.

https://www.dignitymemorial.com/obituaries/austin-tx/donald-good-8209907

Charlie

-- 
voice: +1.512.784.7526       e-mail: sauer@technologists.com
fax: +1.512.346.5240         Web: https://technologists.com/sauer/
Facebook/Google/LinkedIn/Twitter: CharlesHSauer

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [TUHS] Testing an RE recogniser exhaustively.  (Was: A fuzzy awk)
  2024-05-23 13:49 [TUHS] Re: A fuzzy awk Douglas McIlroy
  2024-05-23 20:52 ` Rob Pike
@ 2024-05-27  9:39 ` Ralph Corderoy
  2024-05-27 13:03   ` [TUHS] " Hellwig Geisse
  1 sibling, 1 reply; 15+ messages in thread
From: Ralph Corderoy @ 2024-05-27  9:39 UTC (permalink / raw)
  To: TUHS

Hi,

Doug wrote:
> The trick: From recursive equations (easily derived from the grammar
> of REs), I counted how many REs exist up to various limits on token
> counts,  Then I generated all strings that satisfied those limits,
> turned the recognizer loose on them and counted how many it accepted.

Which reminded me of Doug's paper.

    Enumerating the strings of regular languages,
    J. Functional Programming 14 (2004) 503-518

    Haskell code is developed for two ways to list the strings of the
    language defined by a regular expression: directly by set operations
    and indirectly by converting to and simulating an equivalent
    automaton.  The exercise illustrates techniques for dealing with
    infinite ordered domains and leads to an effective standard form for
    nondeterministic finite automata.

    PDF preprint: https://www.cs.dartmouth.edu/~doug/nfa.pdf

It's also nice for the NFA construction with one state per symbol plus
one final state, and no epsilon transitions.  Doug writes:

    The even-a language (ab*a|b)* is defined by automaton h, with three
    start states.

	h0 = State 0 ’~’ []
	h1 = State 1 ’b’ [h4,h1,h0]
	h2 = State 2 ’a’ [h4,h1,h0]
	h3 = State 3 ’b’ [h2,h3]
	h4 = State 4 ’a’ [h2,h3]
	h = [h4,h1,h0]

The symbols replaced by their state numbers gives (43*2|1)*; state 0 is
the sole final state.

-- 
Cheers, Ralph.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [TUHS] Re: Testing an RE recogniser exhaustively.  (Was: A fuzzy awk)
  2024-05-27  9:39 ` [TUHS] Testing an RE recogniser exhaustively. (Was: A fuzzy awk) Ralph Corderoy
@ 2024-05-27 13:03   ` Hellwig Geisse
  0 siblings, 0 replies; 15+ messages in thread
From: Hellwig Geisse @ 2024-05-27 13:03 UTC (permalink / raw)
  To: TUHS

Hi,

On Mon, 2024-05-27 at 10:39 +0100, Ralph Corderoy wrote:
> 
> Which reminded me of Doug's paper.
> 
>     Enumerating the strings of regular languages,
>     J. Functional Programming 14 (2004) 503-518
> 

Thanks for the pointer. That's a nice paper,
turned into an equally nice testing method.

Hellwig

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2024-05-27 13:03 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-23 13:49 [TUHS] Re: A fuzzy awk Douglas McIlroy
2024-05-23 20:52 ` Rob Pike
2024-05-24  5:41   ` andrew
2024-05-24  7:17   ` Ralph Corderoy
2024-05-24  7:41     ` Rob Pike
2024-05-24 10:00       ` [TUHS] Is fuzz testing random? (Was: A fuzzy awk) Ralph Corderoy
2024-05-24 11:56     ` [TUHS] Re: A fuzzy awk Dan Halbert
2024-05-25  0:17   ` Bakul Shah via TUHS
2024-05-25  0:57     ` G. Branden Robinson
2024-05-25 13:56     ` David Arnold
2024-05-25 17:18     ` Paul Winalski
2024-05-25 17:36       ` Tom Perrine
2024-05-25 17:53         ` [TUHS] Prof Don Good [was " Charles H Sauer (he/him)
2024-05-27  9:39 ` [TUHS] Testing an RE recogniser exhaustively. (Was: A fuzzy awk) Ralph Corderoy
2024-05-27 13:03   ` [TUHS] " Hellwig Geisse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).