The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
* [TUHS] Re: Nice video with Brian Kernighan
@ 2022-08-20 15:48 Douglas McIlroy
  2022-08-20 16:17 ` Clem Cole
  2022-08-21  9:36 ` Mohamed Akram
  0 siblings, 2 replies; 10+ messages in thread
From: Douglas McIlroy @ 2022-08-20 15:48 UTC (permalink / raw)
  To: TUHS main list

Brian's tribute to the brilliant regex mechanism that awk borrowed
from egrep  spurred memories.

For more than forty years I claimed credit for stimulating Ken to
liberate grep from ed. Then, thanks to TUHS, I learned that I had
merely caused Ken to spring from the closet a program he had already
made for his own use.

There's a related story for egrep. Al Aho made a deterministic
regular-expression recognizer as a faster replacement for the
non-deterministic recognizer in grep. He also extended the domain of
patterns to full regular expressions, including alternation; thus the
"e" in egrep.

About the same time, I built on Norm Shryer's personal calendar
utility. I wanted to generalize Norm's strict syntax for dates to
cover most any (American) representation of dates, and to warn about
tomorrow's calendar as well as today's--where "tomorrow" could extend
across a weekend or holiday.

Egrep was just the tool I needed for picking the dates out of a
free-form calendar file. I wrote a little program that built an egrep
pattern based on today's date. The following mouthful for Saturday,
August 20 covers Sunday and Monday, too. (Note that, in egrep, newline
is a synonym for |, the alternation operator.)

        (^|[ (,;])(([Aa]ug[^ ]* *|(08|8)/)0*20)([^0123456789]|$)
        (^|[ (,;])(([Aa]ug[^ ]* *|(08|8)/)0*21)([^0123456789]|$)
        (^|[ (,;])(([Aa]ug[^ ]* *|(08|8)/)0*22)([^0123456789]|$)

It worked like a charm, except that it took a good part of a minute to
handle even a tiny calendar file. The reason: the state count of the
deterministic automaton was exponentially larger than the regular
regular expression; and egrep had to build the automaton before it
could run it. Al was mortified that an early serious use of egrep
should be such a turkey.

But Al was undaunted. He replaced the automaton construction with an
equivalent lazy algorithm that constructed a state only when the
recognizer was about to visit it. This made egrep into the brilliant
tool that Brian praised.

What I don't know is whether the calendar program stimulated the idea
of lazy implementation, or whether Al, like Ken before him with grep,
already had the idea up his sleeve.


^ permalink raw reply	[flat|nested] 10+ messages in thread
* [TUHS] Nice video with Brian Kernighan
@ 2022-08-18 10:01 Arnold Robbins
  2022-08-18 21:58 ` [TUHS] " Clem Cole
  0 siblings, 1 reply; 10+ messages in thread
From: Arnold Robbins @ 2022-08-18 10:01 UTC (permalink / raw)
  To: tuhs

Not quite 30 minutes long. Mostly about the history of awk but some
other stuff, including a nice plug for TUHS at the end.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-08-22  2:51 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-20 15:48 [TUHS] Re: Nice video with Brian Kernighan Douglas McIlroy
2022-08-20 16:17 ` Clem Cole
2022-08-20 18:24   ` Rich Morin
2022-08-21 15:09     ` John Cowan
2022-08-21 15:41       ` Clem Cole
2022-08-22  2:49         ` John Cowan
2022-08-21 20:07       ` Rich Morin
2022-08-21  9:36 ` Mohamed Akram
  -- strict thread matches above, loose matches on Subject: below --
2022-08-18 10:01 [TUHS] " Arnold Robbins
2022-08-18 21:58 ` [TUHS] " Clem Cole
2022-08-18 22:48   ` Pete Wright via TUHS

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).