To close the loop a bit...

I really appreciate the anecdotes and background. It's helpful to those of us who didn't live it.

On the best resources front:

The Unix Programmer's Manual for v7 contains:
"A Tutorial Introduction to the UNIX Text Editor" by B. W. Kernighan - excellent coverage of Context Searching using a limited subset of regex.
"Advanced Editing on UNIX" by B. W. Kernighan - lots of examples.
"ed(1)" by authors of the manpages - super concise but thorough coverage of the regex rules (great followup to the tutorial).

Articles:
"Regular Expression Search Algorithm", by K. Thompson - an Algol-60 implementation of regex described in 4 pages... in 1968... I was 2 1/2.
"Regular Expression Matching Can Be Simple and Fast", by Russ Cox - how can an article be both simple and deep? Great concision.

Other Books:
"The AWK Programming Language" by A. V. Aho, B. W. Kernighan, & P. J. Weinberger - the discussion on pp. 28-31, Regular Expressions, is the best I've seen.

"Chapter 9. Regular Expresssions" in the XBD section of the SUS (IEEE Std 1003.1-2017) - Comprehensive presentation of the spec (good stuff, even if nobody perfectly implements it).

There are plenty more, but with the tutorial, ed(1), and AWK book in hand, I think a beginner is covered.

BTW, awk is awesome (particularly with the new csv additions) - I don't "need" the new unicode support, but it's nice. I didn't get awk, but when I figured out you could do this:

awk '/SYS.*\(write\,/, /\)/' */*
SYSCALL_DEFINE3(write, unsigned int, fd, const char __user *, buf,
               size_t, count)

in the kernel source, I was sold. I've never really wrapped my head around how to efficiently search over multiple lines, awk's range patterns... just make sense :). Even in it looks crazy, it works.

ranges bounded by regexes... who'd of thunk it?

Will



On 3/3/24 8:03 PM, Marc Rochkind wrote:
Will, here's my recollection, when I got to UNIX in late 1972 or thereabouts:

First, there was ed. grep and sed were derived from ed, so came along later. awk came along way later.

There were only manual pages. You typed "man ed" and there it was. The man pages were very accurate, very clear, and very authoritative. Many found them too succinct, especially as UNIX got more popular, but all of us back in the day found them perfect. Maybe you had to read the man page a few times to understand it, but at least that's all you had to read. No need to hunt around for more documentation!

(Well, there was more documentation: The source code, which was all online. But reading the ed source to understand regular expressions was impossible. It was in assembler, and Ken was generating code on the fly as the expression was compiled.)

Also, it should be noted that ed produced a single error message: a question mark. No wasting of teletype paper!

The motivation for learning regular expressions was that that's how you edited files. ed was the only game in town.

(sh used a greatly restricted form of regular expressions, which were documented on the sh man page.)

Marc Rochkind

On Sun, Mar 3, 2024 at 6:31 PM Will Senn <will.senn@gmail.com> wrote:
Hi All,

I was wondering, what were the best early sources of information for regexes and why did folks need to know them to use unix? In my recent explorations, I have needed to have a better understanding of them, so I'm digging in... awk's my most recent thing and it's deeply associated with them, so here we are. I went to the bookshelf to find something appropriate and as usual, I've traced to primary sources to some extent. I started with Mastering Regular Expressions by Friedl, and I won't knock it (it's one of the bestsellers in our field), but it's much to long for my personal taste and it's not quite as systematic as I would like (the author himself notes that his interests are less technical than authors preceding him on the subject). So, back to the shelves... Bourne's, The Unix Environment, and Kernighan & Pike's, The Unix Programming Evironment both talk about them in the context of grep, ed, sed, and awk. Going further back, the Unix Programmer's Manual v7 - ed, grep, sed, awk...

After digging around it seems like folks needed regexes for ed, grep, sed and awk... and any other utility that leveraged the wonderful nature of these handy expressions. Fine. Where did folks go learn them? Was there a particularly good (succinct and accurate) source of information that folks kept handy? I'm imagining (based on what I've seen) that someone might cut out the ed discussion or the grep pages of the manual and tape them to their monitors, but maybe I'm stooopid and they didn't need no stinkin' memory device for regexes - surely they're intuitive enough that even a simpleton could pick them up after seeing a few examples... but if that were really the case, Friedl's book would have been a flop and it wasn't :). So seriously, if you remember that far back - what was the definitive source of your regex knowledge and what were the first motivators for learning them?

Thanks,

Will


--
My new email address is mrochkind@gmail.com