The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
From: Will Senn <will.senn@gmail.com>
To: TUHS <tuhs@tuhs.org>
Subject: [TUHS] Re: regex early discussions
Date: Mon, 4 Mar 2024 11:05:38 -0600	[thread overview]
Message-ID: <f9c02a9d-a953-44cc-b685-4a12410beb2f@gmail.com> (raw)
In-Reply-To: <CAOkr1zXTDgQeetBUPgmXWTcYCSbFsrpk7_jNKCeYXjd8mK+KbA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 5559 bytes --]

To close the loop a bit...

I really appreciate the anecdotes and background. It's helpful to those 
of us who didn't live it.

On the best resources front:

The Unix Programmer's Manual for v7 contains:
"A Tutorial Introduction to the UNIX Text Editor" by B. W. Kernighan - 
excellent coverage of Context Searching using a limited subset of regex.
"Advanced Editing on UNIX" by B. W. Kernighan - lots of examples.
"ed(1)" by authors of the manpages - super concise but thorough coverage 
of the regex rules (great followup to the tutorial).

Articles:
"Regular Expression Search Algorithm", by K. Thompson - an Algol-60 
implementation of regex described in 4 pages... in 1968... I was 2 1/2.
"Regular Expression Matching Can Be Simple and Fast", by Russ Cox - how 
can an article be both simple and deep? Great concision.

Other Books:
"The AWK Programming Language" by A. V. Aho, B. W. Kernighan, & P. J. 
Weinberger - the discussion on pp. 28-31, Regular Expressions, is the 
best I've seen.

"Chapter 9. Regular Expresssions" in the XBD section of the SUS (IEEE 
Std 1003.1-2017) - Comprehensive presentation of the spec (good stuff, 
even if nobody perfectly implements it).

There are plenty more, but with the tutorial, ed(1), and AWK book in 
hand, I think a beginner is covered.

BTW, awk is awesome (particularly with the new csv additions) - I don't 
"need" the new unicode support, but it's nice. I didn't get awk, but 
when I figured out you could do this:

    awk '/SYS.*\(write\,/, /\)/' */*
    SYSCALL_DEFINE3(write, unsigned int, fd, const char __user *, buf,
                    size_t, count)


in the kernel source, I was sold. I've never really wrapped my head 
around how to efficiently search over multiple lines, awk's range 
patterns... just make sense :). Even in it looks crazy, it works.

ranges bounded by regexes... who'd of thunk it?

Will



On 3/3/24 8:03 PM, Marc Rochkind wrote:
> Will, here's my recollection, when I got to UNIX in late 1972 or 
> thereabouts:
>
> First, there was ed. grep and sed were derived from ed, so came along 
> later. awk came along way later.
>
> There were only manual pages. You typed "man ed" and there it was. The 
> man pages were very accurate, very clear, and very authoritative. Many 
> found them too succinct, especially as UNIX got more popular, but all 
> of us back in the day found them perfect. Maybe you had to read the 
> man page a few times to understand it, but at least that's all you had 
> to read. No need to hunt around for more documentation!
>
> (Well, there was more documentation: The source code, which was all 
> online. But reading the ed source to understand regular expressions 
> was impossible. It was in assembler, and Ken was generating code on 
> the fly as the expression was compiled.)
>
> Also, it should be noted that ed produced a single error message: a 
> question mark. No wasting of teletype paper!
>
> The motivation for learning regular expressions was that that's how 
> you edited files. ed was the only game in town.
>
> (sh used a greatly restricted form of regular expressions, which were 
> documented on the sh man page.)
>
> Marc Rochkind
>
> On Sun, Mar 3, 2024 at 6:31 PM Will Senn <will.senn@gmail.com> wrote:
>
>     Hi All,
>
>     I was wondering, what were the best early sources of information
>     for regexes and why did folks need to know them to use unix? In my
>     recent explorations, I have needed to have a better understanding
>     of them, so I'm digging in... awk's my most recent thing and it's
>     deeply associated with them, so here we are. I went to the
>     bookshelf to find something appropriate and as usual, I've traced
>     to primary sources to some extent. I started with Mastering
>     Regular Expressions by Friedl, and I won't knock it (it's one of
>     the bestsellers in our field), but it's much to long for my
>     personal taste and it's not quite as systematic as I would like
>     (the author himself notes that his interests are less technical
>     than authors preceding him on the subject). So, back to the
>     shelves... Bourne's, The Unix Environment, and Kernighan & Pike's,
>     The Unix Programming Evironment both talk about them in the
>     context of grep, ed, sed, and awk. Going further back, the Unix
>     Programmer's Manual v7 - ed, grep, sed, awk...
>
>     After digging around it seems like folks needed regexes for ed,
>     grep, sed and awk... and any other utility that leveraged the
>     wonderful nature of these handy expressions. Fine. Where did folks
>     go learn them? Was there a particularly good (succinct and
>     accurate) source of information that folks kept handy? I'm
>     imagining (based on what I've seen) that someone might cut out the
>     ed discussion or the grep pages of the manual and tape them to
>     their monitors, but maybe I'm stooopid and they didn't need no
>     stinkin' memory device for regexes - surely they're intuitive
>     enough that even a simpleton could pick them up after seeing a few
>     examples... but if that were really the case, Friedl's book would
>     have been a flop and it wasn't :). So seriously, if you remember
>     that far back - what was the definitive source of your regex
>     knowledge and what were the first motivators for learning them?
>
>     Thanks,
>
>     Will
>
>
>
> -- 
> /My new email address is mrochkind@gmail.com/

[-- Attachment #2: Type: text/html, Size: 8513 bytes --]

  parent reply	other threads:[~2024-03-04 17:05 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-04  1:30 [TUHS] " Will Senn
2024-03-04  2:03 ` [TUHS] " Marc Rochkind
2024-03-04  3:38   ` Larry McVoy
2024-03-04  4:18     ` Rich Salz
2024-03-04  7:51     ` Alec Muffett
2024-03-04  8:17     ` Rob Pike
2024-03-04  8:43       ` Alec Muffett
2024-03-04 14:25         ` Jan Schaumann via TUHS
2024-03-04 10:21       ` Bakul Shah via TUHS
2024-03-04 14:34     ` Larry McVoy
2024-03-04  7:10   ` Otto Moerbeek via TUHS
2024-03-04  7:19     ` Dave Long
2024-03-04  7:25       ` arnold
2024-03-04 12:05         ` Ralph Corderoy
2024-03-04 13:01           ` arnold
2024-03-04  7:25     ` Otto Moerbeek via TUHS
2024-03-04 12:00   ` Peter Weinberger (温博格) via TUHS
2024-03-04 17:05   ` Will Senn [this message]
2024-03-04 18:43     ` Rich Salz
2024-03-04 20:57       ` Bakul Shah via TUHS
2024-03-04 21:05       ` Steffen Nurpmeso
2024-03-04 13:17 ` Alan D. Salewski
2024-03-04 16:57 ` Clem Cole
2024-03-04 18:38   ` Phil Budne

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f9c02a9d-a953-44cc-b685-4a12410beb2f@gmail.com \
    --to=will.senn@gmail.com \
    --cc=tuhs@tuhs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).