The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
* [TUHS] regex early discussions
@ 2024-03-04  1:30 Will Senn
  2024-03-04  2:03 ` [TUHS] " Marc Rochkind
                   ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Will Senn @ 2024-03-04  1:30 UTC (permalink / raw)
  To: TUHS

[-- Attachment #1: Type: text/plain, Size: 1939 bytes --]

Hi All,

I was wondering, what were the best early sources of information for 
regexes and why did folks need to know them to use unix? In my recent 
explorations, I have needed to have a better understanding of them, so 
I'm digging in... awk's my most recent thing and it's deeply associated 
with them, so here we are. I went to the bookshelf to find something 
appropriate and as usual, I've traced to primary sources to some extent. 
I started with Mastering Regular Expressions by Friedl, and I won't 
knock it (it's one of the bestsellers in our field), but it's much to 
long for my personal taste and it's not quite as systematic as I would 
like (the author himself notes that his interests are less technical 
than authors preceding him on the subject). So, back to the shelves... 
Bourne's, The Unix Environment, and Kernighan & Pike's, The Unix 
Programming Evironment both talk about them in the context of grep, ed, 
sed, and awk. Going further back, the Unix Programmer's Manual v7 - ed, 
grep, sed, awk...

After digging around it seems like folks needed regexes for ed, grep, 
sed and awk... and any other utility that leveraged the wonderful nature 
of these handy expressions. Fine. Where did folks go learn them? Was 
there a particularly good (succinct and accurate) source of information 
that folks kept handy? I'm imagining (based on what I've seen) that 
someone might cut out the ed discussion or the grep pages of the manual 
and tape them to their monitors, but maybe I'm stooopid and they didn't 
need no stinkin' memory device for regexes - surely they're intuitive 
enough that even a simpleton could pick them up after seeing a few 
examples... but if that were really the case, Friedl's book would have 
been a flop and it wasn't :). So seriously, if you remember that far 
back - what was the definitive source of your regex knowledge and what 
were the first motivators for learning them?

Thanks,

Will

[-- Attachment #2: Type: text/html, Size: 2370 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [TUHS] Re: regex early discussions
  2024-03-04  1:30 [TUHS] regex early discussions Will Senn
@ 2024-03-04  2:03 ` Marc Rochkind
  2024-03-04  3:38   ` Larry McVoy
                     ` (3 more replies)
  2024-03-04 13:17 ` Alan D. Salewski
  2024-03-04 16:57 ` Clem Cole
  2 siblings, 4 replies; 24+ messages in thread
From: Marc Rochkind @ 2024-03-04  2:03 UTC (permalink / raw)
  To: Will Senn; +Cc: TUHS

[-- Attachment #1: Type: text/plain, Size: 3383 bytes --]

Will, here's my recollection, when I got to UNIX in late 1972 or
thereabouts:

First, there was ed. grep and sed were derived from ed, so came along
later. awk came along way later.

There were only manual pages. You typed "man ed" and there it was. The man
pages were very accurate, very clear, and very authoritative. Many found
them too succinct, especially as UNIX got more popular, but all of us back
in the day found them perfect. Maybe you had to read the man page a few
times to understand it, but at least that's all you had to read. No need to
hunt around for more documentation!

(Well, there was more documentation: The source code, which was all online.
But reading the ed source to understand regular expressions was impossible.
It was in assembler, and Ken was generating code on the fly as the
expression was compiled.)

Also, it should be noted that ed produced a single error message: a
question mark. No wasting of teletype paper!

The motivation for learning regular expressions was that that's how you
edited files. ed was the only game in town.

(sh used a greatly restricted form of regular expressions, which were
documented on the sh man page.)

Marc Rochkind

On Sun, Mar 3, 2024 at 6:31 PM Will Senn <will.senn@gmail.com> wrote:

> Hi All,
>
> I was wondering, what were the best early sources of information for
> regexes and why did folks need to know them to use unix? In my recent
> explorations, I have needed to have a better understanding of them, so I'm
> digging in... awk's my most recent thing and it's deeply associated with
> them, so here we are. I went to the bookshelf to find something appropriate
> and as usual, I've traced to primary sources to some extent. I started with
> Mastering Regular Expressions by Friedl, and I won't knock it (it's one of
> the bestsellers in our field), but it's much to long for my personal taste
> and it's not quite as systematic as I would like (the author himself notes
> that his interests are less technical than authors preceding him on the
> subject). So, back to the shelves... Bourne's, The Unix Environment, and
> Kernighan & Pike's, The Unix Programming Evironment both talk about them in
> the context of grep, ed, sed, and awk. Going further back, the Unix
> Programmer's Manual v7 - ed, grep, sed, awk...
>
> After digging around it seems like folks needed regexes for ed, grep, sed
> and awk... and any other utility that leveraged the wonderful nature of
> these handy expressions. Fine. Where did folks go learn them? Was there a
> particularly good (succinct and accurate) source of information that folks
> kept handy? I'm imagining (based on what I've seen) that someone might cut
> out the ed discussion or the grep pages of the manual and tape them to
> their monitors, but maybe I'm stooopid and they didn't need no stinkin'
> memory device for regexes - surely they're intuitive enough that even a
> simpleton could pick them up after seeing a few examples... but if that
> were really the case, Friedl's book would have been a flop and it wasn't
> :). So seriously, if you remember that far back - what was the definitive
> source of your regex knowledge and what were the first motivators for
> learning them?
>
> Thanks,
>
> Will
>


-- 
*My new email address is mrochkind@gmail.com <mrochkind@gmail.com>*

[-- Attachment #2: Type: text/html, Size: 4389 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [TUHS] Re: regex early discussions
  2024-03-04  2:03 ` [TUHS] " Marc Rochkind
@ 2024-03-04  3:38   ` Larry McVoy
  2024-03-04  4:18     ` Rich Salz
                       ` (3 more replies)
  2024-03-04  7:10   ` Otto Moerbeek via TUHS
                     ` (2 subsequent siblings)
  3 siblings, 4 replies; 24+ messages in thread
From: Larry McVoy @ 2024-03-04  3:38 UTC (permalink / raw)
  To: Marc Rochkind; +Cc: Will Senn, TUHS

Marc is right.  I'll add that I grew up in terminal rooms, a bunch of
kids connected to a VAX 780, like 40 or more.  I have no idea how the
kids ahead of me learned but I learned by looking at their terminal
and going "what did you just do?".

My real understanding of regex is from Henry Spencer's regex.

On Sun, Mar 03, 2024 at 07:03:39PM -0700, Marc Rochkind wrote:
> Will, here's my recollection, when I got to UNIX in late 1972 or
> thereabouts:
> 
> First, there was ed. grep and sed were derived from ed, so came along
> later. awk came along way later.
> 
> There were only manual pages. You typed "man ed" and there it was. The man
> pages were very accurate, very clear, and very authoritative. Many found
> them too succinct, especially as UNIX got more popular, but all of us back
> in the day found them perfect. Maybe you had to read the man page a few
> times to understand it, but at least that's all you had to read. No need to
> hunt around for more documentation!
> 
> (Well, there was more documentation: The source code, which was all online.
> But reading the ed source to understand regular expressions was impossible.
> It was in assembler, and Ken was generating code on the fly as the
> expression was compiled.)
> 
> Also, it should be noted that ed produced a single error message: a
> question mark. No wasting of teletype paper!
> 
> The motivation for learning regular expressions was that that's how you
> edited files. ed was the only game in town.
> 
> (sh used a greatly restricted form of regular expressions, which were
> documented on the sh man page.)
> 
> Marc Rochkind
> 
> On Sun, Mar 3, 2024 at 6:31???PM Will Senn <will.senn@gmail.com> wrote:
> 
> > Hi All,
> >
> > I was wondering, what were the best early sources of information for
> > regexes and why did folks need to know them to use unix? In my recent
> > explorations, I have needed to have a better understanding of them, so I'm
> > digging in... awk's my most recent thing and it's deeply associated with
> > them, so here we are. I went to the bookshelf to find something appropriate
> > and as usual, I've traced to primary sources to some extent. I started with
> > Mastering Regular Expressions by Friedl, and I won't knock it (it's one of
> > the bestsellers in our field), but it's much to long for my personal taste
> > and it's not quite as systematic as I would like (the author himself notes
> > that his interests are less technical than authors preceding him on the
> > subject). So, back to the shelves... Bourne's, The Unix Environment, and
> > Kernighan & Pike's, The Unix Programming Evironment both talk about them in
> > the context of grep, ed, sed, and awk. Going further back, the Unix
> > Programmer's Manual v7 - ed, grep, sed, awk...
> >
> > After digging around it seems like folks needed regexes for ed, grep, sed
> > and awk... and any other utility that leveraged the wonderful nature of
> > these handy expressions. Fine. Where did folks go learn them? Was there a
> > particularly good (succinct and accurate) source of information that folks
> > kept handy? I'm imagining (based on what I've seen) that someone might cut
> > out the ed discussion or the grep pages of the manual and tape them to
> > their monitors, but maybe I'm stooopid and they didn't need no stinkin'
> > memory device for regexes - surely they're intuitive enough that even a
> > simpleton could pick them up after seeing a few examples... but if that
> > were really the case, Friedl's book would have been a flop and it wasn't
> > :). So seriously, if you remember that far back - what was the definitive
> > source of your regex knowledge and what were the first motivators for
> > learning them?
> >
> > Thanks,
> >
> > Will
> >
> 
> 
> -- 
> *My new email address is mrochkind@gmail.com <mrochkind@gmail.com>*

-- 
---
Larry McVoy           Retired to fishing          http://www.mcvoy.com/lm/boat

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [TUHS] Re: regex early discussions
  2024-03-04  3:38   ` Larry McVoy
@ 2024-03-04  4:18     ` Rich Salz
  2024-03-04  7:51     ` Alec Muffett
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 24+ messages in thread
From: Rich Salz @ 2024-03-04  4:18 UTC (permalink / raw)
  To: Larry McVoy; +Cc: Marc Rochkind, Will Senn, TUHS

[-- Attachment #1: Type: text/plain, Size: 158 bytes --]

I remember being given a copy of grep source and seeing a char pointer
written as "p[-1]" and it was an like a thunderbolt of understanding about
C pointers.

[-- Attachment #2: Type: text/html, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [TUHS] Re: regex early discussions
  2024-03-04  2:03 ` [TUHS] " Marc Rochkind
  2024-03-04  3:38   ` Larry McVoy
@ 2024-03-04  7:10   ` Otto Moerbeek via TUHS
  2024-03-04  7:19     ` Dave Long
  2024-03-04  7:25     ` Otto Moerbeek via TUHS
  2024-03-04 12:00   ` Peter Weinberger (温博格) via TUHS
  2024-03-04 17:05   ` Will Senn
  3 siblings, 2 replies; 24+ messages in thread
From: Otto Moerbeek via TUHS @ 2024-03-04  7:10 UTC (permalink / raw)
  To: Marc Rochkind; +Cc: Will Senn, TUHS

On Sun, Mar 03, 2024 at 07:03:39PM -0700, Marc Rochkind wrote:

> Will, here's my recollection, when I got to UNIX in late 1972 or
> thereabouts:
> 
> First, there was ed. grep and sed were derived from ed, so came along
> later. awk came along way later.
> 
> There were only manual pages. You typed "man ed" and there it was. The man
> pages were very accurate, very clear, and very authoritative. Many found
> them too succinct, especially as UNIX got more popular, but all of us back
> in the day found them perfect. Maybe you had to read the man page a few
> times to understand it, but at least that's all you had to read. No need to
> hunt around for more documentation!
> 
> (Well, there was more documentation: The source code, which was all online.
> But reading the ed source to understand regular expressions was impossible.
> It was in assembler, and Ken was generating code on the fly as the
> expression was compiled.)

I like to add that there was also quite a large set of additional
documentatiomn (Volume 2, Voilume 1 were the man pages), which
includes "Advanced Editing on UNIX" giving many examples on the use of
regexes in ed(1).

I do remeber reading a lot from Volume 2, as CS students in Amsterdam
we received printed and bound copies of both Volume 1 and 2. So in my
case, "only man pages or source" is not true. Having paper versions
was importent, because access to terminals for students was limited
(until I became a teaching assistent, which came with privileges,
including 24h access to terminals)

	-Otto

> 
> Also, it should be noted that ed produced a single error message: a
> question mark. No wasting of teletype paper!
> 
> The motivation for learning regular expressions was that that's how you
> edited files. ed was the only game in town.
> 
> (sh used a greatly restricted form of regular expressions, which were
> documented on the sh man page.)
> 
> Marc Rochkind
> 
> On Sun, Mar 3, 2024 at 6:31 PM Will Senn <will.senn@gmail.com> wrote:
> 
> > Hi All,
> >
> > I was wondering, what were the best early sources of information for
> > regexes and why did folks need to know them to use unix? In my recent
> > explorations, I have needed to have a better understanding of them, so I'm
> > digging in... awk's my most recent thing and it's deeply associated with
> > them, so here we are. I went to the bookshelf to find something appropriate
> > and as usual, I've traced to primary sources to some extent. I started with
> > Mastering Regular Expressions by Friedl, and I won't knock it (it's one of
> > the bestsellers in our field), but it's much to long for my personal taste
> > and it's not quite as systematic as I would like (the author himself notes
> > that his interests are less technical than authors preceding him on the
> > subject). So, back to the shelves... Bourne's, The Unix Environment, and
> > Kernighan & Pike's, The Unix Programming Evironment both talk about them in
> > the context of grep, ed, sed, and awk. Going further back, the Unix
> > Programmer's Manual v7 - ed, grep, sed, awk...
> >
> > After digging around it seems like folks needed regexes for ed, grep, sed
> > and awk... and any other utility that leveraged the wonderful nature of
> > these handy expressions. Fine. Where did folks go learn them? Was there a
> > particularly good (succinct and accurate) source of information that folks
> > kept handy? I'm imagining (based on what I've seen) that someone might cut
> > out the ed discussion or the grep pages of the manual and tape them to
> > their monitors, but maybe I'm stooopid and they didn't need no stinkin'
> > memory device for regexes - surely they're intuitive enough that even a
> > simpleton could pick them up after seeing a few examples... but if that
> > were really the case, Friedl's book would have been a flop and it wasn't
> > :). So seriously, if you remember that far back - what was the definitive
> > source of your regex knowledge and what were the first motivators for
> > learning them?
> >
> > Thanks,
> >
> > Will
> >
> 
> 
> -- 
> *My new email address is mrochkind@gmail.com <mrochkind@gmail.com>*

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [TUHS] Re: regex early discussions
  2024-03-04  7:10   ` Otto Moerbeek via TUHS
@ 2024-03-04  7:19     ` Dave Long
  2024-03-04  7:25       ` arnold
  2024-03-04  7:25     ` Otto Moerbeek via TUHS
  1 sibling, 1 reply; 24+ messages in thread
From: Dave Long @ 2024-03-04  7:19 UTC (permalink / raw)
  To: Otto Moerbeek; +Cc: Marc Rochkind, Will Senn, TUHS

Did `learn` have a regex module? (my memory* does not suffice, and I didn't even manage to get google to tell me if it were learn(1) or learn(6), so please forgive the imprecision of this response)

-Dave

* although I do recall this was how I learned one of ed(1) or vi(1)

> On 4 Mar 2024, at 08:10, Otto Moerbeek via TUHS <tuhs@tuhs.org> wrote:
> 
> On Sun, Mar 03, 2024 at 07:03:39PM -0700, Marc Rochkind wrote:
> 
>> Will, here's my recollection, when I got to UNIX in late 1972 or
>> thereabouts:
>> 
>> First, there was ed. grep and sed were derived from ed, so came along
>> later. awk came along way later.
>> 
>> There were only manual pages. You typed "man ed" and there it was. The man
>> pages were very accurate, very clear, and very authoritative. Many found
>> them too succinct, especially as UNIX got more popular, but all of us back
>> in the day found them perfect. Maybe you had to read the man page a few
>> times to understand it, but at least that's all you had to read. No need to
>> hunt around for more documentation!
>> 
>> (Well, there was more documentation: The source code, which was all online.
>> But reading the ed source to understand regular expressions was impossible.
>> It was in assembler, and Ken was generating code on the fly as the
>> expression was compiled.)
> 
> I like to add that there was also quite a large set of additional
> documentatiomn (Volume 2, Voilume 1 were the man pages), which
> includes "Advanced Editing on UNIX" giving many examples on the use of
> regexes in ed(1).
> 
> I do remeber reading a lot from Volume 2, as CS students in Amsterdam
> we received printed and bound copies of both Volume 1 and 2. So in my
> case, "only man pages or source" is not true. Having paper versions
> was importent, because access to terminals for students was limited
> (until I became a teaching assistent, which came with privileges,
> including 24h access to terminals)
> 
> -Otto
> 
>> 
>> Also, it should be noted that ed produced a single error message: a
>> question mark. No wasting of teletype paper!
>> 
>> The motivation for learning regular expressions was that that's how you
>> edited files. ed was the only game in town.
>> 
>> (sh used a greatly restricted form of regular expressions, which were
>> documented on the sh man page.)
>> 
>> Marc Rochkind
>> 
>> On Sun, Mar 3, 2024 at 6:31 PM Will Senn <will.senn@gmail.com> wrote:
>> 
>>> Hi All,
>>> 
>>> I was wondering, what were the best early sources of information for
>>> regexes and why did folks need to know them to use unix? In my recent
>>> explorations, I have needed to have a better understanding of them, so I'm
>>> digging in... awk's my most recent thing and it's deeply associated with
>>> them, so here we are. I went to the bookshelf to find something appropriate
>>> and as usual, I've traced to primary sources to some extent. I started with
>>> Mastering Regular Expressions by Friedl, and I won't knock it (it's one of
>>> the bestsellers in our field), but it's much to long for my personal taste
>>> and it's not quite as systematic as I would like (the author himself notes
>>> that his interests are less technical than authors preceding him on the
>>> subject). So, back to the shelves... Bourne's, The Unix Environment, and
>>> Kernighan & Pike's, The Unix Programming Evironment both talk about them in
>>> the context of grep, ed, sed, and awk. Going further back, the Unix
>>> Programmer's Manual v7 - ed, grep, sed, awk...
>>> 
>>> After digging around it seems like folks needed regexes for ed, grep, sed
>>> and awk... and any other utility that leveraged the wonderful nature of
>>> these handy expressions. Fine. Where did folks go learn them? Was there a
>>> particularly good (succinct and accurate) source of information that folks
>>> kept handy? I'm imagining (based on what I've seen) that someone might cut
>>> out the ed discussion or the grep pages of the manual and tape them to
>>> their monitors, but maybe I'm stooopid and they didn't need no stinkin'
>>> memory device for regexes - surely they're intuitive enough that even a
>>> simpleton could pick them up after seeing a few examples... but if that
>>> were really the case, Friedl's book would have been a flop and it wasn't
>>> :). So seriously, if you remember that far back - what was the definitive
>>> source of your regex knowledge and what were the first motivators for
>>> learning them?
>>> 
>>> Thanks,
>>> 
>>> Will
>>> 
>> 
>> 
>> -- 
>> *My new email address is mrochkind@gmail.com <mrochkind@gmail.com>*



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [TUHS] Re: regex early discussions
  2024-03-04  7:10   ` Otto Moerbeek via TUHS
  2024-03-04  7:19     ` Dave Long
@ 2024-03-04  7:25     ` Otto Moerbeek via TUHS
  1 sibling, 0 replies; 24+ messages in thread
From: Otto Moerbeek via TUHS @ 2024-03-04  7:25 UTC (permalink / raw)
  To: Otto Moerbeek via TUHS; +Cc: Marc Rochkind, Will Senn

On Mon, Mar 04, 2024 at 08:10:26AM +0100, Otto Moerbeek via TUHS wrote:

> On Sun, Mar 03, 2024 at 07:03:39PM -0700, Marc Rochkind wrote:
> 
> > Will, here's my recollection, when I got to UNIX in late 1972 or
> > thereabouts:
> > 
> > First, there was ed. grep and sed were derived from ed, so came along
> > later. awk came along way later.
> > 
> > There were only manual pages. You typed "man ed" and there it was. The man
> > pages were very accurate, very clear, and very authoritative. Many found
> > them too succinct, especially as UNIX got more popular, but all of us back
> > in the day found them perfect. Maybe you had to read the man page a few
> > times to understand it, but at least that's all you had to read. No need to
> > hunt around for more documentation!
> > 
> > (Well, there was more documentation: The source code, which was all online.
> > But reading the ed source to understand regular expressions was impossible.
> > It was in assembler, and Ken was generating code on the fly as the
> > expression was compiled.)
> 
> I like to add that there was also quite a large set of additional
> documentatiomn (Volume 2, Voilume 1 were the man pages), which
> includes "Advanced Editing on UNIX" giving many examples on the use of
> regexes in ed(1).
> 
> I do remeber reading a lot from Volume 2, as CS students in Amsterdam
> we received printed and bound copies of both Volume 1 and 2. So in my
> case, "only man pages or source" is not true. Having paper versions
> was importent, because access to terminals for students was limited
> (until I became a teaching assistent, which came with privileges,
> including 24h access to terminals)

https://wolfram.schneider.org/bsd/7thEdManVol2/ shows the contens of
Volume 2 (level ranges from introductionary tutorial to interals of
the compiler)

> 
> 	-Otto
> 
> > 
> > Also, it should be noted that ed produced a single error message: a
> > question mark. No wasting of teletype paper!
> > 
> > The motivation for learning regular expressions was that that's how you
> > edited files. ed was the only game in town.
> > 
> > (sh used a greatly restricted form of regular expressions, which were
> > documented on the sh man page.)
> > 
> > Marc Rochkind
> > 
> > On Sun, Mar 3, 2024 at 6:31 PM Will Senn <will.senn@gmail.com> wrote:
> > 
> > > Hi All,
> > >
> > > I was wondering, what were the best early sources of information for
> > > regexes and why did folks need to know them to use unix? In my recent
> > > explorations, I have needed to have a better understanding of them, so I'm
> > > digging in... awk's my most recent thing and it's deeply associated with
> > > them, so here we are. I went to the bookshelf to find something appropriate
> > > and as usual, I've traced to primary sources to some extent. I started with
> > > Mastering Regular Expressions by Friedl, and I won't knock it (it's one of
> > > the bestsellers in our field), but it's much to long for my personal taste
> > > and it's not quite as systematic as I would like (the author himself notes
> > > that his interests are less technical than authors preceding him on the
> > > subject). So, back to the shelves... Bourne's, The Unix Environment, and
> > > Kernighan & Pike's, The Unix Programming Evironment both talk about them in
> > > the context of grep, ed, sed, and awk. Going further back, the Unix
> > > Programmer's Manual v7 - ed, grep, sed, awk...
> > >
> > > After digging around it seems like folks needed regexes for ed, grep, sed
> > > and awk... and any other utility that leveraged the wonderful nature of
> > > these handy expressions. Fine. Where did folks go learn them? Was there a
> > > particularly good (succinct and accurate) source of information that folks
> > > kept handy? I'm imagining (based on what I've seen) that someone might cut
> > > out the ed discussion or the grep pages of the manual and tape them to
> > > their monitors, but maybe I'm stooopid and they didn't need no stinkin'
> > > memory device for regexes - surely they're intuitive enough that even a
> > > simpleton could pick them up after seeing a few examples... but if that
> > > were really the case, Friedl's book would have been a flop and it wasn't
> > > :). So seriously, if you remember that far back - what was the definitive
> > > source of your regex knowledge and what were the first motivators for
> > > learning them?
> > >
> > > Thanks,
> > >
> > > Will
> > >
> > 
> > 
> > -- 
> > *My new email address is mrochkind@gmail.com <mrochkind@gmail.com>*

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [TUHS] Re: regex early discussions
  2024-03-04  7:19     ` Dave Long
@ 2024-03-04  7:25       ` arnold
  2024-03-04 12:05         ` Ralph Corderoy
  0 siblings, 1 reply; 24+ messages in thread
From: arnold @ 2024-03-04  7:25 UTC (permalink / raw)
  To: otto, dave.long; +Cc: will.senn, tuhs, mrochkind

I learned regular expressions from Kernighan & Plauger's book
"Software Tools". I was exposed to that book, Unix (v6 on a PDP-11)
and C programming (via K&R's book) all at the same time. This was in
the fall of 1980.

"Software Tools" changed my life.

Arnold

Dave Long <dave.long@bluewin.ch> wrote:

> Did `learn` have a regex module? (my memory* does not suffice, and
> I didn't even manage to get google to tell me if it were learn(1) or
> learn(6), so please forgive the imprecision of this response)
>
> -Dave
>
> * although I do recall this was how I learned one of ed(1) or vi(1)
>
> > On 4 Mar 2024, at 08:10, Otto Moerbeek via TUHS <tuhs@tuhs.org> wrote:
> > 
> > On Sun, Mar 03, 2024 at 07:03:39PM -0700, Marc Rochkind wrote:
> > 
> >> Will, here's my recollection, when I got to UNIX in late 1972 or
> >> thereabouts:
> >> 
> >> First, there was ed. grep and sed were derived from ed, so came along
> >> later. awk came along way later.
> >> 
> >> There were only manual pages. You typed "man ed" and there it was. The man
> >> pages were very accurate, very clear, and very authoritative. Many found
> >> them too succinct, especially as UNIX got more popular, but all of us back
> >> in the day found them perfect. Maybe you had to read the man page a few
> >> times to understand it, but at least that's all you had to read. No need to
> >> hunt around for more documentation!
> >> 
> >> (Well, there was more documentation: The source code, which was all online.
> >> But reading the ed source to understand regular expressions was impossible.
> >> It was in assembler, and Ken was generating code on the fly as the
> >> expression was compiled.)
> > 
> > I like to add that there was also quite a large set of additional
> > documentatiomn (Volume 2, Voilume 1 were the man pages), which
> > includes "Advanced Editing on UNIX" giving many examples on the use of
> > regexes in ed(1).
> > 
> > I do remeber reading a lot from Volume 2, as CS students in Amsterdam
> > we received printed and bound copies of both Volume 1 and 2. So in my
> > case, "only man pages or source" is not true. Having paper versions
> > was importent, because access to terminals for students was limited
> > (until I became a teaching assistent, which came with privileges,
> > including 24h access to terminals)
> > 
> > -Otto
> > 
> >> 
> >> Also, it should be noted that ed produced a single error message: a
> >> question mark. No wasting of teletype paper!
> >> 
> >> The motivation for learning regular expressions was that that's how you
> >> edited files. ed was the only game in town.
> >> 
> >> (sh used a greatly restricted form of regular expressions, which were
> >> documented on the sh man page.)
> >> 
> >> Marc Rochkind
> >> 
> >> On Sun, Mar 3, 2024 at 6:31 PM Will Senn <will.senn@gmail.com> wrote:
> >> 
> >>> Hi All,
> >>> 
> >>> I was wondering, what were the best early sources of information for
> >>> regexes and why did folks need to know them to use unix? In my recent
> >>> explorations, I have needed to have a better understanding of them, so I'm
> >>> digging in... awk's my most recent thing and it's deeply associated with
> >>> them, so here we are. I went to the bookshelf to find something appropriate
> >>> and as usual, I've traced to primary sources to some extent. I started with
> >>> Mastering Regular Expressions by Friedl, and I won't knock it (it's one of
> >>> the bestsellers in our field), but it's much to long for my personal taste
> >>> and it's not quite as systematic as I would like (the author himself notes
> >>> that his interests are less technical than authors preceding him on the
> >>> subject). So, back to the shelves... Bourne's, The Unix Environment, and
> >>> Kernighan & Pike's, The Unix Programming Evironment both talk about them in
> >>> the context of grep, ed, sed, and awk. Going further back, the Unix
> >>> Programmer's Manual v7 - ed, grep, sed, awk...
> >>> 
> >>> After digging around it seems like folks needed regexes for ed, grep, sed
> >>> and awk... and any other utility that leveraged the wonderful nature of
> >>> these handy expressions. Fine. Where did folks go learn them? Was there a
> >>> particularly good (succinct and accurate) source of information that folks
> >>> kept handy? I'm imagining (based on what I've seen) that someone might cut
> >>> out the ed discussion or the grep pages of the manual and tape them to
> >>> their monitors, but maybe I'm stooopid and they didn't need no stinkin'
> >>> memory device for regexes - surely they're intuitive enough that even a
> >>> simpleton could pick them up after seeing a few examples... but if that
> >>> were really the case, Friedl's book would have been a flop and it wasn't
> >>> :). So seriously, if you remember that far back - what was the definitive
> >>> source of your regex knowledge and what were the first motivators for
> >>> learning them?
> >>> 
> >>> Thanks,
> >>> 
> >>> Will
> >>> 
> >> 
> >> 
> >> -- 
> >> *My new email address is mrochkind@gmail.com <mrochkind@gmail.com>*
>
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [TUHS] Re: regex early discussions
  2024-03-04  3:38   ` Larry McVoy
  2024-03-04  4:18     ` Rich Salz
@ 2024-03-04  7:51     ` Alec Muffett
  2024-03-04  8:17     ` Rob Pike
  2024-03-04 14:34     ` Larry McVoy
  3 siblings, 0 replies; 24+ messages in thread
From: Alec Muffett @ 2024-03-04  7:51 UTC (permalink / raw)
  To: Larry McVoy; +Cc: Marc Rochkind, Will Senn, TUHS

[-- Attachment #1: Type: text/plain, Size: 1138 bytes --]

On Mon, 4 Mar 2024, 03:38 Larry McVoy, <lm@mcvoy.com> wrote:

> Marc is right.  I'll add that I grew up in terminal rooms, a bunch of
> kids connected to a VAX 780, like 40 or more.  I have no idea how the
> kids ahead of me learned but I learned by looking at their terminal
> and going "what did you just do?".
>
> My real understanding of regex is from Henry Spencer's regex.
>

I have a similar story; I landed in Unix circa 1987 because the computer
science students at UCL were all raving about Unix / the Pyramid (Did we
pass unused cspyr accounts around the college nerd undergraduate
underground? Nooooooo, we would never have done that, that would be
"hacking…") and finally the physics department got some Suns too play with.

I bought the Bourne book to navigate the basic shell utilities and of
course there was source code (which we also weren't meant to have access
to, etc etc) - but from my world "regexp" were a fuzzy concept defined by
sed and grep (and various grep reimplementations) until Perl arrived and
sedimented (crowned?) Henry's implementation.

And that's why we call it PCRE.

-a

[-- Attachment #2: Type: text/html, Size: 1875 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [TUHS] Re: regex early discussions
  2024-03-04  3:38   ` Larry McVoy
  2024-03-04  4:18     ` Rich Salz
  2024-03-04  7:51     ` Alec Muffett
@ 2024-03-04  8:17     ` Rob Pike
  2024-03-04  8:43       ` Alec Muffett
  2024-03-04 10:21       ` Bakul Shah via TUHS
  2024-03-04 14:34     ` Larry McVoy
  3 siblings, 2 replies; 24+ messages in thread
From: Rob Pike @ 2024-03-04  8:17 UTC (permalink / raw)
  To: Larry McVoy; +Cc: Marc Rochkind, Will Senn, TUHS

[-- Attachment #1: Type: text/plain, Size: 4775 bytes --]

If that's really true, that you learned from Spencer's library, then you
didn't learn the most important thing about them, which is the automata
theory that guarantees their performance is always linear. Not to take
anything away from Henry, who admitted at the time that it could be slow
for bad expressions, but we're still paying the price for refusing to
connect "regex" with the theory that created them, ignoring it in fact.

Background: https://swtch.com/~rsc/regexp/regexp1.html

-rob


On Mon, Mar 4, 2024 at 2:38 PM Larry McVoy <lm@mcvoy.com> wrote:

> Marc is right.  I'll add that I grew up in terminal rooms, a bunch of
> kids connected to a VAX 780, like 40 or more.  I have no idea how the
> kids ahead of me learned but I learned by looking at their terminal
> and going "what did you just do?".
>
> My real understanding of regex is from Henry Spencer's regex.
>
> On Sun, Mar 03, 2024 at 07:03:39PM -0700, Marc Rochkind wrote:
> > Will, here's my recollection, when I got to UNIX in late 1972 or
> > thereabouts:
> >
> > First, there was ed. grep and sed were derived from ed, so came along
> > later. awk came along way later.
> >
> > There were only manual pages. You typed "man ed" and there it was. The
> man
> > pages were very accurate, very clear, and very authoritative. Many found
> > them too succinct, especially as UNIX got more popular, but all of us
> back
> > in the day found them perfect. Maybe you had to read the man page a few
> > times to understand it, but at least that's all you had to read. No need
> to
> > hunt around for more documentation!
> >
> > (Well, there was more documentation: The source code, which was all
> online.
> > But reading the ed source to understand regular expressions was
> impossible.
> > It was in assembler, and Ken was generating code on the fly as the
> > expression was compiled.)
> >
> > Also, it should be noted that ed produced a single error message: a
> > question mark. No wasting of teletype paper!
> >
> > The motivation for learning regular expressions was that that's how you
> > edited files. ed was the only game in town.
> >
> > (sh used a greatly restricted form of regular expressions, which were
> > documented on the sh man page.)
> >
> > Marc Rochkind
> >
> > On Sun, Mar 3, 2024 at 6:31???PM Will Senn <will.senn@gmail.com> wrote:
> >
> > > Hi All,
> > >
> > > I was wondering, what were the best early sources of information for
> > > regexes and why did folks need to know them to use unix? In my recent
> > > explorations, I have needed to have a better understanding of them, so
> I'm
> > > digging in... awk's my most recent thing and it's deeply associated
> with
> > > them, so here we are. I went to the bookshelf to find something
> appropriate
> > > and as usual, I've traced to primary sources to some extent. I started
> with
> > > Mastering Regular Expressions by Friedl, and I won't knock it (it's
> one of
> > > the bestsellers in our field), but it's much to long for my personal
> taste
> > > and it's not quite as systematic as I would like (the author himself
> notes
> > > that his interests are less technical than authors preceding him on the
> > > subject). So, back to the shelves... Bourne's, The Unix Environment,
> and
> > > Kernighan & Pike's, The Unix Programming Evironment both talk about
> them in
> > > the context of grep, ed, sed, and awk. Going further back, the Unix
> > > Programmer's Manual v7 - ed, grep, sed, awk...
> > >
> > > After digging around it seems like folks needed regexes for ed, grep,
> sed
> > > and awk... and any other utility that leveraged the wonderful nature of
> > > these handy expressions. Fine. Where did folks go learn them? Was
> there a
> > > particularly good (succinct and accurate) source of information that
> folks
> > > kept handy? I'm imagining (based on what I've seen) that someone might
> cut
> > > out the ed discussion or the grep pages of the manual and tape them to
> > > their monitors, but maybe I'm stooopid and they didn't need no stinkin'
> > > memory device for regexes - surely they're intuitive enough that even a
> > > simpleton could pick them up after seeing a few examples... but if that
> > > were really the case, Friedl's book would have been a flop and it
> wasn't
> > > :). So seriously, if you remember that far back - what was the
> definitive
> > > source of your regex knowledge and what were the first motivators for
> > > learning them?
> > >
> > > Thanks,
> > >
> > > Will
> > >
> >
> >
> > --
> > *My new email address is mrochkind@gmail.com <mrochkind@gmail.com>*
>
> --
> ---
> Larry McVoy           Retired to fishing
> http://www.mcvoy.com/lm/boat
>

[-- Attachment #2: Type: text/html, Size: 6374 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [TUHS] Re: regex early discussions
  2024-03-04  8:17     ` Rob Pike
@ 2024-03-04  8:43       ` Alec Muffett
  2024-03-04 14:25         ` Jan Schaumann via TUHS
  2024-03-04 10:21       ` Bakul Shah via TUHS
  1 sibling, 1 reply; 24+ messages in thread
From: Alec Muffett @ 2024-03-04  8:43 UTC (permalink / raw)
  To: Rob Pike; +Cc: Marc Rochkind, Will Senn, TUHS

[-- Attachment #1: Type: text/plain, Size: 1236 bytes --]

On Mon, 4 Mar 2024, 08:27 Rob Pike, <robpike@gmail.com> wrote [to Larry]

Oh happy days. Hi Rob, loved the book.


If that's really true, that you learned from Spencer's library, then you
> didn't learn the most important thing about them, which is the automata
> theory that guarantees their performance is always linear. Not to take
> anything away from Henry, who admitted at the time that it could be slow
> for bad expressions, but we're still paying the price for refusing to
> connect "regex" with the theory that created them, ignoring it in fact.
>

I once got into a bunfight with a Googler on the topic of coding interview
questions, on a related matter. He was promulgating a regular expression to
correctly match/parse-out legitimate dotted-quad IPv4 addresses, including
bounds-checking the octets to be in the range 0..255, and arguing that it
since it was going to be run through a DFA that it was a sunk cost for
efficiency and therefore perfect.

The result looked like line noise, and he was perturbed that I said I would
prefer to take a much simpler (NFA?) RE, parse out the ints and
bounds-check them, just to reduce cognitive load and increase
maintainability of code.

We didn't really come to an agreement.

-a

[-- Attachment #2: Type: text/html, Size: 2156 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [TUHS] Re: regex early discussions
  2024-03-04  8:17     ` Rob Pike
  2024-03-04  8:43       ` Alec Muffett
@ 2024-03-04 10:21       ` Bakul Shah via TUHS
  1 sibling, 0 replies; 24+ messages in thread
From: Bakul Shah via TUHS @ 2024-03-04 10:21 UTC (permalink / raw)
  To: Rob Pike; +Cc: Marc Rochkind, Will Senn, TUHS

[-- Attachment #1: Type: text/html, Size: 7348 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [TUHS] Re: regex early discussions
  2024-03-04  2:03 ` [TUHS] " Marc Rochkind
  2024-03-04  3:38   ` Larry McVoy
  2024-03-04  7:10   ` Otto Moerbeek via TUHS
@ 2024-03-04 12:00   ` Peter Weinberger (温博格) via TUHS
  2024-03-04 17:05   ` Will Senn
  3 siblings, 0 replies; 24+ messages in thread
From: Peter Weinberger (温博格) via TUHS @ 2024-03-04 12:00 UTC (permalink / raw)
  To: Marc Rochkind; +Cc: Will Senn, TUHS

my recollection is that awk and sed were contemporaneous.

On Sun, Mar 3, 2024 at 9:04 PM Marc Rochkind <mrochkind@gmail.com> wrote:
>
> Will, here's my recollection, when I got to UNIX in late 1972 or thereabouts:
>
> First, there was ed. grep and sed were derived from ed, so came along later. awk came along way later.
>
> There were only manual pages. You typed "man ed" and there it was. The man pages were very accurate, very clear, and very authoritative. Many found them too succinct, especially as UNIX got more popular, but all of us back in the day found them perfect. Maybe you had to read the man page a few times to understand it, but at least that's all you had to read. No need to hunt around for more documentation!
>
> (Well, there was more documentation: The source code, which was all online. But reading the ed source to understand regular expressions was impossible. It was in assembler, and Ken was generating code on the fly as the expression was compiled.)
>
> Also, it should be noted that ed produced a single error message: a question mark. No wasting of teletype paper!
>
> The motivation for learning regular expressions was that that's how you edited files. ed was the only game in town.
>
> (sh used a greatly restricted form of regular expressions, which were documented on the sh man page.)
>
> Marc Rochkind
>
> On Sun, Mar 3, 2024 at 6:31 PM Will Senn <will.senn@gmail.com> wrote:
>>
>> Hi All,
>>
>> I was wondering, what were the best early sources of information for regexes and why did folks need to know them to use unix? In my recent explorations, I have needed to have a better understanding of them, so I'm digging in... awk's my most recent thing and it's deeply associated with them, so here we are. I went to the bookshelf to find something appropriate and as usual, I've traced to primary sources to some extent. I started with Mastering Regular Expressions by Friedl, and I won't knock it (it's one of the bestsellers in our field), but it's much to long for my personal taste and it's not quite as systematic as I would like (the author himself notes that his interests are less technical than authors preceding him on the subject). So, back to the shelves... Bourne's, The Unix Environment, and Kernighan & Pike's, The Unix Programming Evironment both talk about them in the context of grep, ed, sed, and awk. Going further back, the Unix Programmer's Manual v7 - ed, grep, sed, awk...
>>
>> After digging around it seems like folks needed regexes for ed, grep, sed and awk... and any other utility that leveraged the wonderful nature of these handy expressions. Fine. Where did folks go learn them? Was there a particularly good (succinct and accurate) source of information that folks kept handy? I'm imagining (based on what I've seen) that someone might cut out the ed discussion or the grep pages of the manual and tape them to their monitors, but maybe I'm stooopid and they didn't need no stinkin' memory device for regexes - surely they're intuitive enough that even a simpleton could pick them up after seeing a few examples... but if that were really the case, Friedl's book would have been a flop and it wasn't :). So seriously, if you remember that far back - what was the definitive source of your regex knowledge and what were the first motivators for learning them?
>>
>> Thanks,
>>
>> Will
>
>
>
> --
> My new email address is mrochkind@gmail.com

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [TUHS] Re: regex early discussions
  2024-03-04  7:25       ` arnold
@ 2024-03-04 12:05         ` Ralph Corderoy
  2024-03-04 13:01           ` arnold
  0 siblings, 1 reply; 24+ messages in thread
From: Ralph Corderoy @ 2024-03-04 12:05 UTC (permalink / raw)
  To: tuhs; +Cc: will.senn, mrochkind

Hi Arnold,

> I learned regular expressions from Kernighan & Plauger's book
> "Software Tools".  I was exposed to that book, Unix (v6 on a PDP-11)
> and C programming (via K&R's book) all at the same time.  This was in
> the fall of 1980.

An excellent book.  What I think you've not mentioned is that it
implements regular expressions.  Being inside the black box can aid
understanding, including the performance of the matcher and the way the
regexp is best written for a particular matcher.

Kernighan and Pike's ‘The practice of programming’ also briefly
implements some regexp functionality when talking about the power of
notation.

-- 
Cheers, Ralph.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [TUHS] Re: regex early discussions
  2024-03-04 12:05         ` Ralph Corderoy
@ 2024-03-04 13:01           ` arnold
  0 siblings, 0 replies; 24+ messages in thread
From: arnold @ 2024-03-04 13:01 UTC (permalink / raw)
  To: tuhs, ralph; +Cc: will.senn, mrochkind

Hi Ralph.

Ralph Corderoy <ralph@inputplus.co.uk> wrote:

> Hi Arnold,
>
> > I learned regular expressions from Kernighan & Plauger's book
> > "Software Tools".  I was exposed to that book, Unix (v6 on a PDP-11)
> > and C programming (via K&R's book) all at the same time.  This was in
> > the fall of 1980.
>
> An excellent book.  What I think you've not mentioned is that it
> implements regular expressions.  Being inside the black box can aid
> understanding, including the performance of the matcher and the way the
> regexp is best written for a particular matcher.

Quite true.

> Kernighan and Pike's ‘The practice of programming’ also briefly
> implements some regexp functionality when talking about the power of
> notation.

What I didn't quite remember when I wrote the earlier note was that
at the same time as I was learning C, Unix and software tools, I took
a compiler course, using the first edition of the dragon book, which
covered regular expressions, NFAs and DFAs.

It all came together at the same time.

Arnold

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [TUHS] Re: regex early discussions
  2024-03-04  1:30 [TUHS] regex early discussions Will Senn
  2024-03-04  2:03 ` [TUHS] " Marc Rochkind
@ 2024-03-04 13:17 ` Alan D. Salewski
  2024-03-04 16:57 ` Clem Cole
  2 siblings, 0 replies; 24+ messages in thread
From: Alan D. Salewski @ 2024-03-04 13:17 UTC (permalink / raw)
  To: TUHS (The Unix Heritage Society)

On Sun, Mar 3, 2024, at 20:30, Will Senn wrote:
> Hi All,
>
> I was wondering, what were the best early sources of information for 
> regexes and why did folks need to know them to use unix?
[...]
> Thanks,
>
> Will

I don't think I've seen in this thread mention of the 1968 CACM
article by Ken Thompson:

    "Regular Expression Search Algorithm"
    Ken Thompson
    Bell Telephone Laboratories, Inc., Murray Hill, New Jersey
    Communications of the ACM, Volume 11, Number 6, 1968-06

The abstract:
<quote>
    A method for locating specific character strings embedded in
    character text is described and an implementation of this method
    in the form of a compiler is discussed. The compiler accepts a
    regular expression as source language and produces an IBM 7094
    program as object language. The object program then accepts the
    text to be searched as input and produces a signal every time an
    embedded string in the text matches the given regular
    expression. Examples, problems, and solutions are also
    presented.
</quote>

-- 
a l a n   d.   s a l e w s k i
ads@salewski.email
salewski@att.net
https://github.com/salewski

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [TUHS] Re: regex early discussions
  2024-03-04  8:43       ` Alec Muffett
@ 2024-03-04 14:25         ` Jan Schaumann via TUHS
  0 siblings, 0 replies; 24+ messages in thread
From: Jan Schaumann via TUHS @ 2024-03-04 14:25 UTC (permalink / raw)
  To: Alec Muffett; +Cc: Marc Rochkind, Will Senn, TUHS

Alec Muffett <alec.muffett@gmail.com> wrote:

> I once got into a bunfight with a Googler on the topic of coding interview
> questions, on a related matter. He was promulgating a regular expression to
> correctly match/parse-out legitimate dotted-quad IPv4 addresses

That seems an excellent illustration of "now they have
two problems."  (And now do IPv6.)

If you need to pull IP addresses from text, the most
liberal regex will generally be "good enough"; if you
must be certain, feed the string to inet_aton(3).
:-)

-Jan

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [TUHS] Re: regex early discussions
  2024-03-04  3:38   ` Larry McVoy
                       ` (2 preceding siblings ...)
  2024-03-04  8:17     ` Rob Pike
@ 2024-03-04 14:34     ` Larry McVoy
  3 siblings, 0 replies; 24+ messages in thread
From: Larry McVoy @ 2024-03-04 14:34 UTC (permalink / raw)
  To: Marc Rochkind; +Cc: Will Senn, TUHS

On Sun, Mar 03, 2024 at 07:38:45PM -0800, Larry McVoy wrote:
> Marc is right.  I'll add that I grew up in terminal rooms, a bunch of
> kids connected to a VAX 780, like 40 or more.  I have no idea how the
> kids ahead of me learned but I learned by looking at their terminal
> and going "what did you just do?".
> 
> My real understanding of regex is from Henry Spencer's regex.

And this little implementation, I've used this one a lot.   

http://www.cs.yorku.ca/~oz/regex.bun

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [TUHS] Re: regex early discussions
  2024-03-04  1:30 [TUHS] regex early discussions Will Senn
  2024-03-04  2:03 ` [TUHS] " Marc Rochkind
  2024-03-04 13:17 ` Alan D. Salewski
@ 2024-03-04 16:57 ` Clem Cole
  2024-03-04 18:38   ` Phil Budne
  2 siblings, 1 reply; 24+ messages in thread
From: Clem Cole @ 2024-03-04 16:57 UTC (permalink / raw)
  To: Will Senn; +Cc: TUHS

[-- Attachment #1: Type: text/plain, Size: 4597 bytes --]

I've already had a chat with Will, but I wanted to add some other thoughts
to the group as a whole:

   - As was pointed out by others, computer life (certainly not interactive
   computing) does not begin with UNIX (*i.e.* Interactive Text Editors
   have been around since the beginning of Interactive computing).   I'll use
   Thomas Haigh and Paul Ceruzzi's text: "A New History of Modern computing" -
   which basically pegs that as CTSS.  I don't know what the original editor
   was for CTSS. [if some one like Doug or Ken remembers, I'd be curious to
   know].
   - Numerous editors show up on different systems, including STOPGAP on
   the MIT PDP6, eventually SOS, TECO, EMACs, *etc*., and most have some
   concept of a 'line of text' to distinguish from a 'card image.'
   - Common to all is some way to search or find text and some way to
   replace it - usually on a line of input.
   - One of them is Lampson and Deutsch's "quick editor" or QED for SDS.
   - Language theory was definitely a hot item by the mid-1960s and lots of
   papers discussing automaton and the like appear, including Ken's CACM 1968
   article describing his reg-ex search algorithm implementation for the IBM
   7094 [it should be findable with a search -- send me an email offline, I
   have a copy of a crappy scan but it is readable].
   - Most editors like SOS, TECO and the like do not have support for
   reg-ex, but do have some way to do sophisticated searching (and
   replacement).
   - Ken wrote an implementation of QED for CTSS and included his search
   algorithm as an integral part of this new implementation.
   - When Ken writes the original UNIX editor, he bases it on the above.
   - UNIX builds up this idea of a pipeline, so building separate tools
   that connect together make sense and are natural.
   - When Rudd, Doug, Ken, Dennis, *et al* start to develop UNIX - they are
   building a system for *themselves.*
   - One member of the group (Lee McHahon) is using the g/re/p command to
   find things and gets the brilliant idea of a separate tool, grep(1) would
   be born.
   - The most important item here is that said team is a group of
   programmers, so it was logical that the system was useful and easy to
   understand by other programmers.


Will asked how did people learn about Reg-Ex?   The answer of course, it
depends.

But if you were to take college-level CS courses in the late 60s or the
70s, as Bakul mentioned (I also had a similar experience), if you were
going to be taught about automata and simple language theory -- likely in
your first data structures and algorithms class, as certainly by the time
you took a compiler course. My memory is I learned basic automata theory in
the first, but did not see the idea of regular expressions until compilers
[in my case, this is all pre-dragon book].   For all of you later in the
70s, Aho and Ullman's classic text would have exposed it to you.    FWIW:
In the 2000's my daughter's college CS training, she never had to take a
compiler or comparative languages course, but she was taught about reg-ex
in her data structures course.

The key is you were taught a bit about automata theory, but if you really
started to study it, you look at things like the performance of the
different algorithms.  As Rob says, the key take away from learning about
the reg-ex idea, is its linear performance.  So, if you were trained in
some of the formal CS ideas, *using reg-ex was not a huge lift*. It was
natural.

That said, if you were coming from other systems using things like SOS or
Teco (like me), they offered search functions also but the expressions but
no in the same way.  It was a different way to do things, but people like
me, quickly realized it was a lot more powerful and could do much more. *"Ah
ha .. cool beans, apply something I already knew about in a way I had not
seen before ... next item ..."*

So there are a few things to realize from this.

   1. Adding things like reg-ex to tools like sed(1) and awk(1) were
   natural follow-ons to things like grep(1) and ed(1).
   2. If you were a CS person, it was not a big deal - just the more
   powerful "UNIX-way" as it were. But...
   3. If you came from another world of computing (say DEC or a PC)  where
   such tools were not exposed in a manner that was easy to build upon *and/or
   you had never been taught much of any core CS theory* [which is where
   Will cut his teeth], reg-ex might be astonishing.

So I think its not a question of why -- it was just how UNIX did things. It
was a natural way for a programmer to express something.

[-- Attachment #2: Type: text/html, Size: 5096 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [TUHS] Re: regex early discussions
  2024-03-04  2:03 ` [TUHS] " Marc Rochkind
                     ` (2 preceding siblings ...)
  2024-03-04 12:00   ` Peter Weinberger (温博格) via TUHS
@ 2024-03-04 17:05   ` Will Senn
  2024-03-04 18:43     ` Rich Salz
  3 siblings, 1 reply; 24+ messages in thread
From: Will Senn @ 2024-03-04 17:05 UTC (permalink / raw)
  To: TUHS

[-- Attachment #1: Type: text/plain, Size: 5559 bytes --]

To close the loop a bit...

I really appreciate the anecdotes and background. It's helpful to those 
of us who didn't live it.

On the best resources front:

The Unix Programmer's Manual for v7 contains:
"A Tutorial Introduction to the UNIX Text Editor" by B. W. Kernighan - 
excellent coverage of Context Searching using a limited subset of regex.
"Advanced Editing on UNIX" by B. W. Kernighan - lots of examples.
"ed(1)" by authors of the manpages - super concise but thorough coverage 
of the regex rules (great followup to the tutorial).

Articles:
"Regular Expression Search Algorithm", by K. Thompson - an Algol-60 
implementation of regex described in 4 pages... in 1968... I was 2 1/2.
"Regular Expression Matching Can Be Simple and Fast", by Russ Cox - how 
can an article be both simple and deep? Great concision.

Other Books:
"The AWK Programming Language" by A. V. Aho, B. W. Kernighan, & P. J. 
Weinberger - the discussion on pp. 28-31, Regular Expressions, is the 
best I've seen.

"Chapter 9. Regular Expresssions" in the XBD section of the SUS (IEEE 
Std 1003.1-2017) - Comprehensive presentation of the spec (good stuff, 
even if nobody perfectly implements it).

There are plenty more, but with the tutorial, ed(1), and AWK book in 
hand, I think a beginner is covered.

BTW, awk is awesome (particularly with the new csv additions) - I don't 
"need" the new unicode support, but it's nice. I didn't get awk, but 
when I figured out you could do this:

    awk '/SYS.*\(write\,/, /\)/' */*
    SYSCALL_DEFINE3(write, unsigned int, fd, const char __user *, buf,
                    size_t, count)


in the kernel source, I was sold. I've never really wrapped my head 
around how to efficiently search over multiple lines, awk's range 
patterns... just make sense :). Even in it looks crazy, it works.

ranges bounded by regexes... who'd of thunk it?

Will



On 3/3/24 8:03 PM, Marc Rochkind wrote:
> Will, here's my recollection, when I got to UNIX in late 1972 or 
> thereabouts:
>
> First, there was ed. grep and sed were derived from ed, so came along 
> later. awk came along way later.
>
> There were only manual pages. You typed "man ed" and there it was. The 
> man pages were very accurate, very clear, and very authoritative. Many 
> found them too succinct, especially as UNIX got more popular, but all 
> of us back in the day found them perfect. Maybe you had to read the 
> man page a few times to understand it, but at least that's all you had 
> to read. No need to hunt around for more documentation!
>
> (Well, there was more documentation: The source code, which was all 
> online. But reading the ed source to understand regular expressions 
> was impossible. It was in assembler, and Ken was generating code on 
> the fly as the expression was compiled.)
>
> Also, it should be noted that ed produced a single error message: a 
> question mark. No wasting of teletype paper!
>
> The motivation for learning regular expressions was that that's how 
> you edited files. ed was the only game in town.
>
> (sh used a greatly restricted form of regular expressions, which were 
> documented on the sh man page.)
>
> Marc Rochkind
>
> On Sun, Mar 3, 2024 at 6:31 PM Will Senn <will.senn@gmail.com> wrote:
>
>     Hi All,
>
>     I was wondering, what were the best early sources of information
>     for regexes and why did folks need to know them to use unix? In my
>     recent explorations, I have needed to have a better understanding
>     of them, so I'm digging in... awk's my most recent thing and it's
>     deeply associated with them, so here we are. I went to the
>     bookshelf to find something appropriate and as usual, I've traced
>     to primary sources to some extent. I started with Mastering
>     Regular Expressions by Friedl, and I won't knock it (it's one of
>     the bestsellers in our field), but it's much to long for my
>     personal taste and it's not quite as systematic as I would like
>     (the author himself notes that his interests are less technical
>     than authors preceding him on the subject). So, back to the
>     shelves... Bourne's, The Unix Environment, and Kernighan & Pike's,
>     The Unix Programming Evironment both talk about them in the
>     context of grep, ed, sed, and awk. Going further back, the Unix
>     Programmer's Manual v7 - ed, grep, sed, awk...
>
>     After digging around it seems like folks needed regexes for ed,
>     grep, sed and awk... and any other utility that leveraged the
>     wonderful nature of these handy expressions. Fine. Where did folks
>     go learn them? Was there a particularly good (succinct and
>     accurate) source of information that folks kept handy? I'm
>     imagining (based on what I've seen) that someone might cut out the
>     ed discussion or the grep pages of the manual and tape them to
>     their monitors, but maybe I'm stooopid and they didn't need no
>     stinkin' memory device for regexes - surely they're intuitive
>     enough that even a simpleton could pick them up after seeing a few
>     examples... but if that were really the case, Friedl's book would
>     have been a flop and it wasn't :). So seriously, if you remember
>     that far back - what was the definitive source of your regex
>     knowledge and what were the first motivators for learning them?
>
>     Thanks,
>
>     Will
>
>
>
> -- 
> /My new email address is mrochkind@gmail.com/

[-- Attachment #2: Type: text/html, Size: 8513 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [TUHS] Re: regex early discussions
  2024-03-04 16:57 ` Clem Cole
@ 2024-03-04 18:38   ` Phil Budne
  0 siblings, 0 replies; 24+ messages in thread
From: Phil Budne @ 2024-03-04 18:38 UTC (permalink / raw)
  To: tuhs

On the subject of learning how to use reg-exs:

For the better part of a decade I worked on a VoIP system (whose first
product plan was to replace the POTs network: the CTO had a
candlestick phone (sans dial) in his cube attached to a VoIP ATA
(everyone worked from a cube) to hilight that the telephone UI had
gone from switchboard to dial to number pad, and was past due for
replacement.

The admin UI was a poor stepchild (the UX developer was explicitly
excluded from work on it).  To implement "dial plans" and call
routing, the product had a screen with sed style match and
replacements.

My involvement with the product started in the second product plan: a
multi-tenant conferencing system in a 1U box, first at a startup,
after the startup was acquired by Alcatel, Alcatel became
Alcatel-Lucent, and finally when ANOTHER startup purchased rights to
maintain the code as critical to their operations.

In that final setting, I ended up in a room with about 30 customer
service representatives, none of whom I could easily imagine had ever
C.S. course.  I expressed my amazement at their ability to deal with
the reg-ex interface, and apologized the fact that they had to deal with an
interface with such sharp edges!

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [TUHS] Re: regex early discussions
  2024-03-04 17:05   ` Will Senn
@ 2024-03-04 18:43     ` Rich Salz
  2024-03-04 20:57       ` Bakul Shah via TUHS
  2024-03-04 21:05       ` Steffen Nurpmeso
  0 siblings, 2 replies; 24+ messages in thread
From: Rich Salz @ 2024-03-04 18:43 UTC (permalink / raw)
  To: Will Senn; +Cc: TUHS

[-- Attachment #1: Type: text/plain, Size: 276 bytes --]

On Mon, Mar 4, 2024 at 12:05 PM Will Senn <will.senn@gmail.com> wrote:

> ranges bounded by regexes... who'd of thunk it?
>

Go read about the Rob Pike's sam editor (such as
http://doc.cat-v.org/plan_9/4th_edition/papers/sam/) and prepare to have
your mind blown :)

[-- Attachment #2: Type: text/html, Size: 726 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [TUHS] Re: regex early discussions
  2024-03-04 18:43     ` Rich Salz
@ 2024-03-04 20:57       ` Bakul Shah via TUHS
  2024-03-04 21:05       ` Steffen Nurpmeso
  1 sibling, 0 replies; 24+ messages in thread
From: Bakul Shah via TUHS @ 2024-03-04 20:57 UTC (permalink / raw)
  To: Rich Salz; +Cc: Will Senn, TUHS

[-- Attachment #1: Type: text/html, Size: 1357 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [TUHS] Re: regex early discussions
  2024-03-04 18:43     ` Rich Salz
  2024-03-04 20:57       ` Bakul Shah via TUHS
@ 2024-03-04 21:05       ` Steffen Nurpmeso
  1 sibling, 0 replies; 24+ messages in thread
From: Steffen Nurpmeso @ 2024-03-04 21:05 UTC (permalink / raw)
  To: Rich Salz; +Cc: Will Senn, TUHS

Rich Salz wrote in
 <CAFH29to+PPZLhazvcnuMQ7L_mKC5fZTTMGya5yyj=b+HjosyqQ@mail.gmail.com>:
 |On Mon, Mar 4, 2024 at 12:05 PM Will Senn <will.senn@gmail.com> wrote:
 |> ranges bounded by regexes... who'd of thunk it?
 |>
 |
 |Go read about the Rob Pike's sam editor (such as
 |http://doc.cat-v.org/plan_9/4th_edition/papers/sam/) and prepare to have
 |your mind blown :)

I wanted to point to that.

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2024-03-04 21:33 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-04  1:30 [TUHS] regex early discussions Will Senn
2024-03-04  2:03 ` [TUHS] " Marc Rochkind
2024-03-04  3:38   ` Larry McVoy
2024-03-04  4:18     ` Rich Salz
2024-03-04  7:51     ` Alec Muffett
2024-03-04  8:17     ` Rob Pike
2024-03-04  8:43       ` Alec Muffett
2024-03-04 14:25         ` Jan Schaumann via TUHS
2024-03-04 10:21       ` Bakul Shah via TUHS
2024-03-04 14:34     ` Larry McVoy
2024-03-04  7:10   ` Otto Moerbeek via TUHS
2024-03-04  7:19     ` Dave Long
2024-03-04  7:25       ` arnold
2024-03-04 12:05         ` Ralph Corderoy
2024-03-04 13:01           ` arnold
2024-03-04  7:25     ` Otto Moerbeek via TUHS
2024-03-04 12:00   ` Peter Weinberger (温博格) via TUHS
2024-03-04 17:05   ` Will Senn
2024-03-04 18:43     ` Rich Salz
2024-03-04 20:57       ` Bakul Shah via TUHS
2024-03-04 21:05       ` Steffen Nurpmeso
2024-03-04 13:17 ` Alan D. Salewski
2024-03-04 16:57 ` Clem Cole
2024-03-04 18:38   ` Phil Budne

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).