* [TUHS] RegExp decision for meta characters: Circumflex
@ 2021-09-17 16:40 Douglas McIlroy
2021-09-17 20:40 ` Chris Torek
0 siblings, 1 reply; 8+ messages in thread
From: Douglas McIlroy @ 2021-09-17 16:40 UTC (permalink / raw)
To: TUHS main list
> Maybe there existed RE notations that were simply copied ...
Ed was derived from Ken's earlier qed. Qed's descendant in Multics was
described in a 1969 GE document:
http://www.bitsavers.org/pdf/honeywell/multics/swenson/6906.multics-condensed-guide.pdf.
Unfortunately it describes regular expressions only sketchily by
example. However, alternation, symbolized by | with grouping by
parentheses, was supported in qed, whereas alternation was omitted
from ed. The GE document does not mention character classes; an
example shows how to use alternation for the same purpose.
Beginning-of-line is specified by a logical-negation symbol. In
apparent contradiction, the v1 manual says the meanings of [ and ^ are
the same in ed and (an unspecified version of) qed. My guess about the
discrepancies is no better than yours.
(I am amused by the title "condensed guide" for a manual in which each
qed request gets a full page of explanation. It exemplifies how Unix
split from Multics in matters of taste.)
Doug
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [TUHS] RegExp decision for meta characters: Circumflex
2021-09-17 16:40 [TUHS] RegExp decision for meta characters: Circumflex Douglas McIlroy
@ 2021-09-17 20:40 ` Chris Torek
2021-09-18 1:03 ` Greg 'groggy' Lehey
0 siblings, 1 reply; 8+ messages in thread
From: Chris Torek @ 2021-09-17 20:40 UTC (permalink / raw)
To: tuhs
Also worth noting, though the precise history predates my own
experience: it's common in grammar theory to use `$` as the end
symbol. Was this from REs using `$` as an end symbol as well,
or did REs adopt `$` from here, or ...?
Chris
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [TUHS] RegExp decision for meta characters: Circumflex
2021-09-17 20:40 ` Chris Torek
@ 2021-09-18 1:03 ` Greg 'groggy' Lehey
2021-09-18 1:23 ` Bakul Shah
0 siblings, 1 reply; 8+ messages in thread
From: Greg 'groggy' Lehey @ 2021-09-18 1:03 UTC (permalink / raw)
To: Chris Torek; +Cc: tuhs
[-- Attachment #1: Type: text/plain, Size: 655 bytes --]
On Friday, 17 September 2021 at 13:40:25 -0700, Chris Torek wrote:
> Also worth noting, though the precise history predates my own
> experience: it's common in grammar theory to use `$` as the end
> symbol. Was this from REs using `$` as an end symbol as well,
> or did REs adopt `$` from here, or ...?
Weren't there programming languages that used $ as a statement
terminator instead of ;?
Greg
--
Sent from my desktop computer.
Finger grog@lemis.com for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed. If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA.php
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 163 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [TUHS] RegExp decision for meta characters: Circumflex
2021-09-18 1:03 ` Greg 'groggy' Lehey
@ 2021-09-18 1:23 ` Bakul Shah
0 siblings, 0 replies; 8+ messages in thread
From: Bakul Shah @ 2021-09-18 1:23 UTC (permalink / raw)
To: Greg 'groggy' Lehey; +Cc: tuhs
On Sep 17, 2021, at 6:03 PM, Greg 'groggy' Lehey <grog@lemis.com> wrote:
>
> On Friday, 17 September 2021 at 13:40:25 -0700, Chris Torek wrote:
>> Also worth noting, though the precise history predates my own
>> experience: it's common in grammar theory to use `$` as the end
>> symbol. Was this from REs using `$` as an end symbol as well,
>> or did REs adopt `$` from here, or ...?
>
> Weren't there programming languages that used $ as a statement
> terminator instead of ;?
IIRC Macsyma used ; as well as $ as statement terminators.
; if you wanted to print the output of an expresion,
$ if you wanted to suppress the output. But I suspect
it is not related to this.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [TUHS] RegExp decision for meta characters: Circumflex
2021-09-17 9:32 ` Rob Pike
2021-09-17 9:32 ` Rob Pike
@ 2021-09-17 10:10 ` markus schnalke
1 sibling, 0 replies; 8+ messages in thread
From: markus schnalke @ 2021-09-17 10:10 UTC (permalink / raw)
To: Rob Pike; +Cc: TUHS main list
Hoi.
[2021-09-17 11:32] Rob Pike <robpike@gmail.com>
>
> You'd have to ask ken why he chose the characters he did, but I can answer the
> second question. The beginning and end of line are the same. If you make ^ mean
> both beginning and end of line, what does this ed command do:
>
> s/^/x/
>
> Which end gets the x?
Perfect answer! I just never thought about replacing. *oops*
Now that's obvious. ;-)
meillo
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [TUHS] RegExp decision for meta characters: Circumflex
2021-09-17 9:32 ` Rob Pike
@ 2021-09-17 9:32 ` Rob Pike
2021-09-17 10:10 ` markus schnalke
1 sibling, 0 replies; 8+ messages in thread
From: Rob Pike @ 2021-09-17 9:32 UTC (permalink / raw)
To: markus schnalke; +Cc: TUHS main list
[-- Attachment #1: Type: text/plain, Size: 1631 bytes --]
*NOT* the same. Sorry....
I hope the example explains better than my prose.
-rob
On Fri, Sep 17, 2021 at 7:32 PM Rob Pike <robpike@gmail.com> wrote:
> You'd have to ask ken why he chose the characters he did, but I can answer
> the second question. The beginning and end of line are the same. If you
> make ^ mean both beginning and end of line, what does this ed command do:
>
> s/^/x/
>
> Which end gets the x?
>
> -rob
>
>
> On Fri, Sep 17, 2021 at 7:00 PM markus schnalke <meillo@marmaro.de> wrote:
>
>> Hoi,
>>
>> I'm interested in the early design decisions for meta characters
>> in REs, mainly regarding Ken's RE implementation in ed.
>>
>> Two questions:
>>
>> 1) Circumflex
>>
>> As far as I see, the circumflex (^) is the only meta character that
>> has two different special meanings in REs: First being the
>> beginning of line anchor and second inverting a character class.
>> Why was it chosen for the second one? Why not the exclamation mark
>> in that case? (Sure, C didn't exist by then, but the bang probably
>> was used to negate in other languages of the time, I think.)
>>
>> 2) Symbol for the end of line anchor
>>
>> What is the reason that the beginning of line and end of line
>> anchors are different symbols? Is there a reason why not only one
>> symbol, say the circumflex, was chosen to represent both? I
>> currently see no disadvantages of such a design. (Circumflexes
>> aren't likely to end lines of text, neither.)
>>
>> I would appreciate if you could help me understand these design
>> decisions better. Maybe there existed RE notations that were simply
>> copied ...
>>
>>
>> meillo
>>
>
[-- Attachment #2: Type: text/html, Size: 2413 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [TUHS] RegExp decision for meta characters: Circumflex
2021-09-17 8:52 markus schnalke
@ 2021-09-17 9:32 ` Rob Pike
2021-09-17 9:32 ` Rob Pike
2021-09-17 10:10 ` markus schnalke
0 siblings, 2 replies; 8+ messages in thread
From: Rob Pike @ 2021-09-17 9:32 UTC (permalink / raw)
To: markus schnalke; +Cc: TUHS main list
[-- Attachment #1: Type: text/plain, Size: 1424 bytes --]
You'd have to ask ken why he chose the characters he did, but I can answer
the second question. The beginning and end of line are the same. If you
make ^ mean both beginning and end of line, what does this ed command do:
s/^/x/
Which end gets the x?
-rob
On Fri, Sep 17, 2021 at 7:00 PM markus schnalke <meillo@marmaro.de> wrote:
> Hoi,
>
> I'm interested in the early design decisions for meta characters
> in REs, mainly regarding Ken's RE implementation in ed.
>
> Two questions:
>
> 1) Circumflex
>
> As far as I see, the circumflex (^) is the only meta character that
> has two different special meanings in REs: First being the
> beginning of line anchor and second inverting a character class.
> Why was it chosen for the second one? Why not the exclamation mark
> in that case? (Sure, C didn't exist by then, but the bang probably
> was used to negate in other languages of the time, I think.)
>
> 2) Symbol for the end of line anchor
>
> What is the reason that the beginning of line and end of line
> anchors are different symbols? Is there a reason why not only one
> symbol, say the circumflex, was chosen to represent both? I
> currently see no disadvantages of such a design. (Circumflexes
> aren't likely to end lines of text, neither.)
>
> I would appreciate if you could help me understand these design
> decisions better. Maybe there existed RE notations that were simply
> copied ...
>
>
> meillo
>
[-- Attachment #2: Type: text/html, Size: 1904 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* [TUHS] RegExp decision for meta characters: Circumflex
@ 2021-09-17 8:52 markus schnalke
2021-09-17 9:32 ` Rob Pike
0 siblings, 1 reply; 8+ messages in thread
From: markus schnalke @ 2021-09-17 8:52 UTC (permalink / raw)
To: tuhs
Hoi,
I'm interested in the early design decisions for meta characters
in REs, mainly regarding Ken's RE implementation in ed.
Two questions:
1) Circumflex
As far as I see, the circumflex (^) is the only meta character that
has two different special meanings in REs: First being the
beginning of line anchor and second inverting a character class.
Why was it chosen for the second one? Why not the exclamation mark
in that case? (Sure, C didn't exist by then, but the bang probably
was used to negate in other languages of the time, I think.)
2) Symbol for the end of line anchor
What is the reason that the beginning of line and end of line
anchors are different symbols? Is there a reason why not only one
symbol, say the circumflex, was chosen to represent both? I
currently see no disadvantages of such a design. (Circumflexes
aren't likely to end lines of text, neither.)
I would appreciate if you could help me understand these design
decisions better. Maybe there existed RE notations that were simply
copied ...
meillo
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-09-18 1:24 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-17 16:40 [TUHS] RegExp decision for meta characters: Circumflex Douglas McIlroy
2021-09-17 20:40 ` Chris Torek
2021-09-18 1:03 ` Greg 'groggy' Lehey
2021-09-18 1:23 ` Bakul Shah
-- strict thread matches above, loose matches on Subject: below --
2021-09-17 8:52 markus schnalke
2021-09-17 9:32 ` Rob Pike
2021-09-17 9:32 ` Rob Pike
2021-09-17 10:10 ` markus schnalke
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).