From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FROM,HTML_MESSAGE,MAILING_LIST_MULTI autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 27309 invoked from network); 17 Sep 2021 09:34:07 -0000 Received: from minnie.tuhs.org (45.79.103.53) by inbox.vuxu.org with ESMTPUTF8; 17 Sep 2021 09:34:07 -0000 Received: by minnie.tuhs.org (Postfix, from userid 112) id 87E6F9CAE7; Fri, 17 Sep 2021 19:33:56 +1000 (AEST) Received: from minnie.tuhs.org (localhost [127.0.0.1]) by minnie.tuhs.org (Postfix) with ESMTP id 4A3B09CAC6; Fri, 17 Sep 2021 19:33:35 +1000 (AEST) Authentication-Results: minnie.tuhs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="OWA87hoe"; dkim-atps=neutral Received: by minnie.tuhs.org (Postfix, from userid 112) id 5DFF29CAB4; Fri, 17 Sep 2021 19:32:41 +1000 (AEST) Received: from mail-pg1-f173.google.com (mail-pg1-f173.google.com [209.85.215.173]) by minnie.tuhs.org (Postfix) with ESMTPS id 45E919CAB2 for ; Fri, 17 Sep 2021 19:32:40 +1000 (AEST) Received: by mail-pg1-f173.google.com with SMTP id t1so9102578pgv.3 for ; Fri, 17 Sep 2021 02:32:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=jIGoNpXib+1emviTgV1BeFxwZWyieJAy00U59OCZlOw=; b=OWA87hoewmq/Yhq0Z5EyZZiMwHvVhhRU7kLSb1kYAFHUBTtmIbOM5Fo2QhW44wayRV 6o57UoHrz8Tl3p+HNGj8yvidhT5XNI4NADbDo+yJnW3T2siRS0dBR319Twmsse/vVIXW yejAbHMt7upaU1rV025BXKn89K6wDhKl2s2fjvVa2splpDowj+hxabaMIDHz4GVHmHjC KOOCHdOL8JWcoXdXlaOiA0/lHf82n25UEGnG4afycifxcJHAahiDUe6oI1pPuwUJphm0 OB8UH1EbC7kP2G1SfwkKkFWFXmv1AukTrWIvmQLHI4Lkx/c2VnkTzsgGifg+bjZaQFNJ PiuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=jIGoNpXib+1emviTgV1BeFxwZWyieJAy00U59OCZlOw=; b=EmXgxpK6KOtxFkO1wlcmJABj2arcJ2Ccd7v+Du9zOlchfCmoEC1kRo9KS6o+sYseqz OUtr0MIYa1A2OGElDZKRbvLvFYC7tKFlnu2fnOZe93EE8hO4If3/n5YlTlM0DHTd9DVQ 2OEX4rCy9dmrqzQCrjvAsxwXjjHtopJEqsbnxTO8Ph6s93s1LAkKNa26DqdP1Dz8K5EN 0yQo8/PahZozdqzVEXb+5+hAYaujUpRcsqQWqXuJ70+VOQBIkoCU9uwC4lWoAwshZUH6 flifiUAMkjf+ESSV6Y2nnbUA7qMCk+XupRfvEewsys5bjTWvxgxWtmA+FQQyZ1yCYZsr 2wYQ== X-Gm-Message-State: AOAM5305YpCLc/4/1HGoHOjAIWHuGankFgvsBhmUekfDphzTpyhn+/6S NkcQO1yvV7Dc7rOHbeKiIl7aK5fIEIZodx9uSzbsA0xJ X-Google-Smtp-Source: ABdhPJyRSAEZFYBPKk7YE0IQuSCgO/m9L020d52692LSwNd/l6Vhxn4sAqUcA7zEFZIOOFlRekVEOafP1NYbb1psBZM= X-Received: by 2002:a05:6a00:1809:b0:439:9127:1d19 with SMTP id y9-20020a056a00180900b0043991271d19mr9692190pfa.12.1631871159699; Fri, 17 Sep 2021 02:32:39 -0700 (PDT) MIME-Version: 1.0 References: <1mR9bE-8RG-00@marmaro.de> In-Reply-To: From: Rob Pike Date: Fri, 17 Sep 2021 19:32:28 +1000 Message-ID: To: markus schnalke Content-Type: multipart/alternative; boundary="000000000000cb0ce905cc2d9cb9" Subject: Re: [TUHS] RegExp decision for meta characters: Circumflex X-BeenThere: tuhs@minnie.tuhs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: The Unix Heritage Society mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: TUHS main list Errors-To: tuhs-bounces@minnie.tuhs.org Sender: "TUHS" --000000000000cb0ce905cc2d9cb9 Content-Type: text/plain; charset="UTF-8" *NOT* the same. Sorry.... I hope the example explains better than my prose. -rob On Fri, Sep 17, 2021 at 7:32 PM Rob Pike wrote: > You'd have to ask ken why he chose the characters he did, but I can answer > the second question. The beginning and end of line are the same. If you > make ^ mean both beginning and end of line, what does this ed command do: > > s/^/x/ > > Which end gets the x? > > -rob > > > On Fri, Sep 17, 2021 at 7:00 PM markus schnalke wrote: > >> Hoi, >> >> I'm interested in the early design decisions for meta characters >> in REs, mainly regarding Ken's RE implementation in ed. >> >> Two questions: >> >> 1) Circumflex >> >> As far as I see, the circumflex (^) is the only meta character that >> has two different special meanings in REs: First being the >> beginning of line anchor and second inverting a character class. >> Why was it chosen for the second one? Why not the exclamation mark >> in that case? (Sure, C didn't exist by then, but the bang probably >> was used to negate in other languages of the time, I think.) >> >> 2) Symbol for the end of line anchor >> >> What is the reason that the beginning of line and end of line >> anchors are different symbols? Is there a reason why not only one >> symbol, say the circumflex, was chosen to represent both? I >> currently see no disadvantages of such a design. (Circumflexes >> aren't likely to end lines of text, neither.) >> >> I would appreciate if you could help me understand these design >> decisions better. Maybe there existed RE notations that were simply >> copied ... >> >> >> meillo >> > --000000000000cb0ce905cc2d9cb9 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
*NOT* the same. Sorry....

I hope the ex= ample explains better than my prose.

-rob


On Fri, Sep 17, 2021 at 7:32 PM Rob Pike <robpike@gmail.com> wrote:
You'd have to ask = ken why he chose the characters he did, but I can answer the second questio= n. The beginning and end of line are the same. If you make ^ mean both begi= nning and end of line, what does this ed command do:

s/^= /x/

Which end gets the x?

-rob


On Fri, Sep 17, 2021 at 7:00 PM markus schnalke &= lt;meillo@marmaro.de= > wrote:
= Hoi,

I'm interested in the early design decisions for meta characters
in REs, mainly regarding Ken's RE implementation in ed.

Two questions:

1) Circumflex

As far as I see, the circumflex (^) is the only meta character that
has two different special meanings in REs: First being the
beginning of line anchor and second inverting a character class.
Why was it chosen for the second one? Why not the exclamation mark
in that case? (Sure, C didn't exist by then, but the bang probably
was used to negate in other languages of the time, I think.)

2) Symbol for the end of line anchor

What is the reason that the beginning of line and end of line
anchors are different symbols? Is there a reason why not only one
symbol, say the circumflex, was chosen to represent both? I
currently see no disadvantages of such a design. (Circumflexes
aren't likely to end lines of text, neither.)

I would appreciate if you could help me understand these design
decisions better. Maybe there existed RE notations that were simply
copied ...


meillo
--000000000000cb0ce905cc2d9cb9--