From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 15615 invoked from network); 1 Aug 2020 00:03:03 -0000 Received: from minnie.tuhs.org (45.79.103.53) by inbox.vuxu.org with ESMTPUTF8; 1 Aug 2020 00:03:03 -0000 Received: by minnie.tuhs.org (Postfix, from userid 112) id 2A43C9CAA7; Sat, 1 Aug 2020 10:03:02 +1000 (AEST) Received: from minnie.tuhs.org (localhost [127.0.0.1]) by minnie.tuhs.org (Postfix) with ESMTP id 579209CB3E; Sat, 1 Aug 2020 10:02:02 +1000 (AEST) Authentication-Results: minnie.tuhs.org; dkim=pass (2048-bit key; unprotected) header.d=iitbombay-org.20150623.gappssmtp.com header.i=@iitbombay-org.20150623.gappssmtp.com header.b="Ie5NhW2d"; dkim-atps=neutral Received: by minnie.tuhs.org (Postfix, from userid 112) id E02A19C9E3; Sat, 1 Aug 2020 10:02:00 +1000 (AEST) Received: from mail-pl1-f179.google.com (mail-pl1-f179.google.com [209.85.214.179]) by minnie.tuhs.org (Postfix) with ESMTPS id B1CF59CB84 for ; Sat, 1 Aug 2020 10:01:59 +1000 (AEST) Received: by mail-pl1-f179.google.com with SMTP id o1so18155993plk.1 for ; Fri, 31 Jul 2020 17:01:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=iitbombay-org.20150623.gappssmtp.com; s=20150623; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=tkgACt2smkhiC7QEbr4V3K0+WPTL6BxWbgA6w0mQAcQ=; b=Ie5NhW2d2tkLfgqVYCG7t5yMu4vfH/SE0nZDKgnOox0txGNrlG8sc0jWqeJKWNEgTn rIXkzGDTgIVTB9AbO8dmYr1Da0Fflw31lEYNI6hVzF7XzLTnMAEbf8Ar1sXANcBX0byj ObtZ0fcYwp6sRr514Lou7h5TvX6oT+QaoksL8w7a7C/VF4Z93dJOMxQD2rD7nGLtmAE8 oaD4su8KLWxrA4uyiaIAHpOS/US2tag6yvX2QsobO8VfnER9KM2WQqk42fAyPULfr46U E9sOt2UN4CtkVf9v+Q/GW9o2UNbHWDVrqd+xdjtDeW1ndPm2qwqf/fL2uet3MRVWMR2h vj5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=tkgACt2smkhiC7QEbr4V3K0+WPTL6BxWbgA6w0mQAcQ=; b=OwRrLAQM4sYmqHiql/wACVo/JhCdKq2/keuIbhyEzYdlCMrtrt6bA/I5aIwGyYiw8a 11F7H4uh24dti2d+FCH7PSKW3krR6BvEUrulJQnHKx/HkAFjvkcoX0+rEbe5aToDTz/i cFKNIus1xeTh/7TUqys2bKA3JP0h9T2GthjCSH3a41dNpDHg9ieYUCsvzegLS4bvDWvR Miz2KZADEhdxPwV9qh6xdGR/3wGX2tAK675qzwvwabmoVcqMUibVMC+gPBNGvrqoaTZY l0gAYYd5LCPoCgr6IE6P9R62Js8lshnSNsHBBJsSrbL7in/si6L8TjYrXJoi3B2tksXA 2CZA== X-Gm-Message-State: AOAM533e++JiKJ0e0mV5bBB/EnD11wwbmBXaaYG+kU5fJ0mk7iVNQa8U MDmTNmB9Y407Ubt8yq9/5lCp6qA6LSI= X-Google-Smtp-Source: ABdhPJwPCfVPujR3KgHuBy9NuySbqlkwNU65t1cm81S1gum48krg+T+DWmSIALIdino1GxZ/DBD4Qw== X-Received: by 2002:a17:90b:350f:: with SMTP id ls15mr6331775pjb.84.1596240119116; Fri, 31 Jul 2020 17:01:59 -0700 (PDT) Received: from 107-215-223-226.lightspeed.sntcca.sbcglobal.net (107-215-223-226.lightspeed.sntcca.sbcglobal.net. [107.215.223.226]) by smtp.gmail.com with ESMTPSA id g18sm11611320pfk.40.2020.07.31.17.01.57 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 31 Jul 2020 17:01:57 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.15\)) From: Bakul Shah In-Reply-To: <6e9ca056-dfb0-376d-effd-e41c9ed3ef2a@gmail.com> Date: Fri, 31 Jul 2020 17:01:56 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: <402F66B1-12C1-4FED-89FD-38AB99D919B4@iitbombay.org> References: <6e9ca056-dfb0-376d-effd-e41c9ed3ef2a@gmail.com> To: Will Senn X-Mailer: Apple Mail (2.3445.104.15) Subject: Re: [TUHS] Regular Expressions X-BeenThere: tuhs@minnie.tuhs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: The Unix Heritage Society mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: TUHS main list Errors-To: tuhs-bounces@minnie.tuhs.org Sender: "TUHS" On Jul 31, 2020, at 3:57 PM, Will Senn wrote: >=20 > I've always been intrigued with regexes. When I was first exposed to = them, I was mystified and lost in the greediness of matches. Now, I use = them regularly, but still have trouble using them. I think it is because = I don't really understand how they work. > ... > 1. What's the provenance of regex in unix (when did it appear, in what = form, etc)? > 2. What are the 'best' implementations throughout unix (keep it pre = 1980s)? > 3. What are some of the milestones along the way (major changes, = forks, disagreements)? > 4. Where, in the source, or in a paper, would you point someone to = wanting to better understand the mechanics of regex? Start here: https://en.wikipedia.org/wiki/Thompson%27s_construction [I learned about regular expressions in an automata theory class,=20 before I knew anything about Unix. What helped me was learning about finite state machines. You won't need more than paper and pencil to construct one. Reading source code would make more sense once you grasp how to construct a FSM corresponding to a RE.]=