From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 9746 invoked from network); 4 Jul 2022 19:21:23 -0000 Received: from zero.zsh.org (2a02:898:31:0:48:4558:7a:7368) by inbox.vuxu.org with ESMTPUTF8; 4 Jul 2022 19:21:23 -0000 ARC-Seal: i=1; cv=none; a=rsa-sha256; d=zsh.org; s=rsa-20210803; t=1656962483; b=DWV4uYONsfReakRti+9iaG4Go8cVDO+dUJzm87Q7RkBAZuVCT3MUlh0v0HEMR2rlNLd3oe8rgW dzg4LRFLLLpH6XECpDAymREUx+G0hgqC+aMAqIGkWPbJ7dWy1LZqT3qHi6XwUvxdMDoWd1g5JP uZfEu7M3Ezg5krHyQWIC7hYUmfwjZ3tSCRaNNA1zMfe5nGM2Z5FuKnj+OHGZXkVJ80o7z9Pu/c 52sAji+BAiR+HESMRNEQb7WS5GtVNi/0f9LxZKpVqLnx9g1soG0XpJ9a+fzrPNEsXO6VppbWVK WP4MvhSpCMTIsvEn8kUpy0A1zJDzwXbnUyvEY6VOwL37zg==; ARC-Authentication-Results: i=1; zsh.org; iprev=pass (mail-ed1-f52.google.com) smtp.remote-ip=209.85.208.52; dkim=pass header.d=brasslantern-com.20210112.gappssmtp.com header.s=20210112 header.a=rsa-sha256; dmarc=none header.from=brasslantern.com; arc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed; d=zsh.org; s=rsa-20210803; t=1656962483; bh=Of5mkxHAST42UMqcB3m0hOY2qrQgTX7VlHSxYH0IRJg=; h=List-Archive:List-Owner:List-Post:List-Unsubscribe:List-Subscribe:List-Help: List-Id:Sender:Content-Type:Cc:To:Subject:Message-ID:Date:From:In-Reply-To: References:MIME-Version:DKIM-Signature:DKIM-Signature; b=Ch2o4rCcc1UTlb1uuF/VSe7B5lx7JGhWeQsqo+cyfz66y6QN63VS/SdgISBtesYjCtSAi+XTkZ /srWsY3DgloJER5ZXwjt9cUbWxuq3dE50RpyHI3CT1+IKLSmJDoZ+8k+/F4FD1ZNgiM2ddeKdQ yhSf+LUQYWcgzKrxrMpktrt5tnrqPyur8ZJ/HVRzfjNCticMh5yMnnpPlGzsS5Z4hSLZzCIjT0 OFEYHo4bYkMPVxEoCpgOND5i5x+rwMHEYGS9cxN6TPe7uTCmc+A/uF/pUCmNJv1Ofi8mpQ6K54 cLVd1z7QN2EzR4meI5c7P5hADKQ/JmKp/dRhAgMx1anSTw==; DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=zsh.org; s=rsa-20210803; h=List-Archive:List-Owner:List-Post:List-Unsubscribe: List-Subscribe:List-Help:List-Id:Sender:Content-Type:Cc:To:Subject:Message-ID :Date:From:In-Reply-To:References:MIME-Version:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID; bh=D+70dNgJO9EH2oBDSwXPu+6PDOlNk7rIvupYEPCTf08=; b=kMlMT9rufRFU7c7/390JHxx/8V /ZBWZtpFAsZAmZ1sE/LB4yRgw4JFcg3sDDuhw6ujXaFo8TzRdOwoHLK3VJd22rmJp7Ly1N/3iol4G oexxslDTsIexlm86t3Hoaf4NAjkg1tn33UrKJ0UkZVJcb1yZU1ljPxm8KhXAKb64a1MTg/2MumreG o3edwtpvieM9TJg5BWLkzT+1pQFiyGOsJ1akuUdEPmNOqIVDjHhWDhVP6ytG/FYWoqPupv/Yldcvp eKuPSBnvq0861TsstlV/HSk+sFPZEGgnCzUFvqDM8xMPyjgKHi0/LPJyx0deTIyCOt9LbDJxFJG4w I4JAJnNg==; Received: from authenticated user by zero.zsh.org with local id 1o8RdJ-000J0h-Ks; Mon, 04 Jul 2022 19:21:21 +0000 Authentication-Results: zsh.org; iprev=pass (mail-ed1-f52.google.com) smtp.remote-ip=209.85.208.52; dkim=pass header.d=brasslantern-com.20210112.gappssmtp.com header.s=20210112 header.a=rsa-sha256; dmarc=none header.from=brasslantern.com; arc=none Received: from mail-ed1-f52.google.com ([209.85.208.52]:33620) by zero.zsh.org with esmtps (TLS1.3:TLS_AES_128_GCM_SHA256:128) id 1o8RY8-000Iav-8T; Mon, 04 Jul 2022 19:16:01 +0000 Received: by mail-ed1-f52.google.com with SMTP id n8so12820548eda.0 for ; Mon, 04 Jul 2022 12:16:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=brasslantern-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=D+70dNgJO9EH2oBDSwXPu+6PDOlNk7rIvupYEPCTf08=; b=DTiotDMyLOaLyUce9/xQUlOeEPZAjbsKHSRuzBB+ufS7NDLOiGsFmdKmu5sGlhB0AL w3n6zAC4PcpT5WF4FEH3ddKEwGgO8CNhq9DvqkJ1196/EcFABTDZYyaHM/Ik7HHpdIoP 9cv27iVB7hfNJP0CeYnU+x8Q3QZTNzFRraH2aEKciiBNQWARsVaMaMX/ji/I9totMyJ+ Wt/n0ap51JiucHAItlNUg1G+6GQUjdthKyvQnDH1mi7kqR9IG/ffB2egg/4AuOFL6q+W /REwY3lJLXNmWHgCy/tk7LcpTNpszGBnG6OjHZkpeAQy5GbwuXV4Ya0Uv47lziN6pT9y H2TQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=D+70dNgJO9EH2oBDSwXPu+6PDOlNk7rIvupYEPCTf08=; b=PcRBn2zLA3BYqYK3lfn+mRvCYITykcDOCGHCUw9CXD8ji9Up5GtTHChy2QBtZB6RRp OfLRftTnkBz40TGJIg8R472B9vZFLknNsHX7+I/drTT1psMgOlLMttb2nH5bAAcIpR6y vN41fqsKRetU8U0oNVLKNmPsVbLbuLo/RC/g2aN5EPo2nFNKmPz8w7WKRvBEPYRieYT4 dsRfJy80KWdj8703vpGZS5g1eZLDjVzw2y5GQ7Sna6qDiLq2wQ56UnubBhoyPsl3VB+x lRcx3UagJ0NSKat6+gOJLIBTDlZWsS8KLz7nM8KJscEm1opEhb7R8BomuzrO6w8NkAjv v9Vg== X-Gm-Message-State: AJIora8UkSxQEUctOc/Ra8i0Qp2Ju8suyDzcOoSGLjUnkT8pDIfiSrfU szV+yDyQncUkUeMAMgvVgrTFqrZ1IGyq7lJHXOk+VM8thRePKA== X-Google-Smtp-Source: AGRyM1uRO2luQiyHKURdXXH/DaPTfhrI3G7nCeDwjMWd/pzNDtk5ews8Ncj8GIz553zleAiDZ1uwiGNCM7Pno9qcNeM= X-Received: by 2002:a05:6402:2312:b0:437:69ec:adef with SMTP id l18-20020a056402231200b0043769ecadefmr40602624eda.366.1656962158923; Mon, 04 Jul 2022 12:15:58 -0700 (PDT) MIME-Version: 1.0 References: <76883431.1281129.1656942459330@mail2.virginmedia.com> In-Reply-To: <76883431.1281129.1656942459330@mail2.virginmedia.com> From: Bart Schaefer Date: Mon, 4 Jul 2022 12:15:47 -0700 Message-ID: Subject: Re: Extending regexes To: Peter Stephenson Cc: Zsh hackers list Content-Type: text/plain; charset="UTF-8" X-Seq: 50402 Archived-At: X-Loop: zsh-workers@zsh.org Errors-To: zsh-workers-owner@zsh.org Precedence: list Precedence: bulk Sender: zsh-workers-request@zsh.org X-no-archive: yes List-Id: List-Help: List-Subscribe: List-Unsubscribe: List-Post: List-Owner: List-Archive: On Mon, Jul 4, 2022 at 6:53 AM Peter Stephenson wrote:> > > On 04 July 2022 at 13:03 Sebastian Gniazdowski wrote: > > Zsh has extensions to regular regexes - the ~ and ^ negations. > > You're quite right both that they're very useful in zsh and there's nothing > like this in normal regular expressions, but unfortunately I've got a strong > feeling this is a big can of worms [hope that image is graphic enough that > I don't need to explain the phrase for non-native English speakers]. In particular, these no longer fit the formal definition of "regular". PWS correct me if I go too far astray, but (^Y) is internally (*~Y) and (X~Y) is implemented by first matching (X) and then removing anything that matches (Y) ... which is where the regular-ness goes astray. My formal training on this is more than a little rusty, but I believe this means chaining together two finite-state machines rather than building a single one. On Mon, Jul 4, 2022 at 5:06 AM Sebastian Gniazdowski wrote: > > I think that regexes look pretty limited from this point of view and that pcre extensions went wrong path with the look forward and behind semantics. Note that of course "pcre" stands for "perl-compatible RE" so you can find the justifications for look-{ahead,behind} in the history of perl development. Again, a long time ago, but my recollection is that the reason "lookaround assertions" are zero-width elements is to preserve the finite-state semantics. Please take that with 30 years worth of salt grains (a less self-explanatory idiom than Peter's, I fear).