From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 15102 invoked from network); 6 Jul 2022 23:13:08 -0000 Received: from zero.zsh.org (2a02:898:31:0:48:4558:7a:7368) by inbox.vuxu.org with ESMTPUTF8; 6 Jul 2022 23:13:07 -0000 ARC-Seal: i=1; cv=none; a=rsa-sha256; d=zsh.org; s=rsa-20210803; t=1657149188; b=KrKocfmzX3Ls1aDulCzTWuiJX5Ku9Q8dvRTfErOFCSRnNKLHCX67oJBEJEVgh0Subb4voOtY0X pr5Nnez19znPJd0WBKXRlKlzuZOwIgHkX9Qpdj5X4W50GDfanQhOpz2MgHmRHuLRaiBO+5RTVu Ouu/wHwGunWIVyHf6ik6RvzILfpY3f8ZFZ+s/LmXro/YbHUHa7y+4lLw54DggZ234ghL4UnYwd z/gf3kwAj5Vn1nvhxN0K8K420p/qt/dsSzbFWIagi7ApSx6ymgfLe39UaGlFahTgR85klvFu1R 4RmgE3gI/Sn7L4rH6QShcbSgJlTCT/19wTi6RTDi7AParg==; ARC-Authentication-Results: i=1; zsh.org; iprev=pass (mx.spodhuis.org) smtp.remote-ip=94.142.241.89; dkim=pass header.d=spodhuis.org header.s=d202202e2 header.a=ed25519-sha256; dkim=pass header.d=spodhuis.org header.s=d202202 header.a=rsa-sha256; dmarc=pass header.from=spodhuis.org; arc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed; d=zsh.org; s=rsa-20210803; t=1657149188; bh=+9nRZqXWxeUcVd37R96XvuMoWGZQaCj/+P1MDQHZIN8=; h=List-Archive:List-Owner:List-Post:List-Unsubscribe:List-Subscribe:List-Help: List-Id:Sender:In-Reply-To:Content-Type:MIME-Version:References:Message-ID: Subject:To:From:Date:DKIM-Signature:DKIM-Signature:DKIM-Signature; b=bn0cujCxusVmjn98yDqTIj8galjIRLUSjkCMKjotuhNy0cwREcPYSKA5o5nO90RZZE9NsXG5nh eDURqgvVFh4kg1c8b8j4WPadbSr9e/WXNUr+tqnQV18KXbPqtEqJtQhpXfHp4ZKTbF+UfIJzx3 KRPeZvxS6+5fOFr5hl8vcj5Z8vStVeltS1sGC7vJP4u1S3RspNFA/bVSPn1CqtQa0PDNojXa+g S1orDbGvn1YSFvWD7rvHygD/bJ/NvebO6Jrqd0N39nJKPDpJ0/zxzSPelvsTspXaScbu/ZKpuk WJfirMRZxNeZJFJ2rVgAxsIEth9iFXehyuBhtJwPR9dsqg==; DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=zsh.org; s=rsa-20210803; h=List-Archive:List-Owner:List-Post:List-Unsubscribe: List-Subscribe:List-Help:List-Id:Sender:In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:To:From:Date:Reply-To:Cc: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID; bh=TIwXFERB6aVWO5ab9e4EJLlqWAkWvV4KaTluTu4PepQ=; b=f+tAnyQ9cE0kYS/5CZquqxGVK5 egO//9tLP49UWd6HhGZknelsZDDBa5bx6iJQfyq+yN5MmL6mk633bFK6y4nvapa4WE9Kf3Tedrn2/ SIuOj/woX899uh7X2O0Beno9XlnHJHnD70IQN3/Lbcw1esVdUfOsaKWlrQDjHSsx/aetyoIjyRsck VnMcRPG1LL5/Xls4vN+SQdKNBBGTeOrNJHMbfz9oGYj/g5Vo2xczJiWJjU1er1D1BOSVYVU/DhAk1 iu6f5bzIK0WE/pkWPK78IYatCQkoYQRvMeq3qCxqR3Ixvb9TvPEBjKUEhJxSDmetgeLVTk46zo0RG ielQwCXA==; Received: from authenticated user by zero.zsh.org with local id 1o9ECe-000Kcw-Ck; Wed, 06 Jul 2022 23:13:04 +0000 Authentication-Results: zsh.org; iprev=pass (mx.spodhuis.org) smtp.remote-ip=94.142.241.89; dkim=pass header.d=spodhuis.org header.s=d202202e2 header.a=ed25519-sha256; dkim=pass header.d=spodhuis.org header.s=d202202 header.a=rsa-sha256; dmarc=pass header.from=spodhuis.org; arc=none Received: from mx.spodhuis.org ([94.142.241.89]:12902) (DNSSEC AD) by zero.zsh.org with esmtps (TLS1.3:TLS_AES_256_GCM_SHA384:256) id 1o9E7U-000KBv-74; Wed, 06 Jul 2022 23:07:44 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=spodhuis.org; s=d202202; h=OpenPGP:In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:To:From:Date:From:Reply-To:Subject:Date:To:Cc: Content-Transfer-Encoding:Content-ID:Content-Description:OpenPGP:Organization :Auto-Submitted; bh=TIwXFERB6aVWO5ab9e4EJLlqWAkWvV4KaTluTu4PepQ=; t=1657148864; x=1658358464; b=TorU1IA31yQuOvGN6SuOy7V1zD5zRjLKVB3w9qO/zzKjYkk s4mNX2FgAghafInLAKk2xJ1FXfMlBWSEgw3d06ccR4JG25pK7zEAEiDS5IDwDT6KAr+jALV6tGZTk KDBOmi5VZjyrUkbicHARwuVa4icltO1LqZbMxw6M6zmfmKLdflWe7WrcW1Jucc3By+JNPseqnIlai 7dhCHU9F+TlZr/91Z5poDEH22NE07WUZlvC2sfW4SlLX49rldpJV/OTzLBmrkQpXD8VDAaPZkuhDB vsoHKEjmfaCnYnjyYkA/jU+0VnILpxqMY2TodF5nTN/2E7Mpm8b1nsLIEd+Sad6Q==; DKIM-Signature: v=1; a=ed25519-sha256; q=dns/txt; c=relaxed/relaxed; d=spodhuis.org; s=d202202e2; h=OpenPGP:In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:To:From:Date:From:Reply-To:Subject:Date:To:Cc: Content-Transfer-Encoding:Content-ID:Content-Description:OpenPGP:Organization :Auto-Submitted; bh=TIwXFERB6aVWO5ab9e4EJLlqWAkWvV4KaTluTu4PepQ=; t=1657148864; x=1658358464; b=e2p6ZX4Vy7dNjk/J+Pb3jOaBdOi/wv2qeTqvzdgqxG1+z3z ecqe84D9/ag/5RsRQg6GqwUUdDbSpKvwi54vECA==; Received: from authenticated user by smtp.spodhuis.org with esmtpsa (TLS1.3:TLS_AES_256_GCM_SHA384:256) id 1o9E7T-000KBp-Gt; Wed, 06 Jul 2022 23:07:43 +0000 Date: Wed, 6 Jul 2022 19:07:40 -0400 From: Phil Pennock To: zsh-workers@zsh.org Subject: Re: Extending regexes Message-ID: Mail-Followup-To: zsh-workers@zsh.org References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: OpenPGP: url=https://www.security.spodhuis.org/PGP/keys/keys-2013rsa-2020cv25519.asc X-Seq: 50406 Archived-At: X-Loop: zsh-workers@zsh.org Errors-To: zsh-workers-owner@zsh.org Precedence: list Precedence: bulk Sender: zsh-workers-request@zsh.org X-no-archive: yes List-Id: List-Help: List-Subscribe: List-Unsubscribe: List-Post: List-Owner: List-Archive: On 2022-07-04 at 14:03 +0200, Sebastian Gniazdowski wrote: > Zsh has extensions to regular regexes - the ~ and ^ negations. They, as it > can be expected from negations that are required by Turing universal > machines, introduce a whole new universe of computations over standard > regular expressions. For example matching in an AND fashion: For clarity: zsh has long had the module zsh/pcre, providing -pcre-match; when the =~ regexp matching operator was added, we deliberately chose to add a module zsh/regex to use the system ERE libraries with -regex-match and made that the default implementation behind the =~ operator. If you're getting PCRE semantics, then probably somewhere in your startup files you have something like `setopt re_match_pcre`. A while back I wrote some bindings for using the RE2 library, which matches the efficient regexps found in Go and which is licensed such that more vendors might enable it by default with zsh. I stopped as I tried to puzzle through how to dig myself out of my own hole, in having made `RE_MATCH_PCRE` be a simple boolean. My _tentative_ thinking, which I'd appreciate feedback on, is to introduce a new special parameter, `ZSH_EQTILDE_ENGINE` or somesuch; have that only succeed when assigned a parseable value, and make mutations of the RE_MATCH_PCRE be implicit assignments of `regex` or `pcre` to that parameter. Is this sane? Are we happy introducing new special parameters, as long as the name starts `zsh`? Should the semantics just be "name of a module" or a static list? If "name of a module" then that would let people do more than just use our engines (at their own risk), but should we then update the .mdd files or the exported tables with some new identifier to mark "use this function to back =~ when the engine points here"? I would quite like to move towards being able to expect "better, but sane" REs to be available, even with commercial OS vendor builds of zsh. I think RE2 is probably the best way forward, but ... I should probably have asked long ago for advice on the design decisions which need to be made. -Phil