From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 9598 invoked from network); 7 Mar 2023 18:32:19 -0000 Received: from minnie.tuhs.org (2600:3c01:e000:146::1) by inbox.vuxu.org with ESMTPUTF8; 7 Mar 2023 18:32:19 -0000 Received: from minnie.tuhs.org (localhost [IPv6:::1]) by minnie.tuhs.org (Postfix) with ESMTP id BB74D411F2; Wed, 8 Mar 2023 04:32:16 +1000 (AEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tuhs.org; s=dkim; t=1678213936; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references:list-id:list-help:list-owner:list-unsubscribe: list-subscribe:list-post; bh=EyMiLidBTvyLUJI3RKGDVi2kLv4IH5WXF5qD2yGhDdw=; b=SmptTcHuMN+UGpQbPsgBH+/PRordF2nZ3zZklqgHv/fDqB00xmmJhZIXioHD06Mvd4p0ej Sbe43z3WB7OdUHzETpRLJxWlr3/JDJzl+e4DtQOKLx/UKU1fNDfm3gasmm9wRhce+4SH2d f4xvYEBAx0nC/kK9CnoCiDsOFDUsTsY= Received: from tncsrv06.tnetconsulting.net (tncsrv06.tnetconsulting.net [IPv6:2600:3c00:e000:1e9::8849]) by minnie.tuhs.org (Postfix) with ESMTPS id E60CA411F0 for ; Wed, 8 Mar 2023 04:32:04 +1000 (AEST) Received: from Contact-TNet-Consulting-Abuse-for-assistance by tncsrv06.tnetconsulting.net (8.15.2/8.15.2/Debian-3) with ESMTPSA id 327IW3Jv021232 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO) for ; Tue, 7 Mar 2023 12:32:04 -0600 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tnetconsulting.net; s=2019; t=1678213924; bh=EyMiLidBTvyLUJI3RKGDVi2kLv4IH5WXF5qD2yGhDdw=; h=Subject:To:References:From:Message-ID:Date:User-Agent: MIME-Version:In-Reply-To:Content-Type:Cc:Content-Disposition: Content-Language:Content-Transfer-Encoding:Content-Type:Date:From: In-Reply-To:Message-ID:MIME-Version:References:Reply-To: Resent-Date:Resent-From:Resent-To:Resent-Cc:Sender:Subject:To: User-Agent; b=dlrjLwtQpJ2YulUK6ixqeoVTc/NnjdoZy6OKcZ9647LsGU1BNu1tD5rr1tEbtV/z8 HNdYzNU/NB9WKLIpzBQG72KW94jK+56kiN+9TeV2kd6Bhc39JVm4aZd7hhlAmEU6+H bhIj3CHOlMbEh7VE491HlUZc3FeYutsEbgZasQbI= To: coff@tuhs.org References: <8d1de5c8-1f34-3d37-395d-0f1da7b062ec@spamtrap.tnetconsulting.net> <20230307014311.GN5398@mcvoy.com> <20230307113949.501602135B@orac.inputplus.co.uk> Organization: TNet Consulting Message-ID: Date: Tue, 7 Mar 2023 11:31:55 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: <20230307113949.501602135B@orac.inputplus.co.uk> Content-Type: multipart/signed; protocol="application/pkcs7-signature"; micalg=sha-256; boundary="------------ms080105010909080902030105" Message-ID-Hash: 5WPDP24ALLIKVS3527DUOPXCXRIVJJQD X-Message-ID-Hash: 5WPDP24ALLIKVS3527DUOPXCXRIVJJQD X-MailFrom: gtaylor@tnetconsulting.net X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.6b1 Precedence: list Subject: [COFF] Re: Requesting thoughts on extended regular expressions in grep. List-Id: Computer Old Farts Forum Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: From: Grant Taylor via COFF Reply-To: Grant Taylor This is a cryptographically signed message in MIME format. --------------ms080105010909080902030105 Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: quoted-printable On 3/7/23 4:39 AM, Ralph Corderoy wrote: > Readable to you, which is fine because you're the prime future=20 > reader. But it's less readable than the regexp to those that know=20 > and read them because of the indirection introduced by the variables.=20 > You've created your own little language of CAPITALS rather than the=20 > lingua franca of regexps. :-) I want to agree, but then I run into things like this: ^\w{3} [ :[:digit:]]{11} [._[:alnum:]-]+=20 postfix(/smtps)?/smtpd\[[[:digit:]]+\]: disconnect from=20 [._[:alnum:]-]+\[[.:[:xdigit:]]+\]( helo=3D[[:digit:]]+(/[[:digit:]]+)?)?= (=20 ehlo=3D[[:digit:]]+(/[[:digit:]]+)?)?(=20 starttls=3D[[:digit:]]+(/[[:digit:]]+)?)?(=20 auth=3D[[:digit:]]+(/[[:digit:]]+)?)?(=20 mail=3D[[:digit:]]+(/[[:digit:]]+)?)?(=20 rcpt=3D[[:digit:]]+(/[[:digit:]]+)?)?(=20 data=3D[[:digit:]]+(/[[:digit:]]+)?)?(=20 bdat=3D[[:digit:]]+(/[[:digit:]]+)?)?(=20 rset=3D[[:digit:]]+(/[[:digit:]]+)?)?(=20 noop=3D[[:digit:]]+(/[[:digit:]]+)?)?(=20 quit=3D[[:digit:]]+(/[[:digit:]]+)?)?(=20 unknown=3D[[:digit:]]+(/[[:digit:]]+)?)?(=20 commands=3D[[:digit:]]+(/[[:digit:]]+)?)?$ Which is produced by this m4: define(`DAEMONPID', `$1\[DIGITS\]:')dnl define(`DATE', `\w{3} [ :[:digit:]]{11}')dnl define(`DIGIT', `[[:digit:]]')dnl define(`DIGITS', `DIGIT+')dnl define(`HOST', `[._[:alnum:]-]+')dnl define(`HOSTIP', `HOST\[IP\]')dnl define(`IP', `[.:[:xdigit:]]+')dnl define(`VERB', `( $1=3DDIGITS`'(/DIGITS)?)?')dnl ^DATE HOST DAEMONPID(`postfix(/smtps)?/smtpd') disconnect from=20 =20 HOSTIP`'VERB(`helo')VERB(`ehlo')VERB(`starttls')VERB(`auth')VERB(`mail')V= ERB(`rcpt')VERB(`data')VERB(`bdat')VERB(`rset')VERB(`noop')VERB(`quit')VE= RB(`unknown')VERB(`commands')$ I only consider myself to be an /adequate/ m4 user. Though I've done=20 some things that are arguably creating new languages. I personally find the generated regular expression to be onerous to read = and understand, much less modify. I would be highly dependent on my=20 editor's (vim's) parenthesis / square bracket matching (%) capability=20 and / or would need to explode the RE into multiple components on=20 multiple lines to have a hope of accurately understanding or modifying it= =2E Conversely I think that the m4 is /largely/ find and replace with a=20 little syntactic sugar around the definitions. I also think that anyone that does understand regular expressions and=20 the concept of find & replace is likely to be able to both recognize=20 patterns -- as in "VERB(...)" corresponds to "(=20 $1=3DDIGITS`'(/DIGITS)?)?", that "DIGITS" corresponds to "DIGIT+", and=20 that "DIGIT" corresponds to "[[:digit:]]". There seems to be a point between simple REs w/o any supporting=20 constructor and complex REs with supporting constructor where I think it = is better to have the constructors. Especially when duplication comes=20 into play. If nothing else, the constructors are likely to reduce one-off typo=20 errors. The typo will either be everywhere the constructor was used, or = similarly be fixed everywhere at the same time. Conversely, finding an=20 unmatched parenthesis or square bracket in the RE above will be annoying = at best if not likely to be more daunting. > Each time the original language was readable because practitioners=20 > had to read and write it. When its replacement came along, the old=20 > skill was no longer learnt and the language became =E2=80=98unreadable=E2= =80=99. I feel like there is an analogy between machine code and assembly=20 language as well as assembly language and higher level languages. My understanding is that the computer industry has vastly agreed that=20 the higher level language is easier to understand and maintain. > =E2=80=98{1}=E2=80=99 is redundant. That may very well be. But what will be more maintainable / easier to=20 correct in the future; adding `{2}` when necessary or changing the value = of `1` to `2`? I think this is an example of tradeoff of not strictly required to make=20 something more maintainable down the road. Sort of like fleet vehicles=20 vs non-fleet vehicles. > BTW, =E2=80=98{0,1}=E2=80=99 is more readable to those who know regexps= as =E2=80=98?=E2=80=99. I think this is another example of the maintainability. > I'm sending this to just the list. I'm also replying to only the COFF mailing list. > Perhaps your account on the list is configured to not send you an=20 > email if it sees your address in the header's fields. There is a reasonable chance that the COFF mailing list and / or your=20 account therein is configured to minimize duplicates meaning the COFF=20 mailing list won't send you a copy if it sees your subscribed address as = receiving a copy directly. I personally always prefer the mailing list copy and shun the direct=20 copies. I think that the copy from the mailing list keeps the=20 discussion on the mailing list and avoids accidental replies bypassing=20 the mailing list. --=20 Grant. . . . unix || die --------------ms080105010909080902030105 Content-Type: application/pkcs7-signature; name="smime.p7s" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="smime.p7s" Content-Description: S/MIME Cryptographic Signature MIAGCSqGSIb3DQEHAqCAMIACAQExDzANBglghkgBZQMEAgEFADCABgkqhkiG9w0BBwEAAKCC CzowggUiMIIECqADAgECAhEAw8IZWQHDVuWWKHZeojBgoDANBgkqhkiG9w0BAQsFADCBljEL MAkGA1UEBhMCR0IxGzAZBgNVBAgTEkdyZWF0ZXIgTWFuY2hlc3RlcjEQMA4GA1UEBxMHU2Fs Zm9yZDEYMBYGA1UEChMPU2VjdGlnbyBMaW1pdGVkMT4wPAYDVQQDEzVTZWN0aWdvIFJTQSBD bGllbnQgQXV0aGVudGljYXRpb24gYW5kIFNlY3VyZSBFbWFpbCBDQTAeFw0yMjExMTQwMDAw MDBaFw0yMzExMTQyMzU5NTlaMCsxKTAnBgkqhkiG9w0BCQEWGmd0YXlsb3JAdG5ldGNvbnN1 bHRpbmcubmV0MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAzOnBjTJUlBTzN81c PlYErJc9kEbTI/hXq0NA6ZoG4VM6puYTEXtITANjgX+NRwwHjldESnC8dvh6Mx5ckEk9sWoD l8Yr/dWhF3s4fGxAX5ziOeuBI/yX7rKJn6DOwclV3C6dyt3zrLB6LOiF4gA+lk/o3EbOwoPh pW2MqAywy18OIvzfmEXKdya8E/uIP4v/8AHmtakxHfmZ33Krbwh2oia69esRKc7q2i3Jh+ar Tf3PuZJETd86Sb0Lz1+3zAXcYko2/3G9O9AwtUSDvkx5IUKieG8R4a8HLwuUTBNIsJ0qOdmv 4hUjc3IsP0jN+xebTE4w7PheolE/OStiFshpKQIDAQABo4IB0zCCAc8wHwYDVR0jBBgwFoAU CcDy/AvalNtf/ivfqJlCz8ngrQAwHQYDVR0OBBYEFPUkNRFsHVlNMgaz3G4kfNa8DU4VMA4G A1UdDwEB/wQEAwIFoDAMBgNVHRMBAf8EAjAAMB0GA1UdJQQWMBQGCCsGAQUFBwMEBggrBgEF BQcDAjBABgNVHSAEOTA3MDUGDCsGAQQBsjEBAgEBATAlMCMGCCsGAQUFBwIBFhdodHRwczov L3NlY3RpZ28uY29tL0NQUzBaBgNVHR8EUzBRME+gTaBLhklodHRwOi8vY3JsLnNlY3RpZ28u Y29tL1NlY3RpZ29SU0FDbGllbnRBdXRoZW50aWNhdGlvbmFuZFNlY3VyZUVtYWlsQ0EuY3Js MIGKBggrBgEFBQcBAQR+MHwwVQYIKwYBBQUHMAKGSWh0dHA6Ly9jcnQuc2VjdGlnby5jb20v U2VjdGlnb1JTQUNsaWVudEF1dGhlbnRpY2F0aW9uYW5kU2VjdXJlRW1haWxDQS5jcnQwIwYI KwYBBQUHMAGGF2h0dHA6Ly9vY3NwLnNlY3RpZ28uY29tMCUGA1UdEQQeMByBGmd0YXlsb3JA dG5ldGNvbnN1bHRpbmcubmV0MA0GCSqGSIb3DQEBCwUAA4IBAQBdVEYkwnfj7/0fx6R9ll/7 F1HeOL+Q/gzdd4bKpaY3/dkCyHVtx2dAMixzM4YGIq4rDsbhPK1MXqQAS89B786rG9XjWKgM VlgiBHir/9eQxhvX4AbQx1eJdCXNKTMJJwyIG2qlvuor/8H8//ZIjJuBgYAzW4TZREolhzVP 4g92+De1zyWW+3bESGHgx1E1+tkdvYeQATt7wkUtsEkn05MUHGAfRWt0tE3C321ajqSuFtxC VCeGvGusV8+3rw2vsqVG/mkTsmn1EAtq0jGhVgwIgQO8soFSRt/3zWibnVk1aRrXvy45WMGv an16R0/HQp8oLG3MYq++Vq6CFBbIG+9OMIIGEDCCA/igAwIBAgIQTZQsENQ74JQJxYEtOisG TzANBgkqhkiG9w0BAQwFADCBiDELMAkGA1UEBhMCVVMxEzARBgNVBAgTCk5ldyBKZXJzZXkx FDASBgNVBAcTC0plcnNleSBDaXR5MR4wHAYDVQQKExVUaGUgVVNFUlRSVVNUIE5ldHdvcmsx LjAsBgNVBAMTJVVTRVJUcnVzdCBSU0EgQ2VydGlmaWNhdGlvbiBBdXRob3JpdHkwHhcNMTgx MTAyMDAwMDAwWhcNMzAxMjMxMjM1OTU5WjCBljELMAkGA1UEBhMCR0IxGzAZBgNVBAgTEkdy ZWF0ZXIgTWFuY2hlc3RlcjEQMA4GA1UEBxMHU2FsZm9yZDEYMBYGA1UEChMPU2VjdGlnbyBM aW1pdGVkMT4wPAYDVQQDEzVTZWN0aWdvIFJTQSBDbGllbnQgQXV0aGVudGljYXRpb24gYW5k IFNlY3VyZSBFbWFpbCBDQTCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAMo87ZQK Qf/e+Ua56NY75tqSvysQTqoavIK9viYcKSoq0s2cUIE/bZQu85eoZ9X140qOTKl1HyLTJbaz Gl6nBEibivHbSuejQkq6uIgymiqvTcTlxZql19szfBxxo0Nm9l79L9S+TZNTEDygNfcXlkHK RhBhVFHdJDfqB6Mfi/Wlda43zYgo92yZOpCWjj2mz4tudN55/yE1+XvFnz5xsOFbme/SoY9W Aa39uJORHtbC0x7C7aYivToxuIkEQXaumf05Vcf4RgHs+Yd+mwSTManRy6XcCFJE6k/LHt3n dD3sA3If/JBz6OX2ZebtQdHnKav7Azf+bAhudg7PkFOTuRMCAwEAAaOCAWQwggFgMB8GA1Ud IwQYMBaAFFN5v1qqK0rPVIDh2JvAnfKyA2bLMB0GA1UdDgQWBBQJwPL8C9qU21/+K9+omULP yeCtADAOBgNVHQ8BAf8EBAMCAYYwEgYDVR0TAQH/BAgwBgEB/wIBADAdBgNVHSUEFjAUBggr BgEFBQcDAgYIKwYBBQUHAwQwEQYDVR0gBAowCDAGBgRVHSAAMFAGA1UdHwRJMEcwRaBDoEGG P2h0dHA6Ly9jcmwudXNlcnRydXN0LmNvbS9VU0VSVHJ1c3RSU0FDZXJ0aWZpY2F0aW9uQXV0 aG9yaXR5LmNybDB2BggrBgEFBQcBAQRqMGgwPwYIKwYBBQUHMAKGM2h0dHA6Ly9jcnQudXNl cnRydXN0LmNvbS9VU0VSVHJ1c3RSU0FBZGRUcnVzdENBLmNydDAlBggrBgEFBQcwAYYZaHR0 cDovL29jc3AudXNlcnRydXN0LmNvbTANBgkqhkiG9w0BAQwFAAOCAgEAQUR1AKs5whX13o6V bTJxaIwA3RfXehwQOJDI47G9FzGR87bjgrShfsbMIYdhqpFuSUKzPM1ZVPgNlT+9istp5UQN RsJiD4KLu+E2f102qxxvM3TEoGg65FWM89YN5yFTvSB5PelcLGnCLwRfCX6iLPvGlh9j30lK zcT+mLO1NLGWMeK1w+vnKhav2VuQVHwpTf64ZNnXUF8p+5JJpGtkUG/XfdJ5jR3YCq8H0OPZ kNoVkDQ5CSSF8Co2AOlVEf32VBXglIrHQ3v9AAS0yPo4Xl1FdXqGFe5TcDQSqXh3TbjugGnG +d9yZX3lB8bwc/Tn2FlIl7tPbDAL4jNdUNA7jGee+tAnTtlZ6bFz+CsWmCIb6j6lDFqkXVsp +3KyLTZGXq6F2nnBtN4t5jO3ZIj2gpIKHAYNBAWLG2Q2fG7Bt2tPC8BLC9WIM90gbMhAmtMG quITn/2fORdsNmaV3z/sPKuIn8DvdEhmWVfh0fyYeqxGlTw0RfwhBlakdYYrkDmdWC+XszE1 9GUi8K8plBNKcIvyg2omAdebrMIHiAHAOiczxX/aS5ABRVrNUDcjfvp4hYbDOO6qHcfzy/uY 0fO5ssebmHQREJJA3PpSgdVnLernF6pthJrGkNDPeUI05svqw1o5A2HcNzLOpklhNwZ+4uWY LcAi14ACHuVvJsmzNicxggQ1MIIEMQIBATCBrDCBljELMAkGA1UEBhMCR0IxGzAZBgNVBAgT EkdyZWF0ZXIgTWFuY2hlc3RlcjEQMA4GA1UEBxMHU2FsZm9yZDEYMBYGA1UEChMPU2VjdGln byBMaW1pdGVkMT4wPAYDVQQDEzVTZWN0aWdvIFJTQSBDbGllbnQgQXV0aGVudGljYXRpb24g YW5kIFNlY3VyZSBFbWFpbCBDQQIRAMPCGVkBw1bllih2XqIwYKAwDQYJYIZIAWUDBAIBBQCg ggJZMBgGCSqGSIb3DQEJAzELBgkqhkiG9w0BBwEwHAYJKoZIhvcNAQkFMQ8XDTIzMDMwNzE4 MzE1NVowLwYJKoZIhvcNAQkEMSIEICLnnk4be2cawNavYTCL3PeLmK8rj+JmV1pzOg3dwoYX MGwGCSqGSIb3DQEJDzFfMF0wCwYJYIZIAWUDBAEqMAsGCWCGSAFlAwQBAjAKBggqhkiG9w0D BzAOBggqhkiG9w0DAgICAIAwDQYIKoZIhvcNAwICAUAwBwYFKw4DAgcwDQYIKoZIhvcNAwIC ASgwgb0GCSsGAQQBgjcQBDGBrzCBrDCBljELMAkGA1UEBhMCR0IxGzAZBgNVBAgTEkdyZWF0 ZXIgTWFuY2hlc3RlcjEQMA4GA1UEBxMHU2FsZm9yZDEYMBYGA1UEChMPU2VjdGlnbyBMaW1p dGVkMT4wPAYDVQQDEzVTZWN0aWdvIFJTQSBDbGllbnQgQXV0aGVudGljYXRpb24gYW5kIFNl Y3VyZSBFbWFpbCBDQQIRAMPCGVkBw1bllih2XqIwYKAwgb8GCyqGSIb3DQEJEAILMYGvoIGs MIGWMQswCQYDVQQGEwJHQjEbMBkGA1UECBMSR3JlYXRlciBNYW5jaGVzdGVyMRAwDgYDVQQH EwdTYWxmb3JkMRgwFgYDVQQKEw9TZWN0aWdvIExpbWl0ZWQxPjA8BgNVBAMTNVNlY3RpZ28g UlNBIENsaWVudCBBdXRoZW50aWNhdGlvbiBhbmQgU2VjdXJlIEVtYWlsIENBAhEAw8IZWQHD VuWWKHZeojBgoDANBgkqhkiG9w0BAQEFAASCAQCd6BLWCL8+lseRlTNxbC6JgY8dfiqUyPJY eN3MEiJgDcVSUQlJ2+pQQdmvI4ajwX6Qokvq6sEhY+Ejo94Ti/g14RUbii5h2iWTCZkKEetL nQTTO7y8P96ZH9yGPDOZJbjW5ka860Gqdt94aojLOqFExERS+fqSpptu9mDtEjf2M++l3qco BrvRXF+JrhacPOXDAGgFZIT8Ec7MTkhyeVzgUAUTu5GdstiiUYPTYK6q++7ZA3ERvrU597YV f7yYo7ME0yiex9gktwnJN/eHXc9XpvmtQtJR+75hUZLVKBjYzgd/j1dWj2KZJDBmOQ5OoANB uoAk5cUhGfH03ZHLWpJsAAAAAAAA --------------ms080105010909080902030105--