From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=DKIM_ADSP_CUSTOM_MED, FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham autolearn_force=no version=3.4.2 Received: from primenet.com.au (ns1.primenet.com.au [203.24.36.2]) by inbox.vuxu.org (OpenSMTPD) with ESMTP id 1455a014 for ; Sat, 28 Dec 2019 19:05:19 +0000 (UTC) Received: (qmail 4771 invoked by alias); 28 Dec 2019 19:05:09 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: List-Unsubscribe: X-Seq: 45150 Received: (qmail 27711 invoked by uid 1010); 28 Dec 2019 19:05:09 -0000 X-Qmail-Scanner-Diagnostics: from mail-ua1-f45.google.com by f.primenet.com.au (envelope-from , uid 7791) with qmail-scanner-2.11 (clamdscan: 0.102.1/25670. spamassassin: 3.4.2. Clear:RC:0(209.85.222.45):SA:0(-2.0/5.0):. Processed in 1.9421 secs); 28 Dec 2019 19:05:09 -0000 X-Envelope-From: sgniazdowski@gmail.com X-Qmail-Scanner-Mime-Attachments: | X-Qmail-Scanner-Zip-Files: | Received-SPF: pass (ns1.primenet.com.au: SPF record at _netblocks.google.com designates 209.85.222.45 as permitted sender) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=fW5aWp7MbkzWwb0CzX3V6DFRC0Phm4ptMmqYw9pwfJ0=; b=hsAN5snfZA77SuzIju0Owt12BFWJjdGcLR4TbSZ02y7tjju/2AVANPGiUEZxn1hQqw PsbdPJC8Gwu30r0XSSLjNH2gKIdyjZWznoLIHofIc5FS7MlhqUlU3Zpf2IhKx8vDJH8S Ud/8Kf9+mbw9KyI9brF3t5DaAiGBtuLuKsPPOmQ4s3Kp1vneYlag57k3WIHaFBUUIznQ k1lsSS4qXWv0WzPAPF+G2gZ8DBj8X28e8UWZ/FS3R6wtDuw0LareSQw5ZAVcnyRZZobH L7ZKOqGGyKp344uVCZZIQUp0iJEpibc2UX86kU0DULKYj3Xt/Tf6AtE58SGvnyVDKkNy YnqA== X-Gm-Message-State: APjAAAUKOtPJCkkIPNpG4kAWx3agLHqY93g9a39mrX1MQfooXFyXZV3/ RiEptrXK7x0bT2S1VmUtI8hQBsbiSg63QEs1nuc= X-Google-Smtp-Source: APXvYqyob7w6VOF93UF0VZmT1jvb5MrwZrj/sau8nqgel9Cto9wV+cwZT5+ud0pmYT8fGeOE204IKLU2o6mk2fZi4GE= X-Received: by 2002:ab0:6615:: with SMTP id r21mr13885749uam.136.1577559872243; Sat, 28 Dec 2019 11:04:32 -0800 (PST) MIME-Version: 1.0 References: <1a130b2e-5824-4b7a-8510-2b1d0b3fdac5@www.fastmail.com> <20191227052923.yal2nnmxdxfgvfkr@tarpaulin.shahaf.local2> In-Reply-To: <20191227052923.yal2nnmxdxfgvfkr@tarpaulin.shahaf.local2> From: Sebastian Gniazdowski Date: Sat, 28 Dec 2019 20:04:21 +0100 Message-ID: Subject: Re: [Bug] S-flag imposes non-greedy match where it shouldn't To: Daniel Shahaf Cc: Zsh hackers list Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, 27 Dec 2019 at 06:30, Daniel Shahaf wrote: > > Sebastian Gniazdowski wrote on Thu, Dec 26, 2019 at 19:35:05 +0100: > > +++ b/Doc/Zsh/expn.yo > > @@ -1399,6 +1399,20 @@ from the beginning and with tt(%) start from the= end of the string. > > With substitution via tt(${)...tt(/)...tt(}) or > > tt(${)...tt(//)...tt(}), specifies non-greedy matching, i.e. that the > > shortest instead of the longest match should be replaced. > > +The substring search means that the pattern is matched skipping the > > +parts of the input string starting from the direction set by the use > > +of tt(#) or tt(%). > > I don't understand this sentence. What does "skipping" mean? It means that parts of the string are being skipped when they don't match when moving to the other end. Does the sentence need an update? > > +For example, to match a pattern starting from the > > +end, one could use: > > + > > +example(str=3D"abcXXXdefXXXghi" > > +out=3D${(S)str%%(#b)([^X])X##} > > +out=3D$out${match[1]} > > +) > > + > > +The result is tt(abcXXXdefghi). > > That's not correct. The output is abcXXXdefXXXghi (in 'zsh -f') or > abcXXXdeghif (with extendedglob set), but not abcXXXdefghi. I've sent an updated patch half hour before your email. It contains the correct example. > I doubt this example would clarify the meaning of ${(S)} to people who > encounter it for the first time. Please use a more minimal example. > Specific issues: > - (...) This is documentation, not > a homework problem; the answer should be obvious. Something like > =C2=ABout=3D"${out}+${match[1]}"=C2=BB would address this =E2=80=94 but= =E2=80=A6 I think that many examples in the man pages are like that =E2=80=93 they do= n't go the obvious path of just demonstrating the usage but instead, they cover some edge case that, after (sometimes quite long) thinking reveal something very peculiar about the feature. There are better examples of this, however, the best that I've found currently is the one used for the #b glob flag: foo=3D"a string with a message" if [[ $foo =3D (a|an)' '(#b)(*)' '* ]]; then print ${foo[$mbegin[1],$mend[1]]} fi The example prints `string with a', and the user has a "homework" of untangling a few points: - why it isn't "string with a message" (it's because the final ' '* part that requires a space after the final word of the (*) part), - why the answer isn't "message" (the same as above plus the fact that there's no * before (a|an) and the greediness). If not the homework-attitude of the examples in the man page, the example would have been if [[ "a string with a message" =3D (#b)a' '(*) ]]; then and would give the answer "string with a message". This would have been the obvious-demonstration attitude that I've referred to. > - =E2=80=A6 the use of advanced pattern matching features needlessly rais= es the > learning curve. I can add the mention that the example needs EXTENDED_GLOB. Overall I think that the example: - is nice because it shows how to make the (S)...%% substitution behave as the intuition would suggest, - it's the only place in the documentation that uses the (#b) flag with #/% substitution, showing that it's possible to use it in that place, - it isn't that complex for someone that knows #b flag and the $match param= eter. > > It would have been tt(abcXXXdefXXghif) > > +if not the tt([^X]) part, as despite the tt(%%) specifies a greedy > > +match, the substring matching works by trying matches from right to > > +left and stops at a first valid match. > > There are some grammatical errors here (e.g., s/(?<=3Dspecif)ies/ying/), = but > let's not worry about them until the rest of the patch isn't a moving tar= get. I think that grammar is correct here. Did you maybe misread the sentence? --=20 Sebastian Gniazdowski News: https://twitter.com/ZdharmaI IRC: https://kiwiirc.com/client/chat.freenode.net:+6697/#zplugin Blog: http://zdharma.org