* [[ 'abcde' =~ (#i)Bcd ]] @ 2022-11-07 21:10 Ray Andrews 2022-11-07 21:26 ` Roman Perepelitsa 2022-11-08 17:40 ` Phil Pennock 0 siblings, 2 replies; 13+ messages in thread From: Ray Andrews @ 2022-11-07 21:10 UTC (permalink / raw) To: Zsh Users [[ 'abcde' =~ 'bcd' ]] && echo match1 [[ 'abcde' = (#i)ABcde ]] && echo match2 [[ 'abcde' =~ (#i)Bcd ]] && echo match3 [[ 'bcd' =~ 'abcde' ]] && echo match4 ... I get match 1 and match 2. I understand not getting match 4 because '=~' is not bi-directional, the latter value must be a subset of the former. But why don't I get match 3? It seems to break no rules to make 'Bcd' case insensitive and then find it within 'abcde'. Is there a workaround? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [[ 'abcde' =~ (#i)Bcd ]] 2022-11-07 21:10 [[ 'abcde' =~ (#i)Bcd ]] Ray Andrews @ 2022-11-07 21:26 ` Roman Perepelitsa 2022-11-07 21:47 ` Ray Andrews 2022-11-07 21:50 ` Lawrence Velázquez 2022-11-08 17:40 ` Phil Pennock 1 sibling, 2 replies; 13+ messages in thread From: Roman Perepelitsa @ 2022-11-07 21:26 UTC (permalink / raw) To: Ray Andrews; +Cc: Zsh Users On Mon, Nov 7, 2022 at 10:11 PM Ray Andrews <rayandrews@eastlink.ca> wrote: > > > [[ 'abcde' =~ 'bcd' ]] && echo match1 > [[ 'abcde' = (#i)ABcde ]] && echo match2 > [[ 'abcde' =~ (#i)Bcd ]] && echo match3 > [[ 'bcd' =~ 'abcde' ]] && echo match4 > > ... I get match 1 and match 2. I understand not getting match 4 > because '=~' is not bi-directional, the latter value must be a subset of > the former. But why don't I get match 3? Does it surprise you that this also doesn't match? [[ 'a' =~ (#i)A ]] (#i) only works with pattern matching. For regex the easiest workaround is to convert left-hand-side to lowercase: foo=XaBcX [[ ${(L)foo} =~ abc ]] && echo match Another option is to use zsh/pcre module. See https://zsh.sourceforge.io/Doc/Release/Zsh-Modules.html#The-zsh_002fpcre-Module. In this specific case it's better to use pattern matching of course: [[ $foo == (#i)*abc* ]] && echo match I find it extremely rare in practice that I need a regex match in zsh. Roman. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [[ 'abcde' =~ (#i)Bcd ]] 2022-11-07 21:26 ` Roman Perepelitsa @ 2022-11-07 21:47 ` Ray Andrews 2022-11-07 22:15 ` Lawrence Velázquez 2022-11-07 21:50 ` Lawrence Velázquez 1 sibling, 1 reply; 13+ messages in thread From: Ray Andrews @ 2022-11-07 21:47 UTC (permalink / raw) To: zsh-users On 2022-11-07 13:26, Roman Perepelitsa wrote: > [[ 'abcde' =~ (#i)Bcd ]] && echo match3 > (#i) only works with pattern matching. But isn't that a pattern match? [[ 'abcde' = (#i)ABcde ]] && echo match2 ... that seems happy so it would seem that wildcards aren't required. > In this specific case it's better to use pattern matching of course: > > [[ $foo == (#i)*abc* ]] && echo match > That's what puzzles me I expect: [[ $foo == (#i)*abc* ]] && echo match and: [[ $foo =~ (#i)abc ]] && echo match ... to be exactly the same. If not, why not? Actually there are several workarounds but still I'd expect that to work too. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [[ 'abcde' =~ (#i)Bcd ]] 2022-11-07 21:47 ` Ray Andrews @ 2022-11-07 22:15 ` Lawrence Velázquez 2022-11-08 1:57 ` Ray Andrews 0 siblings, 1 reply; 13+ messages in thread From: Lawrence Velázquez @ 2022-11-07 22:15 UTC (permalink / raw) To: Ray Andrews; +Cc: zsh-users On Mon, Nov 7, 2022, at 4:47 PM, Ray Andrews wrote: > On 2022-11-07 13:26, Roman Perepelitsa wrote: >> [[ 'abcde' =~ (#i)Bcd ]] && echo match3 > >> (#i) only works with pattern matching. > > But isn't that a pattern match? No. It is a regular expression match. When discussing shells and adjacent tools, "pattern" almost always implies the syntax used for filename generation, or an extension thereof. (I only say "almost" as a hedge.) > [[ 'abcde' = (#i)ABcde ]] && echo match2 > > ... that seems happy so it would seem that wildcards aren't required. They are required if you want a partial-length match. % [[ abcde = (#i)ABcde ]]; print $? 0 % [[ abcde = (#i)Bcd ]]; print $? 1 % [[ abcde = (#i)*Bcd* ]]; print $? 0 >> In this specific case it's better to use pattern matching of course: >> >> [[ $foo == (#i)*abc* ]] && echo match >> > That's what puzzles me I expect: > > [[ $foo == (#i)*abc* ]] && echo match > > and: > > [[ $foo =~ (#i)abc ]] && echo match > > ... to be exactly the same. If not, why not? Glob qualifiers only work with globs. Regular expression matching is done using an external library (either PCRE or the host regex library). These libraries can hardly be expected to understand zsh glob qualifiers. -- vq ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [[ 'abcde' =~ (#i)Bcd ]] 2022-11-07 22:15 ` Lawrence Velázquez @ 2022-11-08 1:57 ` Ray Andrews 0 siblings, 0 replies; 13+ messages in thread From: Ray Andrews @ 2022-11-08 1:57 UTC (permalink / raw) To: zsh-users On 2022-11-07 14:15, Lawrence Velázquez wrote: > Glob qualifiers only work with globs. > Regular expression matching is done using an external library (either > PCRE or the host regex library). These libraries can hardly be > expected to understand zsh glob qualifiers. Ah! So that's not even zsh's native opinion on the subject. There is a forgivable confusion there since filename globbing and pattern matching look so similar. I often wonder when and where zsh relies on other libraries and programs. This is a very good example of that sort of thing. It short circuits any whining I might be tempted to do since it's not even zsh code. Thanks, this is the sort of deep answer that parts many clouds. I need to look to regex syntax for any answers I might want. > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [[ 'abcde' =~ (#i)Bcd ]] 2022-11-07 21:26 ` Roman Perepelitsa 2022-11-07 21:47 ` Ray Andrews @ 2022-11-07 21:50 ` Lawrence Velázquez 2022-11-08 2:05 ` Ray Andrews 1 sibling, 1 reply; 13+ messages in thread From: Lawrence Velázquez @ 2022-11-07 21:50 UTC (permalink / raw) To: Ray Andrews; +Cc: Roman Perepelitsa, zsh-users On Mon, Nov 7, 2022, at 4:26 PM, Roman Perepelitsa wrote: > (#i) only works with pattern matching. For regex the easiest > workaround is to convert left-hand-side to lowercase: > > foo=XaBcX > [[ ${(L)foo} =~ abc ]] && echo match > > Another option is to use zsh/pcre module. See > https://zsh.sourceforge.io/Doc/Release/Zsh-Modules.html#The-zsh_002fpcre-Module. If you're not using zsh/pcre, yet another option is to disable CASE_MATCH. It's a bit drastic, though. You may also be able to use some nonstandard extensions defined by your host system's regex library, but that would make your script highly dependent on said library. > I find it extremely rare in practice that I need a regex match in zsh. I concur with Roman. I doubt you actually need case-insensitive regex. -- vq ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [[ 'abcde' =~ (#i)Bcd ]] 2022-11-07 21:50 ` Lawrence Velázquez @ 2022-11-08 2:05 ` Ray Andrews 2022-11-08 8:19 ` Roman Perepelitsa 0 siblings, 1 reply; 13+ messages in thread From: Ray Andrews @ 2022-11-08 2:05 UTC (permalink / raw) To: zsh-users On 2022-11-07 13:50, Lawrence Velázquez wrote: > I concur with Roman. I doubt you actually need case-insensitive > regex. > I'm happy with what I've got working at the moment tho you guys would probably improve it. Pardon my personal jargon but: local vvar=$( basename $cc[$aa] 2> /dev/null ) if [[ "$scope_msg" = 'BROAD' && $vvar = (#i)*$filter* ]]; then elif [[ "$scope_msg" = 'Case INsensitive TAME' && $vvar:u = $filter:u ]]; then elif [[ "$scope_msg" = 'Case Sensitive WILD' && $vvar =~ $filter ]]; then elif [[ "$scope_msg" = 'EXACT' && $vvar = $filter ]]; then else cc[$aa]= fi ... the function let's me search for directories with automatic wildcards and/or case sensitivity or both or neither. The four combinations seem well handled above. The construction is still clumsy, I'll fix it shortly. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [[ 'abcde' =~ (#i)Bcd ]] 2022-11-08 2:05 ` Ray Andrews @ 2022-11-08 8:19 ` Roman Perepelitsa 2022-11-08 13:32 ` Ray Andrews 0 siblings, 1 reply; 13+ messages in thread From: Roman Perepelitsa @ 2022-11-08 8:19 UTC (permalink / raw) To: Ray Andrews; +Cc: zsh-users On Tue, Nov 8, 2022 at 3:06 AM Ray Andrews <rayandrews@eastlink.ca> wrote: > > > On 2022-11-07 13:50, Lawrence Velázquez wrote: > > I concur with Roman. I doubt you actually need case-insensitive > > regex. > > > I'm happy with what I've got working at the moment tho you guys would > probably improve it. Pardon my personal jargon but: > > local vvar=$( basename $cc[$aa] 2> /dev/null ) There is a zsh way for this: local var=${cc[$aa]:t} "t" is short for tail. There is also "h" for head. > if [[ "$scope_msg" = 'BROAD' && $vvar = (#i)*$filter* ]]; then > elif [[ "$scope_msg" = 'Case INsensitive TAME' && $vvar:u = $filter:u > ]]; then > elif [[ "$scope_msg" = 'Case Sensitive WILD' && $vvar =~ $filter ]]; then > elif [[ "$scope_msg" = 'EXACT' && $vvar = $filter ]]; then > else cc[$aa]= > fi Here WILD suggests a wildcard (a.k.a. glob, a.k.a. pattern) match, but the code is doing a regex match. If your intention is to perform a wildcard/glob/pattern match, do this: [[ $vvar == $~filter ]] Or, if you want to always perform a partial match: [[ $vvar == *$~filter* ]] Other cases in your if-else chain also look suspiciously non-orthogonal. The orthogonal bits of matching are: 1. Pattern matching or regex? 2. Case sensitive or not? 3. Partial or full? There are a total of 8 combinations. If you drop regex (which you probably want to do), it leaves 4 combinations. [[ $data == $~pattern ]] # case sensitive, full [[ $data == (#i)$~pattern ]] # case insensitive, full [[ $data == *$~pattern* ]] # case sensitive, partial [[ $data == (#i)*$~pattern* ]] # case insensitive, partial Note that you don't need to quote $data here (although you can, if you prefer to do it for stylistic reasons). > ... the function let's me search for directories with automatic > wildcards and/or case sensitivity or both or neither. There might be a better way to do this which would take advantage of **/*. It's hard to say without knowing what you are trying to achieve. Roman. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [[ 'abcde' =~ (#i)Bcd ]] 2022-11-08 8:19 ` Roman Perepelitsa @ 2022-11-08 13:32 ` Ray Andrews 2022-11-08 14:37 ` Roman Perepelitsa 0 siblings, 1 reply; 13+ messages in thread From: Ray Andrews @ 2022-11-08 13:32 UTC (permalink / raw) To: zsh-users >> local vvar=$( basename $cc[$aa] 2> /dev/null ) > There is a zsh way for this: > > local var=${cc[$aa]:t} > > "t" is short for tail. There is also "h" for head. Thanks yes, I knew zsh could do it, the use of basename was just a fill-in. Anyway you did the work for me just there. But I was going to pattern match, seems as usual zsh has a better way. > Here WILD suggests a wildcard (a.k.a. glob, a.k.a. pattern) match, but > the code is doing a regex match. If your intention is to perform a > wildcard/glob/pattern match, do this: Thing is that I need both. Sometimes I'm searching for directories in a saved list, sometimes searching out there in the real world of globbing the filesystem. What I showed was the search in the saved list. My directory stack is file based, universal and persistent sorta like the history list but sometimes I want to go looking out on the FS too. So yeah, 8 combinations :( You'd think it might be four since in the mind it feels like a text search in both situations. > > There might be a better way to do this which would take advantage of > **/*. It's hard to say without knowing what you are trying to achieve. It's a directory 'cd' from my personal stack sent to Sebastian's n_list() for graphical selection. I can't live without it. But I decided to add live 'cd' from the entire filesystem filtered via arguments and, as above, the four combinations and as you anticipate I ran into the mud expecting the syntax for the four combinations in the latter situation to be the same as the former but the latter is 'live globbing' whereas the former is just pattern matching in the lines of a file so they are chalk and cheese. It seems to be working but there's always the next gotcha: 1 /aWorking/Zsh/Source/Wk 0 $ . c; c ,a zsh Searching entire system for directories matching "zsh" (BROAD): ... gives this n_list() screen: ------------------------------------------------------------------------------------------------------- : Most recently visited directories matching "zsh" (BROAD): /aWorking/Backup/Zsh /aWorking/Zsh-55555 /aWorking/Zsh /usr/share/zsh /aWorking/garbageZSH /aWorking/Zsh/Zsh-5.8 /usr/share/doc/zsh-common : System wide directories matching "zsh" (BROAD): /aMisc/Backup-root-2022-10-11/.thunderbird/i3n1gea2.Default User/Mail/Local Folders/ZSH.sbd /aWorking/Backup/Zsh /aWorking/Backup/Zsh/Zsh-5.8 /aWorking/Backup/Zsh/Zsh-5.8/share/zsh /aWorking/garbageZSH /aWorking/Zsh /aWorking/Zsh-55555 /aWorking/Zsh/Zsh-5.8 /aWorking/Zsh/Zsh-5.8/share/zsh /etc/zzsh /root/.thunderbird/i3n1gea2.Default User/Mail/Local Folders/ZSH.sbd /usr/lib/x86_64-linux-gnu/zsh /usr/lib/x86_64-linux-gnu/zsh/5.8/zsh /usr/local/share/zsh /usr/share/doc/zsh /usr/share/doc/zsh-common /usr/share/zsh /usr/share/zsh/functions/Completion/Zsh ------------------------------------------------------------------------------------------------------------- ... cursor up, cursor down, pick a directory, press ENTER and you're there automagically. Or I can demand an exact search (no card sharping, no advice on how to be insensitive): 1 /aWorking/Zsh/Source/Wk 0 $ . c; c ,Xa zsh Searching entire system for directories matching "zsh" (EXACT): ------------------------------------------------------------------------------------------------------------------ : Most recently visited directories matching "zsh" (EXACT): /usr/share/zsh : System wide directories matching "zsh" (EXACT): /aWorking/Backup/Zsh/Zsh-5.8/share/zsh /aWorking/Zsh/Zsh-5.8/share/zsh /usr/lib/x86_64-linux-gnu/zsh /usr/lib/x86_64-linux-gnu/zsh/5.8/zsh /usr/local/share/zsh /usr/share/doc/zsh /usr/share/zsh -------------------------------------------------------------------------------------------------------------------- ... so far, so good. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [[ 'abcde' =~ (#i)Bcd ]] 2022-11-08 13:32 ` Ray Andrews @ 2022-11-08 14:37 ` Roman Perepelitsa 2022-11-08 14:30 ` Ray Andrews 0 siblings, 1 reply; 13+ messages in thread From: Roman Perepelitsa @ 2022-11-08 14:37 UTC (permalink / raw) To: Ray Andrews; +Cc: zsh-users On Tue, Nov 8, 2022 at 2:32 PM Ray Andrews <rayandrews@eastlink.ca> wrote: > > > Here WILD suggests a wildcard (a.k.a. glob, a.k.a. pattern) match, but > > the code is doing a regex match. > > Thing is that I need both. Can you give an example of a use case where you are using a regex and cannot use a pattern instead? None of the examples you already listed would qualify. Roman. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [[ 'abcde' =~ (#i)Bcd ]] 2022-11-08 14:37 ` Roman Perepelitsa @ 2022-11-08 14:30 ` Ray Andrews 0 siblings, 0 replies; 13+ messages in thread From: Ray Andrews @ 2022-11-08 14:30 UTC (permalink / raw) To: zsh-users On 2022-11-08 06:37, Roman Perepelitsa wrote: > > Can you give an example of a use case where you are using a regex and > cannot use a pattern instead? None of the examples you already listed > would qualify. Let me chew over what I've learned just yesterday and today and see how it shakes out then backatcha. It's huge just conceptualizing the difference. As I said, I've tended to think of globbing and regex and pattern matching as more or less the same thing. :( ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [[ 'abcde' =~ (#i)Bcd ]] 2022-11-07 21:10 [[ 'abcde' =~ (#i)Bcd ]] Ray Andrews 2022-11-07 21:26 ` Roman Perepelitsa @ 2022-11-08 17:40 ` Phil Pennock 2022-11-08 18:43 ` Ray Andrews 1 sibling, 1 reply; 13+ messages in thread From: Phil Pennock @ 2022-11-08 17:40 UTC (permalink / raw) To: zsh-users On 2022-11-07 at 13:10 -0800, Ray Andrews wrote: > [[ 'abcde' =~ 'bcd' ]] && echo match1 > [[ 'abcde' = (#i)ABcde ]] && echo match2 > [[ 'abcde' =~ (#i)Bcd ]] && echo match3 > [[ 'bcd' =~ 'abcde' ]] && echo match4 > > ... I get match 1 and match 2. I understand not getting match 4 because > '=~' is not bi-directional, the latter value must be a subset of the > former. But why don't I get match 3? It seems to break no rules to make > 'Bcd' case insensitive and then find it within 'abcde'. Is there a > workaround? * = : equivalent to "==", string comparison with globs supported * =~ : regular expression match, syntax from Perl, used in bash * -regex-match : operator for very explicit regexp match * -pcre-match : operator for very explicit regexp match In zsh, = and == came first, then -pcre-match. The =~ operator from Perl was added to bash and I added support to zsh, and wrote the zsh/regex module so that _by default_ zsh would be compatible with bash. Using `setopt pcre_match` will switch =~ from bash-compatible to using PCRE, Perl Compatible Regular Expressions, so much closer to the original =~. The downside of PCRE in zsh is that for licensing reasons, not all distributions include it. Zsh itself is BSD-licensed, PCRE is not. If PCRE is available, then: [[ 'abcde' =~ (?i)Bcd ]] && echo match3 Use `man pcrepattern` and look at "INTERNAL OPTION SETTING" to see how (?something) turns on options, with 'i' being PCRE_CASELESS. This syntax matches Perl. If PCRE is not available, then you are stuck with ERE syntax (see `man 7 regex`) and you'll have to be a lot more explicit. So probably better to find ways to use zsh glob pattern matching instead of regular expressions. -Phil ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [[ 'abcde' =~ (#i)Bcd ]] 2022-11-08 17:40 ` Phil Pennock @ 2022-11-08 18:43 ` Ray Andrews 0 siblings, 0 replies; 13+ messages in thread From: Ray Andrews @ 2022-11-08 18:43 UTC (permalink / raw) To: zsh-users On 2022-11-08 09:40, Phil Pennock wrote: > > Using `setopt pcre_match` will switch =~ from bash-compatible to using > PCRE, Perl Compatible Regular Expressions, so much closer to the > original =~. So complicated! Not just zsh as she is, but all that history and the variations on the theme. I've always wondered what PCRE is, now I know. Being even vaguely informed about all this stuff is useful tho it puts all issues in their cultural context. Thinks might not be as clean and clear as one might wish. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2022-11-08 18:44 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-11-07 21:10 [[ 'abcde' =~ (#i)Bcd ]] Ray Andrews 2022-11-07 21:26 ` Roman Perepelitsa 2022-11-07 21:47 ` Ray Andrews 2022-11-07 22:15 ` Lawrence Velázquez 2022-11-08 1:57 ` Ray Andrews 2022-11-07 21:50 ` Lawrence Velázquez 2022-11-08 2:05 ` Ray Andrews 2022-11-08 8:19 ` Roman Perepelitsa 2022-11-08 13:32 ` Ray Andrews 2022-11-08 14:37 ` Roman Perepelitsa 2022-11-08 14:30 ` Ray Andrews 2022-11-08 17:40 ` Phil Pennock 2022-11-08 18:43 ` Ray Andrews
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/zsh/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).