zsh-users
 help / color / mirror / code / Atom feed
* zsh-ify a bash script
@ 2022-01-19 20:56 zzapper
  2022-01-20  4:12 ` Bart Schaefer
  0 siblings, 1 reply; 9+ messages in thread
From: zzapper @ 2022-01-19 20:56 UTC (permalink / raw)
  To: Zsh-Users List

[-- Attachment #1: Type: text/plain, Size: 421 bytes --]

hi

By Tim chase

Dumb CLI trick. Wanted to find files containing all of several terms 
(dup2, pledge, socketpair, fork), but they could occur anywhere in the 
file: $


find . -name '*.c' | xargs fgrep -lw dup2 |xargs fgrep -lw pledge | 
xargs grep -l 'socketpair.*SOCK_STREAM' | xargs fgrep -w fork


NB1: Busy today & a bit overwhelmed.


Any ideas. First thing that came to mind was awk but see NB1 above


zzapper



[-- Attachment #2: Type: text/html, Size: 1532 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: zsh-ify a bash script
  2022-01-19 20:56 zsh-ify a bash script zzapper
@ 2022-01-20  4:12 ` Bart Schaefer
  2022-01-20 10:31   ` Mikael Magnusson
  0 siblings, 1 reply; 9+ messages in thread
From: Bart Schaefer @ 2022-01-20  4:12 UTC (permalink / raw)
  To: zzapper; +Cc: Zsh-Users List

On Wed, Jan 19, 2022 at 12:57 PM zzapper <zsh@rayninfo.co.uk> wrote:
>
> Dumb CLI trick. Wanted to find files containing all of several terms (dup2, pledge, socketpair, fork), but they could occur anywhere in the file:

Not zsh any more than the first example, but instead of "grep -l" on
the entire file contents for each term ...

# start by getting the actual occurrences of all the terms:
find . -name '*.c' | xargs egrep
'\<(dup2|pledge|fork|socketpair.*SOCK_STREAM)\>' /dev/null |
# reduce the results to just file names and search terms:
sed -E 's/(^[^:]*\.c:).*\<(dup2|pledge|fork|socketpair)\>.*/\1\2/' |
# make every search term unique per file:
sort -u |
# discard the search terms, leaving only file names:
cut -d : -f 1 |
# count the number of times each file name appears:
uniq -c |
# print names with a count of 4 (the number of search terms):
sed -nE 's/^ *4 //p'

Adjusting this for edge cases where two search terms appear on the
same line is left as an exercise.  If you have file names with colons
in them, the first set of parens in the first sed will need a more
precise RE.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: zsh-ify a bash script
  2022-01-20  4:12 ` Bart Schaefer
@ 2022-01-20 10:31   ` Mikael Magnusson
  2022-01-20 10:40     ` Mikael Magnusson
  0 siblings, 1 reply; 9+ messages in thread
From: Mikael Magnusson @ 2022-01-20 10:31 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zzapper, Zsh-Users List

On 1/20/22, Bart Schaefer <schaefer@brasslantern.com> wrote:
> On Wed, Jan 19, 2022 at 12:57 PM zzapper <zsh@rayninfo.co.uk> wrote:
>>
>> Dumb CLI trick. Wanted to find files containing all of several terms
>> (dup2, pledge, socketpair, fork), but they could occur anywhere in the
>> file:
>
> Not zsh any more than the first example, but instead of "grep -l" on
> the entire file contents for each term ...
>
> # start by getting the actual occurrences of all the terms:
> find . -name '*.c' | xargs egrep
> '\<(dup2|pledge|fork|socketpair.*SOCK_STREAM)\>' /dev/null |
> # reduce the results to just file names and search terms:
> sed -E 's/(^[^:]*\.c:).*\<(dup2|pledge|fork|socketpair)\>.*/\1\2/' |
> # make every search term unique per file:
> sort -u |
> # discard the search terms, leaving only file names:
> cut -d : -f 1 |
> # count the number of times each file name appears:
> uniq -c |
> # print names with a count of 4 (the number of search terms):
> sed -nE 's/^ *4 //p'
>
> Adjusting this for edge cases where two search terms appear on the
> same line is left as an exercise.

This part would probably be fixed by passing -o to grep (didn't test
in the above but):
% echo foobar|grep -E '(foo|bar)'
foobar
% echo foobar|grep -oE '(foo|bar)'
foo
bar

(sorry to spoil the exercise)

> If you have file names with colons
> in them, the first set of parens in the first sed will need a more
> precise RE.


-- 
Mikael Magnusson


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: zsh-ify a bash script
  2022-01-20 10:31   ` Mikael Magnusson
@ 2022-01-20 10:40     ` Mikael Magnusson
  2022-01-20 11:26       ` Andreas Kusalananda Kähäri
  0 siblings, 1 reply; 9+ messages in thread
From: Mikael Magnusson @ 2022-01-20 10:40 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zzapper, Zsh-Users List

On 1/20/22, Mikael Magnusson <mikachu@gmail.com> wrote:
> On 1/20/22, Bart Schaefer <schaefer@brasslantern.com> wrote:
>> On Wed, Jan 19, 2022 at 12:57 PM zzapper <zsh@rayninfo.co.uk> wrote:
>>>
>>> Dumb CLI trick. Wanted to find files containing all of several terms
>>> (dup2, pledge, socketpair, fork), but they could occur anywhere in the
>>> file:
>>
>> Not zsh any more than the first example, but instead of "grep -l" on
>> the entire file contents for each term ...
>>
>> # start by getting the actual occurrences of all the terms:
>> find . -name '*.c' | xargs egrep
>> '\<(dup2|pledge|fork|socketpair.*SOCK_STREAM)\>' /dev/null |
>> # reduce the results to just file names and search terms:
>> sed -E 's/(^[^:]*\.c:).*\<(dup2|pledge|fork|socketpair)\>.*/\1\2/' |
>> # make every search term unique per file:
>> sort -u |
>> # discard the search terms, leaving only file names:
>> cut -d : -f 1 |
>> # count the number of times each file name appears:
>> uniq -c |
>> # print names with a count of 4 (the number of search terms):
>> sed -nE 's/^ *4 //p'
>>
>> Adjusting this for edge cases where two search terms appear on the
>> same line is left as an exercise.
>
> This part would probably be fixed by passing -o to grep (didn't test
> in the above but):
> % echo foobar|grep -E '(foo|bar)'
> foobar
> % echo foobar|grep -oE '(foo|bar)'
> foo
> bar

I guess this is more or less the same solution,
% grep -Eo '(zsfree|zerr|subst)' Src/**/*.c|sort -u|sed
's/:[^:]*'//|uniq -c|grep -E '^\s+3'
      3 Src/Zle/zle_main.c
      3 Src/builtin.c
      3 Src/exec.c
      3 Src/glob.c
      3 Src/hist.c
      3 Src/jobs.c
      3 Src/signals.c
      3 Src/subst.c
      3 Src/utils.c


-- 
Mikael Magnusson


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: zsh-ify a bash script
  2022-01-20 10:40     ` Mikael Magnusson
@ 2022-01-20 11:26       ` Andreas Kusalananda Kähäri
  2022-01-20 18:31         ` Bart Schaefer
  2022-01-21  7:48         ` Daniel Shahaf
  0 siblings, 2 replies; 9+ messages in thread
From: Andreas Kusalananda Kähäri @ 2022-01-20 11:26 UTC (permalink / raw)
  To: Mikael Magnusson; +Cc: Bart Schaefer, zzapper, Zsh-Users List

On Thu, Jan 20, 2022 at 11:40:56AM +0100, Mikael Magnusson wrote:
> On 1/20/22, Mikael Magnusson <mikachu@gmail.com> wrote:
> > On 1/20/22, Bart Schaefer <schaefer@brasslantern.com> wrote:
> >> On Wed, Jan 19, 2022 at 12:57 PM zzapper <zsh@rayninfo.co.uk> wrote:
> >>>
> >>> Dumb CLI trick. Wanted to find files containing all of several terms
> >>> (dup2, pledge, socketpair, fork), but they could occur anywhere in the
> >>> file:
> >>
> >> Not zsh any more than the first example, but instead of "grep -l" on
> >> the entire file contents for each term ...
> >>
> >> # start by getting the actual occurrences of all the terms:
> >> find . -name '*.c' | xargs egrep
> >> '\<(dup2|pledge|fork|socketpair.*SOCK_STREAM)\>' /dev/null |
> >> # reduce the results to just file names and search terms:
> >> sed -E 's/(^[^:]*\.c:).*\<(dup2|pledge|fork|socketpair)\>.*/\1\2/' |
> >> # make every search term unique per file:
> >> sort -u |
> >> # discard the search terms, leaving only file names:
> >> cut -d : -f 1 |
> >> # count the number of times each file name appears:
> >> uniq -c |
> >> # print names with a count of 4 (the number of search terms):
> >> sed -nE 's/^ *4 //p'
> >>
> >> Adjusting this for edge cases where two search terms appear on the
> >> same line is left as an exercise.
> >
> > This part would probably be fixed by passing -o to grep (didn't test
> > in the above but):
> > % echo foobar|grep -E '(foo|bar)'
> > foobar
> > % echo foobar|grep -oE '(foo|bar)'
> > foo
> > bar
> 
> I guess this is more or less the same solution,
> % grep -Eo '(zsfree|zerr|subst)' Src/**/*.c|sort -u|sed
> 's/:[^:]*'//|uniq -c|grep -E '^\s+3'
>       3 Src/Zle/zle_main.c
>       3 Src/builtin.c
>       3 Src/exec.c
>       3 Src/glob.c
>       3 Src/hist.c
>       3 Src/jobs.c
>       3 Src/signals.c
>       3 Src/subst.c
>       3 Src/utils.c
> 
> 
> -- 
> Mikael Magnusson

Awk-ifying the end of that pipeline... and just playing around with the
grep a bit for fun (changed to use BREs, but still non-standard due to
-w and -o).

grep -Fwo -e zsfree -e zerr -e subst -- Src/**/*.c |
awk -F: '!seen[$0]++ && ++count[$1] == 3 { print $1 }'

This obviously assumes that no pathname contains colons or embedded
newlines.

-- 
Andreas (Kusalananda) Kähäri
SciLifeLab, NBIS, ICM
Uppsala University, Sweden

.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: zsh-ify a bash script
  2022-01-20 11:26       ` Andreas Kusalananda Kähäri
@ 2022-01-20 18:31         ` Bart Schaefer
  2022-01-20 22:13           ` Andreas Kusalananda Kähäri
  2022-01-21  7:48         ` Daniel Shahaf
  1 sibling, 1 reply; 9+ messages in thread
From: Bart Schaefer @ 2022-01-20 18:31 UTC (permalink / raw)
  To: Zsh-Users List

On Thu, Jan 20, 2022 at 3:26 AM Andreas Kusalananda Kähäri
<andreas.kahari@abc.se> wrote:
>
> grep -Fwo -e zsfree -e zerr -e subst -- Src/**/*.c |

I think "grep -w" doesn't work for the original problem because of the
".*SOCK_STREAM" in one of the search terms.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: zsh-ify a bash script
  2022-01-20 18:31         ` Bart Schaefer
@ 2022-01-20 22:13           ` Andreas Kusalananda Kähäri
  0 siblings, 0 replies; 9+ messages in thread
From: Andreas Kusalananda Kähäri @ 2022-01-20 22:13 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh-Users List

On Thu, Jan 20, 2022 at 10:31:21AM -0800, Bart Schaefer wrote:
> On Thu, Jan 20, 2022 at 3:26 AM Andreas Kusalananda Kähäri
> <andreas.kahari@abc.se> wrote:
> >
> > grep -Fwo -e zsfree -e zerr -e subst -- Src/**/*.c |
> 
> I think "grep -w" doesn't work for the original problem because of the
> ".*SOCK_STREAM" in one of the search terms.

Sorry for going off-topic... but...

The -w option is just a convenience thing to not have to insert one of
\b, \< \>, or [[:<:]] [[:>:]] at either end of the pattern.

Removing the -F is what needs to be done for the patterns to be treated
as BREs.

Compare

	man zsh | grep -w 'as\>.*\<an'

(2 lines of output, each of which has the word "as" followed by the word
"an"),

	man zsh | grep -w 'as.*an'

(7 lines of output, since "an" may match at start of word and "an" at
the end, as on a line with "as ... than"), and

	man zsh | grep 'as.*an'

(15 lines of output due to matching things like "basename ... command")

What the OP may want to do is to change their expression

	socketpair.*SOCK_STREAM

into

	socketpair\>.*\<SOCK_STREAM

and then use -w with grep, depending on what it is they want to actually
match, and what word boundary pattern(s) their regular expression
libraries support.


-- 
Andreas (Kusalananda) Kähäri
SciLifeLab, NBIS, ICM
Uppsala University, Sweden

.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: zsh-ify a bash script
  2022-01-20 11:26       ` Andreas Kusalananda Kähäri
  2022-01-20 18:31         ` Bart Schaefer
@ 2022-01-21  7:48         ` Daniel Shahaf
  2022-01-21  8:50           ` zzapper
  1 sibling, 1 reply; 9+ messages in thread
From: Daniel Shahaf @ 2022-01-21  7:48 UTC (permalink / raw)
  To: Zsh-Users List; +Cc: zzapper

Andreas Kusalananda Kähäri wrote on Thu, Jan 20, 2022 at 12:26:06 +0100:
> On Thu, Jan 20, 2022 at 11:40:56AM +0100, Mikael Magnusson wrote:
> > On 1/20/22, Mikael Magnusson <mikachu@gmail.com> wrote:
> > > On 1/20/22, Bart Schaefer <schaefer@brasslantern.com> wrote:
> > >> On Wed, Jan 19, 2022 at 12:57 PM zzapper <zsh@rayninfo.co.uk> wrote:
> > >>>
> > >>> Dumb CLI trick. Wanted to find files containing all of several terms
> > >>> (dup2, pledge, socketpair, fork), but they could occur anywhere in the
> > >>> file:
> > >>
> > >> Not zsh any more than the first example, but instead of "grep -l" on
> > >> the entire file contents for each term ...
> > >>
> > >> # start by getting the actual occurrences of all the terms:
> > >> find . -name '*.c' | xargs egrep
> > >> '\<(dup2|pledge|fork|socketpair.*SOCK_STREAM)\>' /dev/null |
> > >> # reduce the results to just file names and search terms:
> > >> sed -E 's/(^[^:]*\.c:).*\<(dup2|pledge|fork|socketpair)\>.*/\1\2/' |
> > >> # make every search term unique per file:
> > >> sort -u |
> > >> # discard the search terms, leaving only file names:
> > >> cut -d : -f 1 |
> > >> # count the number of times each file name appears:
> > >> uniq -c |
> > >> # print names with a count of 4 (the number of search terms):
> > >> sed -nE 's/^ *4 //p'
> > >>
> > >> Adjusting this for edge cases where two search terms appear on the
> > >> same line is left as an exercise.
> > >
> > > This part would probably be fixed by passing -o to grep (didn't test
> > > in the above but):
> > > % echo foobar|grep -E '(foo|bar)'
> > > foobar
> > > % echo foobar|grep -oE '(foo|bar)'
> > > foo
> > > bar
> > 
> > I guess this is more or less the same solution,
> > % grep -Eo '(zsfree|zerr|subst)' Src/**/*.c|sort -u|sed 's/:[^:]*'//|uniq -c|grep -E '^\s+3'
> >       3 Src/Zle/zle_main.c
> >       3 Src/builtin.c
> >       3 Src/exec.c
> >       3 Src/glob.c
> >       3 Src/hist.c
> >       3 Src/jobs.c
> >       3 Src/signals.c
> >       3 Src/subst.c
> >       3 Src/utils.c
> > 
> > 
> 
> Awk-ifying the end of that pipeline... and just playing around with the
> grep a bit for fun (changed to use BREs, but still non-standard due to
> -w and -o).
> 
> grep -Fwo -e zsfree -e zerr -e subst -- Src/**/*.c |
> awk -F: '!seen[$0]++ && ++count[$1] == 3 { print $1 }'
> 
> This obviously assumes that no pathname contains colons or embedded
> newlines.

Using glob qualifiers:

% has() { <$REPLY grep -q -- "$1" }
% print -raC1 -- **/*.c(e*has zerr*e*has zsfree*e*has subst*) 
Src/Zle/zle_main.c
Src/builtin.c
Src/exec.c
Src/glob.c
Src/hist.c
Src/jobs.c
Src/signals.c
Src/subst.c
Src/utils.c
% 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: zsh-ify a bash script
  2022-01-21  7:48         ` Daniel Shahaf
@ 2022-01-21  8:50           ` zzapper
  0 siblings, 0 replies; 9+ messages in thread
From: zzapper @ 2022-01-21  8:50 UTC (permalink / raw)
  To: zsh-users


On 21/01/2022 07:48, Daniel Shahaf wrote:
> Andreas Kusalananda Kähäri wrote on Thu, Jan 20, 2022 at 12:26:06 +0100:
>> On Thu, Jan 20, 2022 at 11:40:56AM +0100, Mikael Magnusson wrote:
>>> On 1/20/22, Mikael Magnusson <mikachu@gmail.com> wrote:
>>>> On 1/20/22, Bart Schaefer <schaefer@brasslantern.com> wrote:
>>>>> On Wed, Jan 19, 2022 at 12:57 PM zzapper <zsh@rayninfo.co.uk> wrote:
>>>>>> Dumb CLI trick. Wanted to find files containing all of several terms
>>>>>> (dup2, pledge, socketpair, fork), but they could occur anywhere in the
>>>>>> file:
>>>>> Not zsh any more than the first example, but instead of "grep -l" on
>>>>> the entire file contents for each term ...
>>>>>
>>>>> # start by getting the actual occurrences of all the terms:
>>>>> find . -name '*.c' | xargs egrep
>>>>> '\<(dup2|pledge|fork|socketpair.*SOCK_STREAM)\>' /dev/null |
>>>>> # reduce the results to just file names and search terms:
>>>>> sed -E 's/(^[^:]*\.c:).*\<(dup2|pledge|fork|socketpair)\>.*/\1\2/' |
>>>>> # make every search term unique per file:
>>>>> sort -u |
>>>>> # discard the search terms, leaving only file names:
>>>>> cut -d : -f 1 |
>>>>> # count the number of times each file name appears:
>>>>> uniq -c |
>>>>> # print names with a count of 4 (the number of search terms):
>>>>> sed -nE 's/^ *4 //p'
>>>>>
>>>>> Adjusting this for edge cases where two search terms appear on the
>>>>> same line is left as an exercise.
>>>> This part would probably be fixed by passing -o to grep (didn't test
>>>> in the above but):
>>>> % echo foobar|grep -E '(foo|bar)'
>>>> foobar
>>>> % echo foobar|grep -oE '(foo|bar)'
>>>> foo
>>>> bar
>>> I guess this is more or less the same solution,
>>> % grep -Eo '(zsfree|zerr|subst)' Src/**/*.c|sort -u|sed 's/:[^:]*'//|uniq -c|grep -E '^\s+3'
>>>        3 Src/Zle/zle_main.c
>>>        3 Src/builtin.c
>>>        3 Src/exec.c
>>>        3 Src/glob.c
>>>        3 Src/hist.c
>>>        3 Src/jobs.c
>>>        3 Src/signals.c
>>>        3 Src/subst.c
>>>        3 Src/utils.c
>>>
>>>
>> Awk-ifying the end of that pipeline... and just playing around with the
>> grep a bit for fun (changed to use BREs, but still non-standard due to
>> -w and -o).
>>
>> grep -Fwo -e zsfree -e zerr -e subst -- Src/**/*.c |
>> awk -F: '!seen[$0]++ && ++count[$1] == 3 { print $1 }'
>>
>> This obviously assumes that no pathname contains colons or embedded
>> newlines.
> Using glob qualifiers:
>
> % has() { <$REPLY grep -q -- "$1" }
> % print -raC1 -- **/*.c(e*has zerr*e*has zsfree*e*has subst*)
> Src/Zle/zle_main.c
> Src/builtin.c
> Src/exec.c
> Src/glob.c
> Src/hist.c
> Src/jobs.c
> Src/signals.c
> Src/subst.c
> Src/utils.c
> %
# looks good just trimmed for my system (Mint)

has() { <$REPLY grep -q -- "$1" }
print -raC1 -- /usr/[sS]rc/**.c(e*has stdio*e*has fcntl*e*has mman*) H


### BTW is there any library of text files we can all guarantee on having for future test purposes?



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-01-21  8:51 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-19 20:56 zsh-ify a bash script zzapper
2022-01-20  4:12 ` Bart Schaefer
2022-01-20 10:31   ` Mikael Magnusson
2022-01-20 10:40     ` Mikael Magnusson
2022-01-20 11:26       ` Andreas Kusalananda Kähäri
2022-01-20 18:31         ` Bart Schaefer
2022-01-20 22:13           ` Andreas Kusalananda Kähäri
2022-01-21  7:48         ` Daniel Shahaf
2022-01-21  8:50           ` zzapper

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).