zsh-users
 help / color / mirror / code / Atom feed
* Efficient way to map a list of values to multiple processes, then accumulate their output
@ 2021-11-06 13:40 Zach Riggle
  2021-11-06 16:18 ` Bart Schaefer
  0 siblings, 1 reply; 6+ messages in thread
From: Zach Riggle @ 2021-11-06 13:40 UTC (permalink / raw)
  To: Zsh Users

[-- Attachment #1: Type: text/plain, Size: 1269 bytes --]

Hello all!  Thanks in advance for the assistance.

Ultimately, I'd like to have some quasi-implementation of xargs -P in pure
zsh -- which maps a __function__ to a list of arguments, and returns the
results in an array or associative array.  xargs cannot be used to run
shell functions, so this is out.

I think this is a situation that Zsh is not well-suited for, but let's say
I have a list of files that I want to perform some processing on, then
gather the results when the processing has completed.

Simply backgrounding each process with & and then wait'ing on all jobs to
complete should work, in theory.  However, the output from each invocation
may be intermixed, even in the simple case of "one line of output per file"
the order of completion is non-deterministic and the output may become
garbled.

This seems something that would fit nicely with co-processes, but it
appears each zsh instance can only have one.

Another idea is to farm the output of each to a temporary file, and have an
associative array of e.g. [input]=$(mktemp) as well as [input]=job_id and
then grabbing the output as each job completes.

I expect I am not the first person to try this (equivalent to the Python
multiprocessing.map) but I figured it was worth asking!

*Zach Riggle*

[-- Attachment #2: Type: text/html, Size: 1657 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Efficient way to map a list of values to multiple processes, then accumulate their output
  2021-11-06 13:40 Efficient way to map a list of values to multiple processes, then accumulate their output Zach Riggle
@ 2021-11-06 16:18 ` Bart Schaefer
  2021-11-06 21:58   ` Zach Riggle
  0 siblings, 1 reply; 6+ messages in thread
From: Bart Schaefer @ 2021-11-06 16:18 UTC (permalink / raw)
  To: Zach Riggle; +Cc: Zsh Users

On Sat, Nov 6, 2021 at 6:41 AM Zach Riggle <zachriggle@gmail.com> wrote:
>
> Ultimately, I'd like to have some quasi-implementation of xargs -P in pure zsh -- which maps a __function__ to a list of arguments

Look at "zargs" in "man zshcontrib"


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Efficient way to map a list of values to multiple processes, then accumulate their output
  2021-11-06 16:18 ` Bart Schaefer
@ 2021-11-06 21:58   ` Zach Riggle
  2021-11-06 22:55     ` Lawrence Velázquez
  0 siblings, 1 reply; 6+ messages in thread
From: Zach Riggle @ 2021-11-06 21:58 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh Users

[-- Attachment #1: Type: text/plain, Size: 1218 bytes --]

Wow.  I'm genuinely impressed, it even supports all of the flags that I
need!  And has all of the fun non-determinism of parallel execution (with
-P)!  This satisfies about 60% of what I need, and is a great tool to have
handy!

The limitation appears to be that the command is run in a sub-shell /
separate process (as one might expect), so there's no way to correlate
input74 → output74.

I suppose with "zargs -n1" I can redirect the output of each to an
individual file (in an e.g. temporary file), and then just read them back
and stick everything into an associative array.

A very simple test works exactly correct, but a slight variation gives me
"zargs: argument list too long":

Uploaded the source here for ease of viewing and syntax highlighting:
https://gist.github.com/zachriggle/b53b35faa5a60b674575e1dc6cae1d2e

*Zach Riggle*


On Sat, Nov 6, 2021 at 11:18 AM Bart Schaefer <schaefer@brasslantern.com>
wrote:

> On Sat, Nov 6, 2021 at 6:41 AM Zach Riggle <zachriggle@gmail.com> wrote:
> >
> > Ultimately, I'd like to have some quasi-implementation of xargs -P in
> pure zsh -- which maps a __function__ to a list of arguments
>
> Look at "zargs" in "man zshcontrib"
>

[-- Attachment #2: Type: text/html, Size: 1968 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Efficient way to map a list of values to multiple processes, then accumulate their output
  2021-11-06 21:58   ` Zach Riggle
@ 2021-11-06 22:55     ` Lawrence Velázquez
  2021-11-08 20:27       ` Zach Riggle
  0 siblings, 1 reply; 6+ messages in thread
From: Lawrence Velázquez @ 2021-11-06 22:55 UTC (permalink / raw)
  To: Zach Riggle, Bart Schaefer; +Cc: zsh-users

On Sat, Nov 6, 2021, at 5:58 PM, Zach Riggle wrote:
> A very simple test works exactly correct, but a slight variation gives 
> me "zargs: argument list too long":

From your gist:

> # Works
> zargs -P12 -n1 -- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 -- wrapper1

> # Does not work --> zrgs: argument list too long
> zargs -P12 -n1 -- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 -- wrapper2 double

From zshcontrib(1):

    zargs [ option ... -- ] [ input ... ] [ -- command [ arg ... ] ]

        [...]

        The options -i, -I, -l, -L, and -n differ slightly from their
        usage in `xargs`.  There are no input lines for `zargs` to
        count, so -l and -L count through the "input" list, and -n
        counts the number of arguments passed to each execution of
        "command", *including* any "arg" list.

So you actually want -n2.  (Adjust to taste.)

-- 
vq


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Efficient way to map a list of values to multiple processes, then accumulate their output
  2021-11-06 22:55     ` Lawrence Velázquez
@ 2021-11-08 20:27       ` Zach Riggle
  2021-11-08 20:45         ` Zach Riggle
  0 siblings, 1 reply; 6+ messages in thread
From: Zach Riggle @ 2021-11-08 20:27 UTC (permalink / raw)
  To: Lawrence Velázquez; +Cc: Bart Schaefer, Zsh Users

[-- Attachment #1: Type: text/plain, Size: 1233 bytes --]

Ah-ha! Okay!  I took "works exactly like xargs" a little too literally and
didn't read the docs closely enough.

As always, thanks for your help guys!

*Zach Riggle*


On Sat, Nov 6, 2021 at 5:56 PM Lawrence Velázquez <larryv@zsh.org> wrote:

> On Sat, Nov 6, 2021, at 5:58 PM, Zach Riggle wrote:
> > A very simple test works exactly correct, but a slight variation gives
> > me "zargs: argument list too long":
>
> From your gist:
>
> > # Works
> > zargs -P12 -n1 -- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 --
> wrapper1
>
> > # Does not work --> zrgs: argument list too long
> > zargs -P12 -n1 -- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 --
> wrapper2 double
>
> From zshcontrib(1):
>
>     zargs [ option ... -- ] [ input ... ] [ -- command [ arg ... ] ]
>
>         [...]
>
>         The options -i, -I, -l, -L, and -n differ slightly from their
>         usage in `xargs`.  There are no input lines for `zargs` to
>         count, so -l and -L count through the "input" list, and -n
>         counts the number of arguments passed to each execution of
>         "command", *including* any "arg" list.
>
> So you actually want -n2.  (Adjust to taste.)
>
> --
> vq
>

[-- Attachment #2: Type: text/html, Size: 1872 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Efficient way to map a list of values to multiple processes, then accumulate their output
  2021-11-08 20:27       ` Zach Riggle
@ 2021-11-08 20:45         ` Zach Riggle
  0 siblings, 0 replies; 6+ messages in thread
From: Zach Riggle @ 2021-11-08 20:45 UTC (permalink / raw)
  To: Lawrence Velázquez; +Cc: Bart Schaefer, Zsh Users

[-- Attachment #1: Type: text/plain, Size: 1470 bytes --]

And after going to change my test script, I actually grokked what you were
suggesting :)

*Zach Riggle*


On Mon, Nov 8, 2021 at 2:27 PM Zach Riggle <zachriggle@gmail.com> wrote:

> Ah-ha! Okay!  I took "works exactly like xargs" a little too literally and
> didn't read the docs closely enough.
>
> As always, thanks for your help guys!
>
> *Zach Riggle*
>
>
> On Sat, Nov 6, 2021 at 5:56 PM Lawrence Velázquez <larryv@zsh.org> wrote:
>
>> On Sat, Nov 6, 2021, at 5:58 PM, Zach Riggle wrote:
>> > A very simple test works exactly correct, but a slight variation gives
>> > me "zargs: argument list too long":
>>
>> From your gist:
>>
>> > # Works
>> > zargs -P12 -n1 -- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 --
>> wrapper1
>>
>> > # Does not work --> zrgs: argument list too long
>> > zargs -P12 -n1 -- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 --
>> wrapper2 double
>>
>> From zshcontrib(1):
>>
>>     zargs [ option ... -- ] [ input ... ] [ -- command [ arg ... ] ]
>>
>>         [...]
>>
>>         The options -i, -I, -l, -L, and -n differ slightly from their
>>         usage in `xargs`.  There are no input lines for `zargs` to
>>         count, so -l and -L count through the "input" list, and -n
>>         counts the number of arguments passed to each execution of
>>         "command", *including* any "arg" list.
>>
>> So you actually want -n2.  (Adjust to taste.)
>>
>> --
>> vq
>>
>

[-- Attachment #2: Type: text/html, Size: 2480 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-11-08 20:46 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-06 13:40 Efficient way to map a list of values to multiple processes, then accumulate their output Zach Riggle
2021-11-06 16:18 ` Bart Schaefer
2021-11-06 21:58   ` Zach Riggle
2021-11-06 22:55     ` Lawrence Velázquez
2021-11-08 20:27       ` Zach Riggle
2021-11-08 20:45         ` Zach Riggle

Code repositories for project(s) associated with this inbox:

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).