zsh-users
 help / color / mirror / code / Atom feed
* Re: Re: Parallel processing
@ 2022-03-27 18:37 jdh
  2022-03-27 19:06 ` Bart Schaefer
  2022-03-28 13:47 ` Perry Smith
  0 siblings, 2 replies; 14+ messages in thread
From: jdh @ 2022-03-27 18:37 UTC (permalink / raw)
  To: zsh-users



Re: the phrase "parallel processing":

So that beginners do not get the wrong idea I wish to make clear that Unix based systems are not designed to, and can not do parallel processing.  Strictly speaking it is also not designed to do real time processing. Unix/Linux does do multitasking by using task switching.  Unix based systems does allow users with the proper priviledge to set task priority, which helps in deciding which task gets more run time.

dh


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Re: Parallel processing
  2022-03-27 18:37 Re: Parallel processing jdh
@ 2022-03-27 19:06 ` Bart Schaefer
  2022-03-27 19:24   ` Ray Andrews
  2022-03-28 13:47 ` Perry Smith
  1 sibling, 1 reply; 14+ messages in thread
From: Bart Schaefer @ 2022-03-27 19:06 UTC (permalink / raw)
  To: jdh; +Cc: Zsh Users

On Sun, Mar 27, 2022 at 11:37 AM jdh <dhenman@gmail.com> wrote:
>
> So that beginners do not get the wrong idea I wish to make clear that Unix based systems are not designed to, and can not do parallel processing.

If you're going to bring this up, would you please clarify further?
The Linux kernel has supported symmetric multiprocessing since 1996.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Parallel processing
  2022-03-27 19:06 ` Bart Schaefer
@ 2022-03-27 19:24   ` Ray Andrews
  0 siblings, 0 replies; 14+ messages in thread
From: Ray Andrews @ 2022-03-27 19:24 UTC (permalink / raw)
  To: zsh-users

On 2022-03-27 12:06, Bart Schaefer wrote:
> On Sun, Mar 27, 2022 at 11:37 AM jdh <dhenman@gmail.com> wrote:
>> So that beginners do not get the wrong idea I wish to make clear that Unix based systems are not designed to, and can not do parallel processing.
> If you're going to bring this up, would you please clarify further?
> The Linux kernel has supported symmetric multiprocessing since 1996.
>
Yes.  My understanding is only at the casual level, but I've taken it 
that multi-core CPUs are specifically designed to facilitate 
multiprocessing and that Linux/Unix were and are way ahead of Windows in 
taking advantage of that and have been for a whole long time.  Which is 
why M$'s own servers run Linux.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Parallel processing
  2022-03-27 18:37 Re: Parallel processing jdh
  2022-03-27 19:06 ` Bart Schaefer
@ 2022-03-28 13:47 ` Perry Smith
  1 sibling, 0 replies; 14+ messages in thread
From: Perry Smith @ 2022-03-28 13:47 UTC (permalink / raw)
  To: zsh-users



> On Mar 27, 2022, at 13:37, jdh <dhenman@gmail.com> wrote:
> 
> Re: the phrase "parallel processing":
> 
> So that beginners do not get the wrong idea I wish to make clear that Unix based systems are not designed to, and can not do parallel processing.  Strictly speaking it is also not designed to do real time processing. Unix/Linux does do multitasking by using task switching.  Unix based systems does allow users with the proper priviledge to set task priority, which helps in deciding which task gets more run time.

Well… drifting off the original topic: parallel processing, real time processing, and SMP are all somewhat orthogonal concepts from each other.

The original "parallel processing" systems ran a version of AT&T unix (e.g. Sequent).  It predated SMP.  Or I suppose you could say that IBM was doing parallel processing even before that.

“Real time” processing is requirements that responses to events fall within the manufacture’s specifications.  i.e. “all context switches are guaranteed to occur within 50ms”.  Then the designers of the end applications can work with those constraints and develop systems that meet the requirements of their end users.

SMP is about the very tight coupling between the CPUs.  Apple’s latest glossy keynote on the M1 Ultra gives a whiff of the problems that parallel processing introduces which SMP solves.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Parallel processing
  2022-03-25  4:34 Perry Smith
  2022-03-25 18:27 ` Bart Schaefer
  2022-03-27 19:23 ` Dominik Vogt
@ 2022-03-28 13:31 ` Perry Smith
  2 siblings, 0 replies; 14+ messages in thread
From: Perry Smith @ 2022-03-28 13:31 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 1304 bytes --]

Perhaps I need to think about this another way and that is via “locking”.

From a 10,000 foot view, I could map each target to a file (path) and do the old fashion practice of putting the PID in the file and do it atomically and with the ability to detect that the owner of the lock might have died, etc etc etc.

Now just spin off N tasks each with a complete list of targets.  The basic flow would be for each item in the list of targets, see if it is done.  If it isn’t done, see if it is locked.  If it isn’t locked, set the lock (atomically) and then do the processing to create the target.

Now, if I want 4 processes running, I can just start four jobs in the back ground.

> On Mar 24, 2022, at 23:34, Perry Smith <pedz@easesoftware.com> wrote:
> 
> Has something like prll (parallel) https://github.com/exzombie/prll <https://github.com/exzombie/prll> been added to zsh?
> 
> I need to do about 20 commands.  Each will take several hours to perhaps days.  I’d like to start some fixed number like 4 jobs that are running.  The others are waiting for one of the others to get finished.
> 
> I tried creating a Makefile and use the -j option in make but my targets have spaces in them and make doesn’t like that.  (This is on a BSD system.)
> 
> Thank you,
> Perry


[-- Attachment #2: Type: text/html, Size: 2269 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Parallel processing
  2022-03-27 19:23 ` Dominik Vogt
@ 2022-03-28 13:26   ` Perry Smith
  0 siblings, 0 replies; 14+ messages in thread
From: Perry Smith @ 2022-03-28 13:26 UTC (permalink / raw)
  To: zsh-users



> On Mar 27, 2022, at 14:23, Dominik Vogt <dominik.vogt@gmx.de> wrote:
> 
> On Thu, Mar 24, 2022 at 11:34:07PM -0500, Perry Smith wrote:
>> I need to do about 20 commands.  Each will take several hours to perhaps days.  I???d like to start some fixed number like 4 jobs that are running.  The others are waiting for one of the others to get finished.
>> 
> 
>> I tried creating a Makefile and use the -j option in make but my
>> targets have spaces in them and make doesn???t like that.  (This
>> is on a BSD system.)
> 
> Are you talking about GNU Make?  What is the problem?
> 
> $ cat Makefile
> TARGETS = a\ b foo\ bar
> all: $(TARGETS)
> foo\ bar:
>        touch "$@"
> a\ b:
>        touch "$@"
> $ ls .
> Makefile
> $ make -j 2
> $ ls .
> a b  foo bar  Makefile
> 

I’m inside a BSD “jail”.  They have their own make (I believe).  Let me try the syntax and see if I can get it to work.  The next challenge will be to get this syntax to work from within a list so I can say:

${LIST_OF_TARGETS} :

Where an item in the list can have spaces.





^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Parallel processing
  2022-03-25  4:34 Perry Smith
  2022-03-25 18:27 ` Bart Schaefer
@ 2022-03-27 19:23 ` Dominik Vogt
  2022-03-28 13:26   ` Perry Smith
  2022-03-28 13:31 ` Perry Smith
  2 siblings, 1 reply; 14+ messages in thread
From: Dominik Vogt @ 2022-03-27 19:23 UTC (permalink / raw)
  To: zsh-users

On Thu, Mar 24, 2022 at 11:34:07PM -0500, Perry Smith wrote:
> I need to do about 20 commands.  Each will take several hours to perhaps days.  I???d like to start some fixed number like 4 jobs that are running.  The others are waiting for one of the others to get finished.
>

> I tried creating a Makefile and use the -j option in make but my
> targets have spaces in them and make doesn???t like that.  (This
> is on a BSD system.)

Are you talking about GNU Make?  What is the problem?

 $ cat Makefile
 TARGETS = a\ b foo\ bar
 all: $(TARGETS)
 foo\ bar:
        touch "$@"
 a\ b:
        touch "$@"
 $ ls .
 Makefile
 $ make -j 2
 $ ls .
 a b  foo bar  Makefile

Ciao

Dominik ^_^  ^_^

--

Dominik Vogt


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Parallel processing
  2022-03-27 17:32       ` Philippe Troin
@ 2022-03-27 17:42         ` Bart Schaefer
  0 siblings, 0 replies; 14+ messages in thread
From: Bart Schaefer @ 2022-03-27 17:42 UTC (permalink / raw)
  To: Philippe Troin; +Cc: Perry Smith, Zsh Users

On Sun, Mar 27, 2022 at 10:32 AM Philippe Troin <phil@fifi.org> wrote:
>
> I was referring to the fact that zargs misses the exit status of
> subcommands:

Yes, I addressed that when I said
>> However, that "wait" returns the exit status of only one of those jobs.

> In the first invocation, zargs misses that one of the subshells returns
> non-zero exit status.

That's also mentioned in the comments in the source file:
# With the --max-procs option, zargs may not correctly capture the exit
# status of the backgrounded jobs, because of limitations of the "wait"
# builtin.  If the zsh/parameter module is not available, the status is
# NEVER correctly returned, otherwise the status of the longest-running
# job in each batch is captured.

However, see patch submitted a few minutes ago to zsh-workers.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Parallel processing
  2022-03-26 22:19     ` Bart Schaefer
@ 2022-03-27 17:32       ` Philippe Troin
  2022-03-27 17:42         ` Bart Schaefer
  0 siblings, 1 reply; 14+ messages in thread
From: Philippe Troin @ 2022-03-27 17:32 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Perry Smith, Zsh Users

On Sat, 2022-03-26 at 15:19 -0700, Bart Schaefer wrote:
> On Sat, Mar 26, 2022 at 11:10 AM Philippe Troin <phil@fifi.org> wrote:
> 
> 
> > Anyways, zargs is not doing a stellar job currently with collecting
> > exit statuses from commands ran in parallel:
> 
> # Everything has to be in a subshell just in case of backgrounding jobs,
> # so that we don't unintentionally "wait" for jobs of the parent shell.
> 
> Hmm ... zargs uses
>   wait ${${jobstates[(R)running:*]/#*:/}/%=*/}
> to wait for all the backgrounded jobs that it started.  (This causes a
> segfault in the most recent git checkout if zargs itself is a subshell
> job.)  However, that "wait" returns the exit status of only one of
> those jobs.  There might be something more that could be done now, to
> pick up the status of the rest ... but I'm reluctant to mess with that
> while the segfault is unfixed.
> 
> >    % zargs -n 4 -P 2 -- 0 1 -- zsh -c 'sleep $1 ; exit $1 ' -; echo $?
> >    123
> 
> This is explained in the comments in zargs:
> 
> # Like xargs, zargs exits with the following status:
> #   0 if it succeeds
> #   123 if any invocation of the command exited with status 1-125
> #   124 if the command exited with status 255
> #   125 if the command is killed by a signal
> #   126 if the command cannot be run
> #   127 if the command is not found
> #   1 if some other error occurred.

I was referring to the fact that zargs misses the exit status of
subcommands:

   % zargs -n 4 -P 2 -- 1 0 -- zsh -c 'sleep $1 ; exit $1 ' -; echo $?
   0
   % zargs -n 4 -P 2 -- 0 1 -- zsh -c 'sleep $1 ; exit $1 ' -; echo $?
   123

Both zargs invocation will spawn two subcommands.  In both cases one
subcommand will exit with status 0 and the other with status 1.

In the first invocation, zargs misses that one of the subshells returns
non-zero exit status.

Phil.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Parallel processing
  2022-03-26 18:10   ` Philippe Troin
@ 2022-03-26 22:19     ` Bart Schaefer
  2022-03-27 17:32       ` Philippe Troin
  0 siblings, 1 reply; 14+ messages in thread
From: Bart Schaefer @ 2022-03-26 22:19 UTC (permalink / raw)
  To: Philippe Troin; +Cc: Perry Smith, Zsh Users

On Sat, Mar 26, 2022 at 11:10 AM Philippe Troin <phil@fifi.org> wrote:
>
> Collecting background jobs' exit status is discussed in the manual,
> under the POSIX_JOBS option:
>
>    In  previous  versions  of the shell, it was necessary to enable
>    POSIX_JOBS in order for the builtin command wait to  return  the
>    status of  background jobs that had already exited.  This is no
>    longer the case.
>
> Setting/unsetting POSIX_JOBS does not make any difference.

% (sleep 1; exit 13) & PID=$!; wait; wait $PID; echo "wait: $?"
[1] 64029
[1]  + exit 13    ( sleep 1; exit 13; )
wait: 13

The "wait" command with no arguments does not return a status for any
job, it just hangs until there are no jobs left, and then returns 0.
This is consistent with e.g. bash.  So when the doc says "The exit
status from this command is that of the job waited for" it should be
more precise about what "the job" means.  You have to identify in the
argument list the job(s) whose status you want.

> There does not seem to be a way to retrieve the exit
> status of a command as soon as SIGCHLD is trapped:

Indeed ... the trap handlers are called when the signal arrives (so
the child process has exited) but before the job has actually been [C
library level] waited-for, so the internal tables that are used to
track the exit status of those children have not been updated.  The
job search code invoked by the "wait" builtin therefore can't find the
job.

> Anyways, zargs is not doing a stellar job currently with collecting
> exit statuses from commands ran in parallel:

# Everything has to be in a subshell just in case of backgrounding jobs,
# so that we don't unintentionally "wait" for jobs of the parent shell.

Hmm ... zargs uses
  wait ${${jobstates[(R)running:*]/#*:/}/%=*/}
to wait for all the backgrounded jobs that it started.  (This causes a
segfault in the most recent git checkout if zargs itself is a subshell
job.)  However, that "wait" returns the exit status of only one of
those jobs.  There might be something more that could be done now, to
pick up the status of the rest ... but I'm reluctant to mess with that
while the segfault is unfixed.

>    % zargs -n 4 -P 2 -- 0 1 -- zsh -c 'sleep $1 ; exit $1 ' -; echo $?
>    123

This is explained in the comments in zargs:

# Like xargs, zargs exits with the following status:
#   0 if it succeeds
#   123 if any invocation of the command exited with status 1-125
#   124 if the command exited with status 255
#   125 if the command is killed by a signal
#   126 if the command cannot be run
#   127 if the command is not found
#   1 if some other error occurred.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Parallel processing
  2022-03-25 18:27 ` Bart Schaefer
@ 2022-03-26 18:10   ` Philippe Troin
  2022-03-26 22:19     ` Bart Schaefer
  0 siblings, 1 reply; 14+ messages in thread
From: Philippe Troin @ 2022-03-26 18:10 UTC (permalink / raw)
  To: Bart Schaefer, Perry Smith; +Cc: Zsh Users

On Fri, 2022-03-25 at 11:27 -0700, Bart Schaefer wrote:
> On Thu, Mar 24, 2022 at 9:34 PM Perry Smith <pedz@easesoftware.com> wrote:
> 
> 
> This isn't exactly what you want because it waits for all four jobs
> before starting the next batch, but keeping a specific number of
> children running is not straightforward with the job-management
> operations available to a shell.

There may be a way to achieve keeping a set number of children around,
by trapping SIGCHLD, but we would completely lose the exit status of
the command.  There does not seem to be a way to retrieve the exit
status of a command as soon as SIGCHLD is trapped:

   % zsh -f
   % echo $ZSH_VERSION
   5.8.1
   
   % setopt monitor
   % trap 'x=$?; echo "CHLD args=$* exit=$x"; wait $PID ' CHLD; (sleep 1;  exit 1) & PID=$!; wait; echo "wait: $?"
   [1] 1192215
   [1]  + exit 1     ( sleep 1; exit 1; )
   CHLD args= exit=0
   wait: pid 1192215 is not a child of this shell
   wait: 0
   
   % setopt nomonitor
   % trap 'x=$?; echo "CHLD args=$* exit=$x"; wait $PID ' CHLD; (sleep 1; exit 1) & PID=$!; wait; echo "wait: $?"
   CHLD args= exit=0
   wait: pid 1192528 is not a child of this shell
   wait: 0
   
   % trap - SIGCHLD
   % TRAPCHLD()  { echo "CHLD args=$*"; wait $PID }; (sleep 1; exit 1) & PID=$!; wait; echo "wait: $?"
   CHLD args=17
   TRAPCHLD:wait: pid 1192701 is not a child of this shell
   wait: 0

Collecting background jobs' exit status is discussed in the manual,
under the POSIX_JOBS option:

   In  previous  versions  of the shell, it was necessary to enable
   POSIX_JOBS in order for the builtin command wait to  return  the
   status of  background jobs that had already exited.  This is no
   longer the case.

Setting/unsetting POSIX_JOBS does not make any difference.

Anyways, zargs is not doing a stellar job currently with collecting
exit statuses from commands ran in parallel:

   % zsh -f
   % autoload zargs
   % zargs -n 4 -P 2 -- 1 0 -- zsh -c 'sleep $1 ; exit $1 ' -; echo $?
   0
   % zargs -n 4 -P 2 -- 0 1 -- zsh -c 'sleep $1 ; exit $1 ' -; echo $?
   123
   % zargs -n 2 -P 2 -- 1 0 -- eval '(){ sleep $1 ; return $1 }' ; echo $?
   0
   % zargs -n 2 -P 2 -- 0 1 -- eval '(){ sleep $1 ; return $1 }' ; echo $? 
   123

Phil.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Parallel processing
@ 2022-03-26  4:34 jdh
  0 siblings, 0 replies; 14+ messages in thread
From: jdh @ 2022-03-26  4:34 UTC (permalink / raw)
  To: zsh-users


"Not every program reacts well to parallel processing, for example conventional (non-SSD) disk IO becomes magnitudes slower if two processes read/write disk at the same time because the disk head needs to keep jumping back and forth between different areas of the HDD to serve multiple processes."

This is not in the zsh topic realm but consideration should be given to avoid hard disk drive thrashing.  

dh


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Parallel processing
  2022-03-25  4:34 Perry Smith
@ 2022-03-25 18:27 ` Bart Schaefer
  2022-03-26 18:10   ` Philippe Troin
  2022-03-27 19:23 ` Dominik Vogt
  2022-03-28 13:31 ` Perry Smith
  2 siblings, 1 reply; 14+ messages in thread
From: Bart Schaefer @ 2022-03-25 18:27 UTC (permalink / raw)
  To: Perry Smith; +Cc: Zsh Users

On Thu, Mar 24, 2022 at 9:34 PM Perry Smith <pedz@easesoftware.com> wrote:
>
> Has something like prll (parallel) https://github.com/exzombie/prll been added to zsh?

Look at the "zargs" function, -P option (which works like "xargs -P"
for the most part).

Silly example:

autoload zargs
zmodload zsh/system
zargs -n 2 -P 4 -- {1..20} -- eval '() { print $1 $sysparams[pid]; sleep 2 }'

Note you need -n 2 there because the first "word" is the argument to
eval and the second "word" is the successive integer from the list 1
through 20.

This isn't exactly what you want because it waits for all four jobs
before starting the next batch, but keeping a specific number of
children running is not straightforward with the job-management
operations available to a shell.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Parallel processing
@ 2022-03-25  4:34 Perry Smith
  2022-03-25 18:27 ` Bart Schaefer
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Perry Smith @ 2022-03-25  4:34 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 510 bytes --]

Has something like prll (parallel) https://github.com/exzombie/prll <https://github.com/exzombie/prll> been added to zsh?

I need to do about 20 commands.  Each will take several hours to perhaps days.  I’d like to start some fixed number like 4 jobs that are running.  The others are waiting for one of the others to get finished.

I tried creating a Makefile and use the -j option in make but my targets have spaces in them and make doesn’t like that.  (This is on a BSD system.)

Thank you,
Perry

[-- Attachment #2: Type: text/html, Size: 929 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-03-28 13:48 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-27 18:37 Re: Parallel processing jdh
2022-03-27 19:06 ` Bart Schaefer
2022-03-27 19:24   ` Ray Andrews
2022-03-28 13:47 ` Perry Smith
  -- strict thread matches above, loose matches on Subject: below --
2022-03-26  4:34 jdh
2022-03-25  4:34 Perry Smith
2022-03-25 18:27 ` Bart Schaefer
2022-03-26 18:10   ` Philippe Troin
2022-03-26 22:19     ` Bart Schaefer
2022-03-27 17:32       ` Philippe Troin
2022-03-27 17:42         ` Bart Schaefer
2022-03-27 19:23 ` Dominik Vogt
2022-03-28 13:26   ` Perry Smith
2022-03-28 13:31 ` Perry Smith

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).