zsh-users
 help / color / mirror / code / Atom feed
* Re: zsh poor performances while reading and testing ?
       [not found] <20190703135824.GA19289__20170.6622539618$1562162400$gmane$org@prometheus.u-strasbg.fr>
@ 2019-07-03 14:28 ` Stephane Chazelas
  2019-07-05 13:28   ` Marc Chantreux
       [not found]   ` <20190705132842.GA21074__38577.2207098611$1562333415$gmane$org@prometheus.u-strasbg.fr>
  0 siblings, 2 replies; 9+ messages in thread
From: Stephane Chazelas @ 2019-07-03 14:28 UTC (permalink / raw)
  To: Marc Chantreux; +Cc: Zsh Users

2019-07-03 15:58:24 +0200, Marc Chantreux:
[...]
> i recently made a benchmark to emphasize the gain of speed
> people can have using filters instead of pure shell loops.
> however i was surprised to see how slow zsh is compared to
> other shells when it comes to read data.
> 
> the interesting part of the benchmark is:
> 
>   for it (bash zsh ksh) {
>     TIMEFMT=": $it %U %S %E"
>     time $it  -c 'while read it; do : ; done < x > /dev/null'
>   }
> 
> : bash 4,95s 1,21s 6,18s
> : zsh 12,65s 28,12s 40,82s
> : ksh 9,87s 26,52s 36,42s
> 
> is there any obvious reason for that? is there a way to make
> it faster without diving in the C code?
[...]

read reads one byte at a time so as not to read past the newline
character. On seekable files bash and ksh implement an
optimisation whereby they read more than one byte (128 bytes in
bash IIRC) but seek back to the last newline.

ksh93 goes even further in that it remembers the excess bytes it
has read and reuses them for the next builtin commands, causing
this kind of bug: https://github.com/att/ast/issues/15

You'll probably find that they are all as inefficient for
non-seekable non-peekable input like pipes.

IIRC, ksh93 implements "|" with socketpair() instead of pipe()
on Linux so it can "peek" data to do this kind of optimisation

$ strace  ksh -c 'seq 20 | read'
[...]
socketpair(AF_UNIX, SOCK_STREAM, 0, [3, 4]) = 0
shutdown(4, SHUT_RD)                    = 0
fchmod(4, 0200)                         = 0
shutdown(3, SHUT_WR)                    = 0
fchmod(3, 0400)                         = 0
[...]
fcntl(3, F_DUPFD, 0)                    = 0
[...]
recvfrom(0, "1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14"..., 65536, MSG_PEEK, NULL, NULL) = 51
read(0, "1\n", 2)                       = 2
[...]

Again that kind of optimisation can backfire and be invalid if
there's more than one reader to the pipe (not to mention the
problems on linux where /dev/fd/x doesn't work socketpairs).

By the way, the syntax to read a line is

IFS= read -r line

not

read line

even in zsh.

-- 
Stephane

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: zsh poor performances while reading and testing ?
  2019-07-03 14:28 ` zsh poor performances while reading and testing ? Stephane Chazelas
@ 2019-07-05 13:28   ` Marc Chantreux
  2019-07-05 13:44     ` Peter Stephenson
  2019-07-05 13:44     ` Mikael Magnusson
       [not found]   ` <20190705132842.GA21074__38577.2207098611$1562333415$gmane$org@prometheus.u-strasbg.fr>
  1 sibling, 2 replies; 9+ messages in thread
From: Marc Chantreux @ 2019-07-05 13:28 UTC (permalink / raw)
  To: Marc Chantreux, Zsh Users

hello,

> You'll probably find that they are all as inefficient for
> non-seekable non-peekable input like pipes.

actually my point making this bench was: don't use shell to write
serious filters. however i really appreciate knowing why this difference
exists. thanks a lot.

> IFS= read -r line

ok for -r but as long as i use only one variable, why is it important to
use IFS= ?

regards
marc

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: zsh poor performances while reading and testing ?
  2019-07-05 13:28   ` Marc Chantreux
@ 2019-07-05 13:44     ` Peter Stephenson
  2019-07-05 14:21       ` Marc Chantreux
  2019-07-05 13:44     ` Mikael Magnusson
  1 sibling, 1 reply; 9+ messages in thread
From: Peter Stephenson @ 2019-07-05 13:44 UTC (permalink / raw)
  To: zsh-users

On Fri, 2019-07-05 at 15:28 +0200, Marc Chantreux wrote: 
> > You'll probably find that they are all as inefficient for
> > non-seekable non-peekable input like pipes.
> actually my point making this bench was: don't use shell to write
> serious filters. however i really appreciate knowing why this difference
> exists. thanks a lot.
> 
> > 
> > IFS= read -r line
> ok for -r but as long as i use only one variable, why is it important to
> use IFS= ?

As Look and Learn magazine used to say, Find Out By Doing.


% read -r line
  full line  
           ^^ spaces here
% print -r -- "'$line'"
'  full line  '
% read -r line
  full line  
           ^^ spaces here
% print -r -- "'$line'"
'full line'


pws


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: zsh poor performances while reading and testing ?
  2019-07-05 13:28   ` Marc Chantreux
  2019-07-05 13:44     ` Peter Stephenson
@ 2019-07-05 13:44     ` Mikael Magnusson
  1 sibling, 0 replies; 9+ messages in thread
From: Mikael Magnusson @ 2019-07-05 13:44 UTC (permalink / raw)
  To: Marc Chantreux; +Cc: Zsh Users

On 7/5/19, Marc Chantreux <eiro@phear.org> wrote:
> hello,
>
>> You'll probably find that they are all as inefficient for
>> non-seekable non-peekable input like pipes.
>
> actually my point making this bench was: don't use shell to write
> serious filters. however i really appreciate knowing why this difference
> exists. thanks a lot.
>
>> IFS= read -r line
>
> ok for -r but as long as i use only one variable, why is it important to
> use IFS= ?

% echo '  hello ' | IFS= read -r one; echo -E - _${one}_
_  hello _
% echo '  hello ' | read -r one; echo -E - _${one}_
_hello_


-- 
Mikael Magnusson

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: zsh poor performances while reading and testing ?
  2019-07-05 13:44     ` Peter Stephenson
@ 2019-07-05 14:21       ` Marc Chantreux
  2019-07-05 14:29         ` Peter Stephenson
  0 siblings, 1 reply; 9+ messages in thread
From: Marc Chantreux @ 2019-07-05 14:21 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: zsh-users

hello,

thanks to you and Mikael, it's obvious now but frankly this is tricky
because very unexpected for me (i played around read for years without
notifying this).

thanks and regards,
marc

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: zsh poor performances while reading and testing ?
  2019-07-05 14:21       ` Marc Chantreux
@ 2019-07-05 14:29         ` Peter Stephenson
  2019-07-05 15:53           ` Marc Chantreux
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Stephenson @ 2019-07-05 14:29 UTC (permalink / raw)
  To: zsh-users

> On 05 July 2019 at 15:21 Marc Chantreux <eiro@phear.org> wrote:
> thanks to you and Mikael, it's obvious now but frankly this is tricky
> because very unexpected for me (i played around read for years without
> notifying this).

There are certainly good reasons apart from inefficiency for finding
scripting languages other than the shell for implementing utilities
involving I/O :-/.

Personally, I don't use shell scripting these days for anything other
than helping with interactive use, and I don't consider myself
particularly unfamiliar with the shell.

pws

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: zsh poor performances while reading and testing ?
  2019-07-05 14:29         ` Peter Stephenson
@ 2019-07-05 15:53           ` Marc Chantreux
  0 siblings, 0 replies; 9+ messages in thread
From: Marc Chantreux @ 2019-07-05 15:53 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: zsh-users

hello Peter,

> Personally, I don't use shell scripting these days for anything other
> than helping with interactive use, and I don't consider myself
> particularly unfamiliar with the shell.

doesn't seems so ;)

as i said: my fist attempt when i wrote this benchmark was to point out
how fast are filters compared to while loops. so i don't use zsh for
those kind of stuff too.

i was using zsh+(many filters) for decades now because in some cases,
it was the fastest way to get things done (even perl5 isn't even close
to the simplicity of some shell scripts). the only recent game changer
for me it perl6: for the first time of my life, i feel it's time to kiss
shell scripting goodbye.

regards,
marc

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: zsh poor performances while reading and testing ?
       [not found]   ` <20190705132842.GA21074__38577.2207098611$1562333415$gmane$org@prometheus.u-strasbg.fr>
@ 2019-07-05 18:05     ` Stephane Chazelas
  0 siblings, 0 replies; 9+ messages in thread
From: Stephane Chazelas @ 2019-07-05 18:05 UTC (permalink / raw)
  To: Marc Chantreux; +Cc: Zsh Users

2019-07-05 15:28:42 +0200, Marc Chantreux:
[...]
> > IFS= read -r line
> 
> ok for -r but as long as i use only one variable, why is it important to
> use IFS= ?
[...]

See also:

https://unix.stackexchange.com/questions/209123/understanding-ifs-read-r-line

-- 
Stephane

^ permalink raw reply	[flat|nested] 9+ messages in thread

* zsh poor performances while reading and testing ?
@ 2019-07-03 13:58 Marc Chantreux
  0 siblings, 0 replies; 9+ messages in thread
From: Marc Chantreux @ 2019-07-03 13:58 UTC (permalink / raw)
  To: Zsh Users

hello people,

i recently made a benchmark to emphasize the gain of speed
people can have using filters instead of pure shell loops.
however i was surprised to see how slow zsh is compared to
other shells when it comes to read data.

the interesting part of the benchmark is:

  for it (bash zsh ksh) {
    TIMEFMT=": $it %U %S %E"
    time $it  -c 'while read it; do : ; done < x > /dev/null'
  }

: bash 4,95s 1,21s 6,18s
: zsh 12,65s 28,12s 40,82s
: ksh 9,87s 26,52s 36,42s

is there any obvious reason for that? is there a way to make
it faster without diving in the C code?

regards,
marc

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-07-05 18:06 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20190703135824.GA19289__20170.6622539618$1562162400$gmane$org@prometheus.u-strasbg.fr>
2019-07-03 14:28 ` zsh poor performances while reading and testing ? Stephane Chazelas
2019-07-05 13:28   ` Marc Chantreux
2019-07-05 13:44     ` Peter Stephenson
2019-07-05 14:21       ` Marc Chantreux
2019-07-05 14:29         ` Peter Stephenson
2019-07-05 15:53           ` Marc Chantreux
2019-07-05 13:44     ` Mikael Magnusson
     [not found]   ` <20190705132842.GA21074__38577.2207098611$1562333415$gmane$org@prometheus.u-strasbg.fr>
2019-07-05 18:05     ` Stephane Chazelas
2019-07-03 13:58 Marc Chantreux

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).