zsh-users
 help / color / mirror / code / Atom feed
From: Stephane Chazelas <stephane.chazelas@gmail.com>
To: Marc Chantreux <eiro@phear.org>
Cc: Zsh Users <zsh-users@zsh.org>
Subject: Re: zsh poor performances while reading and testing ?
Date: Wed, 3 Jul 2019 15:28:31 +0100	[thread overview]
Message-ID: <20190703142831.dc762dlvgaxxvbmr@chaz.gmail.com> (raw)
In-Reply-To: <20190703135824.GA19289__20170.6622539618$1562162400$gmane$org@prometheus.u-strasbg.fr>

2019-07-03 15:58:24 +0200, Marc Chantreux:
[...]
> i recently made a benchmark to emphasize the gain of speed
> people can have using filters instead of pure shell loops.
> however i was surprised to see how slow zsh is compared to
> other shells when it comes to read data.
> 
> the interesting part of the benchmark is:
> 
>   for it (bash zsh ksh) {
>     TIMEFMT=": $it %U %S %E"
>     time $it  -c 'while read it; do : ; done < x > /dev/null'
>   }
> 
> : bash 4,95s 1,21s 6,18s
> : zsh 12,65s 28,12s 40,82s
> : ksh 9,87s 26,52s 36,42s
> 
> is there any obvious reason for that? is there a way to make
> it faster without diving in the C code?
[...]

read reads one byte at a time so as not to read past the newline
character. On seekable files bash and ksh implement an
optimisation whereby they read more than one byte (128 bytes in
bash IIRC) but seek back to the last newline.

ksh93 goes even further in that it remembers the excess bytes it
has read and reuses them for the next builtin commands, causing
this kind of bug: https://github.com/att/ast/issues/15

You'll probably find that they are all as inefficient for
non-seekable non-peekable input like pipes.

IIRC, ksh93 implements "|" with socketpair() instead of pipe()
on Linux so it can "peek" data to do this kind of optimisation

$ strace  ksh -c 'seq 20 | read'
[...]
socketpair(AF_UNIX, SOCK_STREAM, 0, [3, 4]) = 0
shutdown(4, SHUT_RD)                    = 0
fchmod(4, 0200)                         = 0
shutdown(3, SHUT_WR)                    = 0
fchmod(3, 0400)                         = 0
[...]
fcntl(3, F_DUPFD, 0)                    = 0
[...]
recvfrom(0, "1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14"..., 65536, MSG_PEEK, NULL, NULL) = 51
read(0, "1\n", 2)                       = 2
[...]

Again that kind of optimisation can backfire and be invalid if
there's more than one reader to the pipe (not to mention the
problems on linux where /dev/fd/x doesn't work socketpairs).

By the way, the syntax to read a line is

IFS= read -r line

not

read line

even in zsh.

-- 
Stephane

       reply	other threads:[~2019-07-03 14:29 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20190703135824.GA19289__20170.6622539618$1562162400$gmane$org@prometheus.u-strasbg.fr>
2019-07-03 14:28 ` Stephane Chazelas [this message]
2019-07-05 13:28   ` Marc Chantreux
2019-07-05 13:44     ` Peter Stephenson
2019-07-05 14:21       ` Marc Chantreux
2019-07-05 14:29         ` Peter Stephenson
2019-07-05 15:53           ` Marc Chantreux
2019-07-05 13:44     ` Mikael Magnusson
     [not found]   ` <20190705132842.GA21074__38577.2207098611$1562333415$gmane$org@prometheus.u-strasbg.fr>
2019-07-05 18:05     ` Stephane Chazelas
2019-07-03 13:58 Marc Chantreux

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190703142831.dc762dlvgaxxvbmr@chaz.gmail.com \
    --to=stephane.chazelas@gmail.com \
    --cc=eiro@phear.org \
    --cc=zsh-users@zsh.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).