From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham autolearn_force=no version=3.4.2 Received: from primenet.com.au (ns1.primenet.com.au [203.24.36.2]) by inbox.vuxu.org (OpenSMTPD) with ESMTP id d623e835 for ; Wed, 3 Jul 2019 14:29:18 +0000 (UTC) Received: (qmail 6636 invoked by alias); 3 Jul 2019 14:29:10 -0000 Mailing-List: contact zsh-users-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Users List List-Post: List-Help: List-Unsubscribe: X-Seq: 24004 Received: (qmail 27960 invoked by uid 1010); 3 Jul 2019 14:29:10 -0000 X-Qmail-Scanner-Diagnostics: from mail-wm1-f51.google.com by f.primenet.com.au (envelope-from , uid 7791) with qmail-scanner-2.11 (clamdscan: 0.101.2/25496. spamassassin: 3.4.2. Clear:RC:0(209.85.128.51):SA:0(-2.0/5.0):. Processed in 2.275811 secs); 03 Jul 2019 14:29:10 -0000 X-Envelope-From: stephane.chazelas@gmail.com X-Qmail-Scanner-Mime-Attachments: | X-Qmail-Scanner-Zip-Files: | Received-SPF: pass (ns1.primenet.com.au: SPF record at _netblocks.google.com designates 209.85.128.51 as permitted sender) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-disposition:in-reply-to:user-agent; bh=OYikPgnpq9lL3LG2mMxfmyzWb1q0v4xzMhXMB0PJfcg=; b=qtBoz+W8+n1YBUpWeCVdziDt4Puof5fPt21noMQuHqDttdiWgb+Gs44qR66sGRKQpV M+2bGGmXDUmOFyuQonvTctYgBpA4+ENHXdhkSVVy/8PYt+gCfDejhvMhFymn0vC6RfeE K7+F/elewQBSEbGzbwikX0lyQtC2IwQRJtYOzZJGEnT885kVC/mI1UoPcK/YenwgqIj9 9zSuSDdj2RnkhUof7eMRqlUQzKdn7dSY0ZFdeatQ5UWZLYZdFv3j4Ibam8qQncByR71B ZIVR1gInQOn3iL9Sp83FYR7sXWs85XSPaKAVYuFHQTqfo5OxmrPAlZeKxsPwEL2N9uEr +icw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-disposition :in-reply-to:user-agent; bh=OYikPgnpq9lL3LG2mMxfmyzWb1q0v4xzMhXMB0PJfcg=; b=JeFFU4iFfwcP3CSll1V4K69yFnHTT1Af9H8Gqpg/+jP6lyHnN3eXLejXlHcunRJqZd JoLL12h/RFZCJ3nX1SgHDhYAXCJonh3gn0jcGVqKX+T5HihQVNsiW9E/6Gy8+39PJNgF XWnYy5yaqx0hFoZ1DeF2mBT8S78A6AfgZxMLzn8Y2CIzGOAK2JXcEeHD3NpeyIzVXCx9 oLRzMHSn7GdIFKB+sG+lX9Ua5K1TSx89ZaeB4s3qil//us50dAMQh2iP2po5NrEuStRZ BWBuUJm2BLdsp7ZvK0I2HsgjNz0wdVNp0CVQTKBJlpm0WqGr7cy1zqzxzi1/6ja8Ur+2 sM9w== X-Gm-Message-State: APjAAAXRUthA43YlImYLqbiM+nXFfeNuUerqkAvxJUXrxDAKlKTVvCqN eLgHw5UjZw6nbL3siUoRDPo= X-Google-Smtp-Source: APXvYqwQlbAEJ3IaJH6SQbP77rb1O/6H1+zLQzJ7zTr4RLx3vm790HBQr3fA/fd4Fy/F2Tb1zDYdPg== X-Received: by 2002:a1c:39d6:: with SMTP id g205mr7848480wma.85.1562164113026; Wed, 03 Jul 2019 07:28:33 -0700 (PDT) Date: Wed, 3 Jul 2019 15:28:31 +0100 From: Stephane Chazelas To: Marc Chantreux Cc: Zsh Users Subject: Re: zsh poor performances while reading and testing ? Message-ID: <20190703142831.dc762dlvgaxxvbmr@chaz.gmail.com> Mail-Followup-To: Marc Chantreux , Zsh Users References: <20190703135824.GA19289__20170.6622539618$1562162400$gmane$org@prometheus.u-strasbg.fr> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190703135824.GA19289__20170.6622539618$1562162400$gmane$org@prometheus.u-strasbg.fr> User-Agent: NeoMutt/20171215 2019-07-03 15:58:24 +0200, Marc Chantreux: [...] > i recently made a benchmark to emphasize the gain of speed > people can have using filters instead of pure shell loops. > however i was surprised to see how slow zsh is compared to > other shells when it comes to read data. > > the interesting part of the benchmark is: > > for it (bash zsh ksh) { > TIMEFMT=": $it %U %S %E" > time $it -c 'while read it; do : ; done < x > /dev/null' > } > > : bash 4,95s 1,21s 6,18s > : zsh 12,65s 28,12s 40,82s > : ksh 9,87s 26,52s 36,42s > > is there any obvious reason for that? is there a way to make > it faster without diving in the C code? [...] read reads one byte at a time so as not to read past the newline character. On seekable files bash and ksh implement an optimisation whereby they read more than one byte (128 bytes in bash IIRC) but seek back to the last newline. ksh93 goes even further in that it remembers the excess bytes it has read and reuses them for the next builtin commands, causing this kind of bug: https://github.com/att/ast/issues/15 You'll probably find that they are all as inefficient for non-seekable non-peekable input like pipes. IIRC, ksh93 implements "|" with socketpair() instead of pipe() on Linux so it can "peek" data to do this kind of optimisation $ strace ksh -c 'seq 20 | read' [...] socketpair(AF_UNIX, SOCK_STREAM, 0, [3, 4]) = 0 shutdown(4, SHUT_RD) = 0 fchmod(4, 0200) = 0 shutdown(3, SHUT_WR) = 0 fchmod(3, 0400) = 0 [...] fcntl(3, F_DUPFD, 0) = 0 [...] recvfrom(0, "1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14"..., 65536, MSG_PEEK, NULL, NULL) = 51 read(0, "1\n", 2) = 2 [...] Again that kind of optimisation can backfire and be invalid if there's more than one reader to the pipe (not to mention the problems on linux where /dev/fd/x doesn't work socketpairs). By the way, the syntax to read a line is IFS= read -r line not read line even in zsh. -- Stephane