zsh-users
 help / color / mirror / code / Atom feed
* while read; problems
@ 2005-04-22 12:22 Mariusz Gniazdowski
  2005-04-22 14:36 ` DervishD
  2005-04-25 10:14 ` Peter Stephenson
  0 siblings, 2 replies; 3+ messages in thread
From: Mariusz Gniazdowski @ 2005-04-22 12:22 UTC (permalink / raw)
  To: zsh-users

Hello.
I wrote function for searching longest line:

function count {
        local line max=0 idx=0 maxidx=0
        while read line; do
                line=${#line}
                (( idx++ ))
                [[ $max -lt $line ]] && { max=$line; maxidx=$idx; }
        done < "$1"
        echo "$max in line nr: $maxidx" 
}                         

First problem: this runs much faster on bash.

zsh# du -sh somefile.bin 
6,9M    somefile.bin
zsh# time ( count somefile.bin )
2,63s user 8,51s system 96% cpu 11,585 total


bash# time { count somefile.bin ;}
2501 in line nr: 9106

real    0m4.061s
user    0m3.682s
sys     0m0.222s


So 4 seconds compared to 11.5. There were tests where bash finished in 6 seconds and zsh in 21.


Second problem: zsh results differ from run to run. I had results like:
6097 in line nr: 169
6553 in line nr: 300
etc.

Bash always returns one result: 2501 in line nr: 9106

This file was binary, but i did'nt output it, or pass it to any function (in variable). It was 
only: readed line by line and counted.

-- 
Mariusz Gniazdowski

----------------------------------------------------------------------
PS. Zdjecia samochodow, bardzo duzo, bardzo fajne galerie... 
>>> http://link.interia.pl/f1877


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: while read; problems
  2005-04-22 12:22 while read; problems Mariusz Gniazdowski
@ 2005-04-22 14:36 ` DervishD
  2005-04-25 10:14 ` Peter Stephenson
  1 sibling, 0 replies; 3+ messages in thread
From: DervishD @ 2005-04-22 14:36 UTC (permalink / raw)
  To: zsh-users

    Hi Mariusz :)

 * Mariusz Gniazdowski <cellsan@interia.pl> dixit:
> function count {
>         local line max=0 idx=0 maxidx=0
>         while read line; do
>                 line=${#line}
>                 (( idx++ ))
>                 [[ $max -lt $line ]] && { max=$line; maxidx=$idx; }
>         done < "$1"
>         echo "$max in line nr: $maxidx" 
> }                         

> First problem: this runs much faster on bash.

    Don't know why :? Here your function applied to a binary file
31MB in size results in 26 seconds in zsh, 14 in bash, so it's true,
the difference is noticeable. Unfortunately not both shells consider
a 'line' the same in a binary file (more in this subject below), and
bash may be reading longer lines and doing much less I/O :???
 
> Second problem: zsh results differ from run to run. I had results like:
> 6097 in line nr: 169
> 6553 in line nr: 300
> etc.

    Not reproduceable here. I always have the same, consistent
result.
 
> This file was binary, but i did'nt output it, or pass it to any
> function (in variable). It was only: readed line by line and
> counted.

    The problem with binary files is: what's a line? Some arbitrary
stream of characters until a '\n' is found? Some arbitary stream of
characters that ends in a '\0'? Your measurement is incorrect because
you don't make sure what's in $IFS, and 'read' uses $IFS. You cannot
do such things on binary files from a shell script.

    Just out of curiosity: why are you messing with lines in a binary
file? Maybe you can do what you want in a faster and more portable
way. If you tell us, maybe we can help :)

    Raúl Núñez de Arenas Coronado

-- 
Linux Registered User 88736
http://www.dervishd.net & http://www.pleyades.net/
It's my PC and I'll cry if I want to...


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: while read; problems
  2005-04-22 12:22 while read; problems Mariusz Gniazdowski
  2005-04-22 14:36 ` DervishD
@ 2005-04-25 10:14 ` Peter Stephenson
  1 sibling, 0 replies; 3+ messages in thread
From: Peter Stephenson @ 2005-04-25 10:14 UTC (permalink / raw)
  To: Mariusz Gniazdowski, zsh-users

Mariusz Gniazdowski wrote:
> function count {
>         local line max=0 idx=0 maxidx=0
>         while read line; do
>                 line=${#line}
>                 (( idx++ ))
>                 [[ $max -lt $line ]] && { max=$line; maxidx=$idx; }
>         done < "$1"
>         echo "$max in line nr: $maxidx" 
> }                         
> 
> Second problem: zsh results differ from run to run. I had results like:
> 6097 in line nr: 169
> 6553 in line nr: 300
> etc.

Luckily, with debugging turned on, this output an error message
so I could find it without too much trouble.  It's yet another problem
with Meta characters.  (If this doesn't fix it report it again.)

Note that the bug turned up in the code which strips trailing whitespace
from the line.  This is the correct behaviour, but it's a not clear it's
what you want in a binary file.  You may need to set IFS to empty, as
DervishD mentioned.

Further discussion on the bug should go to zsh-workers.

Index: Src/builtin.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/builtin.c,v
retrieving revision 1.136
diff -u -r1.136 builtin.c
--- Src/builtin.c	9 Mar 2005 17:14:06 -0000	1.136
+++ Src/builtin.c	25 Apr 2005 10:09:21 -0000
@@ -4747,8 +4747,17 @@
 	}
 	signal_setmask(s);
     }
-    while (bptr > buf && iwsep(bptr[-1]))
-	bptr--;
+    while (bptr > buf) {
+	if (bptr > buf + 1 && bptr[-2] == Meta) {
+	    if (iwsep(bptr[-1] ^ 32))
+		bptr -= 2;
+	    else
+		break;
+	} else if (iwsep(bptr[-1]))
+	    bptr--;
+	else
+	    break;
+    }
     *bptr = '\0';
     if (resettty && SHTTY != -1)
 	settyinfo(&saveti);

-- 
Peter Stephenson <pws@csr.com>                  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK                          Tel: +44 (0)1223 692070


**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

**********************************************************************


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2005-04-25 10:15 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-04-22 12:22 while read; problems Mariusz Gniazdowski
2005-04-22 14:36 ` DervishD
2005-04-25 10:14 ` Peter Stephenson

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).