From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Sat, 25 Jul 2009 10:30:02 +0100 From: Ethan Grammatikidis To: 9fans@9fans.net Message-Id: <20090725103002.ee62f133.eekee57@fastmail.fm> In-Reply-To: References: <80c99e790907240136p46bcf8ebn6232bbd85f25d21c@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [9fans] plan9port tools speed Topicbox-Message-UUID: 2cbfc944-ead5-11e9-9d60-3106f5b1d025 On Fri, 24 Jul 2009 09:41:41 -0400 erik quanstrom wrote: > > I've just installed the plan9port as described here ( > > http://swtch.com/plan9port/man/man1/install.html) on a debian box. > > I was comparing the speed of some commands between the plan9 and the GNU > > version, and I get consistently poorer results for the plan9 ones. > > 'grep' for example, is at least twice as slow as its GNU counterpart. > > on my 64-bit system grepping through linux > source, i do see the same performance difference > you see. > > ; pwd ; which grep > /usr/src/linux-2.6.29-gentoo-r5 > /home/quanstro/plan9/bin/grep > ; for(f in grep /bin/grep)find .|grep '\.[ch]'| > xargs time $f -i 'plan[ ]*9'>/dev/null|[2] > awk '{a+=$1; b+=$2; c+=$3} END {print a "\t" b "\t"c}' > 1.08 0.24 1.36 > 0.46 0.31 0.79 > > but this is not a fair comparison. gnu > grep should be using ascii since none of > the local env variables have been set while > p9p grep is using utf-8. let's level the playing > field: > > ; ; LANG=en_US.UTF-8 for(f in grep /bin/grep)find .| > grep '\.[ch]'|xargs time $f -i 'plan[ ]*9'>/dev/null|[2] > awk '{a+=$1; b+=$2; c+=$3} END {print a "\t" b "\t"c}' > 1.07 0.25 1.37 > 17.13 0.28 17.43 > > this is actually a great improvement. gnu grep used to > be 80x slower for utf-8 locales, now it's only 40x slower. > > - erik > Try LC_ALL=en_GB.UTF-8 for some wierd, wierd fun with gnu grep: $ wc -l deep-file-list 470485 deep-file-list $ 9 grep ethan deep-file-list |wc -l 428065 $ time grep ethan deep-file-list > /dev/null real 4m29.491s user 4m29.366s sys 0m0.080s $ time grep -F ethan deep-file-list > /dev/null real 4m27.740s user 4m27.576s sys 0m0.070s $ time awk '/ethan/ {print}' deep-file-list > /dev/null real 0m2.597s user 0m2.570s sys 0m0.017s $ time sed -n /ethan/p deep-file-list > /dev/null real 0m0.294s user 0m0.273s sys 0m0.020s $ time 9 grep ethan deep-file-list > /dev/null real 0m0.155s user 0m0.140s sys 0m0.017s Note fixed pattern and discarded output. Those are fairly average timings. They rank gnu grep at 1700 times slower than unstripped p9p grep. :) Note that awk and sed there are gnu awk and sed, and both are operating under the same LC_ALL=en_GB.UTF-8 environment. Gnu sed comes in at 900 times faster than gnu grep, and awk at 100 times. I took some more timings after some correspondance with bug-grep@gnu.org. I do recall gnu grep was twice as fast as p9p grep when given a plain ascii environment, but I haven't kept other results. I don't know if the gnu grep maintainers are looking for a fix, or even if they consider this extreme slowness a problem at all. It didn't sound like it when I corresponded with them, but I guess that could simply mean they didn't want to discuss it. -- Ethan Grammatikidis Those who are slower at parsing information must necessarily be faster at problem-solving.