From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id TAA23652 for caml-red; Thu, 14 Dec 2000 19:23:37 +0100 (MET) Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id OAA26645 for ; Wed, 13 Dec 2000 14:29:12 +0100 (MET) Received: from dynabook (h12-001.tokyu-net.catv.ne.jp [202.221.12.1]) by concorde.inria.fr (8.11.1/8.10.0) with ESMTP id eBDDTAH09355 for ; Wed, 13 Dec 2000 14:29:10 +0100 (MET) Received: from dynabook ([127.0.0.1] helo=localhost ident=sumii) by dynabook with esmtp (Exim 3.12 #1 (Debian)) id 146BxQ-00008C-00; Wed, 13 Dec 2000 22:28:32 +0900 To: proff@iq.org Cc: caml-list@inria.fr, sumii@venus.is.s.u-tokyo.ac.jp Subject: Re: substring match like "strstr" In-Reply-To: References: <14902.6640.301542.507040@pc803> <20001213190207A.sumii@yl.is.s.u-tokyo.ac.jp> X-Mailer: Mew version 1.94.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-Id: <20001213222831B.sumii@yl.is.s.u-tokyo.ac.jp> Date: Wed, 13 Dec 2000 22:28:31 +0900 From: Eijiro Sumii X-Dispatcher: imput version 991025(IM133) Sender: weis@pauillac.inria.fr Julian Assange wrote: > All of these more sophisticated methods are only going to be a win > when you are searching for larger strings. strstr is brutally simple, > has little setup/cleanup time, and hence is hard to beat for short > strings. That is completely correct, and the substrings in my application are indeed short (3 characters), but: eijiro_sumii@anet.ne.jp wrote: | Since my program calls the function more than 3000 times for _each_ | pattern, I did partial application, of course. Even so, kmp in OCaml | still performed much worse than strstr in libc (both Sun's and GNU's), | at least in this program. I expected that 3000 strings for 1 substring, each of which are about 20 characters long as I mentioned before, are enough for paying the "setup/cleanup" overhead. (On the other hand, strstr in libc doesn't benefit from the partial application, of course.) Eijiro