From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id SAA06236 for caml-red; Mon, 11 Dec 2000 18:41:24 +0100 (MET) Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id EAA25937 for ; Mon, 11 Dec 2000 04:57:42 +0100 (MET) Received: from dynabook (h12-001.tokyu-net.catv.ne.jp [202.221.12.1]) by concorde.inria.fr (8.11.1/8.10.0) with ESMTP id eBB3vev01701 for ; Mon, 11 Dec 2000 04:57:41 +0100 (MET) Received: from dynabook ([127.0.0.1] helo=localhost ident=sumii) by dynabook with esmtp (Exim 3.12 #1 (Debian)) id 145K5g-0004U2-00; Mon, 11 Dec 2000 12:57:28 +0900 To: caml-list@inria.fr From: eijiro_sumii@anet.ne.jp Cc: gerd@gerd-stolpmann.de, sumii@venus.is.s.u-tokyo.ac.jp Subject: Re: substring match like "strstr" In-Reply-To: <0012101646090G.00625@ice> References: <00120814135508.00625@ice> <20001210221657W.sumii@yl.is.s.u-tokyo.ac.jp> <0012101646090G.00625@ice> X-Mailer: Mew version 1.94.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-Id: <20001211125727S.sumii@yl.is.s.u-tokyo.ac.jp> Date: Mon, 11 Dec 2000 12:57:27 +0900 X-Dispatcher: imput version 991025(IM133) Sender: weis@pauillac.inria.fr > I did not expect that OCamlB performs so well; so I suggested OcamlA and > regexp (which are both fast for the bytecode interpreter, too). I think it > depends also on the problem size (especially on the length of the substring). Right, I forgot to mention those -- in the benchmark, the "needles" (matched substrings) were 3 characters long and the "haystacks" (searched strings) were about 20 characters long; I used the native code compiler. For much larger input, regexp could possibly do better. Anyway, thank you for the helpful suggestions. // Eijiro Sumii (http://www.yl.is.s.u-tokyo.ac.jp/~sumii/) // // Ph.D. Student at Department of IS, University of Tokyo // Visiting Scholar at Department of CIS, University of Pennsylvania