From mboxrd@z Thu Jan  1 00:00:00 1970
Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id NAA06730 for caml-red; Fri, 15 Dec 2000 13:52:50 +0100 (MET)
Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id WAA17084 for <caml-list@pauillac.inria.fr>; Thu, 14 Dec 2000 22:12:07 +0100 (MET)
Received: from math.berkeley.edu (gold.Math.Berkeley.EDU [169.229.58.61])
	by nez-perce.inria.fr (8.11.1/8.10.0) with ESMTP id eBELC5L04229
	for <caml-list@inria.fr>; Thu, 14 Dec 2000 22:12:06 +0100 (MET)
Received: from yuban-c.math.Berkeley.EDU (blue1.Math.Berkeley.EDU [169.229.58.58])
	by math.berkeley.edu (8.9.3/8.9.3) with ESMTP id NAA02999;
	Thu, 14 Dec 2000 13:12:04 -0800 (PST)
Received: (from datta@localhost)
	by yuban-c.math.Berkeley.EDU (8.9.3/8.9.3) id NAA00172;
	Thu, 14 Dec 2000 13:12:04 -0800 (PST)
Date: Thu, 14 Dec 2000 13:12:04 -0800
From: Ruchira Datta <datta@math.berkeley.edu>
To: caml-list@inria.fr
Subject: Re: substring match like "strstr"
Message-ID: <20001214131204.A29664@blue1.berkeley.edu>
Reply-To: datta@math.berkeley.edu
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
Sender: weis@pauillac.inria.fr

John Prevost wrote:
>I just grabbed the glibc version, and found the following comment at
>the head of it:

>/*
> * My personal strstr() implementation that beats most other algorithms.
> * Until someone tells me otherwise, I assume that this is the
> * fastest implementation of strstr() in C.
> * I deliberately chose not to comment it.  You should have at least
> * as much fun trying to understand it, as I had to write it :-).
> *
> * Stephen R. van den Berg, berg@pool.informatik.rwth-aachen.de */

>Since the text of the function involves not only nasty gotos
>(particularly frightening is "goto shloop;" but also pointer tricks
>and register declarations, I'd say it's to be expected that it'll
>outperform O'Caml.  I am, however, seeing if I can figure out the
>essential algorithm used (if there's really anything smart to it) and
>will let you know if I come up with anything extra special.

The frightening tricks are shocking to us functional programmers :-) but
once you get over your shock and read through it, you won't have any
problem understanding it.  At each point, don't ask yourself ``why is
this statement written in this obfuscated way?''  Just assume as an
article of faith that this incantation is invoked to propitiate the
Almighty Assembler :-).  Instead ask yourself ``what does this statement
do?''  That is fairly simple...

There is absolutely nothing smart about the algorithm; it is exactly
the brute-force one.  For example, the first character of the needle
is b, and the second character is c.  Suppose we have matched b and
c and then find that we don't match the whole needle.  Then we back
up our haystack pointer all the way back to the character after the
last potential match, i.e., to the character that matched c, and start 
from the beginning again by checking if it matches b.  We do this every 
time regardless of whether b and c are different.  I guess this is to 
save one register variable.

The speed comes from the low-level assembly language optimization; C is
being used as portable assembler.  I am sure it took a lot of ingenuity
and very intimate knowledge of the gcc compiler to do such a low-level
optimization while remaining in portable C.  But it is not something for
OCaml users to emulate.

Ruchira Datta
datta@math.berkeley.edu