caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Chris Hecker <checker@d6.com>
To: gerd@gerd-stolpmann.de, eijiro_sumii@anet.ne.jp, caml-list@inria.fr
Cc: gerd@gerd-stolpmann.de, sumii@venus.is.s.u-tokyo.ac.jp
Subject: Re: substring match like "strstr"
Date: Mon, 11 Dec 2000 21:06:16 -0800	[thread overview]
Message-ID: <4.3.2.7.2.20001211202018.00b3f510@shell16.ba.best.com> (raw)
In-Reply-To: <0012112341150J.00625@ice>


>  let i_limit = strlen - patlen in

Ah, duh.  Good idea.

Here are some new timings, including the _imp2 (I fixed a minor fencepost bug in yours in an admittedly cheesy way; the code's below):

Under Win2k cmd.exe:
strstr total = 0.610000 (5,0,19,13)
strstr_imp total = 0.593000 (5,0,19,13)
strstr_imp2 total = 0.563000 (5,0,19,13)
strstr_c total = 0.469000 (5,0,19,13)

Under Win98 command.com:
strstr total = 1.460000 (5,0,19,13)
strstr_imp total = 1.020000 (5,0,19,13)
strstr_imp2 total = 0.955000 (5,0,19,13)
strstr_c total = 0.800000 (5,0,19,13)

Under Win98 emacs shell-command:
strstr total = 1.125000 (5,0,19,13)
strstr_imp total = 1.025000 (5,0,19,13)
strstr_imp2 total = 0.960000 (5,0,19,13)
strstr_c total = 0.800000 (5,0,19,13)

I doubled the number of iterations to try to figure out the emacs/command.com difference, but to no avail.

However, then I decided to check to see if this was actually the strstr in the library, so instead of calling that c code, I called string.h's strstr.  I discovered that there's actually an intel-asm version of strstr that's compiled into the crt:

strstr total = 1.130000 (5,0,19,13)
strstr_imp total = 1.020000 (5,0,19,13)
strstr_imp2 total = 0.960000 (5,0,19,13)
strstr_c total = 0.365000 (5,0,19,13)

Ouch!  We're back to 3x slower.  But hey, no one said ocaml could compete with asm.  :)

Anyway, as it stands, the ocaml code is 17% slower than the C code at relatively comparable code-simplicity.  I'd say the ocaml code is a bit harder to understand and wordier, but that's mostly because ocaml doesn't have modifyable pointers, and the C code just walks the pointers through the array.  Also, the C has an early-out return that ocaml can't do (without an exception), and that makes things a little simpler as well.

Of course, I'm not saying you can extrapolate anything about real programs from this simple example, but it's interesting to me nonetheless.

Chris

Here's the updated strstr_imp2.  If you really wanted to try to max out the ocaml, we could read the pattern into an int and try to match it a word at a time, etc.

(* "optimized" imperative strstr *)
let strstr_imp2 pat str = 
  let strlen = String.length str in
  let patlen = String.length pat and
      i = ref 0 and
      f = ref strlen in
  let limit = strlen - patlen + 1 in
  while !i < limit do
    let pati = ref 0 and
        j = ref !i in
    while !pati < patlen && 
      String.unsafe_get str !j == String.unsafe_get pat !pati do
      pati := succ !pati;
      j := succ !j;
    done;
    if !pati = patlen then
      begin f := !i; i := strlen end
    else
      i := succ !i
  done;
  !f



  reply	other threads:[~2000-12-12  9:32 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2000-12-08  6:04 eijiro_sumii
2000-12-08 12:57 ` Gerd Stolpmann
2000-12-10 13:16   ` eijiro_sumii
2000-12-10 15:39     ` Gerd Stolpmann
2000-12-11  3:57       ` eijiro_sumii
2000-12-12 13:58       ` Julian Assange
2000-12-11 21:07     ` Chris Hecker
2000-12-11 22:22       ` Gerd Stolpmann
2000-12-12  5:06         ` Chris Hecker [this message]
2000-12-12 12:28           ` Jean-Christophe Filliatre
2000-12-13 10:02             ` eijiro_sumii
2000-12-13 10:17               ` Eijiro Sumii
2000-12-13 10:53               ` Julian Assange
2000-12-13 13:28                 ` Eijiro Sumii
2000-12-12  3:28       ` eijiro_sumii
2000-12-13  1:12         ` John Prevost
2000-12-13  2:35           ` Chris Hecker
2000-12-12 10:07       ` Sven LUTHER
2000-12-14  3:36       ` eijiro_sumii
2000-12-14  6:48         ` Chris Hecker
2000-12-14  8:02           ` eijiro_sumii
2000-12-14 21:53             ` Stephan Tolksdorf
2000-12-14 21:12 Ruchira Datta

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4.3.2.7.2.20001211202018.00b3f510@shell16.ba.best.com \
    --to=checker@d6.com \
    --cc=caml-list@inria.fr \
    --cc=eijiro_sumii@anet.ne.jp \
    --cc=gerd@gerd-stolpmann.de \
    --cc=sumii@venus.is.s.u-tokyo.ac.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).