caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Hal Daume III <hdaume@ISI.EDU>
To: Caml Mailing List <caml-list@inria.fr>
Subject: [Caml-list] speed, loops vs. hofs
Date: Mon, 28 Apr 2003 10:13:17 -0700 (PDT)	[thread overview]
Message-ID: <Pine.GSO.4.21.0304280958590.20131-100000@moussor.isi.edu> (raw)

One of the primary reasons I use OCaml is because it offers speed,
together with nice functional stuff, like HOFs.  However, it seems that
using these together might be impossible.

I have a module which allows iteration over a particular datatype.  I have
a method of doing this iteration very quickly using for loops.  In
particular, I have a function of type:

  loopPre : hmm -> node -> (node -> int -> bool -> unit) -> unit

which loops over the predecessors of the given node in the HMM.  The
implementation basically looks like:

  let loopPre hmm node action = 
    let ... some definitions ...
      begin
        action {...make the node...} 0 (...make the bool...);
        for i = 0 to num_nodes do
          action {...make the node...} i (...make the bool...);
          for j = 0 to {something} do
            action {...make the node...} j (...make the bool...);
          done;
        done;

basically, there are a few calls to 'action' in a pair of nested
loops.  Then, at the usage point, I do something like:

  ...1...
  loopPre hmm node_j
    (fun nd len null ->
      ...2...
    )
  ...3...

I had a sneaking feeling that if I actually stuck the for loops into the
original code I would get speedups.  However, "...2..." is pretty long, so
there were be (a) a bunch of code duplication and (b) it would be hard to
maintain.

In order to test this, I wrote the following program:

% cat loops.ml
let loop low high f =
  begin
    f 0 true;

    for i=low to high do
      f i false;
    done;

    f (-1) true;
  end

% cat useLoops.ml
open Loops

let top = 500000000

let cnt = ref 0
let pre1 = Sys.time ()
let _ = loop 1 top (fun i b -> if not b then cnt := !cnt + 1)
let pst1 = Sys.time ()

let cnt = ref 0
let pre2 = Sys.time ()
let _ = 
  let f i b = if not b then cnt := !cnt + 1 in
    loop 1 top f
let pst2 = Sys.time ()


let cnt = ref 0
let pre3 = Sys.time ()
let _ = 
  begin
    let i = 0 in let b = true in
      if not b then cnt := !cnt + 1;
      
      for i=1 to top do
        let b = false in
          if not b then cnt := !cnt + 1;
      done;

      let i = -1 in let b = true in
        if not b then cnt := !cnt + 1;
  end
let pst3 = Sys.time ()

let cnt = ref 0
let pre4 = Sys.time ()
let _ = 
  let f i b = if not b then cnt := !cnt + 1 in
    begin
      let i = 0 in let b = true in
        f i b;

        for i=1 to top do
          let b = false in
            f i b;
        done;

        let i = -1 in let b = true in
          f i b;
    end
let pst4 = Sys.time ()

let _ = Printf.printf "Time 1 = %f sec\n" (pst1-.pre1)
let _ = Printf.printf "Time 2 = %f sec\n" (pst2-.pre2)
let _ = Printf.printf "Time 3 = %f sec\n" (pst3-.pre3)
let _ = Printf.printf "Time 4 = %f sec\n" (pst4-.pre4)

%


Here, there are four loops done.  The first two use the HO loop function
from the Loops module (which is very much like my iterPre function); the
last two use loops.  One of them inserts the funciton definition at every
usage; the second does a let-bound definition.

For loops of 500 million elements, the timings come out:

Time 1 = 10.230000 sec
Time 2 = 10.230000 sec
Time 3 = 2.610000 sec
Time 4 = 7.020000 sec

on my x86 linux box.

I find these numbers very disturbing.  First, just looking at the
discrepency between 3 and 4.  I'd imagine this happens because in (3), the
"let b = false in if not b..." gets folded down to just "cnt := !cnt + 1".  
To test, this I also ran:

let _ = 
  let f i b = cnt := !cnt + 1 in
  let g i b = () in
    begin
      let i = 0 in let b = true in
	g i b;
	
	for i=1 to top do
	  let b = false in
	    f i b;
	done;
	
	let i = -1 in let b = true in
	  g i b;
    end

where the unfolding is done manually.  This implementation came in at:

Time 5 = 3.570000 sec

This seems to imply that I will triple the speed of my program by making
it (a) non-functional and (b) unreadable.  Is this true?  Is there another
way to get around this problem?

 - Hal


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


             reply	other threads:[~2003-04-28 17:13 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-04-28 17:13 Hal Daume III [this message]
2003-04-29 14:39 ` [Caml-list] " Andrzej M. Ostruszka
2003-04-30  0:45 ` [Caml-list] " Manos Renieris

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.GSO.4.21.0304280958590.20131-100000@moussor.isi.edu \
    --to=hdaume@isi.edu \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).