caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Jon Harrop <jon@ffconsultancy.com>
To: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] What is a match statement translated into?
Date: Sat, 18 Aug 2007 11:22:06 +0100	[thread overview]
Message-ID: <200708181122.06670.jon@ffconsultancy.com> (raw)
In-Reply-To: <1A6B8F53-FDA5-4F37-BA08-1FBD9CF09E7B@gmail.com>

On Saturday 18 August 2007 07:55:24 Joel Reymont wrote:
> Folks,
>
> Is pattern-matching code in OCaml expanded into threaded code (pre-
> computed branch table) or something analogous to the C switch
> statement (lots of branching)?
>
> How do I find out?

Use:

  ocamlopt -S test.ml

and look at the generated assembler.

> I suspect this should be quite optimized but I haven't tried dumping
> disassembling native-compiled OCaml yet and I wonder if there's a
> simpler approach.

Depends what type you are matching over. Pattern matches over variant types 
(including nesting) are compiled extremely efficiently (dispatch table). 
Polymorphic variants, ints and floats are efficient (balanced trees of 
comparisons). Strings are very inefficient (sequence of string compares).

For example, finding a Java keyword:

let kwd = function
  |"abstract" |"continue" |"for" |"new" |"switch"
  |"assert" |"default" |"goto" |"package" |"synchronized"
  |"boolean" |"do" |"if" |"private" |"this"
  |"break" |"double" |"implements" |"protected" |"throw"
  |"byte" |"else" |"import" |"public" |"throws"
  |"case" |"enum" |"instanceof" |"return" |"transient" 
  |"catch" |"extends" |"int" |"short" |"try"
  |"char" |"final" |"interface" |"static" |"void"
  |"class" |"finally" |"long" |"strictfp" |"volatile"
  |"const" |"float" |"native" |"super" |"while" -> true
  | _ -> false

let () = match Sys.argv with
  | [|_|] ->
      for i=1 to 1000000 do
	ignore(kwd "while")
      done
  | _ ->
      let kwds = Hashtbl.create 1 in
      List.iter (fun kwd -> Hashtbl.add kwds kwd ())
	["abstract" ;"continue" ;"for" ;"new" ;"switch"
	;"assert" ;"default" ;"goto" ;"package" ;"synchronized"
	;"boolean" ;"do" ;"if" ;"private" ;"this"
	;"break" ;"double" ;"implements" ;"protected" ;"throw"
	;"byte" ;"else" ;"import" ;"public" ;"throws"
	;"case" ;"enum" ;"instanceof" ;"return" ;"transient" 
	;"catch" ;"extends" ;"int" ;"short" ;"try"
	;"char" ;"final" ;"interface" ;"static" ;"void"
	;"class" ;"finally" ;"long" ;"strictfp" ;"volatile"
	;"const" ;"float" ;"native" ;"super" ;"while"];
      for i=1 to 1000000 do
	try Hashtbl.find kwds "while" with Not_found -> ()
      done

The pattern match is 4.1x slower that the hash table lookup on my machine.

Contrast this with the worst case for balanced trees, which is probably floats 
because the literals are loaded from memory into registers and the difference 
between fastest and slowest is only 35% and the average improvement over 
string matching is 33x faster.

So pattern matching over strings in performance critical code is a very bad 
idea with the current OCaml implementation. There are ways to work around it, 
of course, but they are non-trivial.

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
OCaml for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists/?e


      parent reply	other threads:[~2007-08-18 10:33 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-08-18  6:55 Joel Reymont
2007-08-18  7:42 ` [Caml-list] " Jacques GARRIGUE
2007-08-18 10:22 ` Jon Harrop [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200708181122.06670.jon@ffconsultancy.com \
    --to=jon@ffconsultancy.com \
    --cc=caml-list@yquem.inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).