caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Gerd Stolpmann <info@gerd-stolpmann.de>
To: Ian Zimmerman <itz@buug.org>
Cc: caml-list@inria.fr
Subject: Re: [Caml-list] Hashtbl or Set?
Date: Tue, 20 Sep 2011 12:13:07 +0200	[thread overview]
Message-ID: <1316513587.16477.11.camel@thinkpad> (raw)
In-Reply-To: <87sjnsl2f0.fsf@foolinux.dyndns.org>

Am Montag, den 19.09.2011, 13:09 -0700 schrieb Ian Zimmerman:
> I need a somewhat large (thousands) set of strings, created once during
> startup and never modified after.  What is a better choice, a (string,
> unit) Hashtbl.t or the Set module?  If the Set module still uses trees
> as it did when I was young :-), access will be logarithmic versus
> constant for Hashtbl.  But on the other hand a hash function must
> examine all of every string while the comparison of 2 strings stops at
> the first nonmatching character.
> 
> I am thinking about the time to build the set as well as the probing
> time.

Nothing has changed so far. Besides naive Hashtbl and Set you have at
least two more options: Hashtbl with self-defined hash function (i.e.
one that stops after the n-th char), and Set with a hash-based
comparison function (i.e. compare first the hashes of the strings, and
only if the hashes are equal, compare the strings). 

In my experience, for a large constant set a Hashtbl usually works best.
Set is usually only better when you can profit from functional updates.

Gerd

> 
> -- 
> Ian Zimmerman
> gpg public key: 1024D/C6FF61AD
> fingerprint: 66DC D68F 5C1B 4D71 2EE5  BD03 8A00 786C C6FF 61AD
> Rule 420: All persons more than eight miles high to leave the court.
> 

-- 
------------------------------------------------------------
Gerd Stolpmann, Darmstadt, Germany    gerd@gerd-stolpmann.de
Creator of GODI and camlcity.org.
Contact details:        http://www.camlcity.org/contact.html
Company homepage:       http://www.gerd-stolpmann.de
*** Searching for new projects! Need consulting for system
*** programming in Ocaml? Gerd Stolpmann can help you.
------------------------------------------------------------


      parent reply	other threads:[~2011-09-20 10:13 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-19 20:09 Ian Zimmerman
2011-09-19 20:17 ` Basile Starynkevitch
2011-09-20 10:13 ` Gerd Stolpmann [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1316513587.16477.11.camel@thinkpad \
    --to=info@gerd-stolpmann.de \
    --cc=caml-list@inria.fr \
    --cc=itz@buug.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).