caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Pierre Weis <pierre.weis@inria.fr>
To: guttman@mitre.org
Cc: pierre.weis@inria.fr, caml-list@inria.fr, guttman@mitre.org
Subject: Re: [Caml-list] Cross-platform DBM equivalent?
Date: Fri, 27 Dec 2002 14:07:29 +0100 (MET)	[thread overview]
Message-ID: <200212271307.OAA25656@pauillac.inria.fr> (raw)
In-Reply-To: <nhaisxgg1ai.fsf@banjara.mitre.org> from "Joshua D. Guttman" at "Dec 26, 102 12:03:49 pm"

> Pierre Weis <pierre.weis@inria.fr> writes:
> 
> >   As far as I know the best (and simpler) way to do this for reasonable
> >   number of URLs bindings (say thousands but not millions) is to create
> >   a Hashtlbl.t or Map.t and dump it to file using output_value (then
> >   read it back with input_value).
> 
> Is there a recommended data structure in case one needs tables for
> reasonably fast access to millions or tens of millions of values?
> Probably hash tables are no longer providing nearly-constant access
> time at those sizes.  Is there something better in the standard
> library?   
> 
> Thanks --
> 
>         Joshua 

You need to try :) I think that hash table and maps can handle
tens of millions of values if you have enough memory available. If
your hash tables are big enough and if your keys are reasonably
different strings, hash table will still give you nearly-constant
access time.

If you have not enough memory, you should consider mmap facilities
from the Bigarray module (memory mapping of files, accessed from the
disk by need when a given page is indeed accessed by the
application).

All the best for the next year!

Pierre Weis

INRIA, Projet Cristal, Pierre.Weis@inria.fr, http://pauillac.inria.fr/~weis/


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


  reply	other threads:[~2002-12-27 13:07 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-12-26  7:17 Matt Gushee
2002-12-26  8:39 ` Alessandro Baretta
2002-12-26 10:05   ` Matt Gushee
2002-12-26 16:50     ` Pierre Weis
2002-12-26 17:03       ` Joshua D. Guttman
2002-12-27 13:07         ` Pierre Weis [this message]
2002-12-26 17:08       ` David Brown
2002-12-26 18:23       ` Stefano Zacchiroli
2002-12-27 13:11         ` Pierre Weis
2003-01-12 10:13           ` Sven Luther
2002-12-26 19:20       ` Dmitry Bely
2002-12-27 13:19         ` Pierre Weis
2002-12-27 18:03           ` brogoff
2002-12-27  7:21       ` Matt Gushee
2002-12-26 20:00 ` Yaron M. Minsky
2003-01-02 10:03 ` Xavier Leroy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200212271307.OAA25656@pauillac.inria.fr \
    --to=pierre.weis@inria.fr \
    --cc=caml-list@inria.fr \
    --cc=guttman@mitre.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).