From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.2 required=5.0 tests=AWL,MAILTO_TO_SPAM_ADDR autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by yquem.inria.fr (Postfix) with ESMTP id 144BABBC1 for ; Wed, 20 Feb 2008 16:54:16 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAAGrcu0fAbSoIh2dsb2JhbACQUQEBAQgKKYEUnkg X-IronPort-AV: E=Sophos;i="4.25,381,1199660400"; d="scan'208";a="9394056" Received: from einhorn.in-berlin.de ([192.109.42.8]) by mail3-smtp-sop.national.inria.fr with ESMTP; 20 Feb 2008 16:54:12 +0100 X-Envelope-From: oliver@first.in-berlin.de X-Envelope-To: Received: from einhorn.in-berlin.de (localhost [127.0.0.1]) by einhorn.in-berlin.de (8.13.6/8.13.6/Debian-1) with ESMTP id m1KFsB4q028684 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT) for ; Wed, 20 Feb 2008 16:54:11 +0100 Received: (from www-data@localhost) by einhorn.in-berlin.de (8.13.6/8.13.6/Submit) id m1KFsB94028682 for caml-list@yquem.inria.fr; Wed, 20 Feb 2008 16:54:11 +0100 X-Authentication-Warning: einhorn.in-berlin.de: www-data set sender to oliver@first.in-berlin.de using -f Received: from dslb-088-073-125-137.pools.arcor-ip.net (dslb-088-073-125-137.pools.arcor-ip.net [88.73.125.137]) by webmail.in-berlin.de (IMP) with HTTP for ; Wed, 20 Feb 2008 16:54:11 +0100 Message-ID: <1203522851.47bc4d2310d0c@webmail.in-berlin.de> Date: Wed, 20 Feb 2008 16:54:11 +0100 From: Oliver Bandel To: caml-list@yquem.inria.fr Subject: Re: [Caml-list] large hash tables References: <33d2b3f70802191501v39346a56x4b1852b84d4067b4@mail.gmail.com> In-Reply-To: <33d2b3f70802191501v39346a56x4b1852b84d4067b4@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit User-Agent: Internet Messaging Program (IMP) 3.2.6 X-Scanned-By: MIMEDefang_at_IN-Berlin_e.V. on 192.109.42.8 X-Spam: no; 0.00; bandel:01 in-berlin:01 hash:01 hash:01 ocaml:01 oliver:01 oliver:01 integer:01 integer:01 caml-list:01 caml:02 btw:03 float:03 element:03 naive:03 Zitat von John Caml : > I need to load roughly 100 million items into a memory based hash > table, > where each key is an integer and each value is an integer and a > float. Under > C++ stdlib's map library, this requires 800 MB of memory. Under my > naive [...] Did you tried Map-Module in OCaml? Or would it need more mem than Hash-module per element? At least there would be no resizing operations... For my applications I have habitted myself to use Map instead of Hash, and it worked quite good and performant even on more or less large files. BTW: how big are the files in use to need 800 MB memeory with the C++ implementation? Ciao, Oliver