From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by yquem.inria.fr (Postfix) with ESMTP id 3B255BBC1 for ; Wed, 20 Feb 2008 00:34:26 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ao8CAPv1ukfVujhf/2dsb2JhbACvGA X-IronPort-AV: E=Sophos;i="4.25,378,1199660400"; d="asc'?scan'208";a="8295664" Received: from witko.kerneis.info ([213.186.56.95]) by mail1-smtp-roc.national.inria.fr with ESMTP; 20 Feb 2008 00:34:25 +0100 Received: from [2002:52e0:d712:5678:20c:f1ff:fe40:edcf] (helo=tatanka.kerneis.info) by witko.kerneis.info with esmtpsa (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.63) (envelope-from ) id 1JRbyj-0007sr-9X for caml-list@yquem.inria.fr; Wed, 20 Feb 2008 00:34:25 +0100 Received: from localhost.kerneis.info ([127.0.0.1] helo=tatanka.kerneis.info) by tatanka.kerneis.info with esmtp (Exim 4.69) (envelope-from ) id 1JRbyh-00022E-CB for caml-list@yquem.inria.fr; Wed, 20 Feb 2008 00:34:23 +0100 Date: Wed, 20 Feb 2008 00:34:14 +0100 From: Gabriel Kerneis To: caml-list@yquem.inria.fr Subject: Re: [Caml-list] large hash tables Message-ID: <20080220003414.7dd5e353@tatanka.kerneis.info> In-Reply-To: <33d2b3f70802191501v39346a56x4b1852b84d4067b4@mail.gmail.com> References: <33d2b3f70802191501v39346a56x4b1852b84d4067b4@mail.gmail.com> Organization: ENST X-Mailer: Claws Mail 3.3.0 (GTK+ 2.12.8; i486-pc-linux-gnu) Mime-Version: 1.0 Content-Type: multipart/signed; boundary="Sig_/mZxT0C5orEnK.fzYdHoK+UN"; protocol="application/pgp-signature"; micalg=PGP-SHA1 X-SA-Exim-Connect-IP: 2002:52e0:d712:5678:20c:f1ff:fe40:edcf X-SA-Exim-Mail-From: kerneis@enst.fr X-SA-Exim-Scanned: No (on witko.kerneis.info); SAEximRunCond expanded to false X-Spam: no; 0.00; hash:01 hashtbl:01 pcre:01 hashtbl:01 argv:01 777777:98 stack:01 rec:01 exception:01 caml-list:01 match:02 string:02 sys:03 float:03 let:03 X-Attachments: type="application/pgp-signature" name="signature.asc" name="signature.asc" --Sig_/mZxT0C5orEnK.fzYdHoK+UN Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Hello, it won't solve any of you problems but I have a few remarks on your code: > exception SplitError >=20 > let read_whole_chan chan =3D > let movieMajor =3D Hashtbl.create 777777 in Here, movieMajor in declared inside read_whole_chan and never returned, so you won't be able to use it (but it might as well be a test program I guess). > let rec loadLines count =3D > let line =3D input_line chan in > let murList =3D Pcre.split line in > match murList with > | m::u::r::[] -> > let rFloat =3D float_of_string r in > Hashtbl.add movieMajor m (u, rFloat); > loadLines (count + 1) You use tail-recursion properly so your Stack overflow doesn't come from there. No idea why it occurs, though... > | _ -> raise SplitError > in >=20 > try > loadLines 0 > with > End_of_file -> () > ;; >=20 > let read_whole_file filename =3D > let chan =3D open_in filename in > read_whole_chan chan > ;; You should close the chan when you're done. > let filename =3D Sys.argv.(1);; >=20 > let str =3D read_whole_file filename;; Regards, --=20 Gabriel Kerneis --Sig_/mZxT0C5orEnK.fzYdHoK+UN Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFHu2d96a2JmXQu5bYRAngUAJ9+eSv0X4ZXtAhWZ8LmpcqpsZjyFgCfcI7J 8OJjtIkL+XHFby/7EuaCZ5E= =ByGs -----END PGP SIGNATURE----- --Sig_/mZxT0C5orEnK.fzYdHoK+UN--