From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by yquem.inria.fr (Postfix) with ESMTP id C9CB9BBAF for ; Fri, 19 May 2006 07:57:11 +0200 (CEST) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id k4J5vBPP025265 for ; Fri, 19 May 2006 07:57:11 +0200 Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id HAA15126 for ; Fri, 19 May 2006 07:57:10 +0200 (MET DST) Received: from eigen.apisnetworks.com (eigen.apisnetworks.com [67.15.52.22]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id k4J5v8Mf025255 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 19 May 2006 07:57:09 +0200 Received: from [10.1.0.2] (dsl231-050-014.sea1.dsl.speakeasy.net [216.231.50.14]) (authenticated bits=0) by eigen.apisnetworks.com (8.12.11.20060308/8.12.10) with ESMTP id k4J5v6LH029382 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 19 May 2006 01:57:07 -0400 Message-ID: <446D5E4A.8060005@akalin.cx> Date: Thu, 18 May 2006 22:57:30 -0700 From: Frederick Akalin User-Agent: Thunderbird 1.5 (Windows/20051201) MIME-Version: 1.0 To: Xavier Leroy Cc: caml-list@inria.fr Subject: Re: [Caml-list] Array 4 MB size limit References: <20060515141230.ajyupn2z28k0484s@horde.akalin.cx> <446986DF.1070308@inria.fr> In-Reply-To: <446986DF.1070308@inria.fr> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Miltered: at nez-perce with ID 446D5E37.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Miltered: at nez-perce with ID 446D5E34.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; arrays:01 ocaml:01 bigarrays:01 vectors:01 arrays:01 assertion:01 surprising:01 vectors:01 triangles:01 scalable:01 surprising:01 bug:01 stack:01 stack:01 segfault:01 X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.0.3 Xavier Leroy wrote: >> I was greatly surprised when I found out there was such a low limit on >> arrays. Is there a reason for this? Will this limit ever be increased? >> > > As Brian Hurt explained, this limit comes from the fact that heap > object sizes are stored in N-10 bits, where N is the bit width of the > processor (32 or 64). > > Historical digression: this representation decision was initially > taken when designing Caml Light in 1989-1990. At that time, even > professional workstations ... Little did I know that the 32-bitters would > survive so long. Now, it's 2006, and 64-bit processors are becoming > universally available, in desktop machines at least. (I've been > running an AMD64 PC at home since january 2005 with absolutely zero > problems.) So, no the data representations of OCaml are not going to > change to lift the array size limit on 32-bit machines. > > I see. This is a perfectly reasonable explanation. I ended up just using Bigarrays of floats for now and converting from those to 3-d vectors on demand (what I need), but it would be nice to have a type that wraps around Array that can get around the 4M limit (using an array of arrays like someone suggested earlier). It's sad, but I think 32-bit is going to be around for a while, and surely I can't be the only person to run up against this. :) Not that I mind writing such a library and releasing it. I wonder if the Extlib guys would be interested... > A better idea would be to determine exactly what data structure you need: > which abstract operations, what performance requirements, etc. C++ > and Lisp programmers tend to encode everything as arrays or lists, > respectively, but quite often these are not the best data structure > for the application of interest. > I find your assertion surprising. C++ and Common LISP are by no means lacking in standard data structures (and using bare arrays is discouraged in C++, as far as I know) and in my experience I haven't much seen C++ code that used arrays/vectors when not appropriate. In any case, in my application (a raytracer) I am reading in lists of numbers from a file representing the points of a mesh and the triangles that make up the mesh (represented by a list of indices into the mesh list). A dynamically resizable, reasonable scalable array seems like the best data structure to use. >> Also, the fact that using lists crashes for the same data set is >> surprising. Is there a similar hard limit for lists, or would this be a >> bug? Should I post a test case? >> > > Depends on the platform you use. In principle, Caml should report > stack overflows cleanly, by throwing a Stack_overflow exception. > However, this is hard to do in native code as it depends a lot on the > processor and OS used. So, some combinations (e.g. x86/Linux) will > report stack overflows via an exception, and others will let the > kernel generate a segfault. > > If you're getting the segfault under x86/Linux for instance, please > post a test case on the bug tracking system. It's high time that > Damien shaves :-) > > Due to some confusion on my part (I didn't realize stderr was buffered in OCaml and needed to be flushed), I erroneously thought my program was crashing on building the list. It was, in fact, a List.map lurking further along in my code. I'm running PPC/OS X, which I'm guessing is one of those platforms that lets the kernel generate a segfault? I tried it on Linux and it threw an exception as expected. Damien's facial hair is safe. :) > - Xavier Leroy > > _______________________________________________ > Caml-list mailing list. Subscription management: > http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list > Archives: http://caml.inria.fr > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs > Frederick Akalin