caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Xavier Leroy <Xavier.Leroy@inria.fr>
To: akalin@akalin.cx
Cc: caml-list@inria.fr
Subject: Re: [Caml-list] Array 4 MB size limit
Date: Tue, 16 May 2006 10:01:35 +0200	[thread overview]
Message-ID: <446986DF.1070308@inria.fr> (raw)
In-Reply-To: <20060515141230.ajyupn2z28k0484s@horde.akalin.cx>

> I was greatly surprised when I found out there was such a low limit on
> arrays.  Is there a reason for this?  Will this limit ever be increased?

As Brian Hurt explained, this limit comes from the fact that heap
object sizes are stored in N-10 bits, where N is the bit width of the
processor (32 or 64).

Historical digression: this representation decision was initially
taken when designing Caml Light in 1989-1990.  At that time, even
professional workstations had 16 M of RAM at best, so limiting arrays to
4M elements was reasonable.  The decision was then reconsidered in
1995 during the redesign that led to OCaml.  At that time,
64-bit architectures were all the rage: OCaml was actually implemented
on a 64-bit Alpha, and only then backported to 32-bit machines.
So, the original header format was kept, since it makes complete sense
on a 64-bit architecture.  Little did I know that the 32-bitters would
survive so long.  Now, it's 2006, and 64-bit processors are becoming
universally available, in desktop machines at least.  (I've been
running an AMD64 PC at home since january 2005 with absolutely zero
problems.)  So, no the data representations of OCaml are not going to
change to lift the array size limit on 32-bit machines.

> Is the limit a limit on the number of elements or the total size?  The
> language in Sys.max_array_size implies the former, but the fact the
> limit is halved for floats implies the latter.  If I had a record type
> with 5 floats, will the limit then by Sys.max_array_size / 10?

No.  In general, Caml arrays are not unboxed, meaning that your array
of 5-float records is actually an array of pointers to individual
blocks holding 5 floats.  The only exception is for arrays of floats,
which are unboxed.

> Is there
> some sort of existing ArrayList module that works around this problem?
> Ideally, I'd like to have something like C++'s std::vector<> type, which
> can be dynamically resized.  Do I have to write my own? :(

A better idea would be to determine exactly what data structure you need:
which abstract operations, what performance requirements, etc.  C++
and Lisp programmers tend to encode everything as arrays or lists,
respectively, but quite often these are not the best data structure
for the application of interest.

> Also, the fact that using lists crashes for the same data set is
> surprising.  Is there a similar hard limit for lists, or would this be a
> bug?  Should I post a test case?

Depends on the platform you use.  In principle, Caml should report
stack overflows cleanly, by throwing a Stack_overflow exception.
However, this is hard to do in native code as it depends a lot on the
processor and OS used.  So, some combinations (e.g. x86/Linux) will
report stack overflows via an exception, and others will let the
kernel generate a segfault.

If you're getting the segfault under x86/Linux for instance, please
post a test case on the bug tracking system.  It's high time that
Damien shaves :-)

- Xavier Leroy


  parent reply	other threads:[~2006-05-16  8:01 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-05-15 18:12 akalin
2006-05-15 18:22 ` [Caml-list] " Nicolas Cannasse
2006-05-15 20:32 ` Damien Doligez
2006-05-15 21:27   ` akalin
2006-05-15 22:51 ` Oliver Bandel
2006-05-16  0:48 ` Brian Hurt
2006-05-16  9:57   ` Damien Doligez
2006-05-16 15:10     ` Markus Mottl
2006-05-16  8:01 ` Xavier Leroy [this message]
2006-05-16  8:20   ` Nicolas Cannasse
2006-05-19 17:13     ` Xavier Leroy
2006-05-19  5:57   ` Frederick Akalin
2006-05-19  6:21     ` Oliver Bandel
2006-05-19 12:15     ` Jon Harrop
2006-05-19 19:36       ` akalin
2006-05-19 20:17         ` Oliver Bandel
2006-05-19 16:28     ` Jozef Kosoru
2006-05-19 20:08       ` Oliver Bandel
2006-05-19 21:26       ` Jon Harrop
2006-05-20  1:06         ` Brian Hurt
2006-05-20 18:32           ` brogoff
2006-05-20 21:29             ` immutable strings II ([Caml-list] Array 4 MB size limit) Oliver Bandel
2006-05-22 22:09               ` Aleksey Nogin
2006-05-20 21:11           ` immutable strings (Re: [Caml-list] " Oliver Bandel
2006-05-25  4:32             ` immutable strings (Re: " Stefan Monnier
2006-05-25  5:56               ` [Caml-list] " Martin Jambon
2006-05-25  7:23                 ` j h woodyatt
2006-05-25 10:22                   ` Jon Harrop
2006-05-25 19:28                   ` Oliver Bandel
2006-05-25 11:14                 ` Brian Hurt
2006-05-25 19:42                   ` Oliver Bandel
2006-05-26  6:51                   ` Alain Frisch
2006-05-25 17:31                 ` Aleksey Nogin
2006-05-25 19:54                   ` Martin Jambon
2006-05-25 11:18               ` Brian Hurt
2006-05-25 17:34                 ` Aleksey Nogin
2006-05-25 18:44                   ` Tom
2006-05-25 23:00                     ` Jon Harrop
2006-05-25 23:15                       ` Martin Jambon
2006-05-20  0:57       ` [Caml-list] Array 4 MB size limit Brian Hurt
2006-05-20  1:17         ` Frederick Akalin
2006-05-20  1:52           ` Brian Hurt
2006-05-20  9:08             ` Jozef Kosoru
2006-05-20 10:12               ` skaller
2006-05-20 11:06                 ` Jozef Kosoru
2006-05-20 12:02                   ` skaller
2006-05-20 21:42                 ` Oliver Bandel
2006-05-21  1:24                   ` skaller
2006-05-21 14:10                     ` Oliver Bandel
     [not found]               ` <Pine.LNX.4.63.0605200847530.10710@localhost.localdomain>
2006-05-20 19:52                 ` Jozef Kosoru
2006-05-20 21:45                   ` Oliver Bandel
2006-05-21  9:26           ` Richard Jones
     [not found]             ` <5CE30707-5DCE-4A22-970E-A49C36F9C901@akalin.cx>
2006-05-22 10:40               ` Richard Jones
2006-05-20 10:51         ` Jozef Kosoru
2006-05-20 14:22           ` Brian Hurt
2006-05-20 18:41             ` j h woodyatt
2006-05-20 19:37               ` Jon Harrop
2006-05-20 20:47             ` Jozef Kosoru
2006-05-26 18:34             ` Ken Rose
2006-05-20 22:07           ` Oliver Bandel
2006-05-20 15:15         ` Don Syme
2006-05-20 22:15           ` Oliver Bandel
2006-05-21  1:25             ` skaller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=446986DF.1070308@inria.fr \
    --to=xavier.leroy@inria.fr \
    --cc=akalin@akalin.cx \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).