caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* OCaml runtime using too much memory in 64-bit Linux
@ 2007-11-07 17:28 Adam Chlipala
  2007-11-07 18:20 ` [Caml-list] " Gerd Stolpmann
  2007-11-08 20:51 ` [Caml-list] " Romain Beauxis
  0 siblings, 2 replies; 17+ messages in thread
From: Adam Chlipala @ 2007-11-07 17:28 UTC (permalink / raw)
  To: caml-list

I've encountered a problem where certain OCaml programs use orders of 
magnitude more RAM when compiled/run in 64-bit Linux instead of 32-bit 
Linux.  Some investigation led to the conclusion that the difference has 
to do with the size of OCaml page tables.  (Here I mean the page tables 
maintained by the OCaml runtime system, not any OS stuff.)

A program that should be using just a few megabytes of RAM ends up using 
200+ MB to store a page table.  It seems that a C macro is defined by 
default on 64-bit Linux to use mmap() instead of malloc().  Ironically, 
a comment says that this was done to avoid being given blocks of memory 
that are very far apart from each other, forcing the creation of overly 
large page tables.  It's ironic because that is exactly the problem that 
is showing up now with mmap().  It ends up called twice for the program 
I'm looking at, and the two addresses it returns are far enough apart to 
lead to creation of a 200 MB page table.

Has anyone else experienced this problem?  Would the runtime system need 
to be changed to avoid it?


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] OCaml runtime using too much memory in 64-bit Linux
  2007-11-07 17:28 OCaml runtime using too much memory in 64-bit Linux Adam Chlipala
@ 2007-11-07 18:20 ` Gerd Stolpmann
  2007-11-07 19:12   ` Adam Chlipala
  2007-11-08 20:51 ` [Caml-list] " Romain Beauxis
  1 sibling, 1 reply; 17+ messages in thread
From: Gerd Stolpmann @ 2007-11-07 18:20 UTC (permalink / raw)
  To: Adam Chlipala; +Cc: caml-list

Am Mittwoch, den 07.11.2007, 12:28 -0500 schrieb Adam Chlipala:
> I've encountered a problem where certain OCaml programs use orders of 
> magnitude more RAM when compiled/run in 64-bit Linux instead of 32-bit 
> Linux.  Some investigation led to the conclusion that the difference has 
> to do with the size of OCaml page tables.  (Here I mean the page tables 
> maintained by the OCaml runtime system, not any OS stuff.)
> 
> A program that should be using just a few megabytes of RAM ends up using 
> 200+ MB to store a page table.  It seems that a C macro is defined by 
> default on 64-bit Linux to use mmap() instead of malloc().  Ironically, 
> a comment says that this was done to avoid being given blocks of memory 
> that are very far apart from each other, forcing the creation of overly 
> large page tables.  It's ironic because that is exactly the problem that 
> is showing up now with mmap().  It ends up called twice for the program 
> I'm looking at, and the two addresses it returns are far enough apart to 
> lead to creation of a 200 MB page table.
> 
> Has anyone else experienced this problem?  Would the runtime system need 
> to be changed to avoid it?

We are using O'Caml on 64 bit Linux, and aren't aware of such problems.

Did you observe a debug GC message that proves it? 200 MB means that an
address space of 200M * 4K = 8E is covered.

Also think of Linux modifications that do address randomization, i.e.
prevent that contiguous addresses are allocated.

Gerd
-- 
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
Phone: +49-6151-153855                  Fax: +49-6151-997714
------------------------------------------------------------


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] OCaml runtime using too much memory in 64-bit Linux
  2007-11-07 18:20 ` [Caml-list] " Gerd Stolpmann
@ 2007-11-07 19:12   ` Adam Chlipala
  2007-11-08 12:56     ` Samuel Mimram
  2007-11-14  4:20     ` Romain Beauxis
  0 siblings, 2 replies; 17+ messages in thread
From: Adam Chlipala @ 2007-11-07 19:12 UTC (permalink / raw)
  To: Gerd Stolpmann; +Cc: caml-list

Gerd Stolpmann wrote:
> Am Mittwoch, den 07.11.2007, 12:28 -0500 schrieb Adam Chlipala:
>   
>> I've encountered a problem where certain OCaml programs use orders of 
>> magnitude more RAM when compiled/run in 64-bit Linux instead of 32-bit 
>> Linux.  Some investigation led to the conclusion that the difference has 
>> to do with the size of OCaml page tables.  (Here I mean the page tables 
>> maintained by the OCaml runtime system, not any OS stuff.)
>>
>> ...
>>     
>
> We are using O'Caml on 64 bit Linux, and aren't aware of such problems.
>
> Did you observe a debug GC message that proves it? 200 MB means that an
> address space of 200M * 4K = 8E is covered.
>   

Here's one run, cut off after allocation seems to settle down:

OCAMLRUNPARAM="v=12" ./program_name.exe
Growing heap to 960k bytes
Growing page table to 204151332 entries
Growing heap to 1440k bytes
Growing heap to 1920k bytes

> Also think of Linux modifications that do address randomization, i.e.
> prevent that contiguous addresses are allocated.

That would definitely cause trouble.  Thanks for the suggestion; I'll 
look into it.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] OCaml runtime using too much memory in 64-bit Linux
  2007-11-07 19:12   ` Adam Chlipala
@ 2007-11-08 12:56     ` Samuel Mimram
  2007-11-14  4:20     ` Romain Beauxis
  1 sibling, 0 replies; 17+ messages in thread
From: Samuel Mimram @ 2007-11-08 12:56 UTC (permalink / raw)
  To: caml-list; +Cc: Adam Chlipala, Gerd Stolpmann

Hi,

Adam Chlipala wrote:
> Gerd Stolpmann wrote:
>> Am Mittwoch, den 07.11.2007, 12:28 -0500 schrieb Adam Chlipala:
>>  
>>> I've encountered a problem where certain OCaml programs use orders of 
>>> magnitude more RAM when compiled/run in 64-bit Linux instead of 
>>> 32-bit Linux.  Some investigation led to the conclusion that the 
>>> difference has to do with the size of OCaml page tables.  (Here I 
>>> mean the page tables maintained by the OCaml runtime system, not any 
>>> OS stuff.)
>>>
>>> ...
>>>     
>>
>> We are using O'Caml on 64 bit Linux, and aren't aware of such problems.
>>
>> Did you observe a debug GC message that proves it? 200 MB means that an
>> address space of 200M * 4K = 8E is covered.

We are observing the same memory consumption problems with 
liquidsoap[1]. On 32-bits machines ps says that it takes around 50M / 
10M of VSZ / RSS whereas on 64-bits machines it takes 200M / 100M which 
is much much bigger!

Here is the initial stack an heap allocation on 32-bits archs:

% OCAMLRUNPARAM="v=12" ./liquidsoap 'output.dummy(blank())'
Growing heap to 480k bytes
Growing page table to 3040 entries
Growing heap to 720k bytes
Growing page table to 3354 entries
Growing heap to 960k bytes
Growing page table to 3532 entries
Growing heap to 1200k bytes
Growing page table to 4251 entries
Growing heap to 1440k bytes
Growing page table to 4416 entries
Growing heap to 1680k bytes
Growing page table to 4478 entries
Growing heap to 1920k bytes
Growing page table to 4540 entries

And on 64-bits archs:

$ OCAMLRUNPARAM="v=12" ./liquidsoap 'output.dummy(blank())'
Growing heap to 960k bytes
Growing page table to 118256149 entries
Growing heap to 1440k bytes

I'm not sure I fully understand what these figures are. Is it expected 
for the page table to be bigger with that many orders of magnitude on 
64-bits archs?

Thanks!

Samuel.

[1] http://savonet.sf.net/


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] OCaml runtime using too much memory in 64-bit Linux
  2007-11-07 17:28 OCaml runtime using too much memory in 64-bit Linux Adam Chlipala
  2007-11-07 18:20 ` [Caml-list] " Gerd Stolpmann
@ 2007-11-08 20:51 ` Romain Beauxis
  1 sibling, 0 replies; 17+ messages in thread
From: Romain Beauxis @ 2007-11-08 20:51 UTC (permalink / raw)
  To: caml-list

Le Wednesday 07 November 2007 18:28:49 Adam Chlipala, vous avez écrit :
> A program that should be using just a few megabytes of RAM ends up using
> 200+ MB to store a page table.  It seems that a C macro is defined by
> default on 64-bit Linux to use mmap() instead of malloc().  Ironically,
> a comment says that this was done to avoid being given blocks of memory
> that are very far apart from each other, forcing the creation of overly
> large page tables.  It's ironic because that is exactly the problem that
> is showing up now with mmap().  It ends up called twice for the program
> I'm looking at, and the two addresses it returns are far enough apart to
> lead to creation of a 200 MB page table.


Unfortunatly, you can't compile without that option 
on amd64 archs, you'll get this error:
 
> boot/ocamlrun boot/ocamlc -nostdlib -I boot  -linkall -o ocaml.tmp toplevel/toplevellib.cma toplevel/topstart.cmo
>Fatal error: exception Out_of_memory


Romain


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] OCaml runtime using too much memory in 64-bit Linux
  2007-11-07 19:12   ` Adam Chlipala
  2007-11-08 12:56     ` Samuel Mimram
@ 2007-11-14  4:20     ` Romain Beauxis
  2007-11-14 12:03       ` Vladimir Shabanov
  1 sibling, 1 reply; 17+ messages in thread
From: Romain Beauxis @ 2007-11-14  4:20 UTC (permalink / raw)
  To: caml-list

	Hi all !

Le Wednesday 07 November 2007 20:12:16 Adam Chlipala, vous avez écrit :
> Gerd Stolpmann wrote:
> > Am Mittwoch, den 07.11.2007, 12:28 -0500 schrieb Adam Chlipala:
> >> I've encountered a problem where certain OCaml programs use orders of
> >> magnitude more RAM when compiled/run in 64-bit Linux instead of 32-bit
> >> Linux.  Some investigation led to the conclusion that the difference has
> >> to do with the size of OCaml page tables.  (Here I mean the page tables
> >> maintained by the OCaml runtime system, not any OS stuff.)
> >>
> >> ...
> >
> > We are using O'Caml on 64 bit Linux, and aren't aware of such problems.
> >
> > Did you observe a debug GC message that proves it? 200 MB means that an
> > address space of 200M * 4K = 8E is covered.
>
> Here's one run, cut off after allocation seems to settle down:
>
> OCAMLRUNPARAM="v=12" ./program_name.exe
> Growing heap to 960k bytes
> Growing page table to 204151332 entries
> Growing heap to 1440k bytes
> Growing heap to 1920k bytes

Following Sam's answer on similar issue with our application, here are two 
compared outputs for the same informations:

-- On i386:
5:13 toots@selassie ~% OCAMLRUNPARAM="v=12" liquidsoap 'output.dummy(blank())'
Growing heap to 480k bytes
Growing page table to 2648 entries
Growing heap to 720k bytes
Growing page table to 2710 entries
Growing heap to 960k bytes
Growing page table to 2815 entries

-- On amd64:
5:12 toots@ras-macintosh ~/sources/svn/savonet/trunk/liquidsoap/src% 
OCAMLRUNPARAM="v=12" ./liquidsoap 'output.dummy(blank())'
Growing heap to 960k bytes
Growing page table to 104640820 entries
Growing heap to 1440k bytes
Growing heap to 1920k bytes

It seems that the "Growing page table to 104640820 entries" in amd64's log is 
quite enourmeous, compared to similar values for i386.

Sorry, I can't debug more, I'm not expert at all on this topic.
However, I'll be glad to dig more if indicated what to do..


Romain


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] OCaml runtime using too much memory in 64-bit Linux
  2007-11-14  4:20     ` Romain Beauxis
@ 2007-11-14 12:03       ` Vladimir Shabanov
  2007-11-14 12:55         ` Xavier Leroy
  0 siblings, 1 reply; 17+ messages in thread
From: Vladimir Shabanov @ 2007-11-14 12:03 UTC (permalink / raw)
  To: caml-list

2007/11/14, Romain Beauxis <romain.beauxis@gmail.com>:
> Following Sam's answer on similar issue with our application, here are two
> compared outputs for the same informations:
>
> -- On i386:
> 5:13 toots@selassie ~% OCAMLRUNPARAM="v=12" liquidsoap 'output.dummy(blank())'
> Growing heap to 480k bytes
> Growing page table to 2648 entries
> Growing heap to 720k bytes
> Growing page table to 2710 entries
> Growing heap to 960k bytes
> Growing page table to 2815 entries
>
> -- On amd64:
> 5:12 toots@ras-macintosh ~/sources/svn/savonet/trunk/liquidsoap/src%
> OCAMLRUNPARAM="v=12" ./liquidsoap 'output.dummy(blank())'
> Growing heap to 960k bytes
> Growing page table to 104640820 entries
> Growing heap to 1440k bytes
> Growing heap to 1920k bytes
>
> It seems that the "Growing page table to 104640820 entries" in amd64's log is
> quite enourmeous, compared to similar values for i386.

I also have problems with my application on amd64. The difference is
that I have additional memory allocated only in bytecode executable.

native amd64:
$ OCAMLRUNPARAM="v=12" ./_build/game.opt
Growing heap to 960k bytes
Growing page table to 72391 entries
... (program output stripped)
Growing heap to 1440k bytes
Growing page table to 90522 entries
...

bytecode amd64:
$ OCAMLRUNPARAM="v=12" ./_build/game
Initial stack limit: 8192k bytes
Growing gray_vals to 32k bytes
Growing heap to 960k bytes
Growing page table to 141518746 entries
...
Growing heap to 1440k bytes
...

It gives me 80--300MB of additional memory allocated (virt & res).
Interestingly enough the number of page table entries is different
from run to run (hence the non-constant additional memory size). In
native executable page table entries count is constant.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] OCaml runtime using too much memory in 64-bit Linux
  2007-11-14 12:03       ` Vladimir Shabanov
@ 2007-11-14 12:55         ` Xavier Leroy
  2007-11-14 13:45           ` Brian Hurt
                             ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Xavier Leroy @ 2007-11-14 12:55 UTC (permalink / raw)
  To: Vladimir Shabanov; +Cc: caml-list

Hello,

Concerning this issue with large page tables on 64-bit architectures,
I opened a problem report on the bug-tracking system to help gather
more information.  I'd like to ask all members of this list that
reported the problem to kindly visit

         http://caml.inria.fr/mantis/view.php?id=4448

and add the required information as a note.  That will help
pinpointing the problem.

Some more explanation on what's going on.  The Caml run-time system
needs to keep track of which memory areas belong to the major heap,
and uses a page table for this purpose, with a dense representation
(an array of bytes).  If the major heap areas are closely spaced, this
table is very small compared with the size of the heap itself.
However, if these areas are widely spaced in the addressing space, the
table can get big.

For 32-bit platforms, this isn't much of a problem since the maximum
size of the page table is 1 megabytes.  For 64-bit platforms, the sky
is the limit, however.  So far, the only 64-bit platform where this
has been a problem in the past is Linux with glibc, where blocks
allocated by malloc() can come either from sbrk() or mmap(), two areas
that are spaced several *exa*bytes apart.  We worked around the
problem by allocating all major heap areas directly with mmap(),
obtaining closely spaced addresses.

Apparently, this trick is no longer working on some systems, but I
need to understand better which ones exactly.  (I suspect some Linux
distros that applied address randomization patches to the stock Linux
kernel.)  So, please provide feedback in the BTS.

If the problem is confirmed, there are several ways to go about it.
One is to implement the page table with a sparse data structure,
e.g. a hash table.  However, the major GC and some primitives like
polymorphic equality perform *lots* of page table lookups, so a
performance hit is to be expected.  The other is to revise OCaml's
data representations so that the GC and polymorphic primitives no
longer need to know which pointers fall in the major heap.  This seems
possible in principle, but will take quite a bit of work and break a
lot of badly written C/OCaml interface code.  You've been warned :-)

- Xavier Leroy


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] OCaml runtime using too much memory in 64-bit Linux
  2007-11-14 12:55         ` Xavier Leroy
@ 2007-11-14 13:45           ` Brian Hurt
  2007-11-14 14:16           ` Romain Beauxis
                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 17+ messages in thread
From: Brian Hurt @ 2007-11-14 13:45 UTC (permalink / raw)
  Cc: caml-list

Xavier Leroy wrote:

>
>If the problem is confirmed, there are several ways to go about it.
>One is to implement the page table with a sparse data structure,
>e.g. a hash table.  However, the major GC and some primitives like
>polymorphic equality perform *lots* of page table lookups, so a
>performance hit is to be expected.  
>

I've been contemplating doing this on my own (not at work, I comment), 
just to see how much of a hit it is.  If no one else steps up to the 
plate, I will.

One important comment I will make, in moving to a hash table the size of 
pages has to increase.  The current implementation uses 1 byte per (4K) 
page for the map- a hash table would use, I project, about 4 words (16 
or 32 bytes) per page.  To keep the memory utilization equal, we'd need 
to go for a larger page size- 64K for 32-bit systems.  I'd be inclined 
to go larger than that- the 4K page size was standardized back when the 
average system (using memory protection) has 1MB, and 16MB was a huge 
amount of memory.  Memory sizes have increased 1024-fold in the same 
time, meaning I could make an argument for 4M pages.  I'm not sure I 
would (I think you'd run into a fragmentation problem on 32-bit), but it 
makes the page sizes I'd probably settle on (256K for 32-bit, 1M for 
64-bit) a lot more palatable. :-)

An advantage large pages would have is that they'd make the table a lot 
smaller, and thus a lot more cache friendly.  I mean, think about it- 
1GB of 4K pages at 1 byte per page is 256K, while 1GB of 256K pages at 
16 bytes per page is 64K, 1/4 the size.  1GB of 1M pages at 32 bytes per 
page is 32K, smaller yet.  The smaller table is more likely to fit into 
cache, and cheaper to load into cache.  While I wouldn't want to 
gaurentee anything, I could easily see the smaller table size that fits 
into cache actually gives a performance boost.  I've certainly seen 
weirder things happen.

>The other is to revise OCaml's
>data representations so that the GC and polymorphic primitives no
>longer need to know which pointers fall in the major heap.  This seems
>possible in principle, but will take quite a bit of work and break a
>lot of badly written C/OCaml interface code.  You've been warned :-)
>
>  
>
Of the two, I think I like the hashtable idea better.

If it isn't clear, this whole email is just speaking for myself.  I'm 
not allowed to speak for Jane Street (heck, if I'm luck, I'm allowed to 
speak *to* Jane Street :-).

Brian


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] OCaml runtime using too much memory in 64-bit Linux
  2007-11-14 12:55         ` Xavier Leroy
  2007-11-14 13:45           ` Brian Hurt
@ 2007-11-14 14:16           ` Romain Beauxis
  2007-11-14 15:56           ` Markus Mottl
  2007-11-14 16:22           ` Stefan Monnier
  3 siblings, 0 replies; 17+ messages in thread
From: Romain Beauxis @ 2007-11-14 14:16 UTC (permalink / raw)
  To: caml-list

	Hi Xavier !

Le Wednesday 14 November 2007 13:55:40 Xavier Leroy, vous avez écrit :
> Apparently, this trick is no longer working on some systems, but I
> need to understand better which ones exactly.  (I suspect some Linux
> distros that applied address randomization patches to the stock Linux
> kernel.)  So, please provide feedback in the BTS.

While printing the required data for your bug report, we found out that:
  /proc/sys/kernel/randomize_va_space
can be set to 0

Afterward, the issue does not appear anymore.


Quite a good workaround until a real fix appear.. :)


Romain


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] OCaml runtime using too much memory in 64-bit Linux
  2007-11-14 12:55         ` Xavier Leroy
  2007-11-14 13:45           ` Brian Hurt
  2007-11-14 14:16           ` Romain Beauxis
@ 2007-11-14 15:56           ` Markus Mottl
  2007-11-14 16:22           ` Stefan Monnier
  3 siblings, 0 replies; 17+ messages in thread
From: Markus Mottl @ 2007-11-14 15:56 UTC (permalink / raw)
  To: caml-list

On 11/14/07, Xavier Leroy <Xavier.Leroy@inria.fr> wrote:
> Concerning this issue with large page tables on 64-bit architectures,
> I opened a problem report on the bug-tracking system to help gather
> more information.  I'd like to ask all members of this list that
> reported the problem to kindly visit
>
>          http://caml.inria.fr/mantis/view.php?id=4448

I have just added a small note there describing a very simple fix,
which, though it may not be totally general, might be good enough and
requires very little effort to implement.  If anybody wants to give it
a try...

Best regards,
Markus

-- 
Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: OCaml runtime using too much memory in 64-bit Linux
  2007-11-14 12:55         ` Xavier Leroy
                             ` (2 preceding siblings ...)
  2007-11-14 15:56           ` Markus Mottl
@ 2007-11-14 16:22           ` Stefan Monnier
  2007-11-14 16:36             ` [Caml-list] " Brian Hurt
  2007-11-14 16:45             ` Lionel Elie Mamane
  3 siblings, 2 replies; 17+ messages in thread
From: Stefan Monnier @ 2007-11-14 16:22 UTC (permalink / raw)
  To: caml-list

> and uses a page table for this purpose, with a dense representation
> (an array of bytes).  If the major heap areas are closely spaced, this
[...]
> For 32-bit platforms, this isn't much of a problem since the maximum
> size of the page table is 1 megabytes.  For 64-bit platforms, the sky

How about allocating this array of bytes via mmap and then leave it
uninitialized (relying on POSIX's guarantee that it's already
initialized to zeros)?
This way you can easily have a 4GB "dense" table which doesn't use much
RAM since most of the 4GB will be mapped (via copy-on-write) to the same
"zero page".


        Stefan


PS: Obviously this is orthogonal to the potential change in page-size
recommended by Brian.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] Re: OCaml runtime using too much memory in 64-bit Linux
  2007-11-14 16:22           ` Stefan Monnier
@ 2007-11-14 16:36             ` Brian Hurt
  2007-11-14 17:08               ` Lionel Elie Mamane
  2007-11-14 17:26               ` Stefan Monnier
  2007-11-14 16:45             ` Lionel Elie Mamane
  1 sibling, 2 replies; 17+ messages in thread
From: Brian Hurt @ 2007-11-14 16:36 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: caml-list

Stefan Monnier wrote:

>How about allocating this array of bytes via mmap and then leave it
>uninitialized (relying on POSIX's guarantee that it's already
>initialized to zeros)?
>This way you can easily have a 4GB "dense" table which doesn't use much
>RAM since most of the 4GB will be mapped (via copy-on-write) to the same
>"zero page".
>
>  
>
Even on a system like linux, which optimistically allocates memory (i.e. 
the actually underlying memory isn't allocated until you actually touch 
it), once you read the page, it has to actually exist in memory.  Even 
using 1 byte per 4M page, mapping a whole 64-bit memory space requires 4 
TB of ram.  And many systems do not optimistically allocate memory, as 
it causes a lot of problems (for example, allocations can return false 
positives, which then segfault when you first touch the memory because 
it can't really be allocated).

Brian


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] Re: OCaml runtime using too much memory in 64-bit Linux
  2007-11-14 16:22           ` Stefan Monnier
  2007-11-14 16:36             ` [Caml-list] " Brian Hurt
@ 2007-11-14 16:45             ` Lionel Elie Mamane
  2007-11-14 17:08               ` Lionel Elie Mamane
  1 sibling, 1 reply; 17+ messages in thread
From: Lionel Elie Mamane @ 2007-11-14 16:45 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: caml-list

On Wed, Nov 14, 2007 at 11:22:46AM -0500, Stefan Monnier wrote:
>> and uses a page table for this purpose, with a dense representation
>> (an array of bytes).  If the major heap areas are closely spaced, this
> [...]
>> For 32-bit platforms, this isn't much of a problem since the maximum
>> size of the page table is 1 megabytes.  For 64-bit platforms, the sky

> How about allocating this array of bytes via mmap and then leave it
> uninitialized (relying on POSIX's guarantee that it's already
> initialized to zeros)?
> This way you can easily have a 4GB "dense" table which doesn't use much
> RAM since most of the 4GB will be mapped (via copy-on-write) to the same
> "zero page".

I think this will fail on a GNU/Linux 2.6 system with
/proc/sys/vm/overcommit_memory set to 2, or any other system that
behaves as Linux with overcommit to 2. (Meaning, it actually reserves
place from the swap+ram pool, so that any mmapped/sbrk'd memory can
actually be used. It permits the kernel to guarantee that even if all
programs actually use all the memory they allocate, it will be able to
serve them - albeit slowly (swap use)).

In particular, the addressing space of a 64 bit machine is, well... 64
bits, by definition. For 4kiB = 2^12 B pages, one thus needs a table
of size 2^(64-12) = 2^52 bytes, that is 4 EB. That is, on any machine
with less than that of memory (and overcommit to 2), the program will
not run. Even at one bit (and not byte) per page, that is still
16PB...

Big pages don't get you out of the problem. 4MB pages only buy you a
factor 1024, that is 4PB and 16GB.


-- 
Lionel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] Re: OCaml runtime using too much memory in 64-bit Linux
  2007-11-14 16:45             ` Lionel Elie Mamane
@ 2007-11-14 17:08               ` Lionel Elie Mamane
  0 siblings, 0 replies; 17+ messages in thread
From: Lionel Elie Mamane @ 2007-11-14 17:08 UTC (permalink / raw)
  To: Stefan Monnier, caml-list

On Wed, Nov 14, 2007 at 05:45:44PM +0100, Lionel Elie Mamane wrote:

> In particular, the addressing space of a 64 bit machine is, well... 64
> bits, by definition. For 4kiB = 2^12 B pages, one thus needs a table
> of size 2^(64-12) = 2^52 bytes, that is 4 EB. That is, on any machine
> with less than that of memory (and overcommit to 2), the program will
> not run. Even at one bit (and not byte) per page, that is still
> 16PB...

> Big pages don't get you out of the problem. 4MB pages only buy you a
> factor 1024, that is 4PB and 16GB.

I got my prefixes all wrong... It is 4PiB, 16TiB, 4TiB and 16GiB...

-- 
Lionel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] Re: OCaml runtime using too much memory in 64-bit Linux
  2007-11-14 16:36             ` [Caml-list] " Brian Hurt
@ 2007-11-14 17:08               ` Lionel Elie Mamane
  2007-11-14 17:26               ` Stefan Monnier
  1 sibling, 0 replies; 17+ messages in thread
From: Lionel Elie Mamane @ 2007-11-14 17:08 UTC (permalink / raw)
  To: Brian Hurt; +Cc: Stefan Monnier, caml-list

On Wed, Nov 14, 2007 at 11:36:16AM -0500, Brian Hurt wrote:
> Stefan Monnier wrote:

>> How about allocating this array of bytes via mmap and then leave it
>> uninitialized (relying on POSIX's guarantee that it's already
>> initialized to zeros)?
>> This way you can easily have a 4GB "dense" table which doesn't use much
>> RAM since most of the 4GB will be mapped (via copy-on-write) to the same
>> "zero page".

> Even on a system like linux, which optimistically allocates memory
> (i.e.  the actually underlying memory isn't allocated until you
> actually touch it), once you read the page, it has to actually exist
> in memory.

This may not be a problem (but only people that know the system
intimately will *know*), it is plausible that the ocaml runtime system
would not read any entry in the table corresponding to a page that is
not allocated. Or maybe not.

-- 
Lionel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] Re: OCaml runtime using too much memory in 64-bit Linux
  2007-11-14 16:36             ` [Caml-list] " Brian Hurt
  2007-11-14 17:08               ` Lionel Elie Mamane
@ 2007-11-14 17:26               ` Stefan Monnier
  1 sibling, 0 replies; 17+ messages in thread
From: Stefan Monnier @ 2007-11-14 17:26 UTC (permalink / raw)
  To: Brian Hurt; +Cc: caml-list

> Even on a system like linux, which optimistically allocates memory (i.e. the
> actually underlying memory isn't allocated until you actually touch it),
> once you read the page, it has to actually exist in memory.

It exists in memory: it's the zero page (a page that contains all zero
bytes).  And it's the same physical (RAM) page used for all pages that
have been allocated but not yet written.  So as long as you don't write
to it, it shouldn't use any RAM space.

Of course, it may cost in swap use (depending on optimistic allocation
and the use of MAP_NORESERVE), and it will cost in kernel memory because
the kernel has to maintain the process's page table.

But it seems like a good quick fix, which preserves the advantages of
a dense array of bytes (i.e. fast and simple lookup, compact
representation using less cache space, ...).


        Stefan


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2007-11-14 17:26 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-11-07 17:28 OCaml runtime using too much memory in 64-bit Linux Adam Chlipala
2007-11-07 18:20 ` [Caml-list] " Gerd Stolpmann
2007-11-07 19:12   ` Adam Chlipala
2007-11-08 12:56     ` Samuel Mimram
2007-11-14  4:20     ` Romain Beauxis
2007-11-14 12:03       ` Vladimir Shabanov
2007-11-14 12:55         ` Xavier Leroy
2007-11-14 13:45           ` Brian Hurt
2007-11-14 14:16           ` Romain Beauxis
2007-11-14 15:56           ` Markus Mottl
2007-11-14 16:22           ` Stefan Monnier
2007-11-14 16:36             ` [Caml-list] " Brian Hurt
2007-11-14 17:08               ` Lionel Elie Mamane
2007-11-14 17:26               ` Stefan Monnier
2007-11-14 16:45             ` Lionel Elie Mamane
2007-11-14 17:08               ` Lionel Elie Mamane
2007-11-08 20:51 ` [Caml-list] " Romain Beauxis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).