caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] Netmcore problems.
@ 2012-01-13 16:30 Anders Fugmann
  2012-01-13 17:06 ` Gerd Stolpmann
  0 siblings, 1 reply; 3+ messages in thread
From: Anders Fugmann @ 2012-01-13 16:30 UTC (permalink / raw)
  To: caml-list

Hi,

I'm having a problem using netmcore from ocamlnet 3.4 library.

When I try to place a large data structure into Netmcore_array, the 
controlling process dies with the following error:

[Fri Jan 13 16:21:38 2012] [netplex.controller] [alert] Process 28067 
for service netmcore_0 terminated with signal 7

The error occurs while trying to initialise a Netmcore_array from an 
array of 4*10^6 strings all of length 150. This is done in the context 
of the "first" process.

The controlling process initializes a memory pool by using

   let pool_id = Netmcore_mempool.create_mempool (1024 * 1024 * 1024 * 
10) in

Running the code on a smaller dataset seems to work.

I have increased kernel variables shmall and shmmax to very high numbers 
(100Gb), but it does not solve the problem.

All ideas are welcome.

On other question while I'm at it; am I allowed to create multiple 
shared arrays from the same memory pool, or do I need to create one pool 
for each shared array?

Regards
Anders Fugmann



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Caml-list] Netmcore problems.
  2012-01-13 16:30 [Caml-list] Netmcore problems Anders Fugmann
@ 2012-01-13 17:06 ` Gerd Stolpmann
  2012-01-16 17:46   ` Anders Fugmann
  0 siblings, 1 reply; 3+ messages in thread
From: Gerd Stolpmann @ 2012-01-13 17:06 UTC (permalink / raw)
  To: Anders Fugmann; +Cc: caml-list

Am Freitag, den 13.01.2012, 17:30 +0100 schrieb Anders Fugmann:
> Hi,
> 
> I'm having a problem using netmcore from ocamlnet 3.4 library.
> 
> When I try to place a large data structure into Netmcore_array, the 
> controlling process dies with the following error:
> 
> [Fri Jan 13 16:21:38 2012] [netplex.controller] [alert] Process 28067 
> for service netmcore_0 terminated with signal 7

That's sigbus. You get it when the OS does not have memory anymore - but
not when the memory is allocated, but first when it is filled with data.
You can change this by disallowing that memory is overcommitted (set the
kernel param vm.overcommit_memory to 1 or even 2).

> The error occurs while trying to initialise a Netmcore_array from an 
> array of 4*10^6 strings all of length 150. This is done in the context 
> of the "first" process.
> 
> The controlling process initializes a memory pool by using
> 
>    let pool_id = Netmcore_mempool.create_mempool (1024 * 1024 * 1024 * 
> 10) in
> 
> Running the code on a smaller dataset seems to work.
> 
> I have increased kernel variables shmall and shmmax to very high numbers 
> (100Gb), but it does not solve the problem.

No, this cannot solve it. These variables only control System V shared
memory, but Netmcore uses POSIX shared memory. On Linux, you can change
the max of this memory by re-mounting /dev/shm, e.g.

mount -o remount,size=80% /dev/shm

The default is 50% of available RAM.

> All ideas are welcome.
> 
> On other question while I'm at it; am I allowed to create multiple 
> shared arrays from the same memory pool, or do I need to create one pool 
> for each shared array?

The pools can be shared.

Gerd

> 
> Regards
> Anders Fugmann
> 
> 
> 

-- 
------------------------------------------------------------
Gerd Stolpmann, Darmstadt, Germany    gerd@gerd-stolpmann.de
Creator of GODI and camlcity.org.
Contact details:        http://www.camlcity.org/contact.html
Company homepage:       http://www.gerd-stolpmann.de
*** Searching for new projects! Need consulting for system
*** programming in Ocaml? Gerd Stolpmann can help you.
------------------------------------------------------------


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Caml-list] Netmcore problems.
  2012-01-13 17:06 ` Gerd Stolpmann
@ 2012-01-16 17:46   ` Anders Fugmann
  0 siblings, 0 replies; 3+ messages in thread
From: Anders Fugmann @ 2012-01-16 17:46 UTC (permalink / raw)
  To: Gerd Stolpmann; +Cc: caml-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

Thank you very much. Remounting with size=80% solved the problem.

It would seem that on debian, /run/shm (which is symlinked from
/dev/shm) is mounted with size=20%.. Quite conservative it would seem.

Again, thanks.
Anders

- --
Anders Fugmann

On 2012-01-13 18:06, Gerd Stolpmann wrote:
> Am Freitag, den 13.01.2012, 17:30 +0100 schrieb Anders Fugmann:
>> Hi,
>>
>> I'm having a problem using netmcore from ocamlnet 3.4 library.
>>
>> When I try to place a large data structure into Netmcore_array, the 
>> controlling process dies with the following error:
>>
>> [Fri Jan 13 16:21:38 2012] [netplex.controller] [alert] Process 28067 
>> for service netmcore_0 terminated with signal 7
> 
> That's sigbus. You get it when the OS does not have memory anymore - but
> not when the memory is allocated, but first when it is filled with data.
> You can change this by disallowing that memory is overcommitted (set the
> kernel param vm.overcommit_memory to 1 or even 2).
> 
>> The error occurs while trying to initialise a Netmcore_array from an 
>> array of 4*10^6 strings all of length 150. This is done in the context 
>> of the "first" process.
>>
>> The controlling process initializes a memory pool by using
>>
>>    let pool_id = Netmcore_mempool.create_mempool (1024 * 1024 * 1024 * 
>> 10) in
>>
>> Running the code on a smaller dataset seems to work.
>>
>> I have increased kernel variables shmall and shmmax to very high numbers 
>> (100Gb), but it does not solve the problem.
> 
> No, this cannot solve it. These variables only control System V shared
> memory, but Netmcore uses POSIX shared memory. On Linux, you can change
> the max of this memory by re-mounting /dev/shm, e.g.
> 
> mount -o remount,size=80% /dev/shm
> 
> The default is 50% of available RAM.
> 
>> All ideas are welcome.
>>
>> On other question while I'm at it; am I allowed to create multiple 
>> shared arrays from the same memory pool, or do I need to create one pool 
>> for each shared array?
> 
> The pools can be shared.
> 
> Gerd
> 
>>
>> Regards
>> Anders Fugmann
>>
>>
>>
> 

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk8UYo0ACgkQ9GexfkaZ7hSfLwCgiu+grFYsZXyQlvvnUBZgna+4
twkAoJKyNJ0hXlI0u9IqmuapqQ48YGDz
=RCbL
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-01-16 17:46 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-13 16:30 [Caml-list] Netmcore problems Anders Fugmann
2012-01-13 17:06 ` Gerd Stolpmann
2012-01-16 17:46   ` Anders Fugmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).