From mboxrd@z Thu Jan 1 00:00:00 1970 X-Sympa-To: caml-list@inria.fr Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id q0GHkqHJ006854 for ; Mon, 16 Jan 2012 18:46:52 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap8EAIBhFE9auLfF/2dsb2JhbABDhRGnIIIQgXIBAQUjVhALDgoCAgUhAgIPAkYGDQEHAod4pFWRIIEvh3wBAQgEDAEBCwgBAQQBBQgFBBEFAQYBAQYBBSwBAgEBCAEBAQECFhUDAQYMBwICAx0DAQYJAgENAQEDCwILAgsDAQEJgTkGA4MCgRYEp1M X-IronPort-AV: E=Sophos;i="4.71,519,1320620400"; d="scan'208";a="127272100" Received: from 0405ds1-vaer.0.fullrate.dk (HELO fw.fugmann.net) ([90.184.183.197]) by mail4-smtp-sop.national.inria.fr with ESMTP/TLS/ADH-AES256-SHA; 16 Jan 2012 18:46:46 +0100 Received: from [IPv6:2001:470:dc46:1:225:22ff:fe81:9ef3] (marvin.fugmann.net [IPv6:2001:470:dc46:1:225:22ff:fe81:9ef3]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by fw.fugmann.net (Postfix) with ESMTPSA id 334244005A; Mon, 16 Jan 2012 18:46:44 +0100 (CET) Message-ID: <4F14628D.60809@fugmann.net> Date: Mon, 16 Jan 2012 18:46:53 +0100 From: Anders Fugmann User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.24) Gecko/20111114 Icedove/3.1.16 MIME-Version: 1.0 To: Gerd Stolpmann CC: caml-list References: <4F105C34.3010207@fugmann.net> <1326474410.14288.116.camel@thinkpad> In-Reply-To: <1326474410.14288.116.camel@thinkpad> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Validation-by: anders@fugmann.net Subject: Re: [Caml-list] Netmcore problems. -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, Thank you very much. Remounting with size=80% solved the problem. It would seem that on debian, /run/shm (which is symlinked from /dev/shm) is mounted with size=20%.. Quite conservative it would seem. Again, thanks. Anders - -- Anders Fugmann On 2012-01-13 18:06, Gerd Stolpmann wrote: > Am Freitag, den 13.01.2012, 17:30 +0100 schrieb Anders Fugmann: >> Hi, >> >> I'm having a problem using netmcore from ocamlnet 3.4 library. >> >> When I try to place a large data structure into Netmcore_array, the >> controlling process dies with the following error: >> >> [Fri Jan 13 16:21:38 2012] [netplex.controller] [alert] Process 28067 >> for service netmcore_0 terminated with signal 7 > > That's sigbus. You get it when the OS does not have memory anymore - but > not when the memory is allocated, but first when it is filled with data. > You can change this by disallowing that memory is overcommitted (set the > kernel param vm.overcommit_memory to 1 or even 2). > >> The error occurs while trying to initialise a Netmcore_array from an >> array of 4*10^6 strings all of length 150. This is done in the context >> of the "first" process. >> >> The controlling process initializes a memory pool by using >> >> let pool_id = Netmcore_mempool.create_mempool (1024 * 1024 * 1024 * >> 10) in >> >> Running the code on a smaller dataset seems to work. >> >> I have increased kernel variables shmall and shmmax to very high numbers >> (100Gb), but it does not solve the problem. > > No, this cannot solve it. These variables only control System V shared > memory, but Netmcore uses POSIX shared memory. On Linux, you can change > the max of this memory by re-mounting /dev/shm, e.g. > > mount -o remount,size=80% /dev/shm > > The default is 50% of available RAM. > >> All ideas are welcome. >> >> On other question while I'm at it; am I allowed to create multiple >> shared arrays from the same memory pool, or do I need to create one pool >> for each shared array? > > The pools can be shared. > > Gerd > >> >> Regards >> Anders Fugmann >> >> >> > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk8UYo0ACgkQ9GexfkaZ7hSfLwCgiu+grFYsZXyQlvvnUBZgna+4 twkAoJKyNJ0hXlI0u9IqmuapqQ48YGDz =RCbL -----END PGP SIGNATURE-----