From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=AWL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by yquem.inria.fr (Postfix) with ESMTP id 65933BBCA for ; Thu, 28 Feb 2008 21:35:57 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAAE6qxkfU4367nmdsb2JhbACQcQEBAQEBBgQGERidIw X-IronPort-AV: E=Sophos;i="4.25,422,1199660400"; d="scan'208";a="9721574" Received: from moutng.kundenserver.de ([212.227.126.187]) by mail3-smtp-sop.national.inria.fr with ESMTP; 28 Feb 2008 21:35:55 +0100 Received: from [152.78.96.56] (blomberg.cip.physik.uni-muenchen.de [141.84.136.37]) by mrelayeu.kundenserver.de (node=mrelayeu8) with ESMTP (Nemesis) id 0ML31I-1JUpTw0i9e-0004ZC; Thu, 28 Feb 2008 21:35:56 +0100 Message-ID: <47C71D90.9050308@functionality.de> Date: Thu, 28 Feb 2008 20:46:08 +0000 From: Thomas Fischbacher User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20060607 Debian/1.7.12-1.2 X-Accept-Language: en MIME-Version: 1.0 To: Caml mailing list Subject: Marshalling problem? Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Provags-ID: V01U2FsdGVkX19hQ/XQrILMyEVSm049jkrxc1/bb83EoghOUrk msFBt4i4EqlRmdX6QvnKg+0JFa7oguJApqGSRA9lOO94rJWZRa Uyh1H3KpJJtywb8WeD6Bw== X-Spam: no; 0.00; marshalling:01 marshalling:01 recv:01 serialize:01 bigarray:01 deserialize:01 segfault:01 serializer:01 slave:98 exception:01 omitted:01 data:02 data:02 chunk:02 seems:03 I encounter a rather weird problem when marshalling a complicated highly networked data structure to send it over the net (yes, it's brute, but that allows me to get an issue solved very quickly which I will refine later on in the future). Basically, the actual sending of a large chunk of data over the net (using MPI_Send/MPI_Recv and serialization) works fine. But the deserialisation seems to get in trouble (on x86-32) when the data is more than 16 MB. Strange enough, If I do not rely on CamlMPI's internal serialization of ML values, but do this of my own and serialize to a bigarray beforehand, the process that just has serialized the data can deserialize it again. If I load that data into some other process - either via the net or by intermediately writing to a file, I get either a segfault or an internal failure exception from the deserializer. Seems as if some initialization may have been omitted somewhere which gets done by first using the serializer... I put an image file (large) as well as some code to demonstrate the deserialization problem on the web at: http://nmag.soton.ac.uk/tf/ocaml-marshal/ The ba.master data array just contains that data that were sent across the network by my master MPI process, and received in the same form by my slave MPI process. Now I am not 100% sure yet I identified the problem correctly -- but I am about 90% confident by now. Oh, by the way, this is using 3.09.2. -- best regards, Thomas Fischbacher t.fischbacher@soton.ac.uk