caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: "Étienne André" <etienne.andre@univ-paris13.fr>
To: caml-list@inria.fr
Subject: Re: [Caml-list] Segmentation fault when using OcamlMPI
Date: Sat, 26 Apr 2014 14:21:03 +0200	[thread overview]
Message-ID: <535BA4AF.5090508@univ-paris13.fr> (raw)
In-Reply-To: <535ABDC2.4080501@univ-paris13.fr>

Dear all,

In the end, it seems we found a bug (or, at least, a very strange issue)
in OcamlMPI.
And we (well, my colleague) found a way to go around the issue.

In short, the node needs to explicitly send its node number in the first
communication to the master.
(Details available on demand.)

We informed the developers.
So the case is closed (for us!).

Best,

-- 
Étienne André
Université Paris 13, Sorbonne Paris Cité
http://lipn.univ-paris13.fr/~andre 

Le 25/04/2014 21:55, Étienne André a écrit :
> Dear all,
>
> I'm trying with a colleague to distribute a verification tool using
> OcamlMPI.
> Unfortunately, we encounter segmentation faults "sometimes".
> Sometimes means still often enough to have the tool crash almost always
> at some point.
>
> We don't understand at all what is happening.
> We thought that the MPI read function ("Mpi.receive source_rank") would
> wait until there is something to read, but maybe we misunderstood that.
> The precise command we use to receive info is as follows:
>
> let res = Mpi.receive source_rank (int_of_slave_tag Slave_result_tag)
> Mpi.comm_world
>
> where int_of_slave_tag Slave_result_tag is our own function returning
> some predefined integer.
>
> Are there any risks of conflicts when several nodes (with different
> source_rank, though) send something to one node?
> For information, we use a master-workers scheme, with one master
> centralizing results computed by workers.
> For information too, we first send (and receive) the size of the data,
> and then the actual data (although, for some strange reason, we do not
> use the size when receiving the data; maybe we should?!).
>
> I put a zip on my Web page with a simplified source (a single .ml file,
> with _oasis and the command to launch) that is enough to show the bug.
> http://lipn.univ-paris13.fr/~andre/PaTATOR.zip
>
> Thank you for your feedback!
>


      reply	other threads:[~2014-04-26 12:21 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-25 19:55 Étienne André
2014-04-26 12:21 ` Étienne André [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=535BA4AF.5090508@univ-paris13.fr \
    --to=etienne.andre@univ-paris13.fr \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).