caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* ocamlmpi reduce_int_array, showstopper?
@ 2010-09-04 18:25 Mike Lin
  2010-09-06 10:51 ` Sylvain Le Gall
  0 siblings, 1 reply; 2+ messages in thread
From: Mike Lin @ 2010-09-04 18:25 UTC (permalink / raw)
  To: sylvain, Xavier.Leroy; +Cc: caml-list

Hi Sylvain, Xavier,
I have encountered what seems to be a serious problem with
reduce_int_array in ocamlmpi, namely, in a trivial test case it
successfully returns incorrect results. This is my test code

--
(* ocamlopt -o test -I +ocamlmpi mpi.cmxa test.ml *)
let me = Mpi.comm_rank Mpi.comm_world;;
let n = Mpi.comm_size Mpi.comm_world;;
let k = 1

let src = Array.make 1 k;;
let dest = Array.make 1 0;;
Mpi.reduce_int_array src dest Mpi.Int_sum 0 Mpi.comm_world;;
if me = 0 then Printf.printf "using reduce_int_array, expected: %d
got: %d\n" (n*k) dest.(0);;

let srcf = Array.make 1 (float k);;
let destf = Array.make 1 0.;;
Mpi.reduce_float_array srcf destf Mpi.Float_sum 0 Mpi.comm_world;;
if me = 0 then Printf.printf "using reduce_float_array, expected: %.1f
got: %.1f\n" (float (n*k)) destf.(0);;
--

I ran this on n=8 processors on NCSA Abe and the output is

using reduce_int_array, expected: 8 got: 0
using reduce_float_array, expected: 8.0 got: 8.0

If I change k to 1,000,000 I get:

using reduce_int_array, expected: 8000000 got: 4000000
using reduce_float_array, expected: 8000000.0 got: 8000000.0

[mikelin@honest3 ~/]$ uname -a
Linux honest3.ncsa.uiuc.edu 2.6.18-92.1.10.el5_lustre.1.6.6smp-perfctr
#7 SMP Tue Nov 10 10:41:00 CST 2009 x86_64 x86_64 x86_64 GNU/Linux
[mikelin@honest3 ~/]$ which mpiexec
/usr/local/mvapich2-1.2-intel-ofed-1.2.5.5/bin/mpiexec

see also: http://www.ncsa.illinois.edu/UserInfo/Resources/Hardware/Intel64Cluster/

I know the ocamlmpi code has been stable for some time, but I wonder
if it's been tested on x86_64? Let me know if you're able to reproduce
this.

Thanks,
Mike


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: ocamlmpi reduce_int_array, showstopper?
  2010-09-04 18:25 ocamlmpi reduce_int_array, showstopper? Mike Lin
@ 2010-09-06 10:51 ` Sylvain Le Gall
  0 siblings, 0 replies; 2+ messages in thread
From: Sylvain Le Gall @ 2010-09-06 10:51 UTC (permalink / raw)
  To: caml-list

On 04-09-2010, Mike Lin <mikelin@MIT.EDU> wrote:
> Hi Sylvain, Xavier,
> I have encountered what seems to be a serious problem with
> reduce_int_array in ocamlmpi, namely, in a trivial test case it
> successfully returns incorrect results. This is my test code
>

[...]

>
> [mikelin@honest3 ~/]$ uname -a
> Linux honest3.ncsa.uiuc.edu 2.6.18-92.1.10.el5_lustre.1.6.6smp-perfctr
> #7 SMP Tue Nov 10 10:41:00 CST 2009 x86_64 x86_64 x86_64 GNU/Linux
> [mikelin@honest3 ~/]$ which mpiexec
> /usr/local/mvapich2-1.2-intel-ofed-1.2.5.5/bin/mpiexec
>
> see also: http://www.ncsa.illinois.edu/UserInfo/Resources/Hardware/Intel64Cluster/
>
> I know the ocamlmpi code has been stable for some time, but I wonder
> if it's been tested on x86_64? Let me know if you're able to reproduce
> this.
>

I can confirm the bug. I have commited a fix in the svn. Though, I am
not an MPI expert.

I think the problem was related to a double in-place re-encoding of
integer in the array. I'll wait for Xavier review to do a release.

Regards,
Sylvain Le Gall


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2010-09-06 10:52 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-09-04 18:25 ocamlmpi reduce_int_array, showstopper? Mike Lin
2010-09-06 10:51 ` Sylvain Le Gall

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).