From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id 51635BC57; Sat, 4 Sep 2010 20:26:00 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Aj0BABYsgkwSCRkOnGdsb2JhbAChFQgVAQEBAQEICwgJESKgMZFbiEiFPQSKFw X-IronPort-AV: E=Sophos;i="4.56,318,1280700000"; d="scan'208";a="68861618" Received: from dmz-mailsec-scanner-3.mit.edu ([18.9.25.14]) by mail4-smtp-sop.national.inria.fr with ESMTP; 04 Sep 2010 20:25:59 +0200 X-AuditID: 1209190e-b7b91ae0000068d0-8f-4c828f386d43 Received: from mailhub-auth-1.mit.edu ( [18.9.21.35]) by dmz-mailsec-scanner-3.mit.edu (Symantec Brightmail Gateway) with SMTP id D9.AE.26832.83F828C4; Sat, 4 Sep 2010 14:26:00 -0400 (EDT) Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103]) by mailhub-auth-1.mit.edu (8.13.8/8.9.2) with ESMTP id o84IPv5n026267; Sat, 4 Sep 2010 14:25:57 -0400 Received: from mail-gy0-f182.google.com (mail-gy0-f182.google.com [209.85.160.182]) (authenticated bits=0) (User authenticated as mflin@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id o84IPt4V009312 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NOT); Sat, 4 Sep 2010 14:25:57 -0400 (EDT) Received: by gyg4 with SMTP id 4so1596699gyg.27 for ; Sat, 04 Sep 2010 11:25:55 -0700 (PDT) Received: by 10.150.157.8 with SMTP id f8mr942640ybe.367.1283624755204; Sat, 04 Sep 2010 11:25:55 -0700 (PDT) MIME-Version: 1.0 Received: by 10.150.92.1 with HTTP; Sat, 4 Sep 2010 11:25:35 -0700 (PDT) X-Originating-IP: [74.61.199.33] From: Mike Lin Date: Sat, 4 Sep 2010 14:25:35 -0400 Message-ID: Subject: ocamlmpi reduce_int_array, showstopper? To: sylvain@le-gall.net, Xavier.Leroy@inria.fr Cc: caml-list@inria.fr Content-Type: text/plain; charset=ISO-8859-1 X-Brightmail-Tracker: AAAAAhXbp9UV3F1Y X-Spam: no; 0.00; mikelin:01 ocamlmpi:01 showstopper:01 ocamlmpi:01 trivial:01 ocamlopt:01 cmxa:01 printf:01 printf:01 mikelin:01 uname:01 usr:01 2009:98 1.2:98 int:01 Hi Sylvain, Xavier, I have encountered what seems to be a serious problem with reduce_int_array in ocamlmpi, namely, in a trivial test case it successfully returns incorrect results. This is my test code -- (* ocamlopt -o test -I +ocamlmpi mpi.cmxa test.ml *) let me = Mpi.comm_rank Mpi.comm_world;; let n = Mpi.comm_size Mpi.comm_world;; let k = 1 let src = Array.make 1 k;; let dest = Array.make 1 0;; Mpi.reduce_int_array src dest Mpi.Int_sum 0 Mpi.comm_world;; if me = 0 then Printf.printf "using reduce_int_array, expected: %d got: %d\n" (n*k) dest.(0);; let srcf = Array.make 1 (float k);; let destf = Array.make 1 0.;; Mpi.reduce_float_array srcf destf Mpi.Float_sum 0 Mpi.comm_world;; if me = 0 then Printf.printf "using reduce_float_array, expected: %.1f got: %.1f\n" (float (n*k)) destf.(0);; -- I ran this on n=8 processors on NCSA Abe and the output is using reduce_int_array, expected: 8 got: 0 using reduce_float_array, expected: 8.0 got: 8.0 If I change k to 1,000,000 I get: using reduce_int_array, expected: 8000000 got: 4000000 using reduce_float_array, expected: 8000000.0 got: 8000000.0 [mikelin@honest3 ~/]$ uname -a Linux honest3.ncsa.uiuc.edu 2.6.18-92.1.10.el5_lustre.1.6.6smp-perfctr #7 SMP Tue Nov 10 10:41:00 CST 2009 x86_64 x86_64 x86_64 GNU/Linux [mikelin@honest3 ~/]$ which mpiexec /usr/local/mvapich2-1.2-intel-ofed-1.2.5.5/bin/mpiexec see also: http://www.ncsa.illinois.edu/UserInfo/Resources/Hardware/Intel64Cluster/ I know the ocamlmpi code has been stable for some time, but I wonder if it's been tested on x86_64? Let me know if you're able to reproduce this. Thanks, Mike