I don't really care what others say, but to prove that this has any performance value you should do the following:


Compare your most "parallel" algorithm with the performance of a corresponding well-written MPI application using openmpi's shared memory transport. If there is a difference, then your system has some value.

Of course openmpi's shared memory transport is terribly buggy, but it should give a baseline acceptable performance.

If there is no comparison, we have no idea.

Best,


--
Eray Ozkural, PhD candidate.  Comp. Sci. Dept., Bilkent University, Ankara
http://groups.yahoo.com/group/ai-philosophy
http://myspace.com/arizanesil http://myspace.com/malfunct