From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by yquem.inria.fr (Postfix) with ESMTP id 5BDAFBB81 for ; Thu, 9 Mar 2006 05:32:43 +0100 (CET) Received: from ash25e.internode.on.net (ash25e.internode.on.net [203.16.214.182]) by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id k294Wfoh009982 for ; Thu, 9 Mar 2006 05:32:42 +0100 Received: from [192.168.1.200] (ppp9-113.lns1.syd7.internode.on.net [59.167.9.113]) by ash25e.internode.on.net (8.13.5/8.13.5) with ESMTP id k294WXuM085149; Thu, 9 Mar 2006 15:02:33 +1030 (CST) (envelope-from skaller@users.sourceforge.net) Subject: Re: [Caml-list] STM support in OCaml From: skaller To: Brian Hurt Cc: caml-list@yquem.inria.fr, William Lovas In-Reply-To: References: <440DB255.1030701@asfandyar.cjb.net> <1141751708.20944.355.camel@budgie.wigram> <440DD982.8080800@asfandyar.cjb.net> <1141779125.20944.405.camel@budgie.wigram> <20060308193633.GA5460@coruscant.stwing.upenn.edu> <1141855594.23909.63.camel@budgie.wigram> <1141863807.23909.151.camel@budgie.wigram> Content-Type: text/plain Organization: Async P/L Date: Thu, 09 Mar 2006 15:32:31 +1100 Message-Id: <1141878751.8450.37.camel@budgie.wigram> Mime-Version: 1.0 X-Mailer: Evolution 2.4.1 Content-Transfer-Encoding: 7bit X-Miltered: at concorde with ID 440FAFE9.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; ocaml:01 trivial:01 reloaded:01 reloaded:01 async:01 2006:98 reloads:98 consultants:98 wrote:01 sourceforge:01 sourceforge:01 caml-list:01 caches:01 realtime:01 increment:02 X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.0.3 On Wed, 2006-03-08 at 21:19 -0600, Brian Hurt wrote: > > I couldn't believe it! That's why we tried the experiment: > > did the OS really invalidate the 'whole' caches, or did our > > trivial increment program fail, because the memory being > > incremented wasn't coherent? > I'm simplifying > enormously here, I know, so was I -- I've read the AMD64 specs. Suppose I have a thread that creates an array which fits roughly in the cache. Then I hand it over to another thread to fold. The first thread has to load stuff from RAM to populate the array, which slows it down. Then it has to write the whole array back to RAM, and the second thread reloads the whole array from RAM into the cache. In that case the whole cache has to be not only flushed to RAM, it has to be reloaded into another cache. Pretty nasty .. on a MP machine. On a uniprocessor, the same CPU will do the fold, and the cache doesn't have to be either saved or reloaded. Yes, that's an extreme case! -- John Skaller Async PL, Realtime software consultants Checkout Felix: http://felix.sourceforge.net