Opps- didn't intend this message to be off-list.

---------- Forwarded message ----------
Date: Tue, 21 Mar 2006 16:32:51 -0600 (CST)
From: Brian Hurt <bhurt@spnz.org>
To: Robert Roessler <roessler@rftp.com>
Subject: Re: [Caml-list] Severe loss of performance due to new signal handling


On Tue, 21 Mar 2006, Robert Roessler wrote:

> Well, I *thought* there was a marked absence of "bit-level parallelism" in 
> the signal-handling... ;)
> 
> So the "expense" of individual atomic operations is not really what is at the 
> heart of this performance problem...

Hmm.  Maybe not.  I'm measuring a 4 clock cycle cost for a xchgl, both with and 
without a lock on my Athlon XP 1.8GHz.  See attached code. Naturally, this is a 
uniprocessor machine and the memory location is in L1 cache (or will be soon), 
and no contention, so this is definately best case.  4 clocks is about rights 
for a read and a write to L1 cache (each L1 cache access taking 2 clocks).

Brian