Opps- didn't intend this message to be off-list. ---------- Forwarded message ---------- Date: Tue, 21 Mar 2006 16:32:51 -0600 (CST) From: Brian Hurt To: Robert Roessler Subject: Re: [Caml-list] Severe loss of performance due to new signal handling On Tue, 21 Mar 2006, Robert Roessler wrote: > Well, I *thought* there was a marked absence of "bit-level parallelism" in > the signal-handling... ;) > > So the "expense" of individual atomic operations is not really what is at the > heart of this performance problem... Hmm. Maybe not. I'm measuring a 4 clock cycle cost for a xchgl, both with and without a lock on my Athlon XP 1.8GHz. See attached code. Naturally, this is a uniprocessor machine and the memory location is in L1 cache (or will be soon), and no contention, so this is definately best case. 4 clocks is about rights for a read and a write to L1 cache (each L1 cache access taking 2 clocks). Brian