Re: [9fans] waitfree - Devon H. O'Dell

9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed

From: "Devon H. O'Dell" <devon.odell@gmail.com>
To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net>
Subject: Re: [9fans] waitfree
Date: Mon, 19 May 2014 21:10:21 -0400	[thread overview]
Message-ID: <CAFgOgC-=2z7JknR74-4geMwm2haV52FgtVA+U9wmOU11oTwrxQ@mail.gmail.com> (raw)
In-Reply-To: <c50dff498502b93c8e9513c0cbbb1ffd@ivey>

2014-05-19 18:05 GMT-04:00 erik quanstrom <quanstro@quanstro.net>:
> On Mon May 19 17:02:57 EDT 2014, devon.odell@gmail.com wrote:
>> So you seem to be worried that N processors in a tight loop of LOCK
>> XADD could have a single processor. This isn't a problem because
>> locked instructions have total order. Section 8.2.3.8:
>>
>> "The memory-ordering model ensures that all processors agree on a
>> single execution order of all locked instructions, including those
>> that are larger than 8 bytes or are not naturally aligned."
>
> i don't think this solves any problems.  given thread 0-n all executing
> LOCK instructions, here's a valid ordering:
>
> 0       1       2               n
> lock    stall   stall   ...     stall
> lock    stall   stall   ...     stall
> ...                     ...
> lock    stall   stall   ...     stall
>
> i'm not sure if the LOCK really changes the situation.  any old exclusive
> cacheline access should do?

It is an ordering, but I don't think it's a valid one: your ellipses
suggest an unbounded execution time (given the context of the
discussion). I don't think that's valid because the protocol can't
possibly negotiate execution for more instructions than it has space
for in its pipeline. Furthermore, the pipeline cannot possibly be
filled with LOCK-prefixed instructions because it also needs to
schedule instruction loading, and it pipelines μops, not whole
instructions anyway. Furthermore, part of the execution cycle is
decomposing an instruction into its μop parts. At some point, that
processor is not going to be executing a LOCK instruction, it is going
to be executing some other μop (like decoding the next LOCK-prefixed
instruction it wants to execute). This won't be done with any
synchronization. When this happens, other processors will execute
their LOCK-prefixed instructions.

The only way I could think to try to force this execution history was
to unroll a loop of LOCK-prefixed instructions. In a tight loop, a
program I wrote to do LOCK XADD 10 billion times per-thread (across 4
threads on my 4 core system) finished with a standard deviation in
cycle count of around 1%. When I unroll the loop enough to fill the
pipeline, the stddev actually decreases (to about 0.5%), which leads
me to believe that the processor actively mitigates that sort of
instruction "attack" for highly concurrent workloads.

So either way, you're still bounded. Eventually p0 has to go do
something that isn't a LOCK-prefixed instruction, like decode the next
one. I don't know how to get the execution order you suggest. You'd
have to manage to fill the pipeline on the processor while starving
the pipeline on the others and preventing them from executing any
further instructions. Instruction load and decode stages are shared,
so I really don't see how you'd manage this without using PAUSE
strategically. You'd have to con the processor into executing that
order. At that point, just use a mutex :)

--dho

> the documentation appears not to cover this completely.
>
> - erik
>

next prev parent reply	other threads:[~2014-05-20  1:10 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-19 17:34 erik quanstrom
2014-05-19 19:49 ` Devon H. O'Dell
2014-05-19 20:21   ` erik quanstrom
2014-05-19 21:01     ` Devon H. O'Dell
2014-05-19 22:05       ` erik quanstrom
2014-05-19 22:14         ` ron minnich
2014-05-19 22:18           ` erik quanstrom
2014-05-20  1:10         ` Devon H. O'Dell [this message]
2014-05-20  2:12           ` erik quanstrom
2014-05-20 14:47             ` Devon H. O'Dell
2014-05-20 15:41               ` erik quanstrom
2014-05-20 19:14                 ` Devon H. O'Dell
2014-05-20 19:30                   ` erik quanstrom
2014-05-20 20:32                     ` Devon H. O'Dell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFgOgC-=2z7JknR74-4geMwm2haV52FgtVA+U9wmOU11oTwrxQ@mail.gmail.com' \
    --to=devon.odell@gmail.com \
    --cc=9fans@9fans.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).