9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: Russ Cox <rsc@swtch.com>
To: erik quanstrom <quanstro@labs.coraid.com>
Cc: 9fans <9fans@9fans.net>
Subject: Re: [9fans] sleep/wakeup bug?
Date: Fri, 25 Feb 2011 01:51:10 -0500	[thread overview]
Message-ID: <AANLkTi=FabYqOd3ozUEXi9_Ua8S5DujfUjhzCYxPF2TA@mail.gmail.com> (raw)
In-Reply-To: <40925e8f64489665bd5bd6ca743400ea@coraid.com>

> your layout in your first email (i think) assumes that wakeup
> can be called twice.

it doesn't.  the scenario in my first email has exactly one
sleep and one wakeup.

the canonical problem you have to avoid when implementing
sleep and wakeup is that the wakeup might happen before
the sleep has gotten around to sleeping.  to be concrete,
you might do something like:

cpu1:
    kick off disk i/o operation
    sleep(r)

cpu2:
    interrupt happens
    mark operation completed
    wakeup(r)

the problem is what happens if the interrupt is so fast
that cpu2 runs all that before sleep(&r) starts.
a wakeup without a sleep is defined to be a no-op, so
if the wakeup runs first the sleep never wakes up:

cpu1:
    kick off disk i/o operation

cpu2:
    interrupt happens
    mark operation completed
    wakeup(r)

cpu1:
    sleep(r)  // never returns

to avoid that problem there is this extra f, arg passed
to sleep along with some locks to make sure sleep
and wakeup are not running their coordination code
simultaneously.  with f(arg), the last scenario becomes:

cpu1:
    kick off disk i/o operation

cpu2:
    interrupt happens
    mark operation completed
    wakeup(r)

cpu1:
    sleep(r)
        calls f(arg), which sees op marked completed, returns 1
        sleep returns immediately

avoiding the missed wakeup.

unfortunately the f(arg) check means that now sleep can
sometimes return before wakeup (kind of a missed sleep):

cpu1:
    kick off disk i/o operation

cpu2:
    interrupt happens
    mark operation completed

cpu1:
    sleep(r)
        calls f(arg), which checks completed, returns 1
        sleep returns immediately

cpu2:
    wakeup(r)
        finds nothing sleeping on r, no-op.

there's no second wakeup involved here.  this is just sleep
figuring out that there's nothing to sleep for, before wakeup
comes along.  f(arg) == true means that wakeup is either
on its way or already passed by, and sleep doesn't know which,
so it has to be conservative and not sleep.

if r is allocated memory and cpu1 calls free(r) when sleep
returns, that's not okay, because cpu2 has already decided
to call wakeup(r), which will now be scribbling on or at least
looking at freed memory.

as i said originally, it's simply not 1:1.
if you need 1:1, you need a semaphore.

russ


p.s. not relevant to your "only one sleep and one wakeup"
constraint, but that last scenario also means that if you
are doing repeated sleep + wakeup on a single r, that pending
wakeup call left over on cpu2 might not happen until cpu1 has
gone back to sleep (a second time).  that is, the first wakeup
can wake the second sleep, intending to wake the first sleep.
so in general you have to handle the case where sleep
wakes up for no good reason.  it doesn't happen all the time,
but it does happen.


  parent reply	other threads:[~2011-02-25  6:51 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-25  5:26 erik quanstrom
2011-02-25  5:47 ` Russ Cox
2011-02-25  5:53   ` erik quanstrom
2011-02-25  6:01     ` Russ Cox
2011-02-25  6:12       ` erik quanstrom
     [not found]       ` <2808a9fa079bea86380a8d52be67b980@coraid.com>
     [not found]         ` <AANLkTi=4_=++Tm2a9Jq9jSzqUSexkW-ZjM-38oD_bS1y@mail.gmail.com>
     [not found]           ` <40925e8f64489665bd5bd6ca743400ea@coraid.com>
2011-02-25  6:51             ` Russ Cox [this message]
2011-02-25  7:13               ` erik quanstrom
2011-02-25 14:44                 ` Russ Cox
2011-02-25  8:37               ` Sape Mullender
2011-02-25  9:18                 ` Bakul Shah
2011-02-25 14:57               ` Charles Forsyth
2011-02-25 16:09               ` Venkatesh Srinivas
  -- strict thread matches above, loose matches on Subject: below --
2011-02-24 22:01 erik quanstrom
2011-02-25  4:46 ` Russ Cox
2011-02-25  9:46 ` Richard Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='AANLkTi=FabYqOd3ozUEXi9_Ua8S5DujfUjhzCYxPF2TA@mail.gmail.com' \
    --to=rsc@swtch.com \
    --cc=9fans@9fans.net \
    --cc=quanstro@labs.coraid.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).