* [9fans] sleep/wakeup bug? @ 2011-02-24 22:01 erik quanstrom 2011-02-25 4:46 ` Russ Cox 2011-02-25 9:46 ` Richard Miller 0 siblings, 2 replies; 15+ messages in thread From: erik quanstrom @ 2011-02-24 22:01 UTC (permalink / raw) To: 9fans /sys/doc/sleep.ps says that sleep/wakeup are atomic. in concrete terms, i take this to mean that if sleep has returned, wakeup will no longer be in its critical section. unfortunately, this does not seem to be the case. the woken process can continue before the rendezvous lock is dropped. this means that any dynamic allocation of structures containing rendezvous is not possible because structure can be free'd before the rendezvous lock is dropped by the waking process. this was biting me on a high-end mp system with improved lapic arbitration in the aoe driver, faulting in the pool library. the memory in question was an Srb. after i zeroed the memory before free'ing, i observed an unlock: not locked: pc x, where x was the splx in wakeup(). this led directly to the observation that the ready'd process could never know when it would be safe to free the rendezvous-containing structure. here's my suggested correction Proc* wakeup(Rendez *r) { Proc *p; int s; s = splhi(); lock(r); p = r->p; if(p != nil){ lock(&p->rlock); if(p->state != Wakeme || p->r != r){ iprint("%p %p %d\n", p->r, r, p->state); panic("wakeup: state"); } r->p = nil; p->r = nil; } unlock(r); if(p != nil){ ready(p); unlock(&p->rlock); } splx(s); return p; } the handling of p->rlock looks wierd, but i haven't investigated. - erik ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [9fans] sleep/wakeup bug? 2011-02-24 22:01 [9fans] sleep/wakeup bug? erik quanstrom @ 2011-02-25 4:46 ` Russ Cox 2011-02-25 9:46 ` Richard Miller 1 sibling, 0 replies; 15+ messages in thread From: Russ Cox @ 2011-02-25 4:46 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs; +Cc: erik quanstrom On Thu, Feb 24, 2011 at 5:01 PM, erik quanstrom <quanstro@quanstro.net> wrote: > /sys/doc/sleep.ps says that sleep/wakeup are atomic. > in concrete terms, i take this to mean that if sleep > has returned, wakeup will no longer be in its critical > section. it means only that if sleep finds f(arg) to be false, then sleep is guaranteed not to miss a wakeup called after f(arg) has been established to be true. in short it means no missed wakeups. sleep may in various conditions wake up spuriously, and it won't go to sleep at all if it finds f(arg) to be true. assuming a tight 1:1 coupling between sleep and wakeup is a recipe for trouble. even if your change fixes one possible race (i didn't bother to see what changed), you still have to deal with cpu1: decide to call sleep call sleep cpu2: decide to call wakeup cpu1: sleep checks f(arg), finds true sleep returns frees whatever cpu2: call wakeup wakeup runs on freed memory these races are inherent to the definition of sleep and wakeup. it doesn't mean what you need it to mean to free memory immediately after sleeping on it. russ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [9fans] sleep/wakeup bug? 2011-02-24 22:01 [9fans] sleep/wakeup bug? erik quanstrom 2011-02-25 4:46 ` Russ Cox @ 2011-02-25 9:46 ` Richard Miller 1 sibling, 0 replies; 15+ messages in thread From: Richard Miller @ 2011-02-25 9:46 UTC (permalink / raw) To: 9fans > this means that any dynamic allocation > of structures containing rendezvous is not possible because > structure can be free'd before the rendezvous lock is > dropped by the waking process. Déjà vu all over again ... This and other subtleties of sleep/wakeup were extensively explored in 9fans back in Jul-Aug 2000. Look for the subject line "i386 test-and-set problem" if you've got a strong stomach. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [9fans] sleep/wakeup bug? @ 2011-02-25 5:26 erik quanstrom 2011-02-25 5:47 ` Russ Cox 0 siblings, 1 reply; 15+ messages in thread From: erik quanstrom @ 2011-02-25 5:26 UTC (permalink / raw) To: 9fans > assuming a tight 1:1 coupling between sleep and > wakeup is a recipe for trouble. even if your change > fixes one possible race (i didn't bother to see what changed), > you still have to deal with the point of sleep/rendezvous is tight coupling, no? the change was to move the ready() to after the rendezvous lock was dropped. therefore the sleeper knows the rendezvous is not locked by the event that woke him. if one can assert that each sleep has exactly one wakeup (as is often the case for rpc-style programming), then that is enough to know the rendezvous can be retired. > these races are inherent to the definition of sleep and > wakeup. it doesn't mean what you need it to mean > to free memory immediately after sleeping on it. if not a tight coupling, what kind of coupling would you think is appropriate? when would you think it would be fair to recycle the rendezvous? 10s? :-) what idiom do you think would be appropriate for such a case? - erik ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [9fans] sleep/wakeup bug? 2011-02-25 5:26 erik quanstrom @ 2011-02-25 5:47 ` Russ Cox 2011-02-25 5:53 ` erik quanstrom 0 siblings, 1 reply; 15+ messages in thread From: Russ Cox @ 2011-02-25 5:47 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs; +Cc: erik quanstrom >> assuming a tight 1:1 coupling between sleep and >> wakeup is a recipe for trouble. even if your change >> fixes one possible race (i didn't bother to see what changed), >> you still have to deal with > > the point of sleep/rendezvous is tight coupling, no? no, it's not 1:1. > the change was to move the ready() to after the rendezvous > lock was dropped. therefore the sleeper knows the rendezvous > is not locked by the event that woke him. if one can assert > that each sleep has exactly one wakeup (as is often the case > for rpc-style programming), then that is enough to know > the rendezvous can be retired. that's only true if the sleep sleeps. if sleep checks f(arg) and finds it to be true, then it will not sleep, and the subsequent wakeup will happen after sleep returns (and be a no-op, unless the memory has been freed). > if not a tight coupling, what kind of coupling would you > think is appropriate? when would you think it would be > fair to recycle the rendezvous? 10s? :-) what idiom do > you think would be appropriate for such a case? it is appropriate to reuse the memory when you know that no cpu is still referring to it. you can deal with the main procs. interrupt handlers are the wildcard. put an ilock around the table where the pointer to it is kept, and only use the memory (from an interrupt handler) while inside the ilock. russ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [9fans] sleep/wakeup bug? 2011-02-25 5:47 ` Russ Cox @ 2011-02-25 5:53 ` erik quanstrom 2011-02-25 6:01 ` Russ Cox 0 siblings, 1 reply; 15+ messages in thread From: erik quanstrom @ 2011-02-25 5:53 UTC (permalink / raw) To: 9fans > >> assuming a tight 1:1 coupling between sleep and > >> wakeup is a recipe for trouble. even if your change > >> fixes one possible race (i didn't bother to see what changed), > >> you still have to deal with > > > > the point of sleep/rendezvous is tight coupling, no? > > no, it's not 1:1. i don't yet see why this assumption isn't a good one to have. what is the argument against tightening it up? i don't see it yet. > > the change was to move the ready() to after the rendezvous > > lock was dropped. therefore the sleeper knows the rendezvous > > is not locked by the event that woke him. if one can assert > > that each sleep has exactly one wakeup (as is often the case > > for rpc-style programming), then that is enough to know > > the rendezvous can be retired. > > that's only true if the sleep sleeps. > if sleep checks f(arg) and finds it to be true, then it will not > sleep, and the subsequent wakeup will happen after sleep > returns (and be a no-op, unless the memory has been freed). and that case is fine, too. either way, we're "done". > > if not a tight coupling, what kind of coupling would you > > think is appropriate? when would you think it would be > > fair to recycle the rendezvous? 10s? :-) what idiom do > > you think would be appropriate for such a case? > > it is appropriate to reuse the memory when you know that > no cpu is still referring to it. you can deal with the main procs. > interrupt handlers are the wildcard. put an ilock around the table > where the pointer to it is kept, and only use the memory > (from an interrupt handler) while inside the ilock. since i know i'm done when the wakeup happens, the wakeup itself is causing the trouble. it starts the sleeper before releasing the rendezvous lock. so your solution would have me have an ilock (with no interrupt in sight!) to protect rendezvous from itself? why can't rendezvous carry its own water? - erik ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [9fans] sleep/wakeup bug? 2011-02-25 5:53 ` erik quanstrom @ 2011-02-25 6:01 ` Russ Cox 2011-02-25 6:12 ` erik quanstrom [not found] ` <2808a9fa079bea86380a8d52be67b980@coraid.com> 0 siblings, 2 replies; 15+ messages in thread From: Russ Cox @ 2011-02-25 6:01 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs; +Cc: erik quanstrom you are fixating on the case you fixed and not the case i am describing. in the case i am describing sleep returns before wakeup even starts. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [9fans] sleep/wakeup bug? 2011-02-25 6:01 ` Russ Cox @ 2011-02-25 6:12 ` erik quanstrom [not found] ` <2808a9fa079bea86380a8d52be67b980@coraid.com> 1 sibling, 0 replies; 15+ messages in thread From: erik quanstrom @ 2011-02-25 6:12 UTC (permalink / raw) To: 9fans On Fri Feb 25 01:02:16 EST 2011, rsc@swtch.com wrote: > you are fixating on the case you fixed and not > the case i am describing. in the case i am describing > sleep returns before wakeup even starts. what i'm saying is that with the code i posted a) if sleep does not sleep, wakeup is not holding the rendez lock when sleep returns b) if sleep does sleep, the same assertion holds - erik ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <2808a9fa079bea86380a8d52be67b980@coraid.com>]
[parent not found: <AANLkTi=4_=++Tm2a9Jq9jSzqUSexkW-ZjM-38oD_bS1y@mail.gmail.com>]
[parent not found: <40925e8f64489665bd5bd6ca743400ea@coraid.com>]
* Re: [9fans] sleep/wakeup bug? [not found] ` <40925e8f64489665bd5bd6ca743400ea@coraid.com> @ 2011-02-25 6:51 ` Russ Cox 2011-02-25 7:13 ` erik quanstrom ` (3 more replies) 0 siblings, 4 replies; 15+ messages in thread From: Russ Cox @ 2011-02-25 6:51 UTC (permalink / raw) To: erik quanstrom; +Cc: 9fans > your layout in your first email (i think) assumes that wakeup > can be called twice. it doesn't. the scenario in my first email has exactly one sleep and one wakeup. the canonical problem you have to avoid when implementing sleep and wakeup is that the wakeup might happen before the sleep has gotten around to sleeping. to be concrete, you might do something like: cpu1: kick off disk i/o operation sleep(r) cpu2: interrupt happens mark operation completed wakeup(r) the problem is what happens if the interrupt is so fast that cpu2 runs all that before sleep(&r) starts. a wakeup without a sleep is defined to be a no-op, so if the wakeup runs first the sleep never wakes up: cpu1: kick off disk i/o operation cpu2: interrupt happens mark operation completed wakeup(r) cpu1: sleep(r) // never returns to avoid that problem there is this extra f, arg passed to sleep along with some locks to make sure sleep and wakeup are not running their coordination code simultaneously. with f(arg), the last scenario becomes: cpu1: kick off disk i/o operation cpu2: interrupt happens mark operation completed wakeup(r) cpu1: sleep(r) calls f(arg), which sees op marked completed, returns 1 sleep returns immediately avoiding the missed wakeup. unfortunately the f(arg) check means that now sleep can sometimes return before wakeup (kind of a missed sleep): cpu1: kick off disk i/o operation cpu2: interrupt happens mark operation completed cpu1: sleep(r) calls f(arg), which checks completed, returns 1 sleep returns immediately cpu2: wakeup(r) finds nothing sleeping on r, no-op. there's no second wakeup involved here. this is just sleep figuring out that there's nothing to sleep for, before wakeup comes along. f(arg) == true means that wakeup is either on its way or already passed by, and sleep doesn't know which, so it has to be conservative and not sleep. if r is allocated memory and cpu1 calls free(r) when sleep returns, that's not okay, because cpu2 has already decided to call wakeup(r), which will now be scribbling on or at least looking at freed memory. as i said originally, it's simply not 1:1. if you need 1:1, you need a semaphore. russ p.s. not relevant to your "only one sleep and one wakeup" constraint, but that last scenario also means that if you are doing repeated sleep + wakeup on a single r, that pending wakeup call left over on cpu2 might not happen until cpu1 has gone back to sleep (a second time). that is, the first wakeup can wake the second sleep, intending to wake the first sleep. so in general you have to handle the case where sleep wakes up for no good reason. it doesn't happen all the time, but it does happen. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [9fans] sleep/wakeup bug? 2011-02-25 6:51 ` Russ Cox @ 2011-02-25 7:13 ` erik quanstrom 2011-02-25 14:44 ` Russ Cox 2011-02-25 8:37 ` Sape Mullender ` (2 subsequent siblings) 3 siblings, 1 reply; 15+ messages in thread From: erik quanstrom @ 2011-02-25 7:13 UTC (permalink / raw) To: 9fans > it doesn't. the scenario in my first email has exactly one > sleep and one wakeup. alright. i do see it now, and this does look like a source of many gnarly, and hard to find bugs. pardon me for being so thick. why doesn't wakeup take a function and a pointer analogous to sleep? that would seem to lead one in proper usage. or is there a more subtle hole that i also don't see in doing it that way? perhaps it's silly to put a thumb in the dyke, but it might be better than relying on proper fixes for everything appearing overnight. in the case of this particular bug, i have at least 40µs grace and the change has held up for 12 hrs, where i could crash the machine in <5s before. > as i said originally, it's simply not 1:1. > if you need 1:1, you need a semaphore. unfortunately, we don't have those in the kernel. > p.s. not relevant to your "only one sleep and one wakeup" > constraint, but that last scenario also means that if you > are doing repeated sleep + wakeup on a single r, that pending > wakeup call left over on cpu2 might not happen until cpu1 has > gone back to sleep (a second time). that is, the first wakeup > can wake the second sleep, intending to wake the first sleep. > so in general you have to handle the case where sleep > wakes up for no good reason. it doesn't happen all the time, > but it does happen. this is probablly a bug lurking in many things. - erik ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [9fans] sleep/wakeup bug? 2011-02-25 7:13 ` erik quanstrom @ 2011-02-25 14:44 ` Russ Cox 0 siblings, 0 replies; 15+ messages in thread From: Russ Cox @ 2011-02-25 14:44 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs; +Cc: erik quanstrom > in the case of this particular bug, i have at least 40µs > grace and the change has held up for 12 hrs, where i > could crash the machine in <5s before. i don't know about you but if i were shipping code that depended critically on not losing a race with a fast device, it would keep me up at night. what sape said. don't try to paper over this. also what the sleep and wakeup paper says: The code looks trivially correct in retrospect: all access to data structures is done under lock, and there is no place that things may get out of order. Nonetheless, it took us several iterations to arrive at the above implementation, because the things that can go wrong are often hard to see. We had four earlier implementations that were examined at great length and only found faulty when a new, different style of device or activity was added to the system. you can spend a lot of time chasing down bugs because you are using sleep and wakeup incorrectly, or you can use it correctly. "fixing" it is not something i would suggest taking on. sape also said that you can't dynamically allocate structures with rendezvous in them, but that's not strictly true. you just have to be careful, i.e. use a lock to make sure no cpu is about to find or about to call wakeup on the structure when you're about to free it. i believe the locking discipline i described in my original mail works (use a lock instead of an ilock if there's no interrupt involved). i was going to point you at devmnt but devmnt looks buggy to me. mountmux unlocks m before calling wakeup, which on first glance looks like a mistake mitigated only by the fact that mntfree only rarely frees r. or i could be missing something. it's certainly much easier to use static rendez structures. as presotto says in the thread richard mentioned, the problem is not sleep vs wakeup. the responsibility necessarily has to lie with the calling code, which is finding, looking at, and possibly modifying the structure containing r before it calls wakeup. http://9fans.net/archive/?q=%27test-and-set+problem%27 >> p.s. not relevant to your "only one sleep and one wakeup" >> constraint, but that last scenario also means that if you >> are doing repeated sleep + wakeup on a single r, that pending >> wakeup call left over on cpu2 might not happen until cpu1 has >> gone back to sleep (a second time). that is, the first wakeup >> can wake the second sleep, intending to wake the first sleep. >> so in general you have to handle the case where sleep >> wakes up for no good reason. it doesn't happen all the time, >> but it does happen. > > this is probablly a bug lurking in many things. it's not that hard to get this part right. the simple rule is that you should call sleep in a loop until the condition you are waiting for has happened. (keep sleeping until your dreams come true?) devmnt is a good example in this case. seeing the loop when you read the code is also a good reminder that sometimes sleep never runs at all. russ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [9fans] sleep/wakeup bug? 2011-02-25 6:51 ` Russ Cox 2011-02-25 7:13 ` erik quanstrom @ 2011-02-25 8:37 ` Sape Mullender 2011-02-25 9:18 ` Bakul Shah 2011-02-25 14:57 ` Charles Forsyth 2011-02-25 16:09 ` Venkatesh Srinivas 3 siblings, 1 reply; 15+ messages in thread From: Sape Mullender @ 2011-02-25 8:37 UTC (permalink / raw) To: 9fans; +Cc: sape I suppose the use of counting semaphores in sleep/wakeup could help in cases like this (but I'm sure there are still plenty of other scenarios where they might not help). The value of the semaphore would represent something like "number of things to do", so acquire(sema) would (atomically) wait until the value of sema is greater than zero, then (using compare&swap, or doing the whole thing inside an ilock) decrement the semaphore and continue. Release(sema) will (atomically) increment the semaphore and, if the old value was zero, wake up any waiters. Now, at first glance that looks like a vast improvement over sleep/ wakeup, but *inside* acquire and release, you'd still have sleep/wakeup and you'd still run the risk of waking up just when something else managed to grab the semaphore, or waking up something that hasn't actually gone to sleep yet. So, I think you can think of semaphores as a wrapper for sleep/wakeup that can be used in some case to make sure that you can indeed safely do a free() of some memory (this was, I think what started the whole discussion). It's taken a long time to get sleep/wakeup bugfree in Plan 9 and some of the greatest minds in code verification (formerly at Bell Labs) have been called upon to help get it right. Russ is perfectly correct in the explanations below and it's a good exercise to read through it. This stuff is really tricky. Many optimization, all of them seemingly correct, failed because of subtle race conditions, some of them involving three or more processes. Sape >> your layout in your first email (i think) assumes that wakeup >> can be called twice. > > it doesn't. the scenario in my first email has exactly one > sleep and one wakeup. > > the canonical problem you have to avoid when implementing > sleep and wakeup is that the wakeup might happen before > the sleep has gotten around to sleeping. to be concrete, > you might do something like: > > cpu1: > kick off disk i/o operation > sleep(r) > > cpu2: > interrupt happens > mark operation completed > wakeup(r) > > the problem is what happens if the interrupt is so fast > that cpu2 runs all that before sleep(&r) starts. > a wakeup without a sleep is defined to be a no-op, so > if the wakeup runs first the sleep never wakes up: > > cpu1: > kick off disk i/o operation > > cpu2: > interrupt happens > mark operation completed > wakeup(r) > > cpu1: > sleep(r) // never returns > > to avoid that problem there is this extra f, arg passed > to sleep along with some locks to make sure sleep > and wakeup are not running their coordination code > simultaneously. with f(arg), the last scenario becomes: > > cpu1: > kick off disk i/o operation > > cpu2: > interrupt happens > mark operation completed > wakeup(r) > > cpu1: > sleep(r) > calls f(arg), which sees op marked completed, returns 1 > sleep returns immediately > > avoiding the missed wakeup. > > unfortunately the f(arg) check means that now sleep can > sometimes return before wakeup (kind of a missed sleep): > > cpu1: > kick off disk i/o operation > > cpu2: > interrupt happens > mark operation completed > > cpu1: > sleep(r) > calls f(arg), which checks completed, returns 1 > sleep returns immediately > > cpu2: > wakeup(r) > finds nothing sleeping on r, no-op. > > there's no second wakeup involved here. this is just sleep > figuring out that there's nothing to sleep for, before wakeup > comes along. f(arg) == true means that wakeup is either > on its way or already passed by, and sleep doesn't know which, > so it has to be conservative and not sleep. > > if r is allocated memory and cpu1 calls free(r) when sleep > returns, that's not okay, because cpu2 has already decided > to call wakeup(r), which will now be scribbling on or at least > looking at freed memory. > > as i said originally, it's simply not 1:1. > if you need 1:1, you need a semaphore. > > russ > > > p.s. not relevant to your "only one sleep and one wakeup" > constraint, but that last scenario also means that if you > are doing repeated sleep + wakeup on a single r, that pending > wakeup call left over on cpu2 might not happen until cpu1 has > gone back to sleep (a second time). that is, the first wakeup > can wake the second sleep, intending to wake the first sleep. > so in general you have to handle the case where sleep > wakes up for no good reason. it doesn't happen all the time, > but it does happen. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [9fans] sleep/wakeup bug? 2011-02-25 8:37 ` Sape Mullender @ 2011-02-25 9:18 ` Bakul Shah 0 siblings, 0 replies; 15+ messages in thread From: Bakul Shah @ 2011-02-25 9:18 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Fri, 25 Feb 2011 09:37:39 +0100 Sape Mullender <sape@plan9.bell-labs.com> wrote: > I suppose the use of counting semaphores in sleep/wakeup could > help in cases like this (but I'm sure there are still plenty of > other scenarios where they might not help). The value of the > semaphore would represent something like "number of things to > do", so acquire(sema) would (atomically) wait until the value > of sema is greater than zero, then (using compare&swap, or > doing the whole thing inside an ilock) decrement the semaphore > and continue. > Release(sema) will (atomically) increment the semaphore and, if the > old value was zero, wake up any waiters. > > Now, at first glance that looks like a vast improvement over sleep/ > wakeup, but *inside* acquire and release, you'd still have sleep/wakeup > and you'd still run the risk of waking up just when something else > managed to grab the semaphore, or waking up something that hasn't > actually gone to sleep yet. > > So, I think you can think of semaphores as a wrapper for sleep/wakeup > that can be used in some case to make sure that you can indeed safely > do a free() of some memory (this was, I think what started the whole > discussion). wait(sema) & signal(sema) in either order would do proper synchronization. Not the case with sleep/wakeup -- they are cheaper though. > It's taken a long time to get sleep/wakeup bugfree in Plan 9 and > some of the greatest minds in code verification (formerly at Bell Labs) > have been called upon to help get it right. > > Russ is perfectly correct in the explanations below and it's a good > exercise to read through it. This stuff is really tricky. Many > optimization, all of them seemingly correct, failed because of subtle > race conditions, some of them involving three or more processes. Is it inherently tricky? Aren't semaphores easier to reason about and get right? ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [9fans] sleep/wakeup bug? 2011-02-25 6:51 ` Russ Cox 2011-02-25 7:13 ` erik quanstrom 2011-02-25 8:37 ` Sape Mullender @ 2011-02-25 14:57 ` Charles Forsyth 2011-02-25 16:09 ` Venkatesh Srinivas 3 siblings, 0 replies; 15+ messages in thread From: Charles Forsyth @ 2011-02-25 14:57 UTC (permalink / raw) To: 9fans the bug is calling wakeup(r) on deallocated r. sleep and wakeup communicate about the event represented by r, not the structure containing r. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [9fans] sleep/wakeup bug? 2011-02-25 6:51 ` Russ Cox ` (2 preceding siblings ...) 2011-02-25 14:57 ` Charles Forsyth @ 2011-02-25 16:09 ` Venkatesh Srinivas 3 siblings, 0 replies; 15+ messages in thread From: Venkatesh Srinivas @ 2011-02-25 16:09 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 1136 bytes --] On Fri, Feb 25, 2011 at 1:51 AM, Russ Cox <rsc@swtch.com> wrote: > > your layout in your first email (i think) assumes that wakeup > > can be called twice. > > it doesn't. the scenario in my first email has exactly one > sleep and one wakeup. > > the canonical problem you have to avoid when implementing > sleep and wakeup is that the wakeup might happen before > the sleep has gotten around to sleeping. > There is a pattern in DragonFly BSD for use with tsleep (tsleep is evolutionarily related to p9's sleep, though the descent is tortuous), tsleep_interlock. tsleep_interlock(WAITCHAN); /* Kick off some operation or release some higher level lock or w/e */ tsleep(WAITCHAN, ...); tsleep_interlock queues a process but does not actually go to sleep. If a wakeup happens after the tsleep_interlock but before tsleep has fully slept, the process does not sleep in tsleep(). I do not know if a similar design would be feasible given Plan 9's sleep/wakeup; at first glance, it looks to be so - sleep_interlock would look similar to sleep's front-end, but it would not 'gotolabel(&m->sched)'. -- vs [-- Attachment #2: Type: text/html, Size: 1480 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2011-02-25 16:09 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-02-24 22:01 [9fans] sleep/wakeup bug? erik quanstrom 2011-02-25 4:46 ` Russ Cox 2011-02-25 9:46 ` Richard Miller 2011-02-25 5:26 erik quanstrom 2011-02-25 5:47 ` Russ Cox 2011-02-25 5:53 ` erik quanstrom 2011-02-25 6:01 ` Russ Cox 2011-02-25 6:12 ` erik quanstrom [not found] ` <2808a9fa079bea86380a8d52be67b980@coraid.com> [not found] ` <AANLkTi=4_=++Tm2a9Jq9jSzqUSexkW-ZjM-38oD_bS1y@mail.gmail.com> [not found] ` <40925e8f64489665bd5bd6ca743400ea@coraid.com> 2011-02-25 6:51 ` Russ Cox 2011-02-25 7:13 ` erik quanstrom 2011-02-25 14:44 ` Russ Cox 2011-02-25 8:37 ` Sape Mullender 2011-02-25 9:18 ` Bakul Shah 2011-02-25 14:57 ` Charles Forsyth 2011-02-25 16:09 ` Venkatesh Srinivas
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).