From mboxrd@z Thu Jan 1 00:00:00 1970 From: miller@hamnavoe.demon.co.uk To: 9fans@cse.psu.edu Date: Thu, 20 Jul 2000 14:54:57 +0100 Subject: Re: [9fans] Kernel question: i386 test-and-set problem Message-Id: Topicbox-Message-UUID: e7a8b278-eac8-11e9-9e20-41e7f4b1d025 jmk@plan9.bell-labs.com writes: > The sleep/wakeup/postnote Rendez structure still has a lock which > protects it, it just moved somewhere else. Sorry, I didn't explain in enough detail. In /sys/src/9/port/proc.c:588 wakeup() looks at r->p (pointer from Rendez to sleeping process) without first acquiring any lock. That's the unprotected access I was referring to: it's dangerous because r->p is shared asynchronously by sleep() and postnote(). The original 2nd edition kernel (CD version) had a lock in the Rendez structure, and all accesses to r->p were protected by acquiring the lock first. However, p->r (pointer from sleeping process to Rendez) was shared between sleep() and postnote() without locking. A later kernel update (845586056.rc) introduced a new lock in the Proc structure (p->rlock) to protect the shared access to p->r, but eliminated the lock in the Rendez structure. This left r->p exposed again. I believe that's why you need coherence() calls. > The 2nd Edition code would > have needed coherence() calls too, but in different places, had it not > been rewritten before we tried running on a multiprocessor Pentium Pro. When I added mp support to the 2nd edition for my dual ppro system, I reinstated the Rendez lock, and kept p->rlock as well, so in the three-way conversation between sleep(), wakeup() and postnote() both r->p and p->r are protected. I didn't add any explicit coherence() calls anywhere, and the system has been running stably for over two years. If I remove the lock around the r->p access in wakeup(), a few simultaneous 'du -a /' commands will quickly cause a crash. -- Richard Miller