From mboxrd@z Thu Jan  1 00:00:00 1970
From: miller@hamnavoe.demon.co.uk
To: 9fans@cse.psu.edu
Date: Thu, 20 Jul 2000 14:54:57 +0100
Subject: Re: [9fans] Kernel question: i386 test-and-set problem
Message-Id: <E13FGjC-0007j3-0Y@anchor-post-34.mail.demon.net>
Topicbox-Message-UUID: e7a8b278-eac8-11e9-9e20-41e7f4b1d025

jmk@plan9.bell-labs.com writes:

> The sleep/wakeup/postnote Rendez structure still has a lock which
> protects it, it just moved somewhere else.

Sorry, I didn't explain in enough detail.  In /sys/src/9/port/proc.c:588
wakeup() looks at r->p (pointer from Rendez to sleeping process)
without first acquiring any lock.  That's the unprotected access I was
referring to: it's dangerous because r->p is shared asynchronously
by sleep() and postnote().

The original 2nd edition kernel (CD version) had a lock in the Rendez
structure, and all accesses to r->p  were protected by acquiring
the lock first.  However, p->r (pointer from sleeping process to Rendez)
was shared between sleep() and postnote() without locking.

A later kernel update (845586056.rc) introduced a new lock in the Proc
structure (p->rlock) to protect the shared access to p->r, but eliminated
the lock in the Rendez structure.  This left r->p exposed again.  I believe
that's why you need coherence() calls.

> The 2nd Edition code would
> have needed coherence() calls too, but in different places, had it not
> been rewritten before we tried running on a multiprocessor Pentium Pro.

When I added mp support to the 2nd edition for my dual ppro system,
I reinstated the Rendez lock, and kept p->rlock as well, so in the
three-way conversation between sleep(), wakeup() and postnote() both
r->p and p->r are protected.  I didn't add any explicit coherence()
calls anywhere, and the system has been running stably for over two years.
If I remove the lock around the r->p access in wakeup(), a few simultaneous
'du -a /' commands will quickly cause a crash.

-- Richard Miller