From mboxrd@z Thu Jan  1 00:00:00 1970
Message-ID: <001901c022bc$58778dc0$356214d4@cz99.cz>
From: "Jakub Jermar" <jj@comberg.cz>
To: <9fans@cse.psu.edu>
References: <20001109211530.9CCDB199E4@mail.cse.psu.edu>
Subject: Re: [9fans] sleep(), sched() and ilock()
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-2"
Content-Transfer-Encoding: 7bit
Date: Wed, 20 Sep 2000 06:35:33 +0200
Topicbox-Message-UUID: 26d2252e-eac9-11e9-9e20-41e7f4b1d025

> First. Am I right when supposing that the reason ilock() doesn't call
> scheduler is that it has to be sure
> that iunlock() executes on the same cpu?
>
> In general, if you are doing an ilock, it should be for something that
> can be done quickly.  ilock is used when you're locking something that
> an interrupt routine also uses and scheding in the middle of it would be
> deadly.

I understand that there can be a deadlock if both a process and an interrupt
race for one lock - therefore the concept of ilock(). I also understand why
ilock() dies when it finds the ilock locked on a uni-processor - because no
one can unlock it and, seen from the other side, ilock() must have been
called twice without first calling iunlock().
But I still don't know, whether iunlock() executing on the same cpu is a
goal (maybe because of the willingness to restore the processor state as it
appeared before ilock()???) or a mere consequence of the fact that there was
no rescheduling in between.

> Second. Why doesn't sleep() simply call sched() right after it releases
both
> locks? Am I unaware of some possible race condition?
>
> Off hand, I can't see any particular reason other than tunnel vision
> on our part.  Looks like all we're doing is duplicating code with the
> net effect of saving 2 subroutine calls (splhi and sched).  I may try
> it the other way and have Gerard Holtzman verify it again and see if
> I'm missing something myself.

That's interesting. Sleep is an example of routine that goes to scheduler
splhi()'ed (as you refer to in your comment to my third point). Furthemore,
it emulates the behaviour of normal sched() after the process resumes its
p->sched: it spllo()'s but immediately after it, there is splx(x) which
cancels the effect of spllo(), no matter what x was.
Actually, this function makes me think there must be a reason for it to
duplicate piece of sched(), but I can't figure out what it is. I thought it
was because of the two locks being held, but their releasing before
gotolabel(&m->sched) is as neccessary as it would have been if sleep() had
called the normal sched().

> Third. It seems to me that sleep() might disable interrupts on a different
> cpu than is the one that enables them again before sleep() finishes. Is it
> ok when a process brings the old processor state to the new current
> processor and leaves the old one in (say) the splhi()'d state?
>
> Once you go frolicking through 'gotolabel(&m->sched)', the spl level
> is lost, i.e., schedinit always starts hi and returns low: any
> process coming out of a rescheduling comes out spllo().  That
> means that calling sched() when you are splhi() is wrong.  This
> refers back to your first point.  The processor that you did the sleep
> on will go spllo while its looking for a new process to run and will
> start that process up spllo.

But when I gotolabel(&up->sched) in sched(), after it runproc()'s a new
process, it restores its PC
and SP. SP points to the process's stack which contains the process's x
(variable from sleep()) with previous processor state, which is restored in
sleep() on the current cpu afterwards.
I believe this is the way that sleeping and waking processes carry processor
state with them. Never mind if the two processors are not the same - I
realized the functionality must be the same - it works just as though there
was no processor switch (the emphasis must be on the code and not the
processor). If the process just gets rescheduled, it leaves sched()
spllo()'ed - just as you said..

Jakub Jermar