9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: David Hogan dhog@lore.plan9.cs.su.oz.au
Subject: calling sleep() while holding lock()
Date: Fri,  9 May 1997 02:53:25 +1000	[thread overview]
Message-ID: <19970508165325.Nq-6UWsyjVLbhs8Z8BK3WMm4fTX3faSlfbiApFof9kw@z> (raw)

David Butler wrote:
> It would seem to me that a process should not call sleep() holding
> a spinlock, even though that seems to be happening.

This seems reasonable to me.

> I changed taslock to increment and decrement the hasspin flag instead of just
> setting it and clearing it.  It is reasonable to have many locks.
> (There is also a problem with the lock being dropped before the hasspin
> was modified that I fixed.  I also temporarly removed the hasspin clear
> from clock.c)

> I then added a print in sleep to print the pid and hasspin counter if
> hasspin > 0.  It happens alot and pretty early in the boot phase.

Good coding.  So now the question remains: why is this behaviour
occuring?  One possibility is that we take a fault while holding
the lock, and we then have to sleep until the memory gets paged in.
Alberto Nava found a place in the kernel where this is happening, and
I'm sure there must be others.

It's not that good to take a fault while holding a spinlock; at
minimum, there will be a loss of efficiency.  In the worst case,
the kernel may deadlock or panic.  Code which allows this to happen
should be tracked down, and changed to either use a local buffer
in the critical section, or else verify that the memory is writable
first...

You should add another print to the fault handler, so that you can
see which of the sleeps are caused by faults, and which aren't.
You might want to record the caller PC of the most recent spinlock,
and print that as well.  This will enable locating which parts of
the kernel are behaving this way.

When you've got a list of PC values, use acid to find the file &
line number for each, and post them!  I for one would be interested
in this data.

> I'm doing this trying to find the cause of my earlier message about
> checksum errors on the ethernet.  I am looking for places where spinlocks
> are being held for long times and next where interrupts are masked
> too long.

I've noticed that the Plan 9 kernel does go through some quite long
periods at high IPL.  During these, it is possible to lose serial
characters at a mere 9600 baud :-(  Any insight into why this
happens would be appreciated.  I was going to add some code to the
kernel to keep a journaling buffer of (PC, microsecond) pairs recorded
at strategic points in the kernel (such as every call to splhi &co),
but I never got around to it.  I may yet do this, now that I have
a decent machine at home and spare time on weekends...

> Before I go much further, I wanted to check on this behavior.

The less time spent holding spinlocks, the better.  Your mission,
if you choose to accept it, is to obtain the release of the lost
CPU cycles.  If you are caught, Dennis will disavow all knowledge
of your actions.

> Thanks for any info.

You're welcome.




             reply	other threads:[~1997-05-08 16:53 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
1997-05-08 16:53 David [this message]
  -- strict thread matches above, loose matches on Subject: below --
1997-05-09 16:16 Paul
1997-05-09 13:41 G.David
1997-05-09  8:00 Lucio
1997-05-09  7:22 G.David
1997-05-08 18:32 beto
1997-05-08 14:04 G.David

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=19970508165325.Nq-6UWsyjVLbhs8Z8BK3WMm4fTX3faSlfbiApFof9kw@z \
    --to=9fans@9fans.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).