From mboxrd@z Thu Jan  1 00:00:00 1970
From: erik quanstrom <quanstro@quanstro.net>
Date: Sun, 22 Jun 2014 08:10:51 -0400
To: 9fans@9fans.net
Message-ID: <65984946d00c8ebe35122418a4f83cb1@mikro.quanstro.net>
In-Reply-To: <CACebkGPwBHsHwcshzNqhMrxjn4f8XJQ9gwAYdE26vE-vmovDhA@mail.gmail.com>
References: <CAAVq3rkLFxjZzv=23VLY0DopFZgbE-4pcUSZnX=Gh8CWYOiEPA@mail.gmail.com>
	<F5C208E1-B0F9-4B22-B46F-4D16B108B477@fb.com>
	<CACebkGPwBHsHwcshzNqhMrxjn4f8XJQ9gwAYdE26vE-vmovDhA@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Subject: Re: [9fans] [GSOC] plan9 kernel scheduler
Topicbox-Message-UUID: fc72d8cc-ead8-11e9-9d60-3106f5b1d025

On Sat Jun 21 21:47:37 EDT 2014, jyu@cowsay.org wrote:
> +1 what Yoann said. :-) On SMP systems, all maches share a global run
> queue, and maches tend to try grabbing procs that have run on it
> before (affinity). Take a look at port/proc.c in particular, where a
> lot of the scheduling logic is implemented.

the (base) plan 9 scheduler uses a global priority based round robin
scheduler.  priority is absolute; a process of priority n+1
will always run before a process of priority n.  processes within
a priority are run in round-robin fashion.  recent cpu usage lowers
a processes' priority.  there is soft affinity that tends to rerun a
process on the same cpu when practical, though this mechanism is
too hard inthe standard distribution. there is hard affinity; one can
wire a proc to a mach.

a key detail of the scheduler is the set of priorities with runnable
processes is kept in a bitvector named "runvec".  significantly less
than 32 priorities are used.

there is also an edf scheduler.  i can't find the iwp9 paper atm.
(other than /n/atom/plan9/sys/doc/real-time.ms) but i think that's
beyond the scope of the question.

the scheduler is inefficient for many cpus, and there are some bugs
that cause expectations to diverge strongly from reality.  so to make
life easier for a gsoc student who is working on proper changes to
the scheduler, i spent two days on some cheep improvements that
make the scheduler predectible and reasonable, if not terribly efficient.
they depend on having a scalable lock.

1.  given a scalable lock, then putting a monitor around the run queues
scan will result in 1 cacheline contention event, rather than many.
this also has the effect of building a queue of processors waiting in lin=
e
for their chance at the run queues

2.  replacing idlehands with monmwait of the runvec.

3.  reducing the soft affinity to n microseconds, rather than letting
all local-affinity processes come before non local processes, regardless
of time spent on run queues.  (i'm currently working with n=3D500=C2=B5s)

these improvements have the net effect on my 8 thread machine
of lowering the nix kernel compile time about 40%.  i'm not sure what
to expect on larger machines.

i did not expect such large improvements.  so unfortunately this has
raised the bar quite a bit for the gsoc project.  nonetheless, i'm happy
to report that the new scheduler is already within 7% of the old!

- erik