9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: leimy2k@speakeasy.net
To: rminnich@lanl.gov, 9fans@cse.psu.edu
Subject: Re: [9fans] xcpu note
Date: Tue, 18 Oct 2005 03:25:38 -0700	[thread overview]
Message-ID: <51724b5012b97dec3b866bb4bf143184@psychobunny.homework.net> (raw)
In-Reply-To: <4354602C.7060102@lanl.gov>

> David Leimbach wrote:
> 
>> Clustermatic is pretty cool, I think it's what was installed on one of
>> the other clusters I used at LANL as a contractor at the time.  I
>> recall a companion tool for bproc to request nodes, sort of an ad-hoc
>> scheduler.  I had to integrate support for this in our MPI's start up
>> that I was testing on that machine.
> 
> the simple scheduler, bjs, was written by erik hendriks (now at Google, 
> sigh) and was rock-solid. It ran on one cluster, unattended, scheduling 
> 128 2-cpu nodes with a very diverse job mix, for one year. It was a 
> great piece of software. It was far faster, and far more reliable, than 
> any scheduler we have ever seen, then or now. In one test, we ran about 
> 20,000 jobs through it on about an hour, on a 1024-node cluster, just to 
> test. Note that it could probably have scheduled a lot more jobs, but 
> the run-time of the job was non-zero. No other scheduler we have used 
> comes close to this kind of performance. Scheduler overhead was 
> basically insignificant.
> 

Yeah, when I came to the lab last it was a "surprise" to find out that I 
not only had to support bproc but bjs though.  Luckilly it took about 10
minutes to figure it out and add support to our "mpirun" startup script.

It was pretty neat.

>> 
>> I'm curious to see how this all fits together with xcpu, if there is
>> such a resource allocation setup needed etc.
> 
> we're going to take bjs and have it schedule nodes to give to users.
> 
> Note one thing we are going to do with xcpu: attach nodes to a user's 
> desktop machine, rather than make users log in to the cluster. So users 
> will get interactive clusters that look like they own them. This will, 
> we hope, kill batch mode. Plan 9 ideas make this possible. It's going to 
> be a big change, one we hope users will like.

Hmm, planning to create a multi-hosted xcpu resource all bound to the 
user's namespace?  Or one host per set of files?  Is there a way to launch
multiple jobs in one shot ala MPI startup this way that's easy?

> 
> If you look at how most clusters are used today, they closely resemble 
> the batch world of the 1960s. It is actually kind of shocking. I 
> downloaded a JCL manual a year or two ago, and compared what JCL did to 
> what people wanted batch schedulers for clusters to do, and the 
> correspondance was a little depressing. The Data General ad said it 
> best: "Batch is a bitch".

Yeah, I've been comparing them to punch card systems for a while now. 
Some are even almost the same size as those old machines now that we've 
stacked them up.

MPI jobs have turned modern machines into huge monoliths that basically 
throw out the advantages of a multi-user system.  In fact having worked
with CPlant for a while with Ron Brightwell over at SNL, they had a design
optimized for one process per machine.  One CPU [no SMP hardware contention],
Myrinet with Portals for RDMA and OS bypass reasons [low overheads], 
no threads [though I was somewhat taunted with them at one point], and this
Yod and Yod2 scheduler for job startup.

It was very unique, and very interesting to work on and not a lot of fun to
debug running code on. :)

The closest thing I've seen to this kind of design in production has to be 
Blue Gene [which is a much different architecture of course but similar in 
that it is very custom designed for a few purposes].


> 
> Oh yeah, if anyone has a copy of that ad (Google does not), i'd like it 
> in .pdf :-) It appeared in the late 70s IIRC.
> 
> ron
> p.s. go ahead, google JCL, and you can find very recent manuals on how 
> to use it. I will be happy to post the JCL for "sort + copy" if anyone 
> wants to see it.

Please god no!!! :)

Dave



  parent reply	other threads:[~2005-10-18 10:25 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-10-17 16:43 Ronald G Minnich
2005-10-17 21:41 ` David Leimbach
2005-10-18  2:38   ` Ronald G Minnich
2005-10-18  4:44     ` Scott Schwartz
2005-10-18  4:45       ` Ronald G Minnich
2005-10-18  7:35         ` Scott Schwartz
2005-10-18  4:57       ` andrey mirtchovski
2005-10-18  4:57         ` Ronald G Minnich
2005-10-19 18:21           ` rog
2005-10-18 10:25     ` leimy2k
2005-10-18 10:25     ` leimy2k
2005-10-18 10:25     ` leimy2k [this message]
2005-10-18 12:10     ` Brantley Coile
2005-10-18  3:04 ` Kenji Okamoto
2005-10-18  3:06   ` Ronald G Minnich
2005-10-18  3:23   ` Eric Van Hensbergen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51724b5012b97dec3b866bb4bf143184@psychobunny.homework.net \
    --to=leimy2k@speakeasy.net \
    --cc=9fans@cse.psu.edu \
    --cc=rminnich@lanl.gov \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).