From mboxrd@z Thu Jan  1 00:00:00 1970
Message-ID: <13426df10801220925r719f8373y7519b64a7235be4a@mail.gmail.com>
Date: Tue, 22 Jan 2008 09:25:48 -0800
From: "ron minnich" <rminnich@gmail.com>
To: "Fans of the OS Plan 9 from Bell Labs" <9fans@cse.psu.edu>
Subject: Re: [9fans] Building GCC
In-Reply-To: <b48b8006e59a15049e928ea131da9217@quanstro.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
References: <a4e6962a0801220900t668f30bckd52ffcde019804d@mail.gmail.com>
	<b48b8006e59a15049e928ea131da9217@quanstro.net>
Topicbox-Message-UUID: 33c4550e-ead3-11e9-9d60-3106f5b1d025

On Jan 22, 2008 9:07 AM, erik quanstrom <quanstro@quanstro.net> wrote:
> > Also - some (HPC) apps that we want to run on Plan 9 have silly
> > dependencies on things like X11.  However, that gets into a different
> > topic than I think the original poster was talking about.
>
> they're running X on blue gene?  that's mad.

So here's a true story. My team at LANL built an incredibly light
weight linux environment for clustering. We could boot 1024 nodes in
2.5 minutes from power off -- less time than it takes most BIOSes to
exit POST. About 2 minutes of that time was Linux saying "look what
hardware I just found" and sleeping on device polling. That scaled
well to large systems -- 2048 nodes took about the same time, since we
used tree-spawn and other nice tricks, such as storing the Myrinet
routes in CMOS so you didn't have to reconfigure the net each and
every time you booted. The compute nodes had one daemon.

You could start a 16 MB MPI image in 2-3 seconds on 1024 nodes, about
the same on 2048, since tree spawn is your friend. The scheduler would
schedule arbitrary groups of nodes in seconds.

This all worked. It's used around the world today, even though our
last release was 2004. It is being turned off, at LANL, in part
because a number of users wish to run xterms and xemacs and similar
apps on a *compute* node. Oh, and because people need Python now, of
course.

So, yes, I expect to see people demanding x11 apps on cluster nodes.
The problem is that all the development nowadays is on the linux
desktop, and people just expect that complete desktop to be there on
each and every cluster node. It's hard to get them to understand that
there is a performance cost to this idea -- or, they just don't care.

ron