From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <13426df10801220925r719f8373y7519b64a7235be4a@mail.gmail.com> Date: Tue, 22 Jan 2008 09:25:48 -0800 From: "ron minnich" To: "Fans of the OS Plan 9 from Bell Labs" <9fans@cse.psu.edu> Subject: Re: [9fans] Building GCC In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: Topicbox-Message-UUID: 33c4550e-ead3-11e9-9d60-3106f5b1d025 On Jan 22, 2008 9:07 AM, erik quanstrom wrote: > > Also - some (HPC) apps that we want to run on Plan 9 have silly > > dependencies on things like X11. However, that gets into a different > > topic than I think the original poster was talking about. > > they're running X on blue gene? that's mad. So here's a true story. My team at LANL built an incredibly light weight linux environment for clustering. We could boot 1024 nodes in 2.5 minutes from power off -- less time than it takes most BIOSes to exit POST. About 2 minutes of that time was Linux saying "look what hardware I just found" and sleeping on device polling. That scaled well to large systems -- 2048 nodes took about the same time, since we used tree-spawn and other nice tricks, such as storing the Myrinet routes in CMOS so you didn't have to reconfigure the net each and every time you booted. The compute nodes had one daemon. You could start a 16 MB MPI image in 2-3 seconds on 1024 nodes, about the same on 2048, since tree spawn is your friend. The scheduler would schedule arbitrary groups of nodes in seconds. This all worked. It's used around the world today, even though our last release was 2004. It is being turned off, at LANL, in part because a number of users wish to run xterms and xemacs and similar apps on a *compute* node. Oh, and because people need Python now, of course. So, yes, I expect to see people demanding x11 apps on cluster nodes. The problem is that all the development nowadays is on the linux desktop, and people just expect that complete desktop to be there on each and every cluster node. It's hard to get them to understand that there is a performance cost to this idea -- or, they just don't care. ron