From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Fri, 9 Jun 2006 12:56:58 -0700 From: Roman Shaposhnick To: Fans of the OS Plan 9 from Bell Labs <9fans@cse.psu.edu> Subject: Re: [9fans] gcc on plan9 Message-ID: <20060609195658.GF1693@submarine> References: <44879B05.6050706@lanl.gov> <44879EC5.3050105@lanl.gov> <4487A269.40906@sun.com> <4487B351.8000808@lanl.gov> Mime-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline In-Reply-To: <4487B351.8000808@lanl.gov> User-Agent: Mutt/1.4.2.1i Topicbox-Message-UUID: 64a04cac-ead1-11e9-9d60-3106f5b1d025 Thanks for a thorough explanation! It all makes sense now. I wish you luck with moving forward with this project. Thanks, Roman. P.S. Keep us posted ;-) On Wed, Jun 07, 2006 at 11:19:13PM -0600, Ronald G Minnich wrote: > Roman Shaposhnik wrote: > >One question that I still have, though, > >is what > >makes you think that once you're done with porting gcc (big task) and > >porting HPC apps to > >gcc/Plan9 (even bigger one!) they will *execute* faster than they do on > >Linux ? > > Excellent question. > > It's all about parallel performance; making sure your 1000 nodes run > 1000 times as fast as 1 node, or, if they don't, that it's Somebody > Else's Problem. The reason that the OS can impact parallel performance > boils down to the kinds of tasks that go on in OSes that can be run at > awkward times,and in turn interfere with parallel applications, and > result in degraded performance. (for another approach, see Cray's > synchronised scheduler work; make all nodes schedule the app at the same > time). > > Imagine you have one of these lovely apps, on a 1000-node cluster with a > 5-microsecond latency network. Let us further imagine (this stuff > exists; see Quadrics) that you can do a broadcast/global sum op in 5 > microseconds. After 1 millisecond, they all need to talk to each other, > and can not proceed until they're all agreed on (say) the value of a > computed number -- e.g. some sort of global sum of a variable held by > each of 1000 procs. The generic term for this type of thing is 'global > reduction' -- you reduce a vector to a scalar of some sort. > > The math is pretty easy to do, but it boils down to this: OS activities > can interfere with, say, just one task, and kill the parallel > performance of the app, making your 1000-node app run like a 750 node > app -- or worse. > > Pretend you're delayed one microsecond; do the math; it's depressing. > One millisecond compute interval is a really extreme case, chosen for > ease of illustration, but ... > > In the clustering world, what a lot of people do is run real heavy nodes > in clusters -- they have stuff like cron running, if you can believe it! > They pretty much do a full desktop install, then turn off a few daemons, > and away they go. Some really famous companies actually run clusters > this way -- you'd be surprised at who. So do some famous gov't labs. > > If they're lucky, interference never hits them. If they're not, they get > less-than-ideal app performance. Then, they draw a conjecture from the > OS interference that comes with such bad configuration: you can't run a > cluster node with anything but a custom OS which has no clock > interrupts, and, for that matter, no ability to run more than one > process at a time. See the computational node kernel on the BG/L for one > example, or the catamount kernel on Red Storm. Those kernels are really > constrained; just running one proc at a time is only part of the story. > > Here at LANL, we run pretty light cluster nodes. > > Here is a cluster node running xcpu (under busybox, as you can see): > 1 ? S 0:00 /bin/ash /linuxrc > 2 ? S 0:00 [migration/0] > 3 ? SN 0:00 [ksoftirqd/0] > 4 ? S 0:00 [watchdog/0] > 5 ? S 0:00 [migration/1] > 6 ? SN 0:00 [ksoftirqd/1] > 7 ? S 0:00 [watchdog/1] > 8 ? S 0:00 [migration/2] > 9 ? SN 0:00 [ksoftirqd/2] > 10 ? S 0:00 [watchdog/2] > 11 ? S 0:00 [migration/3] > 12 ? SN 0:00 [ksoftirqd/3] > 13 ? S 0:00 [watchdog/3] > 14 ? S< 0:00 [events/0] > 15 ? S< 0:00 [events/1] > 16 ? S< 0:00 [events/2] > 17 ? S< 0:00 [events/3] > 18 ? S< 0:00 [khelper] > 19 ? S< 0:00 [kthread] > 26 ? S< 0:00 [kblockd/0] > 27 ? S< 0:00 [kblockd/1] > 28 ? S< 0:00 [kblockd/2] > 29 ? S< 0:00 [kblockd/3] > 105 ? S 0:00 [pdflush] > 106 ? S 0:00 [pdflush] > 107 ? S 0:00 [kswapd1] > 109 ? S< 0:00 [aio/0] > 108 ? S 0:00 [kswapd0] > 110 ? S< 0:00 [aio/1] > 111 ? S< 0:00 [aio/2] > 112 ? S< 0:00 [aio/3] > 697 ? S< 0:00 [kseriod] > 855 ? S 0:00 xsrv -D 0 tcp!*!20001 > 857 ? S 0:00 9pserve -u tcp!*!20001 > 864 ? S 0:00 u9fs -a none -u root -m 65560 -p 564 > 865 ? S 0:00 /bin/ash > > see how little we have running? Oh, but wait, what's all that stuff in > []? It's the stuff we can't turn off. Note there is per-cpu stuff, and > other junk. Note that this node has been up for five hours, and this > stuff is pretty quiet(0 run time); our nodes are the quietest (in the OS > interference sense) Linux nodes I have yet seen. But, that said, all > this can hit you. > > And, in Linux, there's a lot of stuff people are finding you can't turn > off. Lots of timers down there, lots of magic that goes on, and you just > can't turn it off, or adjust it, try as you might. > > Plan 9, our conjecture goes, is a small, tight, kernel, with lots of > stuff moved to user mode (file systems); and, we believe that the Plan 9 > architecture is a good match to future HPC (High Performance Computing) > systems, as typified by Red Storm and BG/L: small, fixed-configuration > nodes with memory, network, CPU, and nothing else. The ability to not > even have a file system on the node is a big plus. The ability to > transparently have the file system remote/local puts the application > into the driver's seat as to how the node is configured, and what > tradeoffs are made; the system as a whole is incredibly flexible. > > Our measurements, so far, do show that Plan 9 is "quieter" than Linux. A > full Plan 9 desktop has less OS noise than a Linux box at the login > prompt. This matters. > > But it only matters if people can run their apps. Hence our concern > about getting gcc-based cra-- er, applications code, running. > > I'm not really trying to make Plan 9 look like Linux. I just want to run > MPQC for a friend of mine :-) > > thanks > > ron