From mboxrd@z Thu Jan 1 00:00:00 1970 MIME-Version: 1.0 In-Reply-To: <41f21b94249accd4df903782d3ed243e@coraid.com> References: <5c420c57e47d4277e80d51801186f929@quanstro.net> <41f21b94249accd4df903782d3ed243e@coraid.com> Date: Mon, 31 Aug 2009 11:12:51 -0500 Message-ID: From: Eric Van Hensbergen To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: [9fans] Interested in improving networking in Plan 9 Topicbox-Message-UUID: 5d63e670-ead5-11e9-9d60-3106f5b1d025 On Mon, Aug 31, 2009 at 10:52 AM, erik quanstrom wrote= : > > so plunkers like us with a few hundred machines are just "casual users"? > i'd hate for plan 9 to become harder to use outside a hpc environment. > it would be good to be flexable enough to support fairly degnerate cases > like just flat files. > I don't disagree. Worst case, this is a complementary platform for large-scale deployments, best case is that its an alternative interface that also improves the experience for the "casual" user -- I think the main benefit here will be in establishing better mechanisms for collaboration amongst "casual" users. If any aspect of this makes things more complicated, we are doing something wrong. The whole point of going zeroconf is to make configuration simple. >> > i also don't know what you mean by "transient, task specific services"= . >> > i can only think of things like ramfs or cdfs. =A0but they live in my >> > namespace so ndb doesn't enter into the picture. >> > >> >> There is the relatively mundane configuration examples of publishing >> multiple file servers, authentication servers, and cpu servers. > > how many file servers and authentication servers are you running? > >>From a file server perspective on blue gene, a full scale rack will have thousands. we're currently operating without auth (in part due to configuration issues), so I don't know how well it will scale. The other aspect here is that in current configurations, every "run" has a different machine configuration based on what you request from the job scheduler and what you actually get. We pretty much get different IP addresses every time, with different front ends, different file servers, etc. etc. Again though - the idea is to use file systems more pervasively within the applications as well -- so there may be multiple file servers per node providing different services depending on workload needs at the particular point of computation. Read our MTAGS paper from last year's supercomputing conference to get a bigger picture view on how we view services coming, going, migrating, and adapting to changing application usage and failure. -eric