* [9fans] Barrelfish @ 2009-10-14 19:09 Tim Newsham 2009-10-14 19:54 ` Roman Shaposhnik 2009-10-15 18:28 ` Christopher Nielsen 0 siblings, 2 replies; 92+ messages in thread From: Tim Newsham @ 2009-10-14 19:09 UTC (permalink / raw) To: 9fans Rethinking multi-core systems as distributed heterogeneous systems. Thoughts? http://www.sigops.org/sosp/sosp09/papers/baumann-sosp09.pdf Tim Newsham http://www.thenewsh.com/~newsham/ ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-14 19:09 [9fans] Barrelfish Tim Newsham @ 2009-10-14 19:54 ` Roman Shaposhnik 2009-10-14 21:21 ` Tim Newsham 2009-10-14 21:36 ` [9fans] Barrelfish Eric Van Hensbergen 2009-10-15 18:28 ` Christopher Nielsen 1 sibling, 2 replies; 92+ messages in thread From: Roman Shaposhnik @ 2009-10-14 19:54 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Wed, Oct 14, 2009 at 12:09 PM, Tim Newsham <newsham@lava.net> wrote: > Rethinking multi-core systems as distributed heterogeneous > systems. Thoughts? Somehow this feels related to the work that came out of Berkeley a year or so ago. I'm still not convinced what is the benefits of multiple kernels. If you are managing a couple of 100s of cores a single kernel would do just fine, once the industry is ready for a couple dozen of thousands PUs -- the kernel is most likely to be dispensed with anyway. Did you find any ideas there particularly engaging? Thanks, Roman. ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-14 19:54 ` Roman Shaposhnik @ 2009-10-14 21:21 ` Tim Newsham 2009-10-14 21:33 ` Lyndon Nerenberg (VE6BBM/VE7TFX) ` (2 more replies) 2009-10-14 21:36 ` [9fans] Barrelfish Eric Van Hensbergen 1 sibling, 3 replies; 92+ messages in thread From: Tim Newsham @ 2009-10-14 21:21 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > Somehow this feels related to the work that came out of Berkeley a year > or so ago. I'm still not convinced what is the benefits of multiple > kernels. If you are managing a couple of 100s of cores a single kernel > would do just fine, once the industry is ready for a couple dozen of > thousands PUs -- the kernel is most likely to be dispensed with anyway. I'm not familiar with the berkeley work. > Did you find any ideas there particularly engaging? I'm still digesting it. My first thoughts were that if my pc is a distributed heterogeneous computer, what lessons it can borrow from earlier work on distributed heterogeneous computing (ie. plan9). I found the discussion on cache coherency, message passing and optimization to be enlightening. The fact that you may want to organize your core OS quite a bit differently depending on which model cpus in the same family you use is kind of scary. The mention that "... the overhead of cache coherence restricts the ability to scale up to even 80 cores" is also eye openeing. If we're at aprox 8 cores today, thats only 5 yrs away (if we double cores every 1.5 yrs). > Roman. Tim Newsham http://www.thenewsh.com/~newsham/ ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-14 21:21 ` Tim Newsham @ 2009-10-14 21:33 ` Lyndon Nerenberg (VE6BBM/VE7TFX) 2009-10-14 21:42 ` Noah Evans 2009-10-15 1:03 ` David Leimbach 2009-10-15 1:50 ` Roman Shaposhnik 2 siblings, 1 reply; 92+ messages in thread From: Lyndon Nerenberg (VE6BBM/VE7TFX) @ 2009-10-14 21:33 UTC (permalink / raw) To: 9fans > I'm not familiar with the berkeley work. Me either. Any chance of some references to this? ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-14 21:33 ` Lyndon Nerenberg (VE6BBM/VE7TFX) @ 2009-10-14 21:42 ` Noah Evans 2009-10-14 21:45 ` erik quanstrom 2009-10-14 22:10 ` Eric Van Hensbergen 0 siblings, 2 replies; 92+ messages in thread From: Noah Evans @ 2009-10-14 21:42 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs http://ramp.eecs.berkeley.edu/ Tim: Andrew Baumann is aware of Plan 9 but their approach is quite a bit different. They are consciously avoiding the networking issue as well(they've been asked to extend their messaging model to the network and have actively said they're not interested). On Wed, Oct 14, 2009 at 11:33 PM, Lyndon Nerenberg (VE6BBM/VE7TFX) <lyndon@orthanc.ca> wrote: >> I'm not familiar with the berkeley work. > > Me either. Any chance of some references to this? > > > ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-14 21:42 ` Noah Evans @ 2009-10-14 21:45 ` erik quanstrom 2009-10-14 21:57 ` Noah Evans 2009-10-14 22:10 ` Eric Van Hensbergen 1 sibling, 1 reply; 92+ messages in thread From: erik quanstrom @ 2009-10-14 21:45 UTC (permalink / raw) To: 9fans > http://ramp.eecs.berkeley.edu/ > > Tim: Andrew Baumann is aware of Plan 9 but their approach is quite a > bit different. They are consciously avoiding the networking issue as > well(they've been asked to extend their messaging model to the network > and have actively said they're not interested). every interconnect is a network. sometimes we don't admit it. - erik ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-14 21:45 ` erik quanstrom @ 2009-10-14 21:57 ` Noah Evans 0 siblings, 0 replies; 92+ messages in thread From: Noah Evans @ 2009-10-14 21:57 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs Have you read the paper? I don't think you understand the difference in scope or goals here. On Wed, Oct 14, 2009 at 11:45 PM, erik quanstrom <quanstro@coraid.com> wrote: >> http://ramp.eecs.berkeley.edu/ >> >> Tim: Andrew Baumann is aware of Plan 9 but their approach is quite a >> bit different. They are consciously avoiding the networking issue as >> well(they've been asked to extend their messaging model to the network >> and have actively said they're not interested). > > every interconnect is a network. sometimes we don't admit it. > > - erik > > ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-14 21:42 ` Noah Evans 2009-10-14 21:45 ` erik quanstrom @ 2009-10-14 22:10 ` Eric Van Hensbergen 2009-10-14 22:21 ` Noah Evans 1 sibling, 1 reply; 92+ messages in thread From: Eric Van Hensbergen @ 2009-10-14 22:10 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Oct 14, 2009, at 3:42 PM, Noah Evans wrote: > http://ramp.eecs.berkeley.edu/ > > Tim: Andrew Baumann is aware of Plan 9 but their approach is quite a > bit different. They are consciously avoiding the networking issue as > well(they've been asked to extend their messaging model to the network > and have actively said they're not interested). > While they may not be interested in implementing a network messaging model, they don't oppose it. Jonathan and I talked with Andrew about porting Barrelfish to Blue Gene yesterday to test some of their scalability claims at large scale. -eric ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-14 22:10 ` Eric Van Hensbergen @ 2009-10-14 22:21 ` Noah Evans 0 siblings, 0 replies; 92+ messages in thread From: Noah Evans @ 2009-10-14 22:21 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs Do want. On Thu, Oct 15, 2009 at 12:10 AM, Eric Van Hensbergen <ericvh@gmail.com> wrote: > On Oct 14, 2009, at 3:42 PM, Noah Evans wrote: > >> http://ramp.eecs.berkeley.edu/ >> >> Tim: Andrew Baumann is aware of Plan 9 but their approach is quite a >> bit different. They are consciously avoiding the networking issue as >> well(they've been asked to extend their messaging model to the network >> and have actively said they're not interested). >> > > While they may not be interested in implementing a network messaging model, > they don't oppose it. Jonathan and I talked with Andrew about porting > Barrelfish to Blue Gene yesterday to test some of their scalability claims > at large scale. > > -eric > > > ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-14 21:21 ` Tim Newsham 2009-10-14 21:33 ` Lyndon Nerenberg (VE6BBM/VE7TFX) @ 2009-10-15 1:03 ` David Leimbach 2009-10-15 1:50 ` Roman Shaposhnik 2 siblings, 0 replies; 92+ messages in thread From: David Leimbach @ 2009-10-15 1:03 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 1074 bytes --] > > > > Did you find any ideas there particularly engaging? >> > > I'm still digesting it. My first thoughts were that if my pc is a > distributed heterogeneous computer, what lessons it can borrow from earlier > work on distributed heterogeneous computing (ie. plan9). > > I found the discussion on cache coherency, message passing and optimization > to be enlightening. The fact that you may want to > organize your core OS quite a bit differently depending on which > model cpus in the same family you use is kind of scary. > > The mention that "... the overhead of cache coherence restricts the ability > to scale up to even 80 cores" is also eye openeing. If we're at aprox 8 > cores today, thats only 5 yrs away (if we double cores every > 1.5 yrs). > I personally thought the use of DSLs built on Haskell was rather clever, but the other discoveries are the sort of feedback I suspect our CPU vendors aren't going to think about on their own somehow :-) > > Roman. >> > > Tim Newsham > http://www.thenewsh.com/~newsham/ > > [-- Attachment #2: Type: text/html, Size: 1788 bytes --] ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-14 21:21 ` Tim Newsham 2009-10-14 21:33 ` Lyndon Nerenberg (VE6BBM/VE7TFX) 2009-10-15 1:03 ` David Leimbach @ 2009-10-15 1:50 ` Roman Shaposhnik 2009-10-15 2:12 ` Eric Van Hensbergen 2009-10-15 10:53 ` Sam Watkins 2 siblings, 2 replies; 92+ messages in thread From: Roman Shaposhnik @ 2009-10-15 1:50 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Wed, Oct 14, 2009 at 2:21 PM, Tim Newsham <newsham@lava.net> wrote: > I'm not familiar with the berkeley work. Sorry I can't readily find the paper (the URL is somewhere on IMAP @Sun :-() But it got presented at the Birkeley ParLab overview given to us by Dave Patterson. They were talking thin hypervisors and that sort of stuff. More details here: http://www.eecs.berkeley.edu/Pubs/TechRpts/2008/EECS-2008-23.pdf (page 10) but still no original paper in sight... > I'm still digesting it. My first thoughts were that if my pc is a > distributed heterogeneous computer It may very well be that, but why would you want to manage that complexity? Your GPU is already heavily "multicore", yet you don't see it (and you really don't want to see it!) > The mention that "... the overhead of cache coherence restricts the ability > to scale up to even 80 cores" is also eye openeing. If we're at aprox 8 > cores today, thats only 5 yrs away (if we double cores every > 1.5 yrs). A couple of years ago we had a Plan9 summit @Google campus and Ken was there. I still remember the question he asked me: what exactly would you make all those core do on your desktop? Frankly, I still don't have a good answer. Thanks, Roman. ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-15 1:50 ` Roman Shaposhnik @ 2009-10-15 2:12 ` Eric Van Hensbergen 2009-10-15 10:53 ` Sam Watkins 1 sibling, 0 replies; 92+ messages in thread From: Eric Van Hensbergen @ 2009-10-15 2:12 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Oct 14, 2009, at 7:50 PM, Roman Shaposhnik wrote: > On Wed, Oct 14, 2009 at 2:21 PM, Tim Newsham <newsham@lava.net> wrote: >> I'm not familiar with the berkeley work. > > Sorry I can't readily find the paper (the URL is somewhere on IMAP > @Sun :-() > But it got presented at the Birkeley ParLab overview given to us by > Dave Patterson. > They were talking thin hypervisors and that sort of stuff. More > details here: > http://www.eecs.berkeley.edu/Pubs/TechRpts/2008/EECS-2008-23.pdf > (page 10) but still no original paper in sight... You may be thinking about the Tesselation work from Berkley ParLab (http://parlab.eecs.berkeley.edu/publication/221 ) -eric ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-15 1:50 ` Roman Shaposhnik 2009-10-15 2:12 ` Eric Van Hensbergen @ 2009-10-15 10:53 ` Sam Watkins 2009-10-15 11:50 ` Richard Miller ` (3 more replies) 1 sibling, 4 replies; 92+ messages in thread From: Sam Watkins @ 2009-10-15 10:53 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Wed, Oct 14, 2009 at 06:50:28PM -0700, Roman Shaposhnik wrote: > > The mention that "... the overhead of cache coherence restricts the ability > > to scale up to even 80 cores" is also eye openeing. If we're at aprox 8 > > cores today, thats only 5 yrs away (if we double cores every > > 1.5 yrs). Sharing the memory between processes is a stupid approach to multi-processing / multi-threading. Modern popular computer architecture and software design is fairly much uniformly stupid. > A couple of years ago we had a Plan9 summit @Google campus and Ken was > there. I still remember the question he asked me: what exactly would you make > all those core do on your desktop? It's easy to write good code that will take advantage of arbitrarily many processors to run faster / smoother, if you have a proper language for the task. With respect to Ken, Bill Gates said something along the lines of "who would need more than 640K?". There is a vast range of applications that cannot be managed in real time using existing single-core technology. Sam ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-15 10:53 ` Sam Watkins @ 2009-10-15 11:50 ` Richard Miller 2009-10-15 12:00 ` W B Hacker 2009-10-16 17:03 ` Sam Watkins 2009-10-15 11:56 ` Josh Wood ` (2 subsequent siblings) 3 siblings, 2 replies; 92+ messages in thread From: Richard Miller @ 2009-10-15 11:50 UTC (permalink / raw) To: 9fans > It's easy to write good code that will take advantage of arbitrarily many > processors to run faster / smoother, if you have a proper language for the > task. ... and if you can find a way around Amdahl's law (qv). ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-15 11:50 ` Richard Miller @ 2009-10-15 12:00 ` W B Hacker 2009-10-16 17:03 ` Sam Watkins 1 sibling, 0 replies; 92+ messages in thread From: W B Hacker @ 2009-10-15 12:00 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs Richard Miller wrote: >> It's easy to write good code that will take advantage of arbitrarily many >> processors to run faster / smoother, if you have a proper language for the >> task. > > ... and if you can find a way around Amdahl's law (qv). > > > http://www.cis.temple.edu/~shi/docs/amdahl/amdahl.html ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-15 11:50 ` Richard Miller 2009-10-15 12:00 ` W B Hacker @ 2009-10-16 17:03 ` Sam Watkins 2009-10-16 18:17 ` ron minnich 2009-10-17 12:42 ` Roman Shaposhnik 1 sibling, 2 replies; 92+ messages in thread From: Sam Watkins @ 2009-10-16 17:03 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Thu, Oct 15, 2009 at 12:50:48PM +0100, Richard Miller wrote: > > It's easy to write good code that will take advantage of arbitrarily many > > processors to run faster / smoother, if you have a proper language for the > > task. > > ... and if you can find a way around Amdahl's law (qv). "The speedup of a program using multiple processors in parallel computing is limited by the time needed for the sequential fraction of the program." So it would only be a problem supposing that a significant part of the program is unparallelizable. I can think of many many tasks where "Amdahl's law" is not going to be a problem at all, for a properly designed system. For example if I had a thousand processors I might raytrace complex scenes for an animated game at 100 fps, or do complex dsp over a 2 hour audio track in one millisecond. I suppose most difficult/interesting tasks can be parallelized effectively. Seems that Amdahl's law is a minor issue. Of course if you are trying to run old-fashioned sequential programs on a parallel machine you will not benefit. You would need to rewrite them. Sam ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-16 17:03 ` Sam Watkins @ 2009-10-16 18:17 ` ron minnich 2009-10-16 18:39 ` Wes Kussmaul 2009-10-17 12:42 ` Roman Shaposhnik 1 sibling, 1 reply; 92+ messages in thread From: ron minnich @ 2009-10-16 18:17 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Fri, Oct 16, 2009 at 10:03 AM, Sam Watkins <sam@nipl.net> wrote: > So it would only be a problem supposing that a significant part of the program > is unparallelizable. I can think of many many tasks where "Amdahl's law" is > not going to be a problem at all, for a properly designed system. > > For example if I had a thousand processors I might raytrace complex scenes for > an animated game at 100 fps, or do complex dsp over a 2 hour audio track in one > millisecond. Yes, if you had that few processors, it might seem easy. It gets somewhat harder when you have, say, 128,000. Insignificant bits of code that were not even visible suddenly dominate the time. ron ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-16 18:17 ` ron minnich @ 2009-10-16 18:39 ` Wes Kussmaul 0 siblings, 0 replies; 92+ messages in thread From: Wes Kussmaul @ 2009-10-16 18:39 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs ron minnich wrote: > Insignificant > bits of code that were not even visible suddenly dominate the time. Reminds me of some project development teams. Maybe Marvin Minsky was on to something. ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-16 17:03 ` Sam Watkins 2009-10-16 18:17 ` ron minnich @ 2009-10-17 12:42 ` Roman Shaposhnik 1 sibling, 0 replies; 92+ messages in thread From: Roman Shaposhnik @ 2009-10-17 12:42 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Fri, Oct 16, 2009 at 10:03 AM, Sam Watkins <sam@nipl.net> wrote: > On Thu, Oct 15, 2009 at 12:50:48PM +0100, Richard Miller wrote: >> > It's easy to write good code that will take advantage of arbitrarily many >> > processors to run faster / smoother, if you have a proper language for the >> > task. >> >> ... and if you can find a way around Amdahl's law (qv). > > "The speedup of a program using multiple processors in parallel computing is > limited by the time needed for the sequential fraction of the program." > > So it would only be a problem supposing that a significant part of the program > is unparallelizable. I can think of many many tasks where "Amdahl's law" is > not going to be a problem at all, for a properly designed system. Lets do a little math, shall we? Better yet, lets graph it: http://en.wikipedia.org/wiki/File:AmdahlsLaw.svg Now, do you see what's on the right side of X axis? That's right 65536 cores. Pause and appreciate the measeleness of speedup... Thanks, Roman. ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-15 10:53 ` Sam Watkins 2009-10-15 11:50 ` Richard Miller @ 2009-10-15 11:56 ` Josh Wood 2009-10-15 13:11 ` hiro 2009-10-18 1:15 ` Roman Shaposhnik 3 siblings, 0 replies; 92+ messages in thread From: Josh Wood @ 2009-10-15 11:56 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Oct 15, 2009, at 3:53 AM, Sam Watkins wrote: > With respect to Ken, Bill Gates said something along the lines of "who > would need more than 640K?" With respect to Ken, from Roman's report, you only know that he asked a question. Roman was the one without an answer, and no one echoed Gates's arbitrary limit. -Josh ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-15 10:53 ` Sam Watkins 2009-10-15 11:50 ` Richard Miller 2009-10-15 11:56 ` Josh Wood @ 2009-10-15 13:11 ` hiro 2009-10-15 15:05 ` David Leimbach 2009-10-18 1:15 ` Roman Shaposhnik 3 siblings, 1 reply; 92+ messages in thread From: hiro @ 2009-10-15 13:11 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > There is a vast range of applications that cannot > be managed in real time using existing single-core technology. I'm sorry to interrupt your discussion, but what is real time? ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-15 13:11 ` hiro @ 2009-10-15 15:05 ` David Leimbach 0 siblings, 0 replies; 92+ messages in thread From: David Leimbach @ 2009-10-15 15:05 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 605 bytes --] On Thu, Oct 15, 2009 at 6:11 AM, hiro <23hiro@googlemail.com> wrote: > > There is a vast range of applications that cannot > > be managed in real time using existing single-core technology. > > I'm sorry to interrupt your discussion, but what is real time? > > Real time just means "fast enough to work properly". You can throw all kinds of other crap on top of that and say things about scheduling requirements and timeslices within which a process must complete, and duty cycles, but those are just things to look at to figure out if your system is "fast enough to work properly" Dave [-- Attachment #2: Type: text/html, Size: 932 bytes --] ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-15 10:53 ` Sam Watkins ` (2 preceding siblings ...) 2009-10-15 13:11 ` hiro @ 2009-10-18 1:15 ` Roman Shaposhnik 2009-10-18 3:15 ` Bakul Shah 3 siblings, 1 reply; 92+ messages in thread From: Roman Shaposhnik @ 2009-10-18 1:15 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Thu, Oct 15, 2009 at 10:53 AM, Sam Watkins <sam@nipl.net> wrote: > On Wed, Oct 14, 2009 at 06:50:28PM -0700, Roman Shaposhnik wrote: >> > The mention that "... the overhead of cache coherence restricts the ability >> > to scale up to even 80 cores" is also eye openeing. If we're at aprox 8 >> > cores today, thats only 5 yrs away (if we double cores every >> > 1.5 yrs). > > Sharing the memory between processes is a stupid approach to multi-processing / > multi-threading. Modern popular computer architecture and software design is > fairly much uniformly stupid. It is. But what's your proposal on code sharing? All those PC registers belonging to different cores have to point somewhere. Is that somewhere is not shared memory the code has to be put there for every single core, right? Thanks, Roman. ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-18 1:15 ` Roman Shaposhnik @ 2009-10-18 3:15 ` Bakul Shah [not found] ` <e763acc10910180606q1312ff7cw9a465d6af39c0fbe@mail.gmail.com> 0 siblings, 1 reply; 92+ messages in thread From: Bakul Shah @ 2009-10-18 3:15 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Sun, 18 Oct 2009 01:15:45 -0000 Roman Shaposhnik <roman@shaposhnik.org> wrote: > On Thu, Oct 15, 2009 at 10:53 AM, Sam Watkins <sam@nipl.net> wrote: > > On Wed, Oct 14, 2009 at 06:50:28PM -0700, Roman Shaposhnik wrote: > >> > The mention that "... the overhead of cache coherence restricts the ab= > ility > >> > to scale up to even 80 cores" is also eye openeing. If we're at aprox = > 8 > >> > cores today, thats only 5 yrs away (if we double cores every > >> > 1.5 yrs). > > > > Sharing the memory between processes is a stupid approach to multi-proces= > sing / > > multi-threading. =A0Modern popular computer architecture and software des= > ign is > > fairly much uniformly stupid. > It is. But what's your proposal on code sharing? All those PC > registers belonging to > different cores have to point somewhere. Is that somewhere is not shared me= > mory > the code has to be put there for every single core, right? Different technoglogies/techniques make sense at different levels of scaling and at different points in time so sharing memory is not necessarily stupid -- unless one thinks that any compromise (to produce usable solutions in a realistic time frame) is stupid. At the hardware level we do have message passing between a processor and the memory controller -- this is exactly the same as talking to a shared server and has the same issues of scaling etc. If you have very few clients, a single shared server is indeed a cost effective solution. When you absolutely have to share state, somebody has to mediate access to the shared state and you can't get around the fact that it's going to cost you. But if you know something about the patterns of sharing, you can get away from a single shared memory & increase concurrency. A simple example is a h/w fifo (to connect producer/consumer but you also gave up some flexibility). As the number of processors increases on a device, sharing state between neighbors will be increasingly cheaper compared any global sharing. Even if you use message passing, messages between near neighbors will be far cheaper than between processors in different neighboorhoods. So switching to message passing is not going to fix things; you have to worry about placement as well (just like in h/w design). ^ permalink raw reply [flat|nested] 92+ messages in thread
[parent not found: <e763acc10910180606q1312ff7cw9a465d6af39c0fbe@mail.gmail.com>]
* Re: [9fans] Barrelfish [not found] ` <e763acc10910180606q1312ff7cw9a465d6af39c0fbe@mail.gmail.com> @ 2009-10-18 13:22 ` Roman Shaposhnik 2009-10-18 19:18 ` Bakul Shah 0 siblings, 1 reply; 92+ messages in thread From: Roman Shaposhnik @ 2009-10-18 13:22 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Sun, Oct 18, 2009 at 6:06 AM, Roman Shaposhnik <shaposhnik@gmail.com> wrote >> It is. But what's your proposal on code sharing? All those PC >> registers belonging to >> different cores have to point somewhere. Is that somewhere is not shared me= >> mory >> the code has to be put there for every single core, right? > > At the hardware level we do have message passing between a > processor and the memory controller -- this is exactly the > same as talking to a shared server and has the same issues of > scaling etc. If you have very few clients, a single shared > server is indeed a cost effective solution. I guess I'm not following. My question to OP was strictly about code sharing. Basically were do the cores get instructions from if not from shared memory. Thanks, Roman. ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-18 13:22 ` Roman Shaposhnik @ 2009-10-18 19:18 ` Bakul Shah 2009-10-18 20:12 ` ron minnich 0 siblings, 1 reply; 92+ messages in thread From: Bakul Shah @ 2009-10-18 19:18 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Sun, 18 Oct 2009 06:22:33 PDT Roman Shaposhnik <roman@shaposhnik.org> wrote: > On Sun, Oct 18, 2009 at 6:06 AM, Roman Shaposhnik <shaposhnik@gmail.com> wrot > e > >> It is. But what's your proposal on code sharing? All those PC > >> registers belonging to > >> different cores have to point somewhere. Is that somewhere is not shared m > e= > >> mory > >> the code has to be put there for every single core, right? > > > > At the hardware level we do have message passing between a > > processor and the memory controller -- this is exactly the > > same as talking to a shared server and has the same issues of > > scaling etc. If you have very few clients, a single shared > > server is indeed a cost effective solution. > > I guess I'm not following. My question to OP was strictly about > code sharing. Basically were do the cores get instructions from > if not from shared memory. Sorry, I should've done a better job of editing. I was really responding to the OP's point that sharing memory between processes is a stupid approach. My point was that "sharing memory" is just a low level programming interface (implemented by message passing in h/w) and it makes sense at some scale. ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-18 19:18 ` Bakul Shah @ 2009-10-18 20:12 ` ron minnich 2009-10-20 0:04 ` [9fans] Parallelism is over a barrel(fish)? Lyndon Nerenberg (VE6BBM/VE7TFX) 0 siblings, 1 reply; 92+ messages in thread From: ron minnich @ 2009-10-18 20:12 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs Since we seem to be having the parallel programming discussion again please look at this: https://asc.llnl.gov/sequoia/benchmarks/ The summaries are interesting. ron ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Parallelism is over a barrel(fish)? 2009-10-18 20:12 ` ron minnich @ 2009-10-20 0:04 ` Lyndon Nerenberg (VE6BBM/VE7TFX) 2009-10-20 1:11 ` W B Hacker 0 siblings, 1 reply; 92+ messages in thread From: Lyndon Nerenberg (VE6BBM/VE7TFX) @ 2009-10-20 0:04 UTC (permalink / raw) To: 9fans >From last week's ACM Technews ... Why Desktop Multiprocessing Has Speed Limits Computerworld (10/05/09) Vol. 43, No. 30, P. 24; Wood, Lamont Despite the mainstreaming of multicore processors for desktops, not every desktop application can be rewritten for multicore frameworks, which means some bottlenecks will persist. "If you have a task that cannot be parallelized and you are currently on a plateau of performance in a single-processor environment, you will not see that task getting significantly faster in the future," says analyst Tom Halfhill. Adobe Systems' Russell Williams points out that performance does not scale linearly even with parallelization on account of memory bandwidth issues and delays dictated by interprocessor communications. Analyst Jim Turley says that, overall, consumer operating systems "don't do anything smart" with multicore architecture. "We have to reinvent computing, and get away from the fundamental premises we inherited from von Neumann," says Microsoft technical fellow Burton Smith. "He assumed one instruction would be executed at a time, and we are no longer even maintaining the appearance of one instruction at a time." Analyst Rob Enderle notes that most applications will operate on only a single core, which means that the benefits of a multicore architecture only come when multiple applications are run. "What we'd all like is a magic compiler that takes yesterday's source code and spreads it across multiple cores, and that is just not happening," says Turley. Despite the performance issues, vendors prefer multicore processors because they can facilitate a higher level of power efficiency. "Using multiple cores will let us get more performance while staying within the power envelope," says Acer's Glenn Jystad. http://www.computerworld.com/s/article/342870/The_Desktop_Traffic_Jam?intsrc=print_latest ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Parallelism is over a barrel(fish)? 2009-10-20 0:04 ` [9fans] Parallelism is over a barrel(fish)? Lyndon Nerenberg (VE6BBM/VE7TFX) @ 2009-10-20 1:11 ` W B Hacker 0 siblings, 0 replies; 92+ messages in thread From: W B Hacker @ 2009-10-20 1:11 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs Lyndon Nerenberg (VE6BBM/VE7TFX) wrote: >>From last week's ACM Technews ... > > Why Desktop Multiprocessing Has Speed Limits > Computerworld (10/05/09) Vol. 43, No. 30, P. 24; Wood, Lamont > > Despite the mainstreaming of multicore processors for desktops, not > every desktop application can be rewritten for multicore frameworks, > which means some bottlenecks will persist. "If you have a task that > cannot be parallelized and you are currently on a plateau of > performance in a single-processor environment, you will not see that > task getting significantly faster in the future," says analyst Tom > Halfhill. Adobe Systems' Russell Williams points out that performance > does not scale linearly even with parallelization on account of memory > bandwidth issues and delays dictated by interprocessor communications. > Analyst Jim Turley says that, overall, consumer operating systems > "don't do anything smart" with multicore architecture. "We have to > reinvent computing, and get away from the fundamental premises we > inherited from von Neumann," says Microsoft technical fellow Burton > Smith. "He assumed one instruction would be executed at a time, and > we are no longer even maintaining the appearance of one instruction at > a time." Analyst Rob Enderle notes that most applications will operate > on only a single core, which means that the benefits of a multicore > architecture only come when multiple applications are run. "What we'd > all like is a magic compiler that takes yesterday's source code and > spreads it across multiple cores, and that is just not happening," > says Turley. Despite the performance issues, vendors prefer multicore > processors because they can facilitate a higher level of power > efficiency. "Using multiple cores will let us get more performance > while staying within the power envelope," says Acer's Glenn Jystad. > > http://www.computerworld.com/s/article/342870/The_Desktop_Traffic_Jam?intsrc=print_latest > > > 'The usual <talking through their anatomy> suspects'.... But they miss the point. Work will always be foudn for 'faster' devices, but the majority of the 'needed' benefit has been accomplished until entirely new challenges surface. Computer games and digital video special-effects are just candy. Even a dual-core allows moving the overhead, housekeeping, I/O, interrupt servicing, et al out of the way of a single-core-bound application. OS/2 Hybrid Multi-Procesing - even with unmodified Win 3X era apps. Beyond that it matters little. Given a 'decent' (not magical)[1] OS, and environment, the apps that actually *matter* to 99% of the population are more than fast enough on the likes of a VIA C6 --> nano/Geode/Atom [2], embedded Ppc [3], or even an ARMish single-core [4]- with or without DSP etc. on-substrate. Faster storage and networks now matter far more than faster local CPU. The ratio of these 'goodies', and their benefits to the population in general to the count of supercomputers [5] and near-real-time video-stream processors [6] is - and will remain - extremely lopsided in favor of the small 'appliance'. Those hyping multi-multi core for the consumer 'PC" market are locked ino an obsolete behaviour pattern. Lower power consumption, smaller form-factor, better display and input interface faster networking is where the need lies. Nothing yet shipped can match the effectiveness of an experienced Wife or Personal Assistant (human) at the other end of an ordinary phone line when (s)he has *anticipated* your needs and called you *before* you recognized the need yourself. Code THAT into silicon, teach it to cook, and you still have a lousy bed-partner... Bill Hacker [1] Anything not horribly wasteful (eg - not Windows), such as Plan9, any *BSD, the leaner Linux (Vector/Slackware), Haiku - all make a more than fast enough desktop on any single-core of 700 MHz or better, even if dragging X-Windows and the like around as a boat-anchor. [2] Laptops amd Netbooks [3] Embedded high-end. Game boxen, Ford and other motor cars [4] A large percentage of PDA's and telecoms handhelds [5] Devilishly hard to substitute for, SETI et al notwithstanding, but needed in relatively small numbers vs, for example, a mobile phone or automobile fuel/pollution reduction system. [6] Given the preponderance of dreck spewed from television and cinema, civilization could well be better-off if all such devices on the planet went on a long holiday and humans returned to actually paying attention to one another. ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-14 19:54 ` Roman Shaposhnik 2009-10-14 21:21 ` Tim Newsham @ 2009-10-14 21:36 ` Eric Van Hensbergen 2009-10-15 2:05 ` Roman Shaposhnik 1 sibling, 1 reply; 92+ messages in thread From: Eric Van Hensbergen @ 2009-10-14 21:36 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs And how does one deal with heterogeneous cores and complex on chip interconnect topologies? Barrelfish also gas a nice benefit in that it could span coherence domains. There's no real evdence that single kernels do well with hundreds of real cores (as opposed to hw threads) - in fact most of the data I've seen is to the contrary. Sent from my iPhone On Oct 14, 2009, at 1:54 PM, Roman Shaposhnik <roman@shaposhnik.org> wrote: > On Wed, Oct 14, 2009 at 12:09 PM, Tim Newsham <newsham@lava.net> > wrote: >> Rethinking multi-core systems as distributed heterogeneous >> systems. Thoughts? > > Somehow this feels related to the work that came out of Berkeley a > year > or so ago. I'm still not convinced what is the benefits of multiple > kernels. If you are managing a couple of 100s of cores a single kernel > would do just fine, once the industry is ready for a couple dozen of > thousands PUs -- the kernel is most likely to be dispensed with > anyway. > > Did you find any ideas there particularly engaging? > > Thanks, > Roman. > ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-14 21:36 ` [9fans] Barrelfish Eric Van Hensbergen @ 2009-10-15 2:05 ` Roman Shaposhnik 2009-10-15 2:17 ` Eric Van Hensbergen 0 siblings, 1 reply; 92+ messages in thread From: Roman Shaposhnik @ 2009-10-15 2:05 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > And how does one deal with heterogeneous cores and complex on chip > interconnect topologies? Good question. Do they have to be heterogeneous? My oppinion is that the future of big multicore will be more Cell-like. > There's no real evdence that single kernels do well with hundreds of real > cores (as opposed to hw threads) - in fact most of the data I've seen is to > the contrary. Agreed. But then, again, you don't really want a kernel for anything but message passing in such an architecture (the other function of the kernel -- multiplexing I/O is only needed on selected few cores) at which point it really becomes a misnomer to even call it a kernel -- a thin hypervisor perhaps... Thanks, Roman. ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-15 2:05 ` Roman Shaposhnik @ 2009-10-15 2:17 ` Eric Van Hensbergen 2009-10-15 3:32 ` Tim Newsham 0 siblings, 1 reply; 92+ messages in thread From: Eric Van Hensbergen @ 2009-10-15 2:17 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Oct 14, 2009, at 8:05 PM, Roman Shaposhnik wrote: >> And how does one deal with heterogeneous cores and complex on chip >> interconnect topologies? > > Good question. Do they have to be heterogeneous? My oppinion is that > the > future of big multicore will be more Cell-like. > They don't have to be, but that is part of both the multikernel and satellite kernel vision. >> There's no real evdence that single kernels do well with hundreds >> of real >> cores (as opposed to hw threads) - in fact most of the data I've >> seen is to >> the contrary. > > Agreed. But then, again, you don't really want a kernel for anything > but message > passing in such an architecture (the other function of the kernel -- > multiplexing > I/O is only needed on selected few cores) at which point it really > becomes a > misnomer to even call it a kernel -- a thin hypervisor perhaps... > If you look at the core of Barrelfish, you'll see that this is essentially what they are doing -- essentially using an extremely small microkernel (like L4) that's very efficient at various forms of message passing. That's the only thing that is duplicated on the various cores. The services themselves can be distributed and/or replicated as appropriate (although their approach favors replication) -- it all depends on the characteristics of the workload. -eric ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-15 2:17 ` Eric Van Hensbergen @ 2009-10-15 3:32 ` Tim Newsham 2009-10-15 3:59 ` Eric Van Hensbergen 0 siblings, 1 reply; 92+ messages in thread From: Tim Newsham @ 2009-10-15 3:32 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > If you look at the core of Barrelfish, you'll see that this is essentially > what they are doing -- essentially using an extremely small microkernel (like > L4) that's very > efficient at various forms of message passing. That's the only thing that is > duplicated on the various cores. The services themselves can be distributed > and/or replicated as appropriate (although their approach favors replication) > -- it all depends on the characteristics of the workload. it sounds like the kernel (L4-like, supposedly tuned to the specific hardware) and the "monitor" (userland, portable) are shared, from the paper. Btw, they have the source code up for free (http://www.barrelfish.org/release_20090914.html) which I supposed could be used to more definitively answer these questions with some effort... > -eric Tim Newsham http://www.thenewsh.com/~newsham/ ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-15 3:32 ` Tim Newsham @ 2009-10-15 3:59 ` Eric Van Hensbergen 2009-10-15 17:39 ` Tim Newsham 0 siblings, 1 reply; 92+ messages in thread From: Eric Van Hensbergen @ 2009-10-15 3:59 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Oct 14, 2009, at 9:32 PM, Tim Newsham wrote: >> If you look at the core of Barrelfish, you'll see that this is >> essentially what they are doing -- essentially using an extremely >> small microkernel (like L4) that's very >> efficient at various forms of message passing. That's the only >> thing that is duplicated on the various cores. The services >> themselves can be distributed >> and/or replicated as appropriate (although their approach favors >> replication) -- it all depends on the characteristics of the >> workload. > > it sounds like the kernel (L4-like, supposedly tuned to the specific > hardware) and the "monitor" (userland, portable) are shared, from > the paper. I'm confused what you mean by "shared". The monitor is replicated on every core as it is responsible for coordination amongst the cores - some things are replicated while others are coordinated. They do choose to replicate most things as part of their core scalability argument, in an effort to reduce lock contention to centralized resources. (from section 4.4): On each core, replicated data structures, such as memory alloca- tion tables and address space mappings, are kept globally consistent by means of an agreement protocol run by the monitors. Application requests that access global state are handled by the monitors, which mediate access to remote copies of state. -eric ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-15 3:59 ` Eric Van Hensbergen @ 2009-10-15 17:39 ` Tim Newsham 0 siblings, 0 replies; 92+ messages in thread From: Tim Newsham @ 2009-10-15 17:39 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs >> it sounds like the kernel (L4-like, supposedly tuned to the specific >> hardware) and the "monitor" (userland, portable) are shared, from >> the paper. > > I'm confused what you mean by "shared". ugh, I completely botched that.. I meant "replicated" not "shared". > -eric Tim Newsham http://www.thenewsh.com/~newsham/ ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-14 19:09 [9fans] Barrelfish Tim Newsham 2009-10-14 19:54 ` Roman Shaposhnik @ 2009-10-15 18:28 ` Christopher Nielsen 2009-10-15 18:55 ` W B Hacker 1 sibling, 1 reply; 92+ messages in thread From: Christopher Nielsen @ 2009-10-15 18:28 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs I think this is an interesting approach. There are several interesting ideas being pursued here. The focus of the discussion has been on the multikernel approach, which I think has merit. Something that has not been discussed here is the wide use of DSLs for systems programming, and using haskell to write a framework for rapidly developing and proving correctness of DSLs. This is just as significant as the multikernel ideas. I downloaded the source, built the system, and will be playing with it. Thoughts? On Wed, Oct 14, 2009 at 12:09, Tim Newsham <newsham@lava.net> wrote: > Rethinking multi-core systems as distributed heterogeneous > systems. Thoughts? > > http://www.sigops.org/sosp/sosp09/papers/baumann-sosp09.pdf > > Tim Newsham > http://www.thenewsh.com/~newsham/ > > -- Christopher Nielsen "They who can give up essential liberty for temporary safety, deserve neither liberty nor safety." --Benjamin Franklin ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-15 18:28 ` Christopher Nielsen @ 2009-10-15 18:55 ` W B Hacker 0 siblings, 0 replies; 92+ messages in thread From: W B Hacker @ 2009-10-15 18:55 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs Christopher Nielsen wrote: > I think this is an interesting approach. > > There are several interesting ideas being pursued here. The focus of > the discussion has been on the multikernel approach, which I think has > merit. > > Something that has not been discussed here is the wide use of DSLs for > systems programming, and using haskell to write a framework for > rapidly developing and proving correctness of DSLs. This is just as > significant as the multikernel ideas. > > I downloaded the source, built the system, and will be playing with it. > > Thoughts? Their 'plan' for security needs a recce as well. Message-passing-based 'creatures' - kernel-level or otherwise - have their own challenges in this regard (Windows, to name one bad example). Likewise, though I've only just started looking at it, if 'Haiku' even *has* a security model, I am (still, yet) blissfuly unaware of it... Bill Hacker > > On Wed, Oct 14, 2009 at 12:09, Tim Newsham <newsham@lava.net> wrote: >> Rethinking multi-core systems as distributed heterogeneous >> systems. Thoughts? >> >> http://www.sigops.org/sosp/sosp09/papers/baumann-sosp09.pdf >> >> Tim Newsham >> http://www.thenewsh.com/~newsham/ >> >> > > > ^ permalink raw reply [flat|nested] 92+ messages in thread
[parent not found: <<20091015105328.GA18947@nipl.net>]
* Re: [9fans] Barrelfish [not found] <<20091015105328.GA18947@nipl.net> @ 2009-10-15 13:27 ` erik quanstrom 2009-10-15 13:40 ` Richard Miller ` (3 more replies) 0 siblings, 4 replies; 92+ messages in thread From: erik quanstrom @ 2009-10-15 13:27 UTC (permalink / raw) To: 9fans On Thu Oct 15 06:55:24 EDT 2009, sam@nipl.net wrote: > task. With respect to Ken, Bill Gates said something along the lines of "who > would need more than 640K?". on the other hand, there were lots of people using computers with 4mb of memory when bill gates said this. it was quite easy to see how to use more than 1mb at the time. in fact, i believe i used an apple ][ around that time that had ~744k. it was a wierd amount of memory. > There is a vast range of applications that cannot > be managed in real time using existing single-core technology. please name one. - erik ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-15 13:27 ` erik quanstrom @ 2009-10-15 13:40 ` Richard Miller 2009-10-16 17:20 ` Sam Watkins ` (2 subsequent siblings) 3 siblings, 0 replies; 92+ messages in thread From: Richard Miller @ 2009-10-15 13:40 UTC (permalink / raw) To: 9fans > in fact, i believe i used an apple ][ around > that time that had ~744k. Are you sure that was an apple II? When I bought mine I remember wrestling with the decision over whether to get the standard 48k of RAM or upgrade to the full 64k. This was long before the IBM PC. ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-15 13:27 ` erik quanstrom 2009-10-15 13:40 ` Richard Miller @ 2009-10-16 17:20 ` Sam Watkins 2009-10-16 18:18 ` Latchesar Ionkov 2009-10-16 21:17 ` Jason Catena 2009-10-17 18:45 ` Eris Discordia [not found] ` <A90043D02D52B2CBF2804FA4@192.168.1.2> 3 siblings, 2 replies; 92+ messages in thread From: Sam Watkins @ 2009-10-16 17:20 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > > There is a vast range of applications that cannot > > be managed in real time using existing single-core technology. > > please name one. Your apparent lack of imagination surprises me. Surely you can see that a whole range of applications becomes possible when using a massively parallel system, when compared to a single-CPU system. You could perhaps also achieve these applications using a large network of 1000 normal computers, but that would be expensive and use a lot of space. I named two in another post: real-time animated raytracing, and instantaneous complex dsp over a long audio track. I'll also mention instantaneous video encoding. Instantaneous building of a complex project from source. (I'm defining instantaneous as less than 1 second for this.) There are also qualitatively different applications, such as effective computer vision, which can be achieved with parallel systems. The operation of animal eyes and brains is obviously massively parallel. A 6502 cpu could achieve a lot in its day with 4000 transistors at 2Mhz. A pentium 4 has 125 million transistors. So, with modern IC tech and excluding the necessary networking and RAM etc on the chip, one could put 31000 6502 processors on a single chip using pentium 4 integration technology, and I suppose you could also clock it up to perhaps 1 Ghz. I shouldn't have to explain how powerful something like this could be. 31000 8-bit 6502 processors running at 1Ghz, fully utilized, could achieve over 7 trillion 32-bit integer operations per second. That is over 7000 times more powerful than a pentium 4 having the same number of transistors. We have 31000 times denser ICs today, and at least 500 times higher clock speeds, but I do not see a 15.5 million times improvement in computer performance when comparing a 6502 to a pentium 4! That is because pentium 4 is a great hulking piece of crap and a waste of transistors. I could easily think of another hundred applications for parallel systems, but I'm sure that if you're intelligent enough to understand what I am saying you can think of your own examples. Sam ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-16 17:20 ` Sam Watkins @ 2009-10-16 18:18 ` Latchesar Ionkov 2009-10-19 15:26 ` Sam Watkins 2009-10-16 21:17 ` Jason Catena 1 sibling, 1 reply; 92+ messages in thread From: Latchesar Ionkov @ 2009-10-16 18:18 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs How do you plan to feed data to these 31 thousand processors so they can be fully utilized? Have you done the calculations and checked what memory bandwidth would you need for that? There are reasons Pentium 4 has the performance you mention, but these reasons don't necessary include the "great hulking piece of crap" statement. Thanks, Lucho On Fri, Oct 16, 2009 at 11:20 AM, Sam Watkins <sam@nipl.net> wrote: > I shouldn't have to explain how powerful something like this could be. 31000 > 8-bit 6502 processors running at 1Ghz, fully utilized, could achieve over 7 > trillion 32-bit integer operations per second. That is over 7000 times more > powerful than a pentium 4 having the same number of transistors. ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-16 18:18 ` Latchesar Ionkov @ 2009-10-19 15:26 ` Sam Watkins 2009-10-19 15:33 ` andrey mirtchovski 2009-10-19 15:50 ` ron minnich 0 siblings, 2 replies; 92+ messages in thread From: Sam Watkins @ 2009-10-19 15:26 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Fri, Oct 16, 2009 at 12:18:47PM -0600, Latchesar Ionkov wrote: > How do you plan to feed data to these 31 thousand processors so they > can be fully utilized? Have you done the calculations and checked what > memory bandwidth would you need for that? I would use a pipelining + divide-and-conquer approach, with some RAM on chip. Units would be smaller than a 6502, more like an adder. Sam ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-19 15:26 ` Sam Watkins @ 2009-10-19 15:33 ` andrey mirtchovski 2009-10-19 15:50 ` ron minnich 1 sibling, 0 replies; 92+ messages in thread From: andrey mirtchovski @ 2009-10-19 15:33 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > I would use a pipelining + divide-and-conquer approach, with some RAM on chip. > Units would be smaller than a 6502, more like an adder. you mean like the Thinking Machines CM-1 and CM-2? it's not like it hasn't been done before :) ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-19 15:26 ` Sam Watkins 2009-10-19 15:33 ` andrey mirtchovski @ 2009-10-19 15:50 ` ron minnich 1 sibling, 0 replies; 92+ messages in thread From: ron minnich @ 2009-10-19 15:50 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Mon, Oct 19, 2009 at 8:26 AM, Sam Watkins <sam@nipl.net> wrote: > On Fri, Oct 16, 2009 at 12:18:47PM -0600, Latchesar Ionkov wrote: >> How do you plan to feed data to these 31 thousand processors so they >> can be fully utilized? Have you done the calculations and checked what >> memory bandwidth would you need for that? > > I would use a pipelining + divide-and-conquer approach, with some RAM on chip. > Units would be smaller than a 6502, more like an adder. I'm not convinced. Lucho just dropped a well known hard problem in your lap (one he deals with every day) but your reply sounds like handwaving. This stuff is harder than it sounds. Unless you're ready to come up with a simulation of your claim -- and it had better be a pretty good one -- I don't think anybody is going to be buying. If you're going to just have adders, for example, you're going to have to explain where the instruction sequencing happens. If there's only one sequencer, then you're going to have to explain why you have not just reinvented the CM-2 or similar MPP. Again, this stuff is quantifiable. A pipeline implies a clock rate. Divide and conquer implies fanout. Where are the numbers? ron ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-16 17:20 ` Sam Watkins 2009-10-16 18:18 ` Latchesar Ionkov @ 2009-10-16 21:17 ` Jason Catena 2009-10-17 20:58 ` Dave Eckhardt 1 sibling, 1 reply; 92+ messages in thread From: Jason Catena @ 2009-10-16 21:17 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > Instantaneous building of a complex project from source. > (I'm defining instantaneous as less than 1 second for this.) Depends on how complex. I spent two years retrofitting a commercial parallel make (which only promises a 20x speedup, even with dedicated hardware) into the build system of a telecommunications product. In retrospect, it would have taken less time to write a new build system with parallelism designed into it, but it seemed less risky to be incremental. There are a lot of dependencies in a complex project. Bundles wrap up a set of files which include executable tasks composed of libraries (linked from their own objects derived from source code) and their own source code: some hand-coded, and some derived from object-oriented models, lexical analyzers and compiler-compilers, and message-passing code generators (it can take a surprisingly long time to generate optimized C code with a functional language). Compile some of this for an ordinary unixy platform, some for any platform which supports java, some for systems without a filesystem where all code runs in the same space as the kernel. Each unit of code wants its own options; all code is expected to honor any global options; build system should not restrict porting code between platforms with different build processes (or produce any delay in the schedule at all;). All of these factors influence the build time of a project, in a complex web of dependencies, even after you write or modify all the build tools to be reentrant so you can run them all at once. The most effective build strategy I've found is avoidance: just don't build what you don't have to, and make sure you only build something once. One thing complicating this is that make and its common variants aren't smart enough to handle the case where version control systems regress a file and present an earlier date not newer than the derived object. In a nutshell, my experience is that unless developers abandon all the fancy tools that supposedly make it easier for them to write mountains of brittle, special-purpose, especially model-generated code, the tool chain created by these dependencies will defeat efforts to make it run faster in parallel. So all your extra processors will only be useful for running many of these heavy builds in parallel, as you try to have each developer build and test before integration. > Sam Jason Catena ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-16 21:17 ` Jason Catena @ 2009-10-17 20:58 ` Dave Eckhardt 2009-10-18 2:09 ` Jason Catena 0 siblings, 1 reply; 92+ messages in thread From: Dave Eckhardt @ 2009-10-17 20:58 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > One thing complicating this is that make and its common > variants aren't smart enough to handle the case where > version control systems regress a file and present an > earlier date not newer than the derived object. See cons/scons. Dave Eckhardt ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-17 20:58 ` Dave Eckhardt @ 2009-10-18 2:09 ` Jason Catena 2009-10-18 16:02 ` Dave Eckhardt 0 siblings, 1 reply; 92+ messages in thread From: Jason Catena @ 2009-10-18 2:09 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs >> One thing complicating this is that make and its common >> variants aren't smart enough to handle the case where >> version control systems regress a file and present an >> earlier date not newer than the derived object. > > See cons/scons. Thanks for the suggestion. In this project someone actually made that same suggestion, but rudely—basically insulting the very thought that someone would be stupid enough to base a build system for commercial software on make. (Right in line with gnu bias, I thought at the time: forceful and disrespectful is no way to make change happen, even if your target was previously inclined your way.) In any event it's not compatible with the speedup tool we selected. Which brings up the unnecessary additional complexity of embedding a dependency analysis and shell-command tool in a general language. Am I expected to complicate my project management tool with python, just to get it to rebuild if a file dependency's date changes at all, rather than only if the file dependency has a newer date? What's wrong with a little language these days? I don't think python needs a file system dependency analysis engine any more than make needs a full language around it. I'd rather store the date of every leaf file on the dependency tree, and in the next build delete any objects derived from a file with a different date. At least that's a consistent programming metaphor. Even the academic project managers out there don't try to mind-merge a general language. For example, Odin complicates make's syntax and execution, almost introducing a type system for targets. This makes it very tricky to generate and edit makefiles dynamically in an existing system, since (IIRC) you have to reload the whole ruleset if you change something. > Dave Eckhardt Jason Catena ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-18 2:09 ` Jason Catena @ 2009-10-18 16:02 ` Dave Eckhardt 0 siblings, 0 replies; 92+ messages in thread From: Dave Eckhardt @ 2009-10-18 16:02 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs >> See cons/scons. > > Thanks for the suggestion. In this project someone actually > made that same suggestion, but rudely?basically insulting the > very thought that someone would be stupid enough to base a > build system for commercial software on make. The non-Plan 9 world suffers from several structural problems which have undermined make's original model. A big one is file systems with routine large clock skew, and another is the N-layers-deep (for large N) nature (build libraries to build tools to build libraries to build applications) which is considered reasonable, or at least unavoidable. Combining that last one with the absence of namespaces makes the problem truly painful in ways which (I think) stretch it outside of the make model. It's possible to "make it work" with enough thrust, but I think people who have done that once and then tried cons/scons think the change is worthwhile. Cons was written by somebody who was in charge of "strap enough thrust onto make" twice and he wrote it to address exactly the problems he saw so he could skip past that part at startup #3. > Am I expected to complicate my project management tool with > python, just to get it to rebuild if a file dependency's date > changes at all, rather than only if the file dependency has > a newer date? Cons and scons get you more than that. Few make systems notice when your compiler has changed out from under you. With gcc's drift rate that could be particularly valuable :-) > What's wrong with a little language these days? Personally I don't find make as typically "augmented" by m4 plus 3,000-line shell scripts to qualify as a "little language". But YMMV and this isn't a make-vs-cons list. Dave Eckhardt ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-15 13:27 ` erik quanstrom 2009-10-15 13:40 ` Richard Miller 2009-10-16 17:20 ` Sam Watkins @ 2009-10-17 18:45 ` Eris Discordia 2009-10-17 21:07 ` Steve Simon 2009-10-19 15:57 ` Sam Watkins [not found] ` <A90043D02D52B2CBF2804FA4@192.168.1.2> 3 siblings, 2 replies; 92+ messages in thread From: Eris Discordia @ 2009-10-17 18:45 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs >> There is a vast range of applications that cannot >> be managed in real time using existing single-core technology. > > please name one. I'm a tiny fish, this is the ocean. Nevertheless, I venture: there are already Cell-based expansion cards out there for "real-time" H.264/VC-1/MPEG-4 AVC encoding. Meaning, 1080p video in, H.264 stream out, "real-time." I can imagine a large market for this in broadcasting, netcasting, simulcasting industry. Simulcasting in particular is a prime application. Station X in Japan broadcasts a popular animated series in 1080i, while US licensor of the same content simulcasts for subscribers through its web interface. This applies all the more to live feeds. What seems to go ignored here is the class of embarrassingly parallel problems which--while they may or may not be important to CS people, I don't know--appear in many areas of applied computing. I know one person working at an institute of the Max Planck Society who regularly runs a few hundred instances of the same program (doing some sort of matrix calculation for a problem in physics) with different input. He certainly could benefit from a hundred cores inside his desktop computing platform _if_ fitting that many cores in there wouldn't cause latencies larger than the network latencies he currently experiences (at the moment he uses a job manager that controls a cluster). "INB4" criticism, his input matrices are small and his work is compute-intensive rather than memory-intensive. Another embarrassingly parallel problem, as Sam Watkins pointed out, arises in digital audio processing. I might add to his example of applying a filter to sections of one track the example of applying the same or different filters to multiple tracks at once. Multitrack editing was/is a killer application of digital audio. Multitrack video editing, too. I believe video/audio processing software were among the first applications for "workstation"-class desktops that were parallelized. By the way, I learnt about embarrassingly parallel problems from that same Max Planck research fellow who runs embarrassingly parallel matrix calculations. --On Thursday, October 15, 2009 09:27 -0400 erik quanstrom <quanstro@quanstro.net> wrote: > On Thu Oct 15 06:55:24 EDT 2009, sam@nipl.net wrote: >> task. With respect to Ken, Bill Gates said something along the lines of >> "who would need more than 640K?". > > on the other hand, there were lots of people using computers with 4mb > of memory when bill gates said this. it was quite easy to see how to use > more than 1mb at the time. in fact, i believe i used an apple ][ around > that time that had ~744k. it was a wierd amount of memory. > >> There is a vast range of applications that cannot >> be managed in real time using existing single-core technology. > > please name one. > > - erik > ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-17 18:45 ` Eris Discordia @ 2009-10-17 21:07 ` Steve Simon 2009-10-17 21:18 ` Eric Van Hensbergen 2009-10-18 8:44 ` Eris Discordia 2009-10-19 15:57 ` Sam Watkins 1 sibling, 2 replies; 92+ messages in thread From: Steve Simon @ 2009-10-17 21:07 UTC (permalink / raw) To: 9fans > I'm a tiny fish, this is the ocean. Nevertheless, I venture: there are > already Cell-based expansion cards out there for "real-time" > H.264/VC-1/MPEG-4 AVC encoding. Meaning, 1080p video in, H.264 stream out, > "real-time." Interesting, 1080p? you have a link? -Steve ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-17 21:07 ` Steve Simon @ 2009-10-17 21:18 ` Eric Van Hensbergen 2009-10-18 8:48 ` Eris Discordia 2009-10-18 8:44 ` Eris Discordia 1 sibling, 1 reply; 92+ messages in thread From: Eric Van Hensbergen @ 2009-10-17 21:18 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs Could be wrong, but I think he's referring to the SPURS Engine: http://en.wikipedia.org/wiki/SpursEngine -eric On Oct 17, 2009, at 4:07 PM, Steve Simon wrote: >> I'm a tiny fish, this is the ocean. Nevertheless, I venture: there >> are >> already Cell-based expansion cards out there for "real-time" >> H.264/VC-1/MPEG-4 AVC encoding. Meaning, 1080p video in, H.264 >> stream out, >> "real-time." > > Interesting, 1080p? you have a link? > > -Steve > ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-17 21:18 ` Eric Van Hensbergen @ 2009-10-18 8:48 ` Eris Discordia 0 siblings, 0 replies; 92+ messages in thread From: Eris Discordia @ 2009-10-18 8:48 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > Could be wrong, but I think he's referring to the SPURS Engine: > http://en.wikipedia.org/wiki/SpursEngine I had never seen that but I had encountered news on the Leadtek card based on it. --On Saturday, October 17, 2009 16:18 -0500 Eric Van Hensbergen <ericvh@gmail.com> wrote: > Could be wrong, but I think he's referring to the SPURS Engine: > http://en.wikipedia.org/wiki/SpursEngine > > -eric > > On Oct 17, 2009, at 4:07 PM, Steve Simon wrote: > >>> I'm a tiny fish, this is the ocean. Nevertheless, I venture: there >>> are >>> already Cell-based expansion cards out there for "real-time" >>> H.264/VC-1/MPEG-4 AVC encoding. Meaning, 1080p video in, H.264 >>> stream out, >>> "real-time." >> >> Interesting, 1080p? you have a link? >> >> -Steve >> > > ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-17 21:07 ` Steve Simon 2009-10-17 21:18 ` Eric Van Hensbergen @ 2009-10-18 8:44 ` Eris Discordia 1 sibling, 0 replies; 92+ messages in thread From: Eris Discordia @ 2009-10-18 8:44 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > Interesting, 1080p? you have a link? The one I read long ago: <http://www.anandtech.com/video/showdoc.aspx?i=3339> First Google "sponsored link:" <http://www.haivision.com/products/mako-hd> (This one's an industrial rackmounted machine. No expansion card.) BadaBoom is just software that uses CUDA: <http://www.badaboomit.com/node/4> "Real-time" performance with CUDA can be achieved on (not-so-)recent Cell-based GPUs. BadaBoom did make a boom in fansubbing community. Every group wants an "encoding officer" with either an i7 or a highly performing GPU. Custom builds of x264 (the most widely used software codec at the moment) already can take advantage of multi-core in encoding. --On Saturday, October 17, 2009 22:07 +0100 Steve Simon <steve@quintile.net> wrote: >> I'm a tiny fish, this is the ocean. Nevertheless, I venture: there are >> already Cell-based expansion cards out there for "real-time" >> H.264/VC-1/MPEG-4 AVC encoding. Meaning, 1080p video in, H.264 stream >> out, "real-time." > > Interesting, 1080p? you have a link? > > -Steve > ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-17 18:45 ` Eris Discordia 2009-10-17 21:07 ` Steve Simon @ 2009-10-19 15:57 ` Sam Watkins 2009-10-19 16:03 ` ron minnich ` (2 more replies) 1 sibling, 3 replies; 92+ messages in thread From: Sam Watkins @ 2009-10-19 15:57 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Sat, Oct 17, 2009 at 07:45:40PM +0100, Eris Discordia wrote: > Another embarrassingly parallel problem, as Sam Watkins pointed out, arises > in digital audio processing. The pipelining + divide-and-conquer method which I would use for parallel systems is much like a series of production lines in a large factory. I calculated roughly that encoding a 2-hour video could be parallelized by a factor of perhaps 20 trillion, using pipelining and divide-and-conquer, with a longest path length of 10000 operations in series. Such a system running at 1Ghz could encode a single 2-hour video in 1/100000 second (latency), or 2 billion hours of video per second (throughput). Details of the calculation: 7200 seconds * 30fps * 12*16 (50*50 pixel chunks) * 500000 elementary arithmetic/logical operations in a pipeline (unrolled). 7200*30*12*16*500000 = 20 trillion (20,000,000,000,000) processing units. This is only a very rough estimate and does not consider all the issues. The "slow" latency of 1/100000 second to encode a video is due to Ahmdal's Law, assuming a longest path of 10000 operations. The throughput of 2 billion hours of video per second would be achieved by pipelining. The throughput is not limited by Ahmdal's Law, as a longer pipeline/network holds more data. Ahmdal's Law gives us a lower limit for the time taken to perform a task with some serial components; but does not limit the throughput of a pipelining system, the throughput is simply one data unit per clock cycle. In reality, it would be hard to build such a system, and one would prefer a system with much less parallelization. However, the human brain does contain 100 billion neurons, and electronic units can be smaller than neurons. My point is, one can design systems to solve practical problems that use almost arbitrarily large numbers of processing units running in parallel. Sam ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-19 15:57 ` Sam Watkins @ 2009-10-19 16:03 ` ron minnich 2009-10-19 16:46 ` Russ Cox 2009-10-20 2:16 ` matt 2 siblings, 0 replies; 92+ messages in thread From: ron minnich @ 2009-10-19 16:03 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Mon, Oct 19, 2009 at 8:57 AM, Sam Watkins <sam@nipl.net> wrote: > This is only a very rough estimate and does not consider all the issues. well that part is right anyway. ron ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-19 15:57 ` Sam Watkins 2009-10-19 16:03 ` ron minnich @ 2009-10-19 16:46 ` Russ Cox 2009-10-20 2:16 ` matt 2 siblings, 0 replies; 92+ messages in thread From: Russ Cox @ 2009-10-19 16:46 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > My point is, one can design systems to solve practical problems that use almost > arbitrarily large numbers of processing units running in parallel. design != build russ ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-19 15:57 ` Sam Watkins 2009-10-19 16:03 ` ron minnich 2009-10-19 16:46 ` Russ Cox @ 2009-10-20 2:16 ` matt 2009-10-20 9:15 ` Steve Simon 2009-10-21 15:43 ` Sam Watkins 2 siblings, 2 replies; 92+ messages in thread From: matt @ 2009-10-20 2:16 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs Sam Watkins wrote: >I calculated roughly that encoding a 2-hour video could be parallelized by a >factor of perhaps 20 trillion, using pipelining and divide-and-conquer, with a >longest path length of 10000 operations in series. Such a system running at >1Ghz could encode a single 2-hour video in 1/100000 second (latency), or 2 >billion hours of video per second (throughput). > > > > I know you are using video / audio encoding as an example and there are probably datasets that make sense but in this case, what use is it? You can't watch 2 hours of video per second and you can't write it to disk fast enough to empty the pipeline. So you'll process all the video and then sit there keeping it powered while you wait to do something with it. I suppose you could keep filtering it. Add into that the datarate of full 10 bit uncompressed 1920x1080/60i HD is 932Mbit so your 1Ghz clockspeed might not be fast enough to play it :) You've got to feed in 2 hours of source material - 820Gb per stream, how? Once you have your uncompressed stream, MPEG-2 encoding requires seeking through the time dimension with keyframes every n frames and out of order macro blocks, so we have to wait for n frames to be composited. For the best quality the datarate is unconstrained on the first processing run and then macro blocks best-fitted and re-ordered on the second to match the desired output datarate, but again, this is n frames at a time. Amdahl is punching you in the face every time you say "see, it's easy". ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-20 2:16 ` matt @ 2009-10-20 9:15 ` Steve Simon 2009-10-21 15:43 ` Sam Watkins 1 sibling, 0 replies; 92+ messages in thread From: Steve Simon @ 2009-10-20 9:15 UTC (permalink / raw) To: 9fans > Add into that the datarate of full 10 bit uncompressed 1920x1080/60i HD > is 932Mbit so your 1Ghz clockspeed might not be fast enough to play it :) Not sure I agree, I think its worse than that: 1920pixels * 1080lines * 30 frames/sec * 20bits/sample in YUV => 1.244Gbps Also, if you want to encode live material you have bigger problems. encoders have pipeline delay but this must be limited, usually to a few hundred millisecods. This means you can only decompose the stream into a few frames which you can run on seperate cpus. Spatial decomposition of the frames helps too but this is much more difficult to do well - i.e. to ensure you cannot see the joins. -Steve ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-20 2:16 ` matt 2009-10-20 9:15 ` Steve Simon @ 2009-10-21 15:43 ` Sam Watkins 2009-10-21 16:11 ` Russ Cox ` (2 more replies) 1 sibling, 3 replies; 92+ messages in thread From: Sam Watkins @ 2009-10-21 15:43 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs I wrote: >I calculated roughly that encoding a 2-hour video could be parallelized by a >factor of perhaps 20 trillion, using pipelining and divide-and-conquer On Tue, Oct 20, 2009 at 03:16:22AM +0100, matt wrote: > I know you are using video / audio encoding as an example and there are > probably datasets that make sense but in this case, what use is it? I was using it to work out the *maximum* extent to which a common task can be parallelized. 20-trillion-fold is the answer I came up with. Someone was talking about Ahmdal's Law and saying that having large numbers of processors is not much use because Ahmdal's Law limits their utilization. I disagree. In reality 10,000 processing units might be a more sensible number to have than 20 trillion. If you have ever done H264 video encoding on a PC you would know that it is very slow, even normal mpeg encoding is barely faster than real time on a 1Ghz PC. Few people like having to wait 2 hours for a task to complete. This whole argument / discussion has come out of nowhere since it appears Ken's original comment was criticising the normal sort of multi-core systems, and he is more in favor of other approaches like FPGA. I fully agree with that. > You can't watch 2 hours of video per second and you can't write it to disk > fast enough to empty the pipeline. If I had a computer with 20 trillion processing units capable of recoding 2 billion hours of video per second, I would have superior storage media and IO systems to go with it. The system I described could encode 2 BILLION hours of video per second, not 2 hours per second. > You've got to feed in 2 hours of source material - 820Gb per stream, how? I suppose some sort of parallel bus of wires or optic fibres. If I have massively parallel processing I would want massively parallel IO to go with it. I.e. something like "read data starting from here" -> "here it is streaming one megabit in parallel down the bus at 1Ghz over 1 million channels" > Once you have your uncompressed stream, MPEG-2 encoding requires seeking > through the time dimension with keyframes every n frames and out of order > macro blocks, so we have to wait for n frames to be composited. For the best > quality the datarate is unconstrained on the first processing run and then > macro blocks best-fitted and re-ordered on the second to match the desired > output datarate, but again, this is n frames at a time. > > Amdahl is punching you in the face every time you say "see, it's easy". I'm no expert on video encoding but it seems to me you are assuming I would approach it the conventional stupid serial way. With massively parallel processing one could "seek" through the time dimension simply by comparing data from all time offsets at once in parallel. Can you give one example of a slow task that you think cannot benefit much from parallel processing? video is an extremely obvious example of one that certainly does benefit from just about as much parallel processing as you can throw at it, so I'm surprised you would argue about it. Probably my "20 trillion" upset you or something, it seems you didn't get my point. It might have been better to consider a simpler example, such as frequency analysis of audio data to perform pitch correction (for out of tune singers). I can write a simple shell script using ffmpeg to do h264 video encoding which would take advantage of perhaps 720 "cores" to encode a two hour video in 10 second chunks with barely any Ahmdal effects, running the encoding over a LAN. A server should be able to pipe the whole 800Mb input - I am assuming it is already encoded in xvid or something - over the network in about 10 seconds on a gigabit (or faster) network. Each participating computer will receive the chunk of data it needs. The encoding would take perhaps 30 seconds for the 10 seconds of video on each of 720 1Ghz computers. And another 10 seconds to pipe the data back to the server. Concatenating the video should take very little time, although perhaps the mp4 format is not the best for that, I'm not sure. The entire operation takes 50 seconds as opposed to 6 hours (21600 seconds). With my 721 computers I achieve a 432 times speed up. Ahmdal is not sucking up much there, only a little for transferring data around. And each computer could be doing something else while waiting for its chunk of data to arrive, the total actual utilization can be over 99%. People do this stuff every day. Have you heard of a render-farm? This applies for all Ahmdal arguments - if part of the system is idle due to serial constraints in the algorithm, it could likely be working on something else. Perhaps you have a couple of videos to recode? Then you can achieve close to 100% utilization. The time taken for a single task may be limited by the method or the hardware, but a batch of several tasks can be achieved close to N times faster if you have N processors/computers. I'm not sure why I'm wasting time writing about this, it's obvious anyway. Sam ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-21 15:43 ` Sam Watkins @ 2009-10-21 16:11 ` Russ Cox 2009-10-21 16:37 ` Sam Watkins 2009-10-21 18:01 ` ron minnich 2009-10-28 15:37 ` matt 2 siblings, 1 reply; 92+ messages in thread From: Russ Cox @ 2009-10-21 16:11 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > Can you give one example of a slow task that you think cannot benefit much from > parallel processing? Rebuilding a venti index is almost entirely I/O bound. You can have as many cores as you want and they will all be sitting idle waiting for the disks. Parallel processing helps only to the extent that you can run the disks in parallel, and they're not multiplying quite as fast as processor cores. > Perhaps you have a couple of videos to recode? Then you can achieve > close to 100% utilization. http://www.dilbert.com/strips/comic/2008-12-13/ Russ ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-21 16:11 ` Russ Cox @ 2009-10-21 16:37 ` Sam Watkins 0 siblings, 0 replies; 92+ messages in thread From: Sam Watkins @ 2009-10-21 16:37 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Wed, Oct 21, 2009 at 09:11:10AM -0700, Russ Cox wrote: > > Can you give one example of a slow task that you think cannot benefit much > > from parallel processing? > > Rebuilding a venti index is almost entirely I/O bound. Perhaps I should have specified a processor-bound task. I don't know much about venti or its indexes, but "rebuilding" an index sounds like a bad idea anyway. I suppose you could make an index that updates progressively? or does this happen in the event of a crash or something? If someone wants to use a massively parallel computer for IO-bound tasks, they should have massively parallel IO and media to go with it. Sam ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-21 15:43 ` Sam Watkins 2009-10-21 16:11 ` Russ Cox @ 2009-10-21 18:01 ` ron minnich 2009-10-28 15:37 ` matt 2 siblings, 0 replies; 92+ messages in thread From: ron minnich @ 2009-10-21 18:01 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Wed, Oct 21, 2009 at 8:43 AM, Sam Watkins <sam@nipl.net> wrote: > People do this stuff every day. > Have you heard of a render-farm? Yes, and some of them are on this list, and have actually done this sort of work, as you clearly have not. Else you would understand where the limits on parallelism are in parallel encoding of MPEG-2, and why, in fact, one useful good thing to know about a two hour movie is that the limit on parallelism might be, oh, say around 240. Had you done it, or come close to doing it, or given some indication that you have some approximation of a clue as to what is involved in doing it, you might be getting a little less argument. > I'm not sure why I'm wasting time writing about this, it's obvious anyway. It is obvious. It's obviously wrong. It's obviously not informed by experience. It's obvious you are enthusiastic about this type of thing but need to learn more about it. The enthusiasm is admirable anyway ... ron ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-21 15:43 ` Sam Watkins 2009-10-21 16:11 ` Russ Cox 2009-10-21 18:01 ` ron minnich @ 2009-10-28 15:37 ` matt 2 siblings, 0 replies; 92+ messages in thread From: matt @ 2009-10-28 15:37 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs Sorry to kick this rotting horse but I just got back > > >>You've got to feed in 2 hours of source material - 820Gb per stream, how? >> >> > >I suppose some sort of parallel bus of wires or optic fibres. > we call that "hand waving" >If I have >massively parallel processing I would want massively parallel IO to go with it. >I.e. something like "read data starting from here" -> "here it is streaming one >megabit in parallel down the bus at 1Ghz over 1 million channels" > > While riding a unicorn >would take advantage of perhaps 720 "cores" to encode a two hour video in 10 >second chunks with barely any Ahmdal effects, > > This 720Gbit storage device sounds pretty good. >People do this stuff every day. >Have you heard of a render-farm? > > Your sarcasm is cute. Have you used a render farm ? You're right that rendering on a few cores is CPU bound. But you've moved the goalposts by 100,000,000 orders of magnitude. We have this comic on the wall with "programmer|compiling" replaced with "animator|rendering" http://xkcd.com/303/ And there's a standing order that you can't have sex unless you're rendering. >I'm not sure why I'm wasting time writing about this, it's obvious anyway. > > Yeah, that must be why everyone is rendering Imax movies in a few seconds. We can all imagine a place where computation is instant and we just say "computer! run Sherlock Holmes on the Holodeck from where we left off, but this time give it a Wild West theme". ^ permalink raw reply [flat|nested] 92+ messages in thread
[parent not found: <A90043D02D52B2CBF2804FA4@192.168.1.2>]
* Re: [9fans] Barrelfish [not found] ` <A90043D02D52B2CBF2804FA4@192.168.1.2> @ 2009-10-18 0:06 ` ron minnich 2009-10-18 0:54 ` Roman Shaposhnik 0 siblings, 1 reply; 92+ messages in thread From: ron minnich @ 2009-10-18 0:06 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs the use of qualitative terms such as "embarassingly parallel" often leads to confusion. Scaling can be measured. It can be quantified. Nothing scales forever, because at some point you want to get an answer back to a person, and/or the components of the app need to talk to each other. It's these basic timing elements that can tell you a lot about scaling. Actually running the app tells you a bit more, of course. Even the really easy apps hit a wall sooner or later. I still remember the struggle I had to scale a simple app to a 16 node cluster in the early days (1992). That was a long time ago and we've gone a lot further than that, but you'd be surprised just how hard it can be, even with "easy" applications. ron ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-18 0:06 ` ron minnich @ 2009-10-18 0:54 ` Roman Shaposhnik 0 siblings, 0 replies; 92+ messages in thread From: Roman Shaposhnik @ 2009-10-18 0:54 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Sun, Oct 18, 2009 at 12:06 AM, ron minnich <rminnich@gmail.com> wrote: > the use of qualitative terms such as "embarassingly parallel" often > leads to confusion. > > Scaling can be measured. It can be quantified. Nothing scales forever, > because at some point you want to get an answer back to a person, > and/or the components of the app need to talk to each other. It's > these basic timing elements that can tell you a lot about scaling. > Actually running the app tells you a bit more, of course. > > Even the really easy apps hit a wall sooner or later. I still remember > the struggle I had to scale a simple app to a 16 node cluster in the > early days (1992). That was a long time ago and we've gone a lot > further than that, but you'd be surprised just how hard it can be, > even with "easy" applications. Can't agree more. I'd say the biggest problem I have with "embarassingly parallel" is the fact that it conjures up images of linear increase of speedup. Nobody ever does math or even experiments to see how quickly we reach point of diminishing returns. Thanks, Roman. ^ permalink raw reply [flat|nested] 92+ messages in thread
[parent not found: <<207092dc429fe476c2046d537aeaa400@hamnavoe.com>]
* Re: [9fans] Barrelfish [not found] <<207092dc429fe476c2046d537aeaa400@hamnavoe.com> @ 2009-10-15 13:52 ` erik quanstrom 2009-10-15 15:07 ` David Leimbach 0 siblings, 1 reply; 92+ messages in thread From: erik quanstrom @ 2009-10-15 13:52 UTC (permalink / raw) To: 9fans On Thu Oct 15 09:41:29 EDT 2009, 9fans@hamnavoe.com wrote: > > in fact, i believe i used an apple ][ around > > that time that had ~744k. > > Are you sure that was an apple II? When I bought mine I remember > wrestling with the decision over whether to get the standard 48k of > RAM or upgrade to the full 64k. This was long before the IBM PC. iirc, it had an odd add in card that accounted for almost all the memory in the system. it wasn't emabled by default. - erik ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-15 13:52 ` erik quanstrom @ 2009-10-15 15:07 ` David Leimbach 2009-10-15 15:21 ` roger peppe 0 siblings, 1 reply; 92+ messages in thread From: David Leimbach @ 2009-10-15 15:07 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 840 bytes --] On Thu, Oct 15, 2009 at 6:52 AM, erik quanstrom <quanstro@quanstro.net>wrote: > On Thu Oct 15 09:41:29 EDT 2009, 9fans@hamnavoe.com wrote: > > > in fact, i believe i used an apple ][ around > > > that time that had ~744k. > > > > Are you sure that was an apple II? When I bought mine I remember > > wrestling with the decision over whether to get the standard 48k of > > RAM or upgrade to the full 64k. This was long before the IBM PC. > > iirc, it had an odd add in card that accounted for almost all the > memory in the system. it wasn't emabled by default. > > - erik > > Was this an Apple ][ GS? I had one with an add on board with I think 1 MB of RAM. People still have those things on the internet... there's ethernet adapters for em and a TCP/IP stack. http://www.apple2.org/marinetti/index.html Dave [-- Attachment #2: Type: text/html, Size: 1379 bytes --] ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-15 15:07 ` David Leimbach @ 2009-10-15 15:21 ` roger peppe 2009-10-16 17:21 ` Sam Watkins 0 siblings, 1 reply; 92+ messages in thread From: roger peppe @ 2009-10-15 15:21 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs BTW it seems the gates quote is false: http://en.wikiquote.org/wiki/Bill_Gates ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-15 15:21 ` roger peppe @ 2009-10-16 17:21 ` Sam Watkins 2009-10-16 23:39 ` Nick LaForge 2009-10-18 1:12 ` Roman Shaposhnik 0 siblings, 2 replies; 92+ messages in thread From: Sam Watkins @ 2009-10-16 17:21 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Thu, Oct 15, 2009 at 04:21:16PM +0100, roger peppe wrote: > BTW it seems the gates quote is false: > > http://en.wikiquote.org/wiki/Bill_Gates maybe the Ken quote is false too - hard to believe he's that out of touch ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-16 17:21 ` Sam Watkins @ 2009-10-16 23:39 ` Nick LaForge 2009-10-18 1:12 ` Roman Shaposhnik 1 sibling, 0 replies; 92+ messages in thread From: Nick LaForge @ 2009-10-16 23:39 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > maybe the Ken quote is false too - hard to believe he's that out of touch The whole table was ganging up on Roman and his crazy idea, I believe ;). The objection mostly was to Intel dumping the complexity of another core on the programmer after it ran out of steam in containing parallelism within the pipeline. Even though Inferno / CSP / Erlang / etc. type people were clearly anxious to make use of parallelism at the level of multiple processor cores, I don't think the average Java programmer was. (That's not to say that Java programmers hadn't been asking for a rude awakening. Perhaps someday, they will also learn what 'Object-Oriented' programming is. ☺) Nick ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-16 17:21 ` Sam Watkins 2009-10-16 23:39 ` Nick LaForge @ 2009-10-18 1:12 ` Roman Shaposhnik 2009-10-19 14:14 ` matt 2009-10-19 16:00 ` Sam Watkins 1 sibling, 2 replies; 92+ messages in thread From: Roman Shaposhnik @ 2009-10-18 1:12 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Fri, Oct 16, 2009 at 5:21 PM, Sam Watkins <sam@nipl.net> wrote: > On Thu, Oct 15, 2009 at 04:21:16PM +0100, roger peppe wrote: >> BTW it seems the gates quote is false: >> >> http://en.wikiquote.org/wiki/Bill_Gates > > maybe the Ken quote is false too - hard to believe he's that out of touch I think the reverse is true -- the fact that he was asking these questions (and again -- he was asking them wrt. garden variety way of doing multicore with a special emphasis on *desktops*) makes him very much in touch with reality, unlike most folks who think that once they get 65535 core they would run 65535 times faster. The misinterpretation of Moore's Law is to blame here, of course: Moore is a smart guy and he was talking about transistor density, but pop culture made is sound like he was talking speed up. For some time the two were in lock-step. Not anymore. I would appreciate if the folks who were in the room correct me, but if I'm not mistaken Ken was alluding to some FPGA work/ideas that he had done and my interpretation of his comments was that if we *really* want to make things parallel we have to bite the bullet, ditch multicore and rethink our strategy. Thanks, Roman. ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-18 1:12 ` Roman Shaposhnik @ 2009-10-19 14:14 ` matt 2009-10-19 16:00 ` Sam Watkins 1 sibling, 0 replies; 92+ messages in thread From: matt @ 2009-10-19 14:14 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > >The misinterpretation of Moore's Law is to blame here, of course: Moore >is a smart guy and he was talking about transistor density, but pop culture >made is sound like he was talking speed up. For some time the two were >in lock-step. Not anymore. > > I ran the numbers the other day based on sped doubles every 2 years, a 60Mhz Pentium would be running 16Ghz by now I think it was the 1ghz that should be 35ghz ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-18 1:12 ` Roman Shaposhnik 2009-10-19 14:14 ` matt @ 2009-10-19 16:00 ` Sam Watkins 1 sibling, 0 replies; 92+ messages in thread From: Sam Watkins @ 2009-10-19 16:00 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Sun, Oct 18, 2009 at 01:12:58AM +0000, Roman Shaposhnik wrote: > I would appreciate if the folks who were in the room correct me, but if I'm > not mistaken Ken was alluding to some FPGA work/ideas that he had done > and my interpretation of his comments was that if we *really* want to > make things parallel we have to bite the bullet, ditch multicore and rethink > our strategy. Certainly, I agree that normal multi-core is not the best approach, FPGA systems or similar could run a lot faster. Sam ^ permalink raw reply [flat|nested] 92+ messages in thread
[parent not found: <<4AD70EE9.1010208@conducive.org>]
* Re: [9fans] Barrelfish [not found] <<4AD70EE9.1010208@conducive.org> @ 2009-10-15 13:52 ` erik quanstrom 0 siblings, 0 replies; 92+ messages in thread From: erik quanstrom @ 2009-10-15 13:52 UTC (permalink / raw) To: 9fans On Thu Oct 15 08:01:29 EDT 2009, wbh@conducive.org wrote: > Richard Miller wrote: > >> It's easy to write good code that will take advantage of arbitrarily many > >> processors to run faster / smoother, if you have a proper language for the > >> task. > > > > ... and if you can find a way around Amdahl's law (qv). > > > > > > > > http://www.cis.temple.edu/~shi/docs/amdahl/amdahl.html the author is hard to search for. http://en.wikipedia.org/wiki/Yuanshi_Era perhaps i misread the paper, but i think it boils down to chopping up a O(n^2) can give you super-linear speedups— no big surprise. and there might be better algorithims. (no examples given.) but you're still going to fall of a cliff when you run out of processors. and you don't get rid of the sequential part. the problem i see with this approach is (a) you need a lot of processors to do this. if p is the number of processors then the speedup for a quadratic algorithm would be n^2 / (sum_{1 .. p} (n/p)^2 = n^2 / p(n/p)^2 = p so if you want a order of magnitude speedup *for a given n* you need 10 processors. but of course the number of processors needed goes as O(n^2). bummer. oh, and we haven't considered communication overhead. and (b) the paper claims that all network communication is worst case O(n) §3¶8. we know this to be false. consider n+1 computers with an ethernet connection connected by a switch. suppose that computers 0-n are ready to send a result back to n and each blast away at the network's limit. it's going to take more time to get the data back in n -> 1 configuration than it would to get the same data back in an 1:1 configuration because the switch will drop packets. even assuming pause frames, the switching time can't be zero. - erik ^ permalink raw reply [flat|nested] 92+ messages in thread
[parent not found: <<3e1162e60910150805q2ea3f682w688299a39274051c@mail.gmail.com>]
* Re: [9fans] Barrelfish [not found] <<3e1162e60910150805q2ea3f682w688299a39274051c@mail.gmail.com> @ 2009-10-15 15:28 ` erik quanstrom 0 siblings, 0 replies; 92+ messages in thread From: erik quanstrom @ 2009-10-15 15:28 UTC (permalink / raw) To: 9fans On Thu Oct 15 11:06:41 EDT 2009, leimy2k@gmail.com wrote: > On Thu, Oct 15, 2009 at 6:11 AM, hiro <23hiro@googlemail.com> wrote: > > > > There is a vast range of applications that cannot > > > be managed in real time using existing single-core technology. > > > > I'm sorry to interrupt your discussion, but what is real time? > > that's a sly one for the fortune file. - erik ^ permalink raw reply [flat|nested] 92+ messages in thread
[parent not found: <<20091016172030.GB3135@nipl.net>]
* Re: [9fans] Barrelfish [not found] <<20091016172030.GB3135@nipl.net> @ 2009-10-16 18:34 ` erik quanstrom 0 siblings, 0 replies; 92+ messages in thread From: erik quanstrom @ 2009-10-16 18:34 UTC (permalink / raw) To: 9fans > > > There is a vast range of applications that cannot > > > be managed in real time using existing single-core technology. > > > > please name one. > > Your apparent lack of imagination surprises me. > > Surely you can see that a whole range of applications becomes possible when > using a massively parallel system, when compared to a single-CPU system. You > could perhaps also achieve these applications using a large network of 1000 > normal computers, but that would be expensive and use a lot of space. > > I named two in another post: real-time animated raytracing, and instantaneous > complex dsp over a long audio track. I'll also mention instantaneous video > encoding. Instantaneous building of a complex project from source. > (I'm defining instantaneous as less than 1 second for this.) two points. 1. by real time i mean this http://en.wikipedia.org/wiki/Real-time_computing i'm not sure what your definition is. i guessing you're using the "can keep up most of the time" definition? 2. i still can't think of any a priori reasons why one can't do any particular task in real time with 1 processor that one can do with more than one processor. perhaps the hidden assumption io that the total processing power of an mp setup will be greater? if the processing power is equal, certainly one would go for the uniprocessor. but as long as i'm being lumped in with ken, i'll take it as a grand complement. ☺ - erik ^ permalink raw reply [flat|nested] 92+ messages in thread
[parent not found: <<d50d7d460910161417w45b5c675p8740315aaf6861f@mail.gmail.com>]
* Re: [9fans] Barrelfish [not found] <<d50d7d460910161417w45b5c675p8740315aaf6861f@mail.gmail.com> @ 2009-10-16 22:25 ` erik quanstrom 0 siblings, 0 replies; 92+ messages in thread From: erik quanstrom @ 2009-10-16 22:25 UTC (permalink / raw) To: 9fans i missed this the first time On Fri Oct 16 17:19:36 EDT 2009, jason.catena@gmail.com wrote: > > Instantaneous building of a complex project from source. > > (I'm defining instantaneous as less than 1 second for this.) > > Depends on how complex. good story. it's hard to know when to rewrite. gcc itself has several files that take ~20s to compile on my machine. what is the plan for getting them to compile in < 1s? also, suppose you have n source files. and suppose you also just happen to have n+1 processors. what's the plan for coordinating them in sub O(n) time? what's the plan for a fs to keep up? heck, linux boot time is bottlenecked not by processor speed but by lowly random disk i/o. - erik ^ permalink raw reply [flat|nested] 92+ messages in thread
[parent not found: <<20091018031508.717CE5B30@mail.bitblocks.com>]
* Re: [9fans] Barrelfish [not found] <<20091018031508.717CE5B30@mail.bitblocks.com> @ 2009-10-19 13:44 ` erik quanstrom 2009-10-19 14:36 ` David Leimbach 0 siblings, 1 reply; 92+ messages in thread From: erik quanstrom @ 2009-10-19 13:44 UTC (permalink / raw) To: 9fans > At the hardware level we do have message passing between a > processor and the memory controller -- this is exactly the > same as talking to a shared server and has the same issues of > scaling etc. If you have very few clients, a single shared > server is indeed a cost effective solution. just to repeat myself in a context that hopefully makes things clearer: sometimes we don't admit it's a network. and that's not always a bad thing. - erik ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-19 13:44 ` erik quanstrom @ 2009-10-19 14:36 ` David Leimbach 0 siblings, 0 replies; 92+ messages in thread From: David Leimbach @ 2009-10-19 14:36 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 947 bytes --] On Mon, Oct 19, 2009 at 6:44 AM, erik quanstrom <quanstro@quanstro.net>wrote: > > At the hardware level we do have message passing between a > > processor and the memory controller -- this is exactly the > > same as talking to a shared server and has the same issues of > > scaling etc. If you have very few clients, a single shared > > server is indeed a cost effective solution. > > just to repeat myself in a context that hopefully makes things > clearer: sometimes we don't admit it's a network. and that's > not always a bad thing. > > - erik > > Yes, we abstract things so it doesn't look like it is... so we can have a programming model where we don't have to care about keeping all the distributed bits in sync. However, I get the feeling that those abstractions, at any level, suffer from the same weaknesses. Well I think that's why certain RISC instruction sets have instructions like eieio anyway :-) Dave [-- Attachment #2: Type: text/html, Size: 1316 bytes --] ^ permalink raw reply [flat|nested] 92+ messages in thread
[parent not found: <<20091019155738.GB13857@nipl.net>]
* Re: [9fans] Barrelfish [not found] <<20091019155738.GB13857@nipl.net> @ 2009-10-19 16:05 ` erik quanstrom 2009-10-19 16:34 ` Sam Watkins 0 siblings, 1 reply; 92+ messages in thread From: erik quanstrom @ 2009-10-19 16:05 UTC (permalink / raw) To: 9fans > Details of the calculation: 7200 seconds * 30fps * 12*16 (50*50 pixel chunks) * > 500000 elementary arithmetic/logical operations in a pipeline (unrolled). > 7200*30*12*16*500000 = 20 trillion (20,000,000,000,000) processing units. > This is only a very rough estimate and does not consider all the issues. could you do a similar calcuation for the memory bandwidth required to deliver said instructions to the processors? if you add that to the memory bandwith required to move the data around, what kind of memory architecture do you propose to move this much data around? - erik ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-19 16:05 ` erik quanstrom @ 2009-10-19 16:34 ` Sam Watkins 2009-10-19 17:30 ` ron minnich 0 siblings, 1 reply; 92+ messages in thread From: Sam Watkins @ 2009-10-19 16:34 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Mon, Oct 19, 2009 at 12:05:19PM -0400, erik quanstrom wrote: > > Details of the calculation: 7200 seconds * 30fps * 12*16 (50*50 pixel > > chunks) * 500000 elementary arithmetic/logical operations in a pipeline > > (unrolled). 7200*30*12*16*500000 = 20 trillion (20,000,000,000,000) > > processing units. This is only a very rough estimate and does not consider > > all the issues. > > could you do a similar calcuation for the memory bandwidth required to > deliver said instructions to the processors? The "processors" (actually smaller processing units) would mostly be configured at load time, much like an FPGA. Most units would execute a single simple operation repeatedly on streams of data, they would not read instructions and execute them sequentially like a normal CPU. The data would travel through the system step by step, it would mostly not need to be stored in RAM. If some RAM was needed, it would be small amounts on chip, at appropriate places in the pipeline. Some programs (not so much video encoding I think) do need a lot of RAM for intermediate calculations, or IO for example to fetch stuff from a database. Such systems can also be designed as networks of simple processing units connected by data streams / pipelines. Sam ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-19 16:34 ` Sam Watkins @ 2009-10-19 17:30 ` ron minnich 2009-10-19 17:57 ` W B Hacker 2009-10-19 18:14 ` David Leimbach 0 siblings, 2 replies; 92+ messages in thread From: ron minnich @ 2009-10-19 17:30 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Mon, Oct 19, 2009 at 9:34 AM, Sam Watkins <sam@nipl.net> wrote: > The "processors" (actually smaller processing units) would mostly be configured > at load time, much like an FPGA. Most units would execute a single simple > operation repeatedly on streams of data, they would not read instructions and > execute them sequentially like a normal CPU. > > The data would travel through the system step by step, it would mostly not need > to be stored in RAM. If some RAM was needed, it would be small amounts on > chip, at appropriate places in the pipeline. > > Some programs (not so much video encoding I think) do need a lot of RAM for > intermediate calculations, or IO for example to fetch stuff from a database. > Such systems can also be designed as networks of simple processing units > connected by data streams / pipelines. I think we could connect them with hyperbarrier technology. Basically we would use the Jeffreys tube, and exploit Bell's theorem and quantum entanglement. Then we could blitz the snarf with the babble, tie it all together with a blotz, and we're done. ron ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-19 17:30 ` ron minnich @ 2009-10-19 17:57 ` W B Hacker 2009-10-19 18:14 ` David Leimbach 1 sibling, 0 replies; 92+ messages in thread From: W B Hacker @ 2009-10-19 17:57 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs ron minnich wrote: > On Mon, Oct 19, 2009 at 9:34 AM, Sam Watkins <sam@nipl.net> wrote: > >> The "processors" (actually smaller processing units) would mostly be configured >> at load time, much like an FPGA. Most units would execute a single simple >> operation repeatedly on streams of data, they would not read instructions and >> execute them sequentially like a normal CPU. >> >> The data would travel through the system step by step, it would mostly not need >> to be stored in RAM. If some RAM was needed, it would be small amounts on >> chip, at appropriate places in the pipeline. >> >> Some programs (not so much video encoding I think) do need a lot of RAM for >> intermediate calculations, or IO for example to fetch stuff from a database. >> Such systems can also be designed as networks of simple processing units >> connected by data streams / pipelines. > > I think we could connect them with hyperbarrier technology. Basically > we would use the Jeffreys tube, and exploit Bell's theorem and quantum > entanglement. Then we could blitz the snarf with the babble, tie it > all together with a blotz, and we're done. > > ron > > Sounds magical. Can any of that approach be used to address Plan9's shortage of drivers and such? Bill (Ducks and waddles away....) ;-) ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-19 17:30 ` ron minnich 2009-10-19 17:57 ` W B Hacker @ 2009-10-19 18:14 ` David Leimbach 1 sibling, 0 replies; 92+ messages in thread From: David Leimbach @ 2009-10-19 18:14 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 1368 bytes --] On Mon, Oct 19, 2009 at 10:30 AM, ron minnich <rminnich@gmail.com> wrote: > On Mon, Oct 19, 2009 at 9:34 AM, Sam Watkins <sam@nipl.net> wrote: > > > The "processors" (actually smaller processing units) would mostly be > configured > > at load time, much like an FPGA. Most units would execute a single > simple > > operation repeatedly on streams of data, they would not read instructions > and > > execute them sequentially like a normal CPU. > > > > The data would travel through the system step by step, it would mostly > not need > > to be stored in RAM. If some RAM was needed, it would be small amounts > on > > chip, at appropriate places in the pipeline. > > > > Some programs (not so much video encoding I think) do need a lot of RAM > for > > intermediate calculations, or IO for example to fetch stuff from a > database. > > Such systems can also be designed as networks of simple processing units > > connected by data streams / pipelines. > > I think we could connect them with hyperbarrier technology. Basically > we would use the Jeffreys tube, and exploit Bell's theorem and quantum > entanglement. Then we could blitz the snarf with the babble, tie it > all together with a blotz, and we're done. > > ron > > As Sir Robin said in the Holy Grail just before getting tossed off The Bridge of Death. "that's EASY!!" [-- Attachment #2: Type: text/html, Size: 1836 bytes --] ^ permalink raw reply [flat|nested] 92+ messages in thread
[parent not found: <<4ADC7439.3060502@maht0x0r.net>]
* Re: [9fans] Barrelfish [not found] <<4ADC7439.3060502@maht0x0r.net> @ 2009-10-19 16:13 ` erik quanstrom 2009-10-19 18:23 ` tlaronde 2009-10-20 1:38 ` matt 0 siblings, 2 replies; 92+ messages in thread From: erik quanstrom @ 2009-10-19 16:13 UTC (permalink / raw) To: 9fans > I ran the numbers the other day based on sped doubles every 2 years, a > 60Mhz Pentium would be running 16Ghz by now > I think it was the 1ghz that should be 35ghz you motivated me to find my copy of _high speed semiconductor devices_, s.m. sze, ed., 1990. there might be one our two little problems with chips that speed that have nothing to do with power — make that cooling. 0. frequency prop electron mobility prop 1/eff. bandgap. unfortunately there's a lower limit on the band gap — kT, thermal energy. 1. p. 8. "the most promising devices are quantum effect devices." (none are currently in use in processors.) 2. p. 192, "...device size will continue to be limited by hot-electron damage." oops. that fills one with confidence, doesn't it? - erik ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-19 16:13 ` erik quanstrom @ 2009-10-19 18:23 ` tlaronde 2009-10-20 1:38 ` matt 1 sibling, 0 replies; 92+ messages in thread From: tlaronde @ 2009-10-19 18:23 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Mon, Oct 19, 2009 at 12:13:34PM -0400, erik quanstrom wrote: > > 1. p. 8. "the most promising devices are quantum effect > devices." (none are currently in use in processors.) Since quantics means unpredictable, I think that we see more and more quantum effects in hardware and software. So, I beg to disagree ;) -- Thierry Laronde (Alceste) <tlaronde +AT+ polynum +dot+ com> http://www.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-19 16:13 ` erik quanstrom 2009-10-19 18:23 ` tlaronde @ 2009-10-20 1:38 ` matt 2009-10-20 1:58 ` Eris Discordia 1 sibling, 1 reply; 92+ messages in thread From: matt @ 2009-10-20 1:38 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs erik quanstrom wrote: > >you motivated me to find my copy of _high speed >semiconductor devices_, s.m. sze, ed., 1990. > > > which motivated me to dig out the post I made elsewhere : "Moore's law doesn't say anything about speed or power. It says manufacturing costs will lower from technological improvements such that the reasonably priced transistor count in an IC will double every 2 years. And here's a pretty graph http://en.wikipedia.org/wiki/File:Transistor_Count_and_Moore%27s_Law_-_2008.svg The misunderstanding makes people who say such twaddle as "Moore's Law, the founding axiom behind Intel, that chips get exponentially faster". If we pretend that 2 years = double speed then roughly : The 1993 66Mhz P1 would now be running at 16.9Ghz The 1995 200Mhz Pentium now would be 25.6Ghz The 1997 300Mhz Pentium now would be 19.2Ghz The 1999 500Mhz Pentium now would be 16Ghz The 2000 1.3Ghz Pentium now would be 20Ghz The 2002 2.2Ghz Pentium would now be 35Ghz The 2002 3.06Ghz Pentium would be going on 48Ghz by Xmas If you plot speed vs year for Pentiums you get two straight lines with a change in gradient in 1999 with the introduction of the P4" ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-20 1:38 ` matt @ 2009-10-20 1:58 ` Eris Discordia 2009-10-20 2:17 ` matt 0 siblings, 1 reply; 92+ messages in thread From: Eris Discordia @ 2009-10-20 1:58 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > "Moore's law doesn't say anything about speed or power. But why'd you assume "people in the wrong" (w.r.t. their understanding of Moore's law) would measure "speed" in gigahertz rather than MIPS or FLOPS? --On Tuesday, October 20, 2009 02:38 +0100 matt <maht-9fans@maht0x0r.net> wrote: > erik quanstrom wrote: > >> >> you motivated me to find my copy of _high speed >> semiconductor devices_, s.m. sze, ed., 1990. >> >> >> > which motivated me to dig out the post I made elsewhere : > > "Moore's law doesn't say anything about speed or power. It says > manufacturing costs will lower from technological improvements such that > the reasonably priced transistor count in an IC will double every 2 years. > > And here's a pretty graph > http://en.wikipedia.org/wiki/File:Transistor_Count_and_Moore%27s_Law_-_20 > 08.svg > > The misunderstanding makes people who say such twaddle as "Moore's Law, > the founding axiom behind Intel, that chips get exponentially faster". > > If we pretend that 2 years = double speed then roughly : > The 1993 66Mhz P1 would now be running at 16.9Ghz > The 1995 200Mhz Pentium now would be 25.6Ghz > The 1997 300Mhz Pentium now would be 19.2Ghz > The 1999 500Mhz Pentium now would be 16Ghz > The 2000 1.3Ghz Pentium now would be 20Ghz > The 2002 2.2Ghz Pentium would now be 35Ghz > The 2002 3.06Ghz Pentium would be going on 48Ghz by Xmas > > If you plot speed vs year for Pentiums you get two straight lines with a > change in gradient in 1999 with the introduction of the P4" > > > ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-20 1:58 ` Eris Discordia @ 2009-10-20 2:17 ` matt 0 siblings, 0 replies; 92+ messages in thread From: matt @ 2009-10-20 2:17 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs Eris Discordia wrote: >> "Moore's law doesn't say anything about speed or power. > > > But why'd you assume "people in the wrong" (w.r.t. their understanding > of Moore's law) would measure "speed" in gigahertz rather than MIPS or > FLOPS? > because that's what the discussion I was having was about ^ permalink raw reply [flat|nested] 92+ messages in thread
[parent not found: <<20091019182352.GA1688@polynum.com>]
* Re: [9fans] Barrelfish [not found] <<20091019182352.GA1688@polynum.com> @ 2009-10-19 18:48 ` erik quanstrom 0 siblings, 0 replies; 92+ messages in thread From: erik quanstrom @ 2009-10-19 18:48 UTC (permalink / raw) To: 9fans totally ot. sorry. > > 1. p. 8. "the most promising devices are quantum effect > > devices." (none are currently in use in processors.) > > Since quantics means unpredictable, I think that we see more and more > quantum effects in hardware and software. So, I beg to disagree ;) you may not fully appreciate what is meant by quantum effect. example devices are: resonant- tunneling transistors, quantum wires and dots. they are definately not unpredictable. they are probabilistic and one can build very useful devices with them. in my misguided youth i worked on building an 808nm laser out of quaternary semis and such quantum structures. wierd stuff. there is no fundamental reason one couldn't build a computer with rt transistors. here's a rt xor structure http://www.hindawi.com/journals/vlsi/2009/803974.html this stuff is insanely hard, and probablly the mythical "twenty-years out"; it's just not Si-friendly. and nobody wants (can afford) to deal with GaAs let alone the funky quateraries. but if we make any real break throughs in computing, it'll likely be based on quantum effect. - erik ^ permalink raw reply [flat|nested] 92+ messages in thread
[parent not found: <<4ADD147A.4090801@maht0x0r.net>]
* Re: [9fans] Barrelfish [not found] <<4ADD147A.4090801@maht0x0r.net> @ 2009-10-20 2:11 ` erik quanstrom 2009-10-20 2:33 ` matt 0 siblings, 1 reply; 92+ messages in thread From: erik quanstrom @ 2009-10-20 2:11 UTC (permalink / raw) To: 9fans > >you motivated me to find my copy of _high speed > >semiconductor devices_, s.m. sze, ed., 1990. > > > > > > > which motivated me to dig out the post I made elsewhere : > > "Moore's law doesn't say anything about speed or power. It says > manufacturing costs will lower from technological improvements such that > the reasonably priced transistor count in an IC will double every 2 years. this is quite an astounding thread. you brought up clock speed doubling and now refute yourself. i just noted that 48ghz is not possible with silicon non-quantium effect tech. - erik ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [9fans] Barrelfish 2009-10-20 2:11 ` erik quanstrom @ 2009-10-20 2:33 ` matt 0 siblings, 0 replies; 92+ messages in thread From: matt @ 2009-10-20 2:33 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs >this is quite an astounding thread. you brought >up clock speed doubling and now refute yourself. > >i just noted that 48ghz is not possible with silicon >non-quantium effect tech. > >- erik > > > I think I've been misunderstood, I wasn't asserting the clock speed increase in the first place, I was hoping to demonstrate what would have happened if Moore's law was the often misquoted "speed doubles every 2 years" when measured in Ghz (not flops as noted by Eris) ^ permalink raw reply [flat|nested] 92+ messages in thread
end of thread, other threads:[~2009-10-28 15:37 UTC | newest] Thread overview: 92+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2009-10-14 19:09 [9fans] Barrelfish Tim Newsham 2009-10-14 19:54 ` Roman Shaposhnik 2009-10-14 21:21 ` Tim Newsham 2009-10-14 21:33 ` Lyndon Nerenberg (VE6BBM/VE7TFX) 2009-10-14 21:42 ` Noah Evans 2009-10-14 21:45 ` erik quanstrom 2009-10-14 21:57 ` Noah Evans 2009-10-14 22:10 ` Eric Van Hensbergen 2009-10-14 22:21 ` Noah Evans 2009-10-15 1:03 ` David Leimbach 2009-10-15 1:50 ` Roman Shaposhnik 2009-10-15 2:12 ` Eric Van Hensbergen 2009-10-15 10:53 ` Sam Watkins 2009-10-15 11:50 ` Richard Miller 2009-10-15 12:00 ` W B Hacker 2009-10-16 17:03 ` Sam Watkins 2009-10-16 18:17 ` ron minnich 2009-10-16 18:39 ` Wes Kussmaul 2009-10-17 12:42 ` Roman Shaposhnik 2009-10-15 11:56 ` Josh Wood 2009-10-15 13:11 ` hiro 2009-10-15 15:05 ` David Leimbach 2009-10-18 1:15 ` Roman Shaposhnik 2009-10-18 3:15 ` Bakul Shah [not found] ` <e763acc10910180606q1312ff7cw9a465d6af39c0fbe@mail.gmail.com> 2009-10-18 13:22 ` Roman Shaposhnik 2009-10-18 19:18 ` Bakul Shah 2009-10-18 20:12 ` ron minnich 2009-10-20 0:04 ` [9fans] Parallelism is over a barrel(fish)? Lyndon Nerenberg (VE6BBM/VE7TFX) 2009-10-20 1:11 ` W B Hacker 2009-10-14 21:36 ` [9fans] Barrelfish Eric Van Hensbergen 2009-10-15 2:05 ` Roman Shaposhnik 2009-10-15 2:17 ` Eric Van Hensbergen 2009-10-15 3:32 ` Tim Newsham 2009-10-15 3:59 ` Eric Van Hensbergen 2009-10-15 17:39 ` Tim Newsham 2009-10-15 18:28 ` Christopher Nielsen 2009-10-15 18:55 ` W B Hacker [not found] <<20091015105328.GA18947@nipl.net> 2009-10-15 13:27 ` erik quanstrom 2009-10-15 13:40 ` Richard Miller 2009-10-16 17:20 ` Sam Watkins 2009-10-16 18:18 ` Latchesar Ionkov 2009-10-19 15:26 ` Sam Watkins 2009-10-19 15:33 ` andrey mirtchovski 2009-10-19 15:50 ` ron minnich 2009-10-16 21:17 ` Jason Catena 2009-10-17 20:58 ` Dave Eckhardt 2009-10-18 2:09 ` Jason Catena 2009-10-18 16:02 ` Dave Eckhardt 2009-10-17 18:45 ` Eris Discordia 2009-10-17 21:07 ` Steve Simon 2009-10-17 21:18 ` Eric Van Hensbergen 2009-10-18 8:48 ` Eris Discordia 2009-10-18 8:44 ` Eris Discordia 2009-10-19 15:57 ` Sam Watkins 2009-10-19 16:03 ` ron minnich 2009-10-19 16:46 ` Russ Cox 2009-10-20 2:16 ` matt 2009-10-20 9:15 ` Steve Simon 2009-10-21 15:43 ` Sam Watkins 2009-10-21 16:11 ` Russ Cox 2009-10-21 16:37 ` Sam Watkins 2009-10-21 18:01 ` ron minnich 2009-10-28 15:37 ` matt [not found] ` <A90043D02D52B2CBF2804FA4@192.168.1.2> 2009-10-18 0:06 ` ron minnich 2009-10-18 0:54 ` Roman Shaposhnik [not found] <<207092dc429fe476c2046d537aeaa400@hamnavoe.com> 2009-10-15 13:52 ` erik quanstrom 2009-10-15 15:07 ` David Leimbach 2009-10-15 15:21 ` roger peppe 2009-10-16 17:21 ` Sam Watkins 2009-10-16 23:39 ` Nick LaForge 2009-10-18 1:12 ` Roman Shaposhnik 2009-10-19 14:14 ` matt 2009-10-19 16:00 ` Sam Watkins [not found] <<4AD70EE9.1010208@conducive.org> 2009-10-15 13:52 ` erik quanstrom [not found] <<3e1162e60910150805q2ea3f682w688299a39274051c@mail.gmail.com> 2009-10-15 15:28 ` erik quanstrom [not found] <<20091016172030.GB3135@nipl.net> 2009-10-16 18:34 ` erik quanstrom [not found] <<d50d7d460910161417w45b5c675p8740315aaf6861f@mail.gmail.com> 2009-10-16 22:25 ` erik quanstrom [not found] <<20091018031508.717CE5B30@mail.bitblocks.com> 2009-10-19 13:44 ` erik quanstrom 2009-10-19 14:36 ` David Leimbach [not found] <<20091019155738.GB13857@nipl.net> 2009-10-19 16:05 ` erik quanstrom 2009-10-19 16:34 ` Sam Watkins 2009-10-19 17:30 ` ron minnich 2009-10-19 17:57 ` W B Hacker 2009-10-19 18:14 ` David Leimbach [not found] <<4ADC7439.3060502@maht0x0r.net> 2009-10-19 16:13 ` erik quanstrom 2009-10-19 18:23 ` tlaronde 2009-10-20 1:38 ` matt 2009-10-20 1:58 ` Eris Discordia 2009-10-20 2:17 ` matt [not found] <<20091019182352.GA1688@polynum.com> 2009-10-19 18:48 ` erik quanstrom [not found] <<4ADD147A.4090801@maht0x0r.net> 2009-10-20 2:11 ` erik quanstrom 2009-10-20 2:33 ` matt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).