[9fans] threads vs forks

9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed

* [9fans] threads vs forks
@ 2009-03-03 11:52 hugo rivera
  2009-03-03 15:19 ` David Leimbach
                   ` (2 more replies)
  0 siblings, 3 replies; 71+ messages in thread
From: hugo rivera @ 2009-03-03 11:52 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Hi,
this is not really a plan 9 question, but since you are the wisest
guys I know I am hoping that you can help me.
You see, I have to launch many tasks running in parallel (~5000) in a
cluster running linux. Each of the task performs some astronomical
calculations and I am not pretty sure if using fork is the best answer
here.
First of all, all the programming is done in python and c, and since
we are using os.fork() python facility I think that it is somehow
related to the underlying c fork (well, I really do not know much of
forks in linux, the few things I do know about forks and threads I got
them from Francisco Ballesteros' "Introduction to operating system
abstractions").
The point here is if I should use forks or threads to deal with the job at hand?
I heard that there are some problems if you fork too many processes (I
am not sure how many are too many) so I am thinking to use threads.
I know some basic differences between threads and forks, but I am not
aware of the details of the implementation (probably I will never be).
Finally, if this is a question that does not belong to the plan 9
mailing list, please let me know and I'll shut up.
Saludos

--
Hugo

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-03 11:52 [9fans] threads vs forks hugo rivera
@ 2009-03-03 15:19 ` David Leimbach
  2009-03-03 15:32   ` Uriel
                     ` (2 more replies)
  2009-03-03 16:00 ` ron minnich
  2009-03-03 16:47 ` John Barham
  2 siblings, 3 replies; 71+ messages in thread
From: David Leimbach @ 2009-03-03 15:19 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 2086 bytes --]

On Tue, Mar 3, 2009 at 3:52 AM, hugo rivera <uair00@gmail.com> wrote:

> Hi,
> this is not really a plan 9 question, but since you are the wisest
> guys I know I am hoping that you can help me.
> You see, I have to launch many tasks running in parallel (~5000) in a
> cluster running linux. Each of the task performs some astronomical
> calculations and I am not pretty sure if using fork is the best answer
> here.
> First of all, all the programming is done in python and c, and since
> we are using os.fork() python facility I think that it is somehow
> related to the underlying c fork (well, I really do not know much of
> forks in linux, the few things I do know about forks and threads I got
> them from Francisco Ballesteros' "Introduction to operating system
> abstractions").


My knowledge on this subject is about 8 or 9 years old, so check with
your local Python guru....

The last I'd heard about Python's threading is that it was cooperative only,
and that you couldn't get real parallelism out of it.  It serves as a means
to organize your program in a concurrent manner.

In other words no two threads run at the same time in Python, even if you're
on a multi-core system, due to something they call a "Global Interpreter
Lock".


>
> The point here is if I should use forks or threads to deal with the job at
> hand?
> I heard that there are some problems if you fork too many processes (I
> am not sure how many are too many) so I am thinking to use threads.
> I know some basic differences between threads and forks, but I am not
> aware of the details of the implementation (probably I will never be).
> Finally, if this is a question that does not belong to the plan 9
> mailing list, please let me know and I'll shut up.
> Saludos
>

I think you need to understand the system limits, which is something you can
look up for yourself.  Also you should understand what kind of runtime model
threads in the language you're using actually implements.

Those rules basically apply to any system.


>
> --
> Hugo
>
>

[-- Attachment #2: Type: text/html, Size: 2792 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-03 15:19 ` David Leimbach
@ 2009-03-03 15:32   ` Uriel
  2009-03-03 16:15     ` hugo rivera
  2009-03-03 15:33   ` hugo rivera
  2009-03-03 18:11   ` Roman V. Shaposhnik
  2 siblings, 1 reply; 71+ messages in thread
From: Uriel @ 2009-03-03 15:32 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Python 'threads' are the same pthreads turds all other lunix junk
uses. The only difference is that the interpreter itself is not
threadsafe, so they have a global lock which means threads suck even
more than usual.

Forking a python interpreter is a *bad* idea, because python's start
up takes billions of years. This has nothing to do with the merits of
fork, and all with how much python sucks.

There is Stackless Python, which has proper CSP threads/procs and
channels, very similar to limbo.

http://www.stackless.com/

But that is too sane for the mainline python folks obviously, so they
stick to the pthrereads turds, ...

My advice: unless you can use Stackless, stay as far away as you can
from any concurrent python stuff. (And don't get me started on twisted
and their event based hacks).

Oh, and as I mentioned in another thread, in my experience if you are
going to fork, make sure you compile statically, dynamic linking is
almost as evil as pthreads. But this is lunix, so what do you expect?

uriel

On Tue, Mar 3, 2009 at 4:19 PM, David Leimbach <leimy2k@gmail.com> wrote:
>
>
> On Tue, Mar 3, 2009 at 3:52 AM, hugo rivera <uair00@gmail.com> wrote:
>>
>> Hi,
>> this is not really a plan 9 question, but since you are the wisest
>> guys I know I am hoping that you can help me.
>> You see, I have to launch many tasks running in parallel (~5000) in a
>> cluster running linux. Each of the task performs some astronomical
>> calculations and I am not pretty sure if using fork is the best answer
>> here.
>> First of all, all the programming is done in python and c, and since
>> we are using os.fork() python facility I think that it is somehow
>> related to the underlying c fork (well, I really do not know much of
>> forks in linux, the few things I do know about forks and threads I got
>> them from Francisco Ballesteros' "Introduction to operating system
>> abstractions").
>
> My knowledge on this subject is about 8 or 9 years old, so check with your local Python guru....
> The last I'd heard about Python's threading is that it was cooperative only,
> and that you couldn't get real parallelism out of it.  It serves as a means
> to organize your program in a concurrent manner.
> In other words no two threads run at the same time in Python, even if you're
> on a multi-core system, due to something they call a "Global Interpreter
> Lock".
>
>>
>> The point here is if I should use forks or threads to deal with the job at
>> hand?
>> I heard that there are some problems if you fork too many processes (I
>> am not sure how many are too many) so I am thinking to use threads.
>> I know some basic differences between threads and forks, but I am not
>> aware of the details of the implementation (probably I will never be).
>> Finally, if this is a question that does not belong to the plan 9
>> mailing list, please let me know and I'll shut up.
>> Saludos
>
> I think you need to understand the system limits, which is something you can
> look up for yourself.  Also you should understand what kind of runtime model
> threads in the language you're using actually implements.
> Those rules basically apply to any system.
>
>>
>> --
>> Hugo
>>
>
>



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-03 15:19 ` David Leimbach
  2009-03-03 15:32   ` Uriel
@ 2009-03-03 15:33   ` hugo rivera
  2009-03-03 18:11   ` Roman V. Shaposhnik
  2 siblings, 0 replies; 71+ messages in thread
From: hugo rivera @ 2009-03-03 15:33 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

thanks a lot guys.
I think I should study this issue in greater detail. It is not as easy
as I tought it would be.

2009/3/3, David Leimbach <leimy2k@gmail.com>:
>
>
> On Tue, Mar 3, 2009 at 3:52 AM, hugo rivera <uair00@gmail.com> wrote:
> > Hi,
> > this is not really a plan 9 question, but since you are the wisest
> > guys I know I am hoping that you can help me.
> > You see, I have to launch many tasks running in parallel (~5000) in a
> > cluster running linux. Each of the task performs some astronomical
> > calculations and I am not pretty sure if using fork is the best answer
> > here.
> > First of all, all the programming is done in python and c, and since
> > we are using os.fork() python facility I think that it is somehow
> > related to the underlying c fork (well, I really do not know much of
> > forks in linux, the few things I do know about forks and threads I got
> > them from Francisco Ballesteros' "Introduction to operating system
> > abstractions").
>
> My knowledge on this subject is about 8 or 9 years old, so
> check with your local Python guru....
>
> The last I'd heard about Python's threading is that it was cooperative only,
> and that you couldn't get real parallelism out of it.  It serves as a means
> to organize your program in a concurrent manner.
>
> In other words no two threads run at the same time in Python, even if you're
> on a multi-core system, due to something they call a "Global Interpreter
> Lock".
>
> >
> > The point here is if I should use forks or threads to deal with the job at
> hand?
> > I heard that there are some problems if you fork too many processes (I
> > am not sure how many are too many) so I am thinking to use threads.
> > I know some basic differences between threads and forks, but I am not
> > aware of the details of the implementation (probably I will never be).
> > Finally, if this is a question that does not belong to the plan 9
> > mailing list, please let me know and I'll shut up.
> > Saludos
> >
>
> I think you need to understand the system limits, which is something you can
> look up for yourself.  Also you should understand what kind of runtime model
> threads in the language you're using actually implements.
>
> Those rules basically apply to any system.
>
> >
> > --
> > Hugo
> >
> >
>
>


--
Hugo



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-03 11:52 [9fans] threads vs forks hugo rivera
  2009-03-03 15:19 ` David Leimbach
@ 2009-03-03 16:00 ` ron minnich
  2009-03-03 16:28   ` hugo rivera
  2009-03-03 16:47 ` John Barham
  2 siblings, 1 reply; 71+ messages in thread
From: ron minnich @ 2009-03-03 16:00 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Tue, Mar 3, 2009 at 3:52 AM, hugo rivera <uair00@gmail.com> wrote:

> You see, I have to launch many tasks running in parallel (~5000) in a
> cluster running linux. Each of the task performs some astronomical
> calculations and I am not pretty sure if using fork is the best answer
> here.


lots of questions first .

how  many cluster nodes. how long do the jobs run. input files or
args? output files? how big? You can't say much with the information
you gave.

ron



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-03 15:32   ` Uriel
@ 2009-03-03 16:15     ` hugo rivera
  0 siblings, 0 replies; 71+ messages in thread
From: hugo rivera @ 2009-03-03 16:15 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

2009/3/3, Uriel <uriel99@gmail.com>:

>  Oh, and as I mentioned in another thread, in my experience if you are
>  going to fork, make sure you compile statically, dynamic linking is
>  almost as evil as pthreads. But this is lunix, so what do you expect?
>

not much. Wish I could get it done with plan 9.

--
Hugo



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-03 16:00 ` ron minnich
@ 2009-03-03 16:28   ` hugo rivera
  2009-03-03 17:31     ` ron minnich
  0 siblings, 1 reply; 71+ messages in thread
From: hugo rivera @ 2009-03-03 16:28 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

2009/3/3, ron minnich <rminnich@gmail.com>:
>
> lots of questions first .
>
>  how  many cluster nodes. how long do the jobs run. input files or
>  args? output files? how big? You can't say much with the information
>  you gave.

It is a small cluster, of 6 machines. I think each job runs for a few
minutes (~5), take some input files and generate a couple of files (I
am not really sure about how many output files each proccess
generates). The size of the output files is ~1Mb.

--
Hugo



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-03 11:52 [9fans] threads vs forks hugo rivera
  2009-03-03 15:19 ` David Leimbach
  2009-03-03 16:00 ` ron minnich
@ 2009-03-03 16:47 ` John Barham
  2009-03-04  9:37   ` Vincent Schut
  2 siblings, 1 reply; 71+ messages in thread
From: John Barham @ 2009-03-03 16:47 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Tue, Mar 3, 2009 at 3:52 AM, hugo rivera <uair00@gmail.com> wrote:

> I have to launch many tasks running in parallel (~5000) in a
> cluster running linux. Each of the task performs some astronomical
> calculations and I am not pretty sure if using fork is the best answer
> here.
> First of all, all the programming is done in python and c...

Take a look at the multiprocessing package
(http://docs.python.org/library/multiprocessing.html), newly
introduced with Python 2.6 and 3.0:

"multiprocessing is a package that supports spawning processes using
an API similar to the threading module. The multiprocessing package
offers both local and remote concurrency, effectively side-stepping
the Global Interpreter Lock by using subprocesses instead of threads."

It should be a quick and easy way to set up a cluster-wide job
processing system (provided all your jobs are driven by Python).

It also looks like it's been (partially?) back-ported to Python 2.4
and 2.5: http://pypi.python.org/pypi/processing.

  John

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-03 16:28   ` hugo rivera
@ 2009-03-03 17:31     ` ron minnich
  0 siblings, 0 replies; 71+ messages in thread
From: ron minnich @ 2009-03-03 17:31 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Tue, Mar 3, 2009 at 8:28 AM, hugo rivera <uair00@gmail.com> wrote:

> It is a small cluster, of 6 machines. I think each job runs for a few
> minutes (~5), take some input files and generate a couple of files (I
> am not really sure about how many output files each proccess
> generates). The size of the output files is ~1Mb.

for that size cluster, and jobs running a few minutes, fork ought to be fine.

ron



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-03 15:19 ` David Leimbach
  2009-03-03 15:32   ` Uriel
  2009-03-03 15:33   ` hugo rivera
@ 2009-03-03 18:11   ` Roman V. Shaposhnik
  2009-03-03 18:38     ` Bakul Shah
                       ` (3 more replies)
  2 siblings, 4 replies; 71+ messages in thread
From: Roman V. Shaposhnik @ 2009-03-03 18:11 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Tue, 2009-03-03 at 07:19 -0800, David Leimbach wrote:

> My knowledge on this subject is about 8 or 9 years old, so check with your local Python guru....
>
>
> The last I'd heard about Python's threading is that it was cooperative
> only, and that you couldn't get real parallelism out of it.  It serves
> as a means to organize your program in a concurrent manner.
>
>
> In other words no two threads run at the same time in Python, even if
> you're on a multi-core system, due to something they call a "Global
> Interpreter Lock".

I believe GIL is as present in Python nowadays as ever. On a related
note: does anybody know any sane interpreted languages with a decent
threading model to go along? Stackless python is the only thing that
I'm familiar with in that department.

Thanks,
Roman.




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-03 18:11   ` Roman V. Shaposhnik
@ 2009-03-03 18:38     ` Bakul Shah
  2009-03-06 18:47       ` Roman V Shaposhnik
  2009-03-03 23:08     ` J.R. Mauro
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 71+ messages in thread
From: Bakul Shah @ 2009-03-03 18:38 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Tue, 03 Mar 2009 10:11:10 PST "Roman V. Shaposhnik" <rvs@sun.com>  wrote:
> On Tue, 2009-03-03 at 07:19 -0800, David Leimbach wrote:
>
> > My knowledge on this subject is about 8 or 9 years old, so check with your
> local Python guru....
> >
> >
> > The last I'd heard about Python's threading is that it was cooperative
> > only, and that you couldn't get real parallelism out of it.  It serves
> > as a means to organize your program in a concurrent manner.
> >
> >
> > In other words no two threads run at the same time in Python, even if
> > you're on a multi-core system, due to something they call a "Global
> > Interpreter Lock".
>
> I believe GIL is as present in Python nowadays as ever. On a related
> note: does anybody know any sane interpreted languages with a decent
> threading model to go along? Stackless python is the only thing that
> I'm familiar with in that department.

Depend on what you mean by "sane interpreted language with a
decent threading model" and what you want to do with it but
check out www.clojure.org.  Then there is Erlang.  Its
wikipedia entry has this to say:
    Although Erlang was designed to fill a niche and has
    remained an obscure language for most of its existence,
    it is experiencing a rapid increase in popularity due to
    increased demand for concurrent services, inferior models
    of concurrency in most mainstream programming languages,
    and its substantial libraries and documentation.[7][8]
    Well-known applications include Amazon SimpleDB,[9]
    Yahoo! Delicious,[10] and the Facebook Chat system.[11]



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-03 18:11   ` Roman V. Shaposhnik
  2009-03-03 18:38     ` Bakul Shah
@ 2009-03-03 23:08     ` J.R. Mauro
  2009-03-03 23:15       ` Uriel
  2009-03-04  5:07     ` David Leimbach
  2009-03-04  5:35     ` John Barham
  3 siblings, 1 reply; 71+ messages in thread
From: J.R. Mauro @ 2009-03-03 23:08 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Tue, Mar 3, 2009 at 1:11 PM, Roman V. Shaposhnik <rvs@sun.com> wrote:
> On Tue, 2009-03-03 at 07:19 -0800, David Leimbach wrote:
>
>> My knowledge on this subject is about 8 or 9 years old, so check with your local Python guru....
>>
>>
>> The last I'd heard about Python's threading is that it was cooperative
>> only, and that you couldn't get real parallelism out of it.  It serves
>> as a means to organize your program in a concurrent manner.
>>
>>
>> In other words no two threads run at the same time in Python, even if
>> you're on a multi-core system, due to something they call a "Global
>> Interpreter Lock".
>
> I believe GIL is as present in Python nowadays as ever. On a related
> note: does anybody know any sane interpreted languages with a decent
> threading model to go along? Stackless python is the only thing that
> I'm familiar with in that department.

I thought part of the reason for the "big break" with Python 3000 was
to get rid of the GIL and clean that threading mess up. Or am I way
off?

>
> Thanks,
> Roman.
>
>
>



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-03 23:08     ` J.R. Mauro
@ 2009-03-03 23:15       ` Uriel
  2009-03-03 23:23         ` J.R. Mauro
  0 siblings, 1 reply; 71+ messages in thread
From: Uriel @ 2009-03-03 23:15 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

You are off. It is doubtful that the GIL will ever be removed.

But that really isn't the issue, the issue is the lack of a decent
concurrency model, like the one provided by Stackless.

But apparently one of the things stackless allows is evil recursive
programming, which Guido considers 'confusing' and wont allow in
mainline python (I think another reason is that porting it to jython
and .not would be hard, but I'm not familiar with the details).

uriel


On Wed, Mar 4, 2009 at 12:08 AM, J.R. Mauro <jrm8005@gmail.com> wrote:
> On Tue, Mar 3, 2009 at 1:11 PM, Roman V. Shaposhnik <rvs@sun.com> wrote:
>> On Tue, 2009-03-03 at 07:19 -0800, David Leimbach wrote:
>>
>>> My knowledge on this subject is about 8 or 9 years old, so check with your local Python guru....
>>>
>>>
>>> The last I'd heard about Python's threading is that it was cooperative
>>> only, and that you couldn't get real parallelism out of it.  It serves
>>> as a means to organize your program in a concurrent manner.
>>>
>>>
>>> In other words no two threads run at the same time in Python, even if
>>> you're on a multi-core system, due to something they call a "Global
>>> Interpreter Lock".
>>
>> I believe GIL is as present in Python nowadays as ever. On a related
>> note: does anybody know any sane interpreted languages with a decent
>> threading model to go along? Stackless python is the only thing that
>> I'm familiar with in that department.
>
> I thought part of the reason for the "big break" with Python 3000 was
> to get rid of the GIL and clean that threading mess up. Or am I way
> off?
>
>>
>> Thanks,
>> Roman.
>>
>>
>>
>
>



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-03 23:15       ` Uriel
@ 2009-03-03 23:23         ` J.R. Mauro
  2009-03-03 23:54           ` Devon H. O'Dell
  0 siblings, 1 reply; 71+ messages in thread
From: J.R. Mauro @ 2009-03-03 23:23 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Tue, Mar 3, 2009 at 6:15 PM, Uriel <uriel99@gmail.com> wrote:
> You are off. It is doubtful that the GIL will ever be removed.

That's too bad. Things like that just reinforce my view that Python is a hack :(

Oh well, back to C...

>
> But that really isn't the issue, the issue is the lack of a decent
> concurrency model, like the one provided by Stackless.
>
> But apparently one of the things stackless allows is evil recursive
> programming, which Guido considers 'confusing' and wont allow in
> mainline python (I think another reason is that porting it to jython
> and .not would be hard, but I'm not familiar with the details).

Concurrency seems to be one of those things that's "too hard" for
everyone, and I don't buy it. There's no reason it needs to be as hard
as it is.

And nevermind the fact that it's not really usable for every (or even
most) jobs out there. But Intel is pushing it, so that's where we have
to go, I suppose.

>
> uriel
> - Show quoted text -
>
> On Wed, Mar 4, 2009 at 12:08 AM, J.R. Mauro <jrm8005@gmail.com> wrote:
>> On Tue, Mar 3, 2009 at 1:11 PM, Roman V. Shaposhnik <rvs@sun.com> wrote:
>>> On Tue, 2009-03-03 at 07:19 -0800, David Leimbach wrote:
>>>
>>>> My knowledge on this subject is about 8 or 9 years old, so check with your local Python guru....
>>>>
>>>>
>>>> The last I'd heard about Python's threading is that it was cooperative
>>>> only, and that you couldn't get real parallelism out of it.  It serves
>>>> as a means to organize your program in a concurrent manner.
>>>>
>>>>
>>>> In other words no two threads run at the same time in Python, even if
>>>> you're on a multi-core system, due to something they call a "Global
>>>> Interpreter Lock".
>>>
>>> I believe GIL is as present in Python nowadays as ever. On a related
>>> note: does anybody know any sane interpreted languages with a decent
>>> threading model to go along? Stackless python is the only thing that
>>> I'm familiar with in that department.
>>
>> I thought part of the reason for the "big break" with Python 3000 was
>> to get rid of the GIL and clean that threading mess up. Or am I way
>> off?
>>
>>>
>>> Thanks,
>>> Roman.
>>>
>>>
>>>
>>
>>
>
>



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-03 23:23         ` J.R. Mauro
@ 2009-03-03 23:54           ` Devon H. O'Dell
  2009-03-04  0:33             ` J.R. Mauro
  2009-03-06  9:39             ` maht
  0 siblings, 2 replies; 71+ messages in thread
From: Devon H. O'Dell @ 2009-03-03 23:54 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

2009/3/3 J.R. Mauro <jrm8005@gmail.com>:
> Concurrency seems to be one of those things that's "too hard" for
> everyone, and I don't buy it. There's no reason it needs to be as hard
> as it is.

That's a fact. If you have access to The ACM Queue, check out
p16-cantrill-concurrency.pdf (Cantrill and Bonwich on concurrency).

> And nevermind the fact that it's not really usable for every (or even
> most) jobs out there. But Intel is pushing it, so that's where we have
> to go, I suppose.

That's simply not true. In my world (server software and networking),
most tasks can be improved by utilizing concurrent programming
paradigms. Even in user interfaces, these are useful. For mathematics,
there's simply no question that making use of concurrent algorithms is
a win. In fact, I can't think of a single case in which doing two
lines of work at once isn't better than doing one at a time, assuming
that accuracy is maintained in the result.

--dho

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-03 23:54           ` Devon H. O'Dell
@ 2009-03-04  0:33             ` J.R. Mauro
  2009-03-04  0:54               ` erik quanstrom
  2009-03-06  9:39             ` maht
  1 sibling, 1 reply; 71+ messages in thread
From: J.R. Mauro @ 2009-03-04  0:33 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Tue, Mar 3, 2009 at 6:54 PM, Devon H. O'Dell <devon.odell@gmail.com> wrote:
> 2009/3/3 J.R. Mauro <jrm8005@gmail.com>:
>> Concurrency seems to be one of those things that's "too hard" for
>> everyone, and I don't buy it. There's no reason it needs to be as hard
>> as it is.
>
> That's a fact. If you have access to The ACM Queue, check out
> p16-cantrill-concurrency.pdf (Cantrill and Bonwich on concurrency).

Things like TBB and other libraries to automagically scale up repeated
operations into parallelized ones help alleviate the problems with
getting parallelization to work. They're ugly, they only address
narrow problem sets, but they're attempts at solutions. And if you
look at languages like LISP and Erlang, you're definitely left with a
feeling that parallelization is being treated as harder than it is.

I'm not saying it isn't hard, just that there are a lot of people who
seem to be throwing up their hands over it. I suppose I should stop
reading their material.

>
>> And nevermind the fact that it's not really usable for every (or even
>> most) jobs out there. But Intel is pushing it, so that's where we have
>> to go, I suppose.
>
> That's simply not true. In my world (server software and networking),
> most tasks can be improved by utilizing concurrent programming
> paradigms. Even in user interfaces, these are useful. For mathematics,
> there's simply no question that making use of concurrent algorithms is
> a win. In fact, I can't think of a single case in which doing two
> lines of work at once isn't better than doing one at a time, assuming
> that accuracy is maintained in the result.

I should have qualified. I mean *massive* parallelization when applied
to "average" use cases. I don't think it's totally unusable (I
complain about synchronous I/O on my phone every day), but it's being
pushed as a panacea, and that is what I think is wrong. Don Knuth
holds this opinion, but I think he's mostly alone on that,
unfortunately.

Of course for mathematically intensive and large-scale operations, the
more parallel you can make things the better.

>
> --dho
>
>

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-04  0:33             ` J.R. Mauro
@ 2009-03-04  0:54               ` erik quanstrom
  2009-03-04  1:54                 ` J.R. Mauro
                                   ` (2 more replies)
  0 siblings, 3 replies; 71+ messages in thread
From: erik quanstrom @ 2009-03-04  0:54 UTC (permalink / raw)
  To: 9fans

> I should have qualified. I mean *massive* parallelization when applied
> to "average" use cases. I don't think it's totally unusable (I
> complain about synchronous I/O on my phone every day), but it's being
> pushed as a panacea, and that is what I think is wrong. Don Knuth
> holds this opinion, but I think he's mostly alone on that,
> unfortunately.

it's interesting that parallel wasn't cool when chips were getting
noticably faster rapidly.  perhaps the focus on parallelization
is a sign there aren't any other ideas.

- erik



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-04  0:54               ` erik quanstrom
@ 2009-03-04  1:54                 ` J.R. Mauro
  2009-03-04  3:18                   ` James Tomaschke
  2009-03-04  5:19                   ` David Leimbach
  2009-03-04  2:47                 ` John Barham
  2009-03-04  5:24                 ` blstuart
  2 siblings, 2 replies; 71+ messages in thread
From: J.R. Mauro @ 2009-03-04  1:54 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Tue, Mar 3, 2009 at 7:54 PM, erik quanstrom <quanstro@quanstro.net> wrote:
>> I should have qualified. I mean *massive* parallelization when applied
>> to "average" use cases. I don't think it's totally unusable (I
>> complain about synchronous I/O on my phone every day), but it's being
>> pushed as a panacea, and that is what I think is wrong. Don Knuth
>> holds this opinion, but I think he's mostly alone on that,
>> unfortunately.
>
> it's interesting that parallel wasn't cool when chips were getting
> noticably faster rapidly.  perhaps the focus on parallelization
> is a sign there aren't any other ideas.

Indeed, I think it is. The big manufacturers seem to have hit a wall
with clock speed, done a full reverse, and are now just trying to pack
more transistors and cores on the chip. Not that this is evil, but I
think this is just as bad as the obsession with upping the clock
speeds in that they're too focused on one path instead of
incorporating other cool ideas (i.e., things Transmeta was working on
with virtualization and hosting foreign ISAs)

>
> - erik
>
>



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-04  0:54               ` erik quanstrom
  2009-03-04  1:54                 ` J.R. Mauro
@ 2009-03-04  2:47                 ` John Barham
  2009-03-04  5:24                 ` blstuart
  2 siblings, 0 replies; 71+ messages in thread
From: John Barham @ 2009-03-04  2:47 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Tue, Mar 3, 2009 at 4:54 PM, erik quanstrom <quanstro@quanstro.net> wrote:
>> I should have qualified. I mean *massive* parallelization when applied
>> to "average" use cases. I don't think it's totally unusable (I
>> complain about synchronous I/O on my phone every day), but it's being
>> pushed as a panacea, and that is what I think is wrong. Don Knuth
>> holds this opinion, but I think he's mostly alone on that,
>> unfortunately.
>
> it's interesting that parallel wasn't cool when chips were getting
> noticably faster rapidly.  perhaps the focus on parallelization
> is a sign there aren't any other ideas.

That seems to be what Knuth thinks.  Excerpt from a 2008 interview w/ InformIT:

"InformIT: Vendors of multicore processors have expressed frustration
at the difficulty of moving developers to this model. As a former
professor, what thoughts do you have on this transition and how to
make it happen? Is it a question of proper tools, such as better
native support for concurrency in languages, or of execution
frameworks? Or are there other solutions?

Knuth: I don’t want to duck your question entirely. I might as well
flame a bit about my personal unhappiness with the current trend
toward multicore architecture. To me, it looks more or less like the
hardware designers have run out of ideas, and that they’re trying to
pass the blame for the future demise of Moore’s Law to the software
writers by giving us machines that work faster only on a few key
benchmarks! I won’t be surprised at all if the whole multithreading
idea turns out to be a flop, worse than the "Itanium" approach that
was supposed to be so terrific—until it turned out that the wished-for
compilers were basically impossible to write."

Full interview is at http://www.informit.com/articles/article.aspx?p=1193856.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-04  1:54                 ` J.R. Mauro
@ 2009-03-04  3:18                   ` James Tomaschke
  2009-03-04  3:30                     ` erik quanstrom
  2009-03-04  5:19                   ` David Leimbach
  1 sibling, 1 reply; 71+ messages in thread
From: James Tomaschke @ 2009-03-04  3:18 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

J.R. Mauro wrote:
> On Tue, Mar 3, 2009 at 7:54 PM, erik quanstrom <quanstro@quanstro.net> wrote:
>>> I should have qualified. I mean *massive* parallelization when applied
>>> to "average" use cases. I don't think it's totally unusable (I
>>> complain about synchronous I/O on my phone every day), but it's being
>>> pushed as a panacea, and that is what I think is wrong. Don Knuth
>>> holds this opinion, but I think he's mostly alone on that,
>>> unfortunately.
>> it's interesting that parallel wasn't cool when chips were getting
>> noticably faster rapidly.  perhaps the focus on parallelization
>> is a sign there aren't any other ideas.
>
> Indeed, I think it is. The big manufacturers seem to have hit a wall
> with clock speed, done a full reverse, and are now just trying to pack
> more transistors and cores on the chip. Not that this is evil, but I
> think this is just as bad as the obsession with upping the clock
> speeds in that they're too focused on one path instead of
> incorporating other cool ideas (i.e., things Transmeta was working on
> with virtualization and hosting foreign ISAs)

Die size has been the main focus for the foundries, reduced transistor
switch time is just a benefit from that.  Digital components work well
here, but Analog suffers and creating a stable clock at high frequency
is done in the Analog domain.

It is much easier to double the transistor count than it is to double
the clock frequency.  Also have to consider the power/heat/noise costs
from increasing the clock.

I think the reason why you didn't see parallelism come out earlier in
the PC market was because they needed to create new mechanisms for I/O.
  AMD did this with Hypertransport, and I've seen 32-core (8-socket)
systems with this.  Now Intel has their own I/O rethink out there.

I've been trying to get my industry to look at parallel computing for
many years, and it's only now that they are starting to sell parallel
circuit simulators and still they are not that efficient.  A
traditionally week-long sim is now taking a single day when run on
12-cores.  I'll take that 7x over 1x anytime though.

/james

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-04  3:18                   ` James Tomaschke
@ 2009-03-04  3:30                     ` erik quanstrom
  2009-03-04  4:44                       ` James Tomaschke
  0 siblings, 1 reply; 71+ messages in thread
From: erik quanstrom @ 2009-03-04  3:30 UTC (permalink / raw)
  To: 9fans

> I think the reason why you didn't see parallelism come out earlier in
> the PC market was because they needed to create new mechanisms for I/O.
>   AMD did this with Hypertransport, and I've seen 32-core (8-socket)
> systems with this.  Now Intel has their own I/O rethink out there.

i think what you're saying is equivalent to saying
(in terms i understand) that memory bandwidth was
so bad that a second processor couldn't do much work.

but i haven't found this to be the case.  even the
highly constrained pentium 4 gets some milage out of
hyperthreading for the tests i've run.

the intel 5000-series still use a fsb.  and they seem to
scale well from 1 to 4 cores.

are there benchmarks that show otherwise similar
hypertransport systems trouncing intel in multithreaded
performance?  i don't recall seeing anything more than
a moderate (15-20%) advantage.

- erik

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-04  3:30                     ` erik quanstrom
@ 2009-03-04  4:44                       ` James Tomaschke
  2009-03-04  5:05                         ` J.R. Mauro
  0 siblings, 1 reply; 71+ messages in thread
From: James Tomaschke @ 2009-03-04  4:44 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

erik quanstrom wrote:
>> I think the reason why you didn't see parallelism come out earlier in
>> the PC market was because they needed to create new mechanisms for I/O.
>>   AMD did this with Hypertransport, and I've seen 32-core (8-socket)
>> systems with this.  Now Intel has their own I/O rethink out there.
>
> i think what you're saying is equivalent to saying
> (in terms i understand) that memory bandwidth was
> so bad that a second processor couldn't do much work.
Yes bandwidth and latency.
>
> but i haven't found this to be the case.  even the
> highly constrained pentium 4 gets some milage out of
> hyperthreading for the tests i've run.
>
> the intel 5000-series still use a fsb.  and they seem to
> scale well from 1 to 4 cores.

Many of the circuit simulators I use fall flat on their face after 4
cores, say.  However I blame this on their algorithm not hardware.

I wasn't making an AMD vs Intel comment, just that AMD had created HTX
along with their K8 platform to address scalability concerns with I/O.

> are there benchmarks that show otherwise similar
> hypertransport systems trouncing intel in multithreaded
> performance?  i don't recall seeing anything more than
> a moderate (15-20%) advantage.

I don't have a 16-core Intel system to compare with, but:
http://en.wikipedia.org/wiki/List_of_device_bandwidths#Computer_buses

I think the reason why Intel developed their Common Systems Interconnect
(now called QuickPath Interconnect) was to address it's shortcomings.

Both AMD and Intel are looking at I/O because it is and will be a
limiting factor when scaling to higher core counts.

>
> - erik
>
>




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-04  4:44                       ` James Tomaschke
@ 2009-03-04  5:05                         ` J.R. Mauro
  2009-03-04  5:50                           ` erik quanstrom
  0 siblings, 1 reply; 71+ messages in thread
From: J.R. Mauro @ 2009-03-04  5:05 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Tue, Mar 3, 2009 at 11:44 PM, James Tomaschke <james@orcasystems.com> wrote:
> erik quanstrom wrote:
>>>
>>> I think the reason why you didn't see parallelism come out earlier in the
>>> PC market was because they needed to create new mechanisms for I/O.  AMD did
>>> this with Hypertransport, and I've seen 32-core (8-socket) systems with
>>> this.  Now Intel has their own I/O rethink out there.
>>
>> i think what you're saying is equivalent to saying
>> (in terms i understand) that memory bandwidth was
>> so bad that a second processor couldn't do much work.
>
> Yes bandwidth and latency.
>>
>> but i haven't found this to be the case.  even the
>> highly constrained pentium 4 gets some milage out of
>> hyperthreading for the tests i've run.
>>
>> the intel 5000-series still use a fsb.  and they seem to
>> scale well from 1 to 4 cores.
>
> Many of the circuit simulators I use fall flat on their face after 4 cores,
> say.  However I blame this on their algorithm not hardware.
>
> I wasn't making an AMD vs Intel comment, just that AMD had created HTX along
> with their K8 platform to address scalability concerns with I/O.
>
>> are there benchmarks that show otherwise similar
>> hypertransport systems trouncing intel in multithreaded
>> performance?  i don't recall seeing anything more than
>> a moderate (15-20%) advantage.
>
> I don't have a 16-core Intel system to compare with, but:
> http://en.wikipedia.org/wiki/List_of_device_bandwidths#Computer_buses
>
> I think the reason why Intel developed their Common Systems Interconnect
> (now called QuickPath Interconnect) was to address it's shortcomings.
>
> Both AMD and Intel are looking at I/O because it is and will be a limiting
> factor when scaling to higher core counts.

And soon hard disk latencies are really going to start hurting (they
already are hurting some, I'm sure), and I'm not convinced of the
viability of SSDs.


There was an interesting article I came across that compared the
latencies of accessing a register, a CPU cache, main memory, and disk,
which put them in human terms. As much as we like to say we understand
the difference between a millisecond and a nanosecond, seeing cache
access expressed in terms of moments and a disk access in terms of
years was rather illuminating, if only to me.

Same article also put a google search at only slightly slower latency
than hard disk access. The internet really is becoming the computer, I
suppose.

>
>>
>> - erik
>>
>>
>
>
>



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-03 18:11   ` Roman V. Shaposhnik
  2009-03-03 18:38     ` Bakul Shah
  2009-03-03 23:08     ` J.R. Mauro
@ 2009-03-04  5:07     ` David Leimbach
  2009-03-04  5:35     ` John Barham
  3 siblings, 0 replies; 71+ messages in thread
From: David Leimbach @ 2009-03-04  5:07 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 1399 bytes --]

On Tue, Mar 3, 2009 at 10:11 AM, Roman V. Shaposhnik <rvs@sun.com> wrote:

> On Tue, 2009-03-03 at 07:19 -0800, David Leimbach wrote:
>
> > My knowledge on this subject is about 8 or 9 years old, so check with
> your local Python guru....
> >
> >
> > The last I'd heard about Python's threading is that it was cooperative
> > only, and that you couldn't get real parallelism out of it.  It serves
> > as a means to organize your program in a concurrent manner.
> >
> >
> > In other words no two threads run at the same time in Python, even if
> > you're on a multi-core system, due to something they call a "Global
> > Interpreter Lock".
>
> I believe GIL is as present in Python nowadays as ever. On a related
> note: does anybody know any sane interpreted languages with a decent
> threading model to go along? Stackless python is the only thing that
> I'm familiar with in that department.


I'm a fan of Erlang.  Though I guess it's technically a compiled virtual
machine of sorts, even when it's "escript".

But I've had an absolutely awesome experience over the last year using it,
and so far only wishing it came with the type safety of Haskell :-).

I love Haskell's threading model actually, in either the data parallelism or
the forkIO interface, it's pretty sane.  Typed data channels even between
forkIO'd threads.


>
>
> Thanks,
> Roman.
>
>
>

[-- Attachment #2: Type: text/html, Size: 2006 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-04  1:54                 ` J.R. Mauro
  2009-03-04  3:18                   ` James Tomaschke
@ 2009-03-04  5:19                   ` David Leimbach
  1 sibling, 0 replies; 71+ messages in thread
From: David Leimbach @ 2009-03-04  5:19 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 1285 bytes --]

On Tue, Mar 3, 2009 at 5:54 PM, J.R. Mauro <jrm8005@gmail.com> wrote:

> On Tue, Mar 3, 2009 at 7:54 PM, erik quanstrom <quanstro@quanstro.net>
> wrote:
> >> I should have qualified. I mean *massive* parallelization when applied
> >> to "average" use cases. I don't think it's totally unusable (I
> >> complain about synchronous I/O on my phone every day), but it's being
> >> pushed as a panacea, and that is what I think is wrong. Don Knuth
> >> holds this opinion, but I think he's mostly alone on that,
> >> unfortunately.
> >
> > it's interesting that parallel wasn't cool when chips were getting
> > noticably faster rapidly.  perhaps the focus on parallelization
> > is a sign there aren't any other ideas.
>
> Indeed, I think it is. The big manufacturers seem to have hit a wall
> with clock speed, done a full reverse, and are now just trying to pack
> more transistors and cores on the chip. Not that this is evil, but I
> think this is just as bad as the obsession with upping the clock
> speeds in that they're too focused on one path instead of
> incorporating other cool ideas (i.e., things Transmeta was working on
> with virtualization and hosting foreign ISAs)


Can we bring back the Burroughs? :-)


>
>
> >
> > - erik
> >
> >
>
>

[-- Attachment #2: Type: text/html, Size: 1901 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-04  0:54               ` erik quanstrom
  2009-03-04  1:54                 ` J.R. Mauro
  2009-03-04  2:47                 ` John Barham
@ 2009-03-04  5:24                 ` blstuart
  2009-03-04  5:37                   ` erik quanstrom
                                     ` (2 more replies)
  2 siblings, 3 replies; 71+ messages in thread
From: blstuart @ 2009-03-04  5:24 UTC (permalink / raw)
  To: 9fans

> it's interesting that parallel wasn't cool when chips were getting
> noticably faster rapidly.  perhaps the focus on parallelization
> is a sign there aren't any other ideas.

Gotta do something will all the extra transistors.  After all, Moore's
law hasn't been repealed.  And pipelines and traditional caches
are pretty good examples of dimishing returns.  So multiple cores
seems a pretty straightforward approach.

Now there is another use that would at least be intellectually interesting
and possible useful in practice.  Use the transistors for a really big
memory running at cache speed.  But instead of it being a hardware
cache, manage it explicitly.  In effect, we have a very high speed
main memory, and the traditional main memory is backing store.
It'd give a use for all those paging algorithms that aren't particularly
justified at the main memory-disk boundary any more.  And you
can fit a lot of Plan 9 executable images in a 64MB on-chip memory
space.  Obviously, it wouldn't be a good fit for severely memory-hungry
apps, and it might be a dead end overall, but it'd at least be something
different...

BLS

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-03 18:11   ` Roman V. Shaposhnik
                       ` (2 preceding siblings ...)
  2009-03-04  5:07     ` David Leimbach
@ 2009-03-04  5:35     ` John Barham
  3 siblings, 0 replies; 71+ messages in thread
From: John Barham @ 2009-03-04  5:35 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> I believe GIL is as present in Python nowadays as ever. On a related
> note: does anybody know any sane interpreted languages with a decent
> threading model to go along? Stackless python is the only thing that
> I'm familiar with in that department.

Check out Lua's coroutines: http://www.lua.org/manual/5.1/manual.html#2.11

Here's an implementation of the sieve of Eratosthenes using Lua
coroutines similar to the Limbo one:
http://www.lua.org/cgi-bin/demo?sieve



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-04  5:24                 ` blstuart
@ 2009-03-04  5:37                   ` erik quanstrom
  2009-03-04 16:29                   ` Roman V Shaposhnik
  2009-03-04 16:56                   ` john
  2 siblings, 0 replies; 71+ messages in thread
From: erik quanstrom @ 2009-03-04  5:37 UTC (permalink / raw)
  To: 9fans

> Now there is another use that would at least be intellectually interesting
> and possible useful in practice.  Use the transistors for a really big
> memory running at cache speed.  But instead of it being a hardware
> cache, manage it explicitly.  In effect, we have a very high speed
> main memory, and the traditional main memory is backing store.
> It'd give a use for all those paging algorithms that aren't particularly
> justified at the main memory-disk boundary any more.  And you
> can fit a lot of Plan 9 executable images in a 64MB on-chip memory
> space.  Obviously, it wouldn't be a good fit for severely memory-hungry
> apps, and it might be a dead end overall, but it'd at least be something
> different...

ken's fs already has the machinery to handle this.  one could imagine
a cachefs that knew how to manage this for venti.  (though venti seems
like a poor fit.)  there are lots of interesting uses of explicitly managed,
heirarchical caches.  yet so far hardware has done it's level best to hide
this.

- erik



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-04  5:05                         ` J.R. Mauro
@ 2009-03-04  5:50                           ` erik quanstrom
  2009-03-04  6:08                             ` andrey mirtchovski
  2009-03-04 16:52                             ` J.R. Mauro
  0 siblings, 2 replies; 71+ messages in thread
From: erik quanstrom @ 2009-03-04  5:50 UTC (permalink / raw)
  To: 9fans

> >
> > Both AMD and Intel are looking at I/O because it is and will be a limiting
> > factor when scaling to higher core counts.

i/o starts sucking wind with one core.
that's why we differentiate i/o from everything
else we do.

> And soon hard disk latencies are really going to start hurting (they
> already are hurting some, I'm sure), and I'm not convinced of the
> viability of SSDs.

i'll assume you mean throughput.  hard drive latency has been a big deal
for a long time.  tanenbaum integrated knowledge of track layout into
his minix elevator algorithm.

i think the gap between cpu performance and hd performance is narrowing,
not getting wider.

i don't have accurate measurements on how much real-world performance
difference there is between a core i7 and an intel 5000.  it's generally not
spectacular, clock-for-clock. on the other hand, when the intel 5000-series
was released, the rule of thumb for a sata hd was 50mb/s.  it's not too hard
to find regular sata hard drives that do 110mb/s today.  the ssd drives we've
(coraid) tested have been spectacular --- reading at > 200mb/s.  if you want
to talk latency, ssds can deliver 1/100th the latency of spinning media.
there's no way that the core i7 is 100x faster than the intel 5000.

- erik

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-04  5:50                           ` erik quanstrom
@ 2009-03-04  6:08                             ` andrey mirtchovski
  2009-03-04 16:52                             ` J.R. Mauro
  1 sibling, 0 replies; 71+ messages in thread
From: andrey mirtchovski @ 2009-03-04  6:08 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> the ssd drives we've
> (coraid) tested have been spectacular --- reading at > 200mb/s.

you know, i've read all the reviews and seen all the windows
benchmarks. but this info, coming from somebody on this list, is much
more assuring than all the slashdot articles.

the tests didn't involve plan9 by any chance, did they? ;)

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-03 16:47 ` John Barham
@ 2009-03-04  9:37   ` Vincent Schut
  2009-03-04  9:58     ` hugo rivera
  0 siblings, 1 reply; 71+ messages in thread
From: Vincent Schut @ 2009-03-04  9:37 UTC (permalink / raw)
  To: 9fans

John Barham wrote:
> On Tue, Mar 3, 2009 at 3:52 AM, hugo rivera <uair00@gmail.com> wrote:
>
>> I have to launch many tasks running in parallel (~5000) in a
>> cluster running linux. Each of the task performs some astronomical
>> calculations and I am not pretty sure if using fork is the best answer
>> here.
>> First of all, all the programming is done in python and c...
>
> Take a look at the multiprocessing package
> (http://docs.python.org/library/multiprocessing.html), newly
> introduced with Python 2.6 and 3.0:
>
> "multiprocessing is a package that supports spawning processes using
> an API similar to the threading module. The multiprocessing package
> offers both local and remote concurrency, effectively side-stepping
> the Global Interpreter Lock by using subprocesses instead of threads."
>
> It should be a quick and easy way to set up a cluster-wide job
> processing system (provided all your jobs are driven by Python).

Better: use parallelpython (www.parallelpython.org). Afaik
multiprocessing is geared towards multi-core systems (one machine),
while pp is also suitable for real clusters with more pc's. No special
cluster software needed. It will start (here's your fork) a (some)
python interpreters on each node, and then you can submit jobs to those
'workers'. The interpreters are kept alive between jobs, so the startup
penalty becomes neglectibly when the number of jobs is large enough.
Using it here to process massive amounts of satellite data, works like a
charm.

Vincent.
>
> It also looks like it's been (partially?) back-ported to Python 2.4
> and 2.5: http://pypi.python.org/pypi/processing.
>
>   John
>
>




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-04  9:37   ` Vincent Schut
@ 2009-03-04  9:58     ` hugo rivera
  2009-03-04 10:30       ` Vincent Schut
  0 siblings, 1 reply; 71+ messages in thread
From: hugo rivera @ 2009-03-04  9:58 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Thanks for the advice.
Nevertheless I am in no position to decide what pieces of software the
cluster will run, I just have to deal with what I have, but anyway I
can suggest other possibilities.

2009/3/4, Vincent Schut <schut@sarvision.nl>:
> John Barham wrote:
>
> > On Tue, Mar 3, 2009 at 3:52 AM, hugo rivera <uair00@gmail.com> wrote:
> >
> >
> > > I have to launch many tasks running in parallel (~5000) in a
> > > cluster running linux. Each of the task performs some astronomical
> > > calculations and I am not pretty sure if using fork is the best answer
> > > here.
> > > First of all, all the programming is done in python and c...
> > >
> >
> > Take a look at the multiprocessing package
> > (http://docs.python.org/library/multiprocessing.html),
> newly
> > introduced with Python 2.6 and 3.0:
> >
> > "multiprocessing is a package that supports spawning processes using
> > an API similar to the threading module. The multiprocessing package
> > offers both local and remote concurrency, effectively side-stepping
> > the Global Interpreter Lock by using subprocesses instead of threads."
> >
> > It should be a quick and easy way to set up a cluster-wide job
> > processing system (provided all your jobs are driven by Python).
> >
>
>  Better: use parallelpython (www.parallelpython.org). Afaik multiprocessing
> is geared towards multi-core systems (one machine), while pp is also
> suitable for real clusters with more pc's. No special cluster software
> needed. It will start (here's your fork) a (some) python interpreters on
> each node, and then you can submit jobs to those 'workers'. The interpreters
> are kept alive between jobs, so the startup penalty becomes neglectibly when
> the number of jobs is large enough.
>  Using it here to process massive amounts of satellite data, works like a
> charm.
>
>  Vincent.
>
>
> >
> > It also looks like it's been (partially?) back-ported to Python 2.4
> > and 2.5: http://pypi.python.org/pypi/processing.
> >
> >  John
> >
> >
> >
>
>
>


--
Hugo



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-04  9:58     ` hugo rivera
@ 2009-03-04 10:30       ` Vincent Schut
  2009-03-04 10:45         ` hugo rivera
  2009-03-04 14:57         ` ron minnich
  0 siblings, 2 replies; 71+ messages in thread
From: Vincent Schut @ 2009-03-04 10:30 UTC (permalink / raw)
  To: 9fans

hugo rivera wrote:
> Thanks for the advice.
> Nevertheless I am in no position to decide what pieces of software the
> cluster will run, I just have to deal with what I have, but anyway I
> can suggest other possibilities.

Well, depends on how you define 'software the cluster will run'. Do you
mean cluster management software, or really any program or script or
python module that needs to be installed on each node? Because for pp,
you won't need any cluster software. pp is just some python module and
helper scripts. You *do* need to install this (pure python) module on
each node, yes, but that's it, nothing else needed.
Btw, you said 'it's a small cluster, about 6 machines'. Now I'm not an
expert, but I don't think you can do threading/forking from one machine
to another (on linux). So I suppose there already is some cluster
management software involved? And while you appear to be "in no position
to decide what pieces of software the cluster will run", you might want
to enlighten us on what this cluster /will/ run? Your best solution
might depend on that...

Cheers,
Vincent.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-04 10:30       ` Vincent Schut
@ 2009-03-04 10:45         ` hugo rivera
  2009-03-04 11:15           ` Vincent Schut
  2009-03-04 14:57         ` ron minnich
  1 sibling, 1 reply; 71+ messages in thread
From: hugo rivera @ 2009-03-04 10:45 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

The cluster has torque installed as the resource manager. I think it
runs of top of pbs (an older project).
As far as I know now I just have to call a qsub command to submit my
jobs on a queue, then the resource manager allocates a processor in
the cluster for my process to run till is finished.
And I am not really sure if I have access to all the nodes, so I can
install pp on each one of them.

2009/3/4, Vincent Schut <schut@sarvision.nl>:
> hugo rivera wrote:
>
> > Thanks for the advice.
> > Nevertheless I am in no position to decide what pieces of software the
> > cluster will run, I just have to deal with what I have, but anyway I
> > can suggest other possibilities.
> >
>
>  Well, depends on how you define 'software the cluster will run'. Do you
> mean cluster management software, or really any program or script or python
> module that needs to be installed on each node? Because for pp, you won't
> need any cluster software. pp is just some python module and helper scripts.
> You *do* need to install this (pure python) module on each node, yes, but
> that's it, nothing else needed.
>  Btw, you said 'it's a small cluster, about 6 machines'. Now I'm not an
> expert, but I don't think you can do threading/forking from one machine to
> another (on linux). So I suppose there already is some cluster management
> software involved? And while you appear to be "in no position to decide what
> pieces of software the cluster will run", you might want to enlighten us on
> what this cluster /will/ run? Your best solution might depend on that...
>
>  Cheers,
>  Vincent.
>
>
>


--
Hugo



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-04 10:45         ` hugo rivera
@ 2009-03-04 11:15           ` Vincent Schut
  2009-03-04 11:33             ` hugo rivera
  0 siblings, 1 reply; 71+ messages in thread
From: Vincent Schut @ 2009-03-04 11:15 UTC (permalink / raw)
  To: 9fans

hugo rivera wrote:
> The cluster has torque installed as the resource manager. I think it
> runs of top of pbs (an older project).
> As far as I know now I just have to call a qsub command to submit my
> jobs on a queue, then the resource manager allocates a processor in
> the cluster for my process to run till is finished.

Well, I don't know torque neither pbs, but I'm guessing that when you
submit a job, this job will be some program or script that is run on the
allocated processor? If so, your initial question of forking vs
threading is bogus. Your cluster manager will run (exec) your job, which
if it is a python script will start a python interpreter for each job. I
guess that's the overhead you get when running a flexible cluster
system, flexible meaning that it can run any type of job (shell script,
binary executable, python script, perl, etc.).
However, your overhead of starting new python processes each time may
seem significant when viewed in absolute terms, but if each job
processes lots of data and takes, as you said, 5 min to run on a decent
processor, don't you think the startup time for the python process would
become non-significant? For example, on a decent machine here, the first
time python takes 0.224 secs to start and shutdown immediately, and
consequetive starts take only about 0.009 secs because everything is
still in memory. Let's take the 0.224 secs for a worst case scenario.
That would be approx 0.075 percent of your job execution time. Now lets
say you have 6 machines with 8 cores each and perfect scaling, all your
jobs would take 6000 / (6*8) *5min = 625 minutes (10 hours 25 mins)
without python starting each time, and 625 minutes and 28 seconds with
python starting anew each job. Don't you think you could just live with
these 28 seconds more? Just reading this message might already have
taken you more than those 28 seconds...

Vincent.

> And I am not really sure if I have access to all the nodes, so I can
> install pp on each one of them.
>
> 2009/3/4, Vincent Schut <schut@sarvision.nl>:
>> hugo rivera wrote:
>>
>>> Thanks for the advice.
>>> Nevertheless I am in no position to decide what pieces of software the
>>> cluster will run, I just have to deal with what I have, but anyway I
>>> can suggest other possibilities.
>>>
>>  Well, depends on how you define 'software the cluster will run'. Do you
>> mean cluster management software, or really any program or script or python
>> module that needs to be installed on each node? Because for pp, you won't
>> need any cluster software. pp is just some python module and helper scripts.
>> You *do* need to install this (pure python) module on each node, yes, but
>> that's it, nothing else needed.
>>  Btw, you said 'it's a small cluster, about 6 machines'. Now I'm not an
>> expert, but I don't think you can do threading/forking from one machine to
>> another (on linux). So I suppose there already is some cluster management
>> software involved? And while you appear to be "in no position to decide what
>> pieces of software the cluster will run", you might want to enlighten us on
>> what this cluster /will/ run? Your best solution might depend on that...
>>
>>  Cheers,
>>  Vincent.
>>
>>
>>
>
>

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-04 11:15           ` Vincent Schut
@ 2009-03-04 11:33             ` hugo rivera
  2009-03-04 13:23               ` Uriel
  0 siblings, 1 reply; 71+ messages in thread
From: hugo rivera @ 2009-03-04 11:33 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

you are right. I was totally confused at the beggining.
Thanks a lot.

2009/3/4, Vincent Schut <schut@sarvision.nl>:
> hugo rivera wrote:
>
> > The cluster has torque installed as the resource manager. I think it
> > runs of top of pbs (an older project).
> > As far as I know now I just have to call a qsub command to submit my
> > jobs on a queue, then the resource manager allocates a processor in
> > the cluster for my process to run till is finished.
> >
>
>  Well, I don't know torque neither pbs, but I'm guessing that when you
> submit a job, this job will be some program or script that is run on the
> allocated processor? If so, your initial question of forking vs threading is
> bogus. Your cluster manager will run (exec) your job, which if it is a
> python script will start a python interpreter for each job. I guess that's
> the overhead you get when running a flexible cluster system, flexible
> meaning that it can run any type of job (shell script, binary executable,
> python script, perl, etc.).
>  However, your overhead of starting new python processes each time may seem
> significant when viewed in absolute terms, but if each job processes lots of
> data and takes, as you said, 5 min to run on a decent processor, don't you
> think the startup time for the python process would become non-significant?
> For example, on a decent machine here, the first time python takes 0.224
> secs to start and shutdown immediately, and consequetive starts take only
> about 0.009 secs because everything is still in memory. Let's take the 0.224
> secs for a worst case scenario. That would be approx 0.075 percent of your
> job execution time. Now lets say you have 6 machines with 8 cores each and
> perfect scaling, all your jobs would take 6000 / (6*8) *5min = 625 minutes
> (10 hours 25 mins) without python starting each time, and 625 minutes and 28
> seconds with python starting anew each job. Don't you think you could just
> live with these 28 seconds more? Just reading this message might already
> have taken you more than those 28 seconds...
>
>  Vincent.
>
>
>
> > And I am not really sure if I have access to all the nodes, so I can
> > install pp on each one of them.
> >
> > 2009/3/4, Vincent Schut <schut@sarvision.nl>:
> >
> > > hugo rivera wrote:
> > >
> > >
> > > > Thanks for the advice.
> > > > Nevertheless I am in no position to decide what pieces of software the
> > > > cluster will run, I just have to deal with what I have, but anyway I
> > > > can suggest other possibilities.
> > > >
> > > >
> > >  Well, depends on how you define 'software the cluster will run'. Do you
> > > mean cluster management software, or really any program or script or
> python
> > > module that needs to be installed on each node? Because for pp, you
> won't
> > > need any cluster software. pp is just some python module and helper
> scripts.
> > > You *do* need to install this (pure python) module on each node, yes,
> but
> > > that's it, nothing else needed.
> > >  Btw, you said 'it's a small cluster, about 6 machines'. Now I'm not an
> > > expert, but I don't think you can do threading/forking from one machine
> to
> > > another (on linux). So I suppose there already is some cluster
> management
> > > software involved? And while you appear to be "in no position to decide
> what
> > > pieces of software the cluster will run", you might want to enlighten us
> on
> > > what this cluster /will/ run? Your best solution might depend on that...
> > >
> > >  Cheers,
> > >  Vincent.
> > >
> > >
> > >
> > >
> >
> >
> >
>
>
>


--
Hugo



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-04 11:33             ` hugo rivera
@ 2009-03-04 13:23               ` Uriel
  0 siblings, 0 replies; 71+ messages in thread
From: Uriel @ 2009-03-04 13:23 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

What about xcpu?


On Wed, Mar 4, 2009 at 12:33 PM, hugo rivera <uair00@gmail.com> wrote:
> you are right. I was totally confused at the beggining.
> Thanks a lot.
>
> 2009/3/4, Vincent Schut <schut@sarvision.nl>:
>> hugo rivera wrote:
>>
>> > The cluster has torque installed as the resource manager. I think it
>> > runs of top of pbs (an older project).
>> > As far as I know now I just have to call a qsub command to submit my
>> > jobs on a queue, then the resource manager allocates a processor in
>> > the cluster for my process to run till is finished.
>> >
>>
>>  Well, I don't know torque neither pbs, but I'm guessing that when you
>> submit a job, this job will be some program or script that is run on the
>> allocated processor? If so, your initial question of forking vs threading is
>> bogus. Your cluster manager will run (exec) your job, which if it is a
>> python script will start a python interpreter for each job. I guess that's
>> the overhead you get when running a flexible cluster system, flexible
>> meaning that it can run any type of job (shell script, binary executable,
>> python script, perl, etc.).
>>  However, your overhead of starting new python processes each time may seem
>> significant when viewed in absolute terms, but if each job processes lots of
>> data and takes, as you said, 5 min to run on a decent processor, don't you
>> think the startup time for the python process would become non-significant?
>> For example, on a decent machine here, the first time python takes 0.224
>> secs to start and shutdown immediately, and consequetive starts take only
>> about 0.009 secs because everything is still in memory. Let's take the 0.224
>> secs for a worst case scenario. That would be approx 0.075 percent of your
>> job execution time. Now lets say you have 6 machines with 8 cores each and
>> perfect scaling, all your jobs would take 6000 / (6*8) *5min = 625 minutes
>> (10 hours 25 mins) without python starting each time, and 625 minutes and 28
>> seconds with python starting anew each job. Don't you think you could just
>> live with these 28 seconds more? Just reading this message might already
>> have taken you more than those 28 seconds...
>>
>>  Vincent.
>>
>>
>>
>> > And I am not really sure if I have access to all the nodes, so I can
>> > install pp on each one of them.
>> >
>> > 2009/3/4, Vincent Schut <schut@sarvision.nl>:
>> >
>> > > hugo rivera wrote:
>> > >
>> > >
>> > > > Thanks for the advice.
>> > > > Nevertheless I am in no position to decide what pieces of software the
>> > > > cluster will run, I just have to deal with what I have, but anyway I
>> > > > can suggest other possibilities.
>> > > >
>> > > >
>> > >  Well, depends on how you define 'software the cluster will run'. Do you
>> > > mean cluster management software, or really any program or script or
>> python
>> > > module that needs to be installed on each node? Because for pp, you
>> won't
>> > > need any cluster software. pp is just some python module and helper
>> scripts.
>> > > You *do* need to install this (pure python) module on each node, yes,
>> but
>> > > that's it, nothing else needed.
>> > >  Btw, you said 'it's a small cluster, about 6 machines'. Now I'm not an
>> > > expert, but I don't think you can do threading/forking from one machine
>> to
>> > > another (on linux). So I suppose there already is some cluster
>> management
>> > > software involved? And while you appear to be "in no position to decide
>> what
>> > > pieces of software the cluster will run", you might want to enlighten us
>> on
>> > > what this cluster /will/ run? Your best solution might depend on that...
>> > >
>> > >  Cheers,
>> > >  Vincent.
>> > >
>> > >
>> > >
>> > >
>> >
>> >
>> >
>>
>>
>>
>
>
> --
> Hugo
>
>



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-04 10:30       ` Vincent Schut
  2009-03-04 10:45         ` hugo rivera
@ 2009-03-04 14:57         ` ron minnich
  1 sibling, 0 replies; 71+ messages in thread
From: ron minnich @ 2009-03-04 14:57 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs; +Cc: 9fans

On Wed, Mar 4, 2009 at 2:30 AM, Vincent Schut <schut@sarvision.nl> wrote:
> hugo rivera wrote:
>Now I'm not an
> expert, but I don't think you can do threading/forking from one machine to
> another (on linux).

You can with bproc, but it's not supported past 2.6.21 or so.

ron



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-04  5:24                 ` blstuart
  2009-03-04  5:37                   ` erik quanstrom
@ 2009-03-04 16:29                   ` Roman V Shaposhnik
  2009-03-04 16:56                   ` john
  2 siblings, 0 replies; 71+ messages in thread
From: Roman V Shaposhnik @ 2009-03-04 16:29 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Tue, 2009-03-03 at 23:24 -0600, blstuart@bellsouth.net wrote:
> > it's interesting that parallel wasn't cool when chips were getting
> > noticably faster rapidly.  perhaps the focus on parallelization
> > is a sign there aren't any other ideas.
>
> Gotta do something will all the extra transistors.  After all, Moore's
> law hasn't been repealed.  And pipelines and traditional caches
> are pretty good examples of dimishing returns.  So multiple cores
> seems a pretty straightforward approach.

Our running joke circa '05 was that the industry was suffering from
the "transistor overproduction crisis". One only needs to look at other
"overproduction crisis" (especially the food industry) to appreciate
the similarities.

> Now there is another use that would at least be intellectually interesting
> and possible useful in practice.  Use the transistors for a really big
> memory running at cache speed.  But instead of it being a hardware
> cache, manage it explicitly.  In effect, we have a very high speed
> main memory, and the traditional main memory is backing store.
> It'd give a use for all those paging algorithms that aren't particularly
> justified at the main memory-disk boundary any more.  And you
> can fit a lot of Plan 9 executable images in a 64MB on-chip memory
> space.  Obviously, it wouldn't be a good fit for severely memory-hungry
> apps, and it might be a dead end overall, but it'd at least be something
> different...

One could argue that transactional memory model is supposed to be
exactly that.

Thanks,
Roman.




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-04  5:50                           ` erik quanstrom
  2009-03-04  6:08                             ` andrey mirtchovski
@ 2009-03-04 16:52                             ` J.R. Mauro
  2009-03-04 17:14                               ` ron minnich
  1 sibling, 1 reply; 71+ messages in thread
From: J.R. Mauro @ 2009-03-04 16:52 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Wed, Mar 4, 2009 at 12:50 AM, erik quanstrom <quanstro@quanstro.net> wrote:
>> >
>> > Both AMD and Intel are looking at I/O because it is and will be a limiting
>> > factor when scaling to higher core counts.
>
> i/o starts sucking wind with one core.
> that's why we differentiate i/o from everything
> else we do.
>
>> And soon hard disk latencies are really going to start hurting (they
>> already are hurting some, I'm sure), and I'm not convinced of the
>> viability of SSDs.
>
> i'll assume you mean throughput.  hard drive latency has been a big deal
> for a long time.  tanenbaum integrated knowledge of track layout into
> his minix elevator algorithm.

Yes, sorry.

>
> i think the gap between cpu performance and hd performance is narrowing,
> not getting wider.
>
> i don't have accurate measurements on how much real-world performance
> difference there is between a core i7 and an intel 5000.  it's generally not
> spectacular, clock-for-clock. on the other hand, when the intel 5000-series
> was released, the rule of thumb for a sata hd was 50mb/s.  it's not too hard
> to find regular sata hard drives that do 110mb/s today.  the ssd drives we've
> (coraid) tested have been spectacular --- reading at > 200mb/s.  if you want
> to talk latency, ssds can deliver 1/100th the latency of spinning media.
> there's no way that the core i7 is 100x faster than the intel 5000.

For the costs (in terms of power and durability) hard drives are
really a pain, not just for some of the companies I've talked to that
are burning out terabyte drives in a matter of weeks, but for "mere
mortals" as well. And I'm sorry but the performance of hard drives is
*not* very good, despite it improving. Every time I do something on a
large directory tree, my drive (which is a model from last year)
grinds and moans and takes, IMO, too long to do things. Putting 4GB of
RAM in my computer helped, but the buffering algorithms aren't
psychic, so I still pay a penalty the first time I use certain
directories.

Now I haven't tested an SSD for performance, but I know they are
better. If I got one, this problem would likely subside, but I'm not
convinced that SSDs are durable enough, despite what the manufacturers
say. I haven't seen many torture tests on them, but the fact that
erasing a block destroys it a little bit is scary. I do a lot of
sustained writes with my typical desktop workload over the same files,
and I'd rather not trust them to something that is delicate enough to
need filesystem algorithms to be optimized for so they don't "wear
out".

I guess, in essence, I just want my flying car today.

>
> - erik
>
>

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-04  5:24                 ` blstuart
  2009-03-04  5:37                   ` erik quanstrom
  2009-03-04 16:29                   ` Roman V Shaposhnik
@ 2009-03-04 16:56                   ` john
  2 siblings, 0 replies; 71+ messages in thread
From: john @ 2009-03-04 16:56 UTC (permalink / raw)
  To: 9fans

>> it's interesting that parallel wasn't cool when chips were getting
>> noticably faster rapidly.  perhaps the focus on parallelization
>> is a sign there aren't any other ideas.
>
> Gotta do something will all the extra transistors.  After all, Moore's
> law hasn't been repealed.  And pipelines and traditional caches
> are pretty good examples of dimishing returns.  So multiple cores
> seems a pretty straightforward approach.
>
> Now there is another use that would at least be intellectually interesting
> and possible useful in practice.  Use the transistors for a really big
> memory running at cache speed.  But instead of it being a hardware
> cache, manage it explicitly.  In effect, we have a very high speed
> main memory, and the traditional main memory is backing store.
> It'd give a use for all those paging algorithms that aren't particularly
> justified at the main memory-disk boundary any more.  And you
> can fit a lot of Plan 9 executable images in a 64MB on-chip memory
> space.  Obviously, it wouldn't be a good fit for severely memory-hungry
> apps, and it might be a dead end overall, but it'd at least be something
> different...
>
> BLS

64 MB is enough to run a lot of Plan 9 apps and the kernel
simultaneously, sure.  But you can't fit Windows or Firefox in there,
so it's probably not going to happen--if you can't fit either of the
world's two most-used consumer apps, I don't think Intel will bother.
Besides that, doing such a thing would involve departing from the
hallowed CPU-cache-memory-swap-disk architecture we've held so dear
since dinosaurs roamed the earth.

Better off to just beef up the caches; there are big benefits and
cash prizes to be had from higher L1 hit rates.


John Floren




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-04 16:52                             ` J.R. Mauro
@ 2009-03-04 17:14                               ` ron minnich
  2009-03-04 17:27                                 ` William Josephson
                                                   ` (2 more replies)
  0 siblings, 3 replies; 71+ messages in thread
From: ron minnich @ 2009-03-04 17:14 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Wed, Mar 4, 2009 at 8:52 AM, J.R. Mauro <jrm8005@gmail.com> wrote:

> Now I haven't tested an SSD for performance, but I know they are
> better.

Well that I don't understand at all. Is this "faith-based" performance
measurement? :-)

I have a friend who is doing lots of SSD testing and they're not
always better. For some cases, you pay a whole lot more for 2x greater
throughput.

it's not as simple as "know they are better".

>If I got one, this problem would likely subside, but I'm not
> convinced that SSDs are durable enough, despite what the manufacturers
> say. I haven't seen many torture tests on them, but the fact that
> erasing a block destroys it a little bit is scary. I do a lot of
> sustained writes with my typical desktop workload over the same files,
> and I'd rather not trust them to something that is delicate enough to
> need filesystem algorithms to be optimized for so they don't "wear
> out".

in most cases write leveling is not in the file system. It's in the
hardware or in a powerpc that is in the SSD controller.  It's worth
your doing some reading here.

That said, I sure would like to have a fusion IO card for venti. From
what my friend is telling me the fusion card would be ideal for venti
-- as long as we keep only the arenas  on it.

ron



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-04 17:14                               ` ron minnich
@ 2009-03-04 17:27                                 ` William Josephson
  2009-03-04 18:15                                 ` erik quanstrom
  2009-03-05  3:32                                 ` J.R. Mauro
  2 siblings, 0 replies; 71+ messages in thread
From: William Josephson @ 2009-03-04 17:27 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Wed, Mar 04, 2009 at 09:14:35AM -0800, ron minnich wrote:
> in most cases write leveling is not in the file system. It's in the
> hardware or in a powerpc that is in the SSD controller.  It's worth
> your doing some reading here.

With the exception of a few embedded systems running things
like YAFS and JFFS2, wear levelling is done in the controller.

> That said, I sure would like to have a fusion IO card for venti. From
> what my friend is telling me the fusion card would be ideal for venti
> -- as long as we keep only the arenas  on it.

The Fusion IO cards are pretty impressive: I have one here.
I am curious why arena only, however.  One thing I have found
is that to get good performance requires a fair bit of concurrency.

 -WJ



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-04 17:14                               ` ron minnich
  2009-03-04 17:27                                 ` William Josephson
@ 2009-03-04 18:15                                 ` erik quanstrom
  2009-03-05  3:32                                 ` J.R. Mauro
  2 siblings, 0 replies; 71+ messages in thread
From: erik quanstrom @ 2009-03-04 18:15 UTC (permalink / raw)
  To: 9fans

> That said, I sure would like to have a fusion IO card for venti. From
> what my friend is telling me the fusion card would be ideal for venti
> -- as long as we keep only the arenas  on it.

even better for ken's fs.  i would imagine the performance difference
between the fusion i/o card and mass storage is similar to that between
wrens and the jukebox.

- erik



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-04 17:14                               ` ron minnich
  2009-03-04 17:27                                 ` William Josephson
  2009-03-04 18:15                                 ` erik quanstrom
@ 2009-03-05  3:32                                 ` J.R. Mauro
  2009-03-05  3:39                                   ` erik quanstrom
  2009-03-05  3:55                                   ` William K. Josephson
  2 siblings, 2 replies; 71+ messages in thread
From: J.R. Mauro @ 2009-03-05  3:32 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Wed, Mar 4, 2009 at 12:14 PM, ron minnich <rminnich@gmail.com> wrote:
> On Wed, Mar 4, 2009 at 8:52 AM, J.R. Mauro <jrm8005@gmail.com> wrote:
>
>> Now I haven't tested an SSD for performance, but I know they are
>> better.
>
> Well that I don't understand at all. Is this "faith-based" performance
> measurement? :-)

No, I have seen several benchmarks. The benchmarks I haven't seen are
ones for "how long does it take to actually break these drives?" from
anyone other than the manufacturer.

>
> I have a friend who is doing lots of SSD testing and they're not
> always better. For some cases, you pay a whole lot more for 2x greater
> throughput.
>
> it's not as simple as "know they are better".

What types of things degrade their performance? I'm interested in
seeing other data than the handful of benchmarks I've seen. I imagine
writes would be the culprit since you have to erase a whole block
first?

>
>>If I got one, this problem would likely subside, but I'm not
>> convinced that SSDs are durable enough, despite what the manufacturers
>> say. I haven't seen many torture tests on them, but the fact that
>> erasing a block destroys it a little bit is scary. I do a lot of
>> sustained writes with my typical desktop workload over the same files,
>> and I'd rather not trust them to something that is delicate enough to
>> need filesystem algorithms to be optimized for so they don't "wear
>> out".
>
> in most cases write leveling is not in the file system. It's in the
> hardware or in a powerpc that is in the SSD controller.  It's worth
> your doing some reading here.

I've seen a lot about optimizing the next-generation filesystems for
flash. Despite the claims that the hardware-based solutions will be
satisfactory, there are a lot of people interested in making existing
filesystems smarter about SSDs, both for wear and for optimizing
read/write.

Beyond that, though, I feel very shaky just hearing the term "wear
leveling". I've had more flash-based devices fail on me than hard
drives, but maybe I'm just crazy and the technology has gotten decent
enough in the past couple years to allay my worrying. It would just be
nice to see a bit stronger alternative being pushed as hard as SSDs.

>
> That said, I sure would like to have a fusion IO card for venti. From
> what my friend is telling me the fusion card would be ideal for venti
> -- as long as we keep only the arenas  on it.
>
> ron
>
>

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-05  3:32                                 ` J.R. Mauro
@ 2009-03-05  3:39                                   ` erik quanstrom
  2009-03-05  3:55                                   ` William K. Josephson
  1 sibling, 0 replies; 71+ messages in thread
From: erik quanstrom @ 2009-03-05  3:39 UTC (permalink / raw)
  To: 9fans

> On Wed, Mar 4, 2009 at 12:14 PM, ron minnich <rminnich@gmail.com> wrote:
> > On Wed, Mar 4, 2009 at 8:52 AM, J.R. Mauro <jrm8005@gmail.com> wrote:
> >
> >> Now I haven't tested an SSD for performance, but I know they are
> >> better.
> >
> > Well that I don't understand at all. Is this "faith-based" performance
> > measurement? :-)
>
> No, I have seen several benchmarks. The benchmarks I haven't seen are
> ones for "how long does it take to actually break these drives?" from
> anyone other than the manufacturer.

have you seen the same benchmarks from
hard drive mfgrs?

- erik



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-05  3:32                                 ` J.R. Mauro
  2009-03-05  3:39                                   ` erik quanstrom
@ 2009-03-05  3:55                                   ` William K. Josephson
  2009-03-05  4:00                                     ` erik quanstrom
  1 sibling, 1 reply; 71+ messages in thread
From: William K. Josephson @ 2009-03-05  3:55 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Wed, Mar 04, 2009 at 10:32:55PM -0500, J.R. Mauro wrote:
> What types of things degrade their performance? I'm interested in
> seeing other data than the handful of benchmarks I've seen. I imagine
> writes would be the culprit since you have to erase a whole block
> first?

Being full.  Small random writes, too, although much more so for
run-of-the-mill SSDs than for FusionIO.

> Beyond that, though, I feel very shaky just hearing the term "wear
> leveling". I've had more flash-based devices fail on me than hard
> drives, but maybe I'm just crazy and the technology has gotten decent
> enough in the past couple years to allay my worrying. It would just be
> nice to see a bit stronger alternative being pushed as hard as SSDs.

Using MLC or SLC NAND? :-p



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-05  3:55                                   ` William K. Josephson
@ 2009-03-05  4:00                                     ` erik quanstrom
  2009-03-05  4:16                                       ` William K. Josephson
  0 siblings, 1 reply; 71+ messages in thread
From: erik quanstrom @ 2009-03-05  4:00 UTC (permalink / raw)
  To: 9fans

> On Wed, Mar 04, 2009 at 10:32:55PM -0500, J.R. Mauro wrote:
> > What types of things degrade their performance? I'm interested in
> > seeing other data than the handful of benchmarks I've seen. I imagine
> > writes would be the culprit since you have to erase a whole block
> > first?
>
> Being full.  Small random writes, too, although much more so for
> run-of-the-mill SSDs than for FusionIO.

[citation needed]

- erik



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-05  4:00                                     ` erik quanstrom
@ 2009-03-05  4:16                                       ` William K. Josephson
  2009-03-07  3:01                                         ` William Josephson
  0 siblings, 1 reply; 71+ messages in thread
From: William K. Josephson @ 2009-03-05  4:16 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Wed, Mar 04, 2009 at 11:00:25PM -0500, erik quanstrom wrote:
> > On Wed, Mar 04, 2009 at 10:32:55PM -0500, J.R. Mauro wrote:
> > > What types of things degrade their performance? I'm interested in
> > > seeing other data than the handful of benchmarks I've seen. I imagine
> > > writes would be the culprit since you have to erase a whole block
> > > first?
> >
> > Being full.  Small random writes, too, although much more so for
> > run-of-the-mill SSDs than for FusionIO.
>
> [citation needed]

Not really.  The only question mark is the FusionIO device
and for now you'll just have to take it from me :-)  Hopefully
there will be a paper sooner rather than later.



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-03 23:54           ` Devon H. O'Dell
  2009-03-04  0:33             ` J.R. Mauro
@ 2009-03-06  9:39             ` maht
  1 sibling, 0 replies; 71+ messages in thread
From: maht @ 2009-03-06  9:39 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs


> That's a fact. If you have access to The ACM Queue, check out
> p16-cantrill-concurrency.pdf (Cantrill and Bonwich on concurrency).
>
Or you can rely on one of the hackish attempts at email attachment
management or whatever conceptual error lead to this :

https://agora.cs.illinois.edu/download/attachments/18744240/p16-cantrill.pdf?version=1


courtesy of a google datacentre near you





^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-03 18:38     ` Bakul Shah
@ 2009-03-06 18:47       ` Roman V Shaposhnik
  2009-03-06 20:38         ` David Leimbach
  2009-03-07  0:21         ` Bakul Shah
  0 siblings, 2 replies; 71+ messages in thread
From: Roman V Shaposhnik @ 2009-03-06 18:47 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Clojure is definitely something that I would like to play
with extensively. Looks very promising from the outset,
so the only question that I have is how does it feel
when used for substantial things.

Thanks,
Roman.

P.S. My belief in it was actually reaffirmed by a raving
endorsement it got from an old LISP community. Those
guys are a bit like 9fans, if you know what I mean ;-)


On Tue, 2009-03-03 at 10:38 -0800, Bakul Shah wrote:
> On Tue, 03 Mar 2009 10:11:10 PST "Roman V. Shaposhnik" <rvs@sun.com>  wrote:
> > On Tue, 2009-03-03 at 07:19 -0800, David Leimbach wrote:
> >
> > > My knowledge on this subject is about 8 or 9 years old, so check with your
> > local Python guru....
> > >
> > >
> > > The last I'd heard about Python's threading is that it was cooperative
> > > only, and that you couldn't get real parallelism out of it.  It serves
> > > as a means to organize your program in a concurrent manner.
> > >
> > >
> > > In other words no two threads run at the same time in Python, even if
> > > you're on a multi-core system, due to something they call a "Global
> > > Interpreter Lock".
> >
> > I believe GIL is as present in Python nowadays as ever. On a related
> > note: does anybody know any sane interpreted languages with a decent
> > threading model to go along? Stackless python is the only thing that
> > I'm familiar with in that department.
>
> Depend on what you mean by "sane interpreted language with a
> decent threading model" and what you want to do with it but
> check out www.clojure.org.  Then there is Erlang.  Its
> wikipedia entry has this to say:
>     Although Erlang was designed to fill a niche and has
>     remained an obscure language for most of its existence,
>     it is experiencing a rapid increase in popularity due to
>     increased demand for concurrent services, inferior models
>     of concurrency in most mainstream programming languages,
>     and its substantial libraries and documentation.[7][8]
>     Well-known applications include Amazon SimpleDB,[9]
>     Yahoo! Delicious,[10] and the Facebook Chat system.[11]
>




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-06 18:47       ` Roman V Shaposhnik
@ 2009-03-06 20:38         ` David Leimbach
  2009-03-07  8:00           ` Bakul Shah
  2009-03-07  0:21         ` Bakul Shah
  1 sibling, 1 reply; 71+ messages in thread
From: David Leimbach @ 2009-03-06 20:38 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 2460 bytes --]

Things like Clojure, or Scala become a bit more interesting when the VM is
extended to allow tail recursion to happen in a nice way.

On Fri, Mar 6, 2009 at 10:47 AM, Roman V Shaposhnik <rvs@sun.com> wrote:

> Clojure is definitely something that I would like to play
> with extensively. Looks very promising from the outset,
> so the only question that I have is how does it feel
> when used for substantial things.
>
> Thanks,
> Roman.
>
> P.S. My belief in it was actually reaffirmed by a raving
> endorsement it got from an old LISP community. Those
> guys are a bit like 9fans, if you know what I mean ;-)
>
>
> On Tue, 2009-03-03 at 10:38 -0800, Bakul Shah wrote:
> > On Tue, 03 Mar 2009 10:11:10 PST "Roman V. Shaposhnik" <rvs@sun.com>
>  wrote:
> > > On Tue, 2009-03-03 at 07:19 -0800, David Leimbach wrote:
> > >
> > > > My knowledge on this subject is about 8 or 9 years old, so check with
> your
> > > local Python guru....
> > > >
> > > >
> > > > The last I'd heard about Python's threading is that it was
> cooperative
> > > > only, and that you couldn't get real parallelism out of it.  It
> serves
> > > > as a means to organize your program in a concurrent manner.
> > > >
> > > >
> > > > In other words no two threads run at the same time in Python, even if
> > > > you're on a multi-core system, due to something they call a "Global
> > > > Interpreter Lock".
> > >
> > > I believe GIL is as present in Python nowadays as ever. On a related
> > > note: does anybody know any sane interpreted languages with a decent
> > > threading model to go along? Stackless python is the only thing that
> > > I'm familiar with in that department.
> >
> > Depend on what you mean by "sane interpreted language with a
> > decent threading model" and what you want to do with it but
> > check out www.clojure.org.  Then there is Erlang.  Its
> > wikipedia entry has this to say:
> >     Although Erlang was designed to fill a niche and has
> >     remained an obscure language for most of its existence,
> >     it is experiencing a rapid increase in popularity due to
> >     increased demand for concurrent services, inferior models
> >     of concurrency in most mainstream programming languages,
> >     and its substantial libraries and documentation.[7][8]
> >     Well-known applications include Amazon SimpleDB,[9]
> >     Yahoo! Delicious,[10] and the Facebook Chat system.[11]
> >
>
>
>

[-- Attachment #2: Type: text/html, Size: 3107 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-06 18:47       ` Roman V Shaposhnik
  2009-03-06 20:38         ` David Leimbach
@ 2009-03-07  0:21         ` Bakul Shah
  2009-03-07  2:20           ` Brian L. Stuart
  1 sibling, 1 reply; 71+ messages in thread
From: Bakul Shah @ 2009-03-07  0:21 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Fri, 06 Mar 2009 10:47:20 PST Roman V Shaposhnik <rvs@sun.com>  wrote:
> Clojure is definitely something that I would like to play
> with extensively. Looks very promising from the outset,
> so the only question that I have is how does it feel
> when used for substantial things.

You can browse various Clojure related google groups but
there is only one way to find out if it is for you!

> P.S. My belief in it was actually reaffirmed by a raving
> endorsement it got from an old LISP community. Those
> guys are a bit like 9fans, if you know what I mean ;-)

No comment :-)



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-07  0:21         ` Bakul Shah
@ 2009-03-07  2:20           ` Brian L. Stuart
  0 siblings, 0 replies; 71+ messages in thread
From: Brian L. Stuart @ 2009-03-07  2:20 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> P.S. My belief in it was actually reaffirmed by a raving
> endorsement it got from an old LISP community. Those
> guys are a bit like 9fans, if you know what I mean ;-)

You mean intelligent people who appreciate elegance? :)

Sorry.  Couldn't resist.

BLS




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-05  4:16                                       ` William K. Josephson
@ 2009-03-07  3:01                                         ` William Josephson
  2009-03-07  3:31                                           ` erik quanstrom
  2009-03-07  5:00                                           ` lucio
  0 siblings, 2 replies; 71+ messages in thread
From: William Josephson @ 2009-03-07  3:01 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Wed, Mar 04, 2009 at 11:16:33PM -0500, William K. Josephson wrote:
> On Wed, Mar 04, 2009 at 11:00:25PM -0500, erik quanstrom wrote:
> > > On Wed, Mar 04, 2009 at 10:32:55PM -0500, J.R. Mauro wrote:
> > > > What types of things degrade their performance? I'm interested in
> > > > seeing other data than the handful of benchmarks I've seen. I imagine
> > > > writes would be the culprit since you have to erase a whole block
> > > > first?
> > >
> > > Being full.  Small random writes, too, although much more so for
> > > run-of-the-mill SSDs than for FusionIO.
> >
> > [citation needed]
>
> Not really.

To be less flippant, what makes high performance flash difficult
is the slow erasure time and large erasure blocks relative to
the size of individual flash pages.  Being full hurts since the
flash is typically managed by a log structured storage system
with a garbage collector.  Small random writes require updating
the logical->physical mapping efficiently and crash recoverably.
You also need to do copy-on-write which leads to what is commonly
called write amplification, which reduces the usuable number of
writes.  Small writes tend to exacerbate a lot of these problems.

Where does all this fancy stuff belong?  In the storage medium,
in the HBA, in the device driver, in the file system, or in the
application?

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-07  3:01                                         ` William Josephson
@ 2009-03-07  3:31                                           ` erik quanstrom
  2009-03-07  6:00                                             ` William Josephson
  2009-03-07  5:00                                           ` lucio
  1 sibling, 1 reply; 71+ messages in thread
From: erik quanstrom @ 2009-03-07  3:31 UTC (permalink / raw)
  To: 9fans

> To be less flippant, what makes high performance flash difficult
> is the slow erasure time and large erasure blocks relative to
> the size of individual flash pages.  Being full hurts since the
> flash is typically managed by a log structured storage system
> with a garbage collector.  Small random writes require updating
> the logical->physical mapping efficiently and crash recoverably.
> You also need to do copy-on-write which leads to what is commonly
> called write amplification, which reduces the usuable number of
> writes.  Small writes tend to exacerbate a lot of these problems.
>
> Where does all this fancy stuff belong?  In the storage medium,
> in the HBA, in the device driver, in the file system, or in the
> application?

it's interesting to note that the quoted mtbf numbers for ssds is
within a factor of 2 of enterprise hard drives.  if one considers that
one needs ~4 ssds to cover the capacity of 1 hard drive, the quoted
mtbf/byte is worse for ssd.

the obvious conclusion is that if you think you need raid for hard
drives, then you also need raid for ssds.  at least if you believe the
mtbf numbers.

i think that it's a real good question where the fancy flash
tricks belong.  the naive guess would be that for backwards compatability
reasons, the media will get much of the smarts.

- erik

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-07  3:01                                         ` William Josephson
  2009-03-07  3:31                                           ` erik quanstrom
@ 2009-03-07  5:00                                           ` lucio
  2009-03-07  5:08                                             ` William Josephson
  1 sibling, 1 reply; 71+ messages in thread
From: lucio @ 2009-03-07  5:00 UTC (permalink / raw)
  To: 9fans

> Where does all this fancy stuff belong?  In the storage medium,
> in the HBA, in the device driver, in the file system, or in the
> application?

In a very intelligent cache?  Or did you mention that above and in my
ignorance I missed it?

OK, let's try this:

. Storage medium: only the hardware developers have access to that and
  they have never seemed interested in matching anyone else's
  requirements or suggestions.

. The HBA (?).  If that's the device adapter, the same applies as
  above.

. The device driver should not be very complex and the block handling
  should hopefully be shared by more than one device driver, which
  with the effective demise of Streams is not a very easy thing to
  implement without resorting to jumping through flaming hoops.

. The application?  That's being facetious, surely?

. A cache?  As quanstro pointed out, flash makes a wonderful WORM.
  Now to get Fossil to work as originally intended, or a more suitable
  design and implementation to take its place in this role and we have
  a winner.

++L

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-07  5:00                                           ` lucio
@ 2009-03-07  5:08                                             ` William Josephson
  2009-03-07  5:19                                               ` erik quanstrom
  2009-03-07  5:24                                               ` [9fans] threads vs forks lucio
  0 siblings, 2 replies; 71+ messages in thread
From: William Josephson @ 2009-03-07  5:08 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Sat, Mar 07, 2009 at 07:00:26AM +0200, lucio@proxima.alt.za wrote:
> . A cache?  As quanstro pointed out, flash makes a wonderful WORM.
>   Now to get Fossil to work as originally intended, or a more suitable
>   design and implementation to take its place in this role and we have
>   a winner.

Sadly, if a WORM is your only application, then no one cares.
At least not enough to pony up for real peformance.  The folks
at places like Sandia are interested in running HPC applications
and there are a lot of people in other industries such as big oil
and finance that are willing to pay for performance for running
HPC applications, VMs which tend to have high I/O requirements when
an OS patch comes out, etc.

In the near term, the performance leaders in the flash market
ship a device driver plus hardware.  Much of the intelligence
actually resides in the device driver.  It is that secret sauce
that gets you good performance.  In theory it could be pushed
down, but it takes CPU, memory, and memory bandwidth that may
not be cost effective there.

 -WJ

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-07  5:08                                             ` William Josephson
@ 2009-03-07  5:19                                               ` erik quanstrom
  2009-03-07  5:45                                                 ` [9fans] Flash William K. Josephson
  2009-03-07  5:24                                               ` [9fans] threads vs forks lucio
  1 sibling, 1 reply; 71+ messages in thread
From: erik quanstrom @ 2009-03-07  5:19 UTC (permalink / raw)
  To: 9fans

> Sadly, if a WORM is your only application, then no one cares.
> At least not enough to pony up for real peformance.  The folks
> at places like Sandia are interested in running HPC applications
> and there are a lot of people in other industries such as big oil
> and finance that are willing to pay for performance for running
> HPC applications, VMs which tend to have high I/O requirements when
> an OS patch comes out, etc.

ask not what a technology can do for the world,
ask what a technology can do for you!

- erik



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-07  5:08                                             ` William Josephson
  2009-03-07  5:19                                               ` erik quanstrom
@ 2009-03-07  5:24                                               ` lucio
  1 sibling, 0 replies; 71+ messages in thread
From: lucio @ 2009-03-07  5:24 UTC (permalink / raw)
  To: 9fans

> Much of the intelligence
> actually resides in the device driver.  It is that secret sauce
> that gets you good performance.  In theory it could be pushed
> down, but it takes CPU, memory, and memory bandwidth that may
> not be cost effective there.

That would entail a really intelligent controller, which brings us
back to a cache, does it not, this time hidden inside a black box.  I
have been thinking that the obsession with SMP has a negative impact
on diverse engineering where intelligent peripherals take over
operations that are too slow or too demanding on the generic CPU.
Smacks of AoE to me, with a lot more packed into the A.

But I'm just an old software developer with a hobbyist interest in
electronic engineering and my opinions are not backed by much
research.

++L

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] Flash
  2009-03-07  5:19                                               ` erik quanstrom
@ 2009-03-07  5:45                                                 ` William K. Josephson
  2009-03-07 14:42                                                   ` erik quanstrom
  0 siblings, 1 reply; 71+ messages in thread
From: William K. Josephson @ 2009-03-07  5:45 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Sat, Mar 07, 2009 at 12:19:25AM -0500, erik quanstrom wrote:
> > Sadly, if a WORM is your only application, then no one cares.
> > At least not enough to pony up for real peformance.  The folks
>
> ask not what a technology can do for the world,
> ask what a technology can do for you!

The thing is, flash isn't going to replace disk for WORM-like
applications due to capacity.  It might be very interesting
for things like the index and to replace battery-backed DRAM
for the NVRAM component in an appliance.



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-07  3:31                                           ` erik quanstrom
@ 2009-03-07  6:00                                             ` William Josephson
  2009-03-07 13:58                                               ` erik quanstrom
  0 siblings, 1 reply; 71+ messages in thread
From: William Josephson @ 2009-03-07  6:00 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Fri, Mar 06, 2009 at 10:31:59PM -0500, erik quanstrom wrote:
> it's interesting to note that the quoted mtbf numbers for ssds is
> within a factor of 2 of enterprise hard drives.  if one considers that
> one needs ~4 ssds to cover the capacity of 1 hard drive, the quoted
> mtbf/byte is worse for ssd.

That's only if you think of flash as a direct replacement for disk.
SSDs are expensive on a $/MB basis compared to disks.  The good ones
start looking cheap if you instead price on $/IOPS.  I think you'll
start seeing a lot more of these in things like NAS appliances.  For
short-lived data you only need go over the I/O bus twice vs. three
times for most NVRAMs based on battery-backed DRAM.

 -WJ

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-06 20:38         ` David Leimbach
@ 2009-03-07  8:00           ` Bakul Shah
  0 siblings, 0 replies; 71+ messages in thread
From: Bakul Shah @ 2009-03-07  8:00 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Fri, 06 Mar 2009 12:38:57 PST David Leimbach <leimy2k@gmail.com>  wrote:
>
> Things like Clojure, or Scala become a bit more interesting when the VM is
> extended to allow tail recursion to happen in a nice way.

A lack of TCO is not something that will prevent you from
writing many interesting programs (except things like a state
machine as a set of mutually calling functions!).

There is nothing in Clojure, or C for that matter, that will
disallow tail call optimization should an implemention
provide it.  It is just that unlike Scheme most programming
languages do not *mandate* that tail calls be optimized.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-07  6:00                                             ` William Josephson
@ 2009-03-07 13:58                                               ` erik quanstrom
  2009-03-07 14:37                                                 ` William Josephson
  0 siblings, 1 reply; 71+ messages in thread
From: erik quanstrom @ 2009-03-07 13:58 UTC (permalink / raw)
  To: 9fans

On Sat Mar  7 01:02:31 EST 2009, jkw@eecs.harvard.edu wrote:
> On Fri, Mar 06, 2009 at 10:31:59PM -0500, erik quanstrom wrote:
> > it's interesting to note that the quoted mtbf numbers for ssds is
> > within a factor of 2 of enterprise hard drives.  if one considers that
> > one needs ~4 ssds to cover the capacity of 1 hard drive, the quoted
> > mtbf/byte is worse for ssd.
>
> That's only if you think of flash as a direct replacement for disk.

i think that's why they put them in a 2.5" form factor with a standard
SATA interface.  what are you thinking of?

> SSDs are expensive on a $/MB basis compared to disks.  The good ones

not as much as you think.  a top-drawer 15k sas drive is on the order
of 300GB and $350+.  the intel ssd is only twice as much.  if you compare
the drives supported by the big-iron vendors, intel ssd already has cost
parity.

> For short-lived data you only need go over the I/O bus twice vs. three
> times for most NVRAMs based on battery-backed DRAM.

i'm missing something here.  what are your assumptions
on how things are connected?  also, isn't there an assumption
that you don't want to be writing short-lived data to flash if
possible?

- erik

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-07 13:58                                               ` erik quanstrom
@ 2009-03-07 14:37                                                 ` William Josephson
  2009-03-07 15:05                                                   ` erik quanstrom
  0 siblings, 1 reply; 71+ messages in thread
From: William Josephson @ 2009-03-07 14:37 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Sat, Mar 07, 2009 at 08:58:42AM -0500, erik quanstrom wrote:
> i think that's why they put them in a 2.5" form factor with a standard
> SATA interface.  what are you thinking of?

No, the reason they do that is for backwards compatibility.

> > SSDs are expensive on a $/MB basis compared to disks.  The good ones
>
> not as much as you think.  a top-drawer 15k sas drive is on the order
> of 300GB and $350+.  the intel ssd is only twice as much.  if you compare
> the drives supported by the big-iron vendors, intel ssd already has cost
> parity.

The Intel SSD is cheap and slow :-)

> > For short-lived data you only need go over the I/O bus twice vs. three
> > times for most NVRAMs based on battery-backed DRAM.
>
> i'm missing something here.  what are your assumptions
> on how things are connected?  also, isn't there an assumption
> that you don't want to be writing short-lived data to flash if
> possible?

Take a gander at the NetApp NAS filers or DataDomain restorers.
Things have to go to NVRAM in case of a crash.  The DRAM based
NVRAMs are typically quite a bit more expensive than flash
based ones and so you have less of it.  That means more data
has to get flushed to disk since it can't live in battery-backed
NVRAM permanently.



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] Flash
  2009-03-07  5:45                                                 ` [9fans] Flash William K. Josephson
@ 2009-03-07 14:42                                                   ` erik quanstrom
  2009-03-07 14:56                                                     ` William Josephson
  2009-03-07 15:39                                                     ` Russ Cox
  0 siblings, 2 replies; 71+ messages in thread
From: erik quanstrom @ 2009-03-07 14:42 UTC (permalink / raw)
  To: 9fans

On Sat Mar  7 00:47:33 EST 2009, jkw@eecs.harvard.edu wrote:
> On Sat, Mar 07, 2009 at 12:19:25AM -0500, erik quanstrom wrote:
> > > Sadly, if a WORM is your only application, then no one cares.
> > > At least not enough to pony up for real peformance.  The folks
> >
> > ask not what a technology can do for the world,
> > ask what a technology can do for you!
>
> The thing is, flash isn't going to replace disk for WORM-like
> applications due to capacity.  It might be very interesting
> for things like the index and to replace battery-backed DRAM
> for the NVRAM component in an appliance.

i've been evaluating replacing the worm in the coraid fileserver's
sr15x1 with ssd.  so far it looks promising.

- erik



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] Flash
  2009-03-07 14:42                                                   ` erik quanstrom
@ 2009-03-07 14:56                                                     ` William Josephson
  2009-03-07 15:39                                                     ` Russ Cox
  1 sibling, 0 replies; 71+ messages in thread
From: William Josephson @ 2009-03-07 14:56 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Sat, Mar 07, 2009 at 09:42:00AM -0500, erik quanstrom wrote:
> On Sat Mar  7 00:47:33 EST 2009, jkw@eecs.harvard.edu wrote:
> > On Sat, Mar 07, 2009 at 12:19:25AM -0500, erik quanstrom wrote:
> > > > Sadly, if a WORM is your only application, then no one cares.
> > > > At least not enough to pony up for real peformance.  The folks
> > >
> > > ask not what a technology can do for the world,
> > > ask what a technology can do for you!
> >
> > The thing is, flash isn't going to replace disk for WORM-like
> > applications due to capacity.  It might be very interesting
> > for things like the index and to replace battery-backed DRAM
> > for the NVRAM component in an appliance.
>
> i've been evaluating replacing the worm in the coraid fileserver's
> sr15x1 with ssd.  so far it looks promising.

My point isn't that it is a bad idea, just that it isn't
likely to provide enough business to keep manufacturers
interested.  Moreover, for capacity disks will keep on
winning for a long time.  They just start to look more
and more like tape.

 -WJ



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-07 14:37                                                 ` William Josephson
@ 2009-03-07 15:05                                                   ` erik quanstrom
  2009-03-07 15:28                                                     ` William K. Josephson
  0 siblings, 1 reply; 71+ messages in thread
From: erik quanstrom @ 2009-03-07 15:05 UTC (permalink / raw)
  To: 9fans

On Sat Mar  7 09:39:38 EST 2009, jkw@eecs.harvard.edu wrote:
> On Sat, Mar 07, 2009 at 08:58:42AM -0500, erik quanstrom wrote:
> > i think that's why they put them in a 2.5" form factor with a standard
> > SATA interface.  what are you thinking of?
>
> No, the reason they do that is for backwards compatibility.

it's kind of funny to call sata "backwards compatability".  if
things go as you suggest — pcie connected, i think we'll all
long for the day when we could write one driver per hba rather
than one driver per storage device.

new boss, same as the old boss.

> > > SSDs are expensive on a $/MB basis compared to disks.  The good ones
> >
> > not as much as you think.  a top-drawer 15k sas drive is on the order
> > of 300GB and $350+.  the intel ssd is only twice as much.  if you compare
> > the drives supported by the big-iron vendors, intel ssd already has cost
> > parity.
>
> The Intel SSD is cheap and slow :-)

pick a lane!  first you argued that they are expensive. ☺

> Take a gander at the NetApp NAS filers or DataDomain restorers.

so you're saying that these machines don't differentiate between
primary cache and their write log (or whatever they call it)?

> My point isn't that it is a bad idea, just that it isn't
> likely to provide enough business to keep manufacturers
> interested.  Moreover, for capacity disks will keep on
> winning for a long time.  They just start to look more
> and more like tape.

no.  i agree.  worm storage in general is not a popular topic,
but the few companies that do use it pay the big bucks for it.

it's always great when the backup media is less reliable
than the primary media.

- erik

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] threads vs forks
  2009-03-07 15:05                                                   ` erik quanstrom
@ 2009-03-07 15:28                                                     ` William K. Josephson
  0 siblings, 0 replies; 71+ messages in thread
From: William K. Josephson @ 2009-03-07 15:28 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Sat, Mar 07, 2009 at 10:05:44AM -0500, erik quanstrom wrote:
> it's kind of funny to call sata "backwards compatability".  if
> things go as you suggest ??? pcie connected, i think we'll all
> long for the day when we could write one driver per hba rather
> than one driver per storage device.

Or perhaps we'll get a new version of sata that has a better
interface.  I think that treating flash as a linear array
of blocks is not the right abstraction.  I'm less interested
in the physical interconnect.

> > The Intel SSD is cheap and slow :-)
>
> pick a lane!  first you argued that they are expensive. ???

Fast SSDs are expensive; Intel SSDs aren't fast :-)

> no.  i agree.  worm storage in general is not a popular topic,
> but the few companies that do use it pay the big bucks for it.

That isn't really enough to keep an industry going.
What has driven flash is volume, the simplicity of
manufacturing it, and the resulting yield.  All
three have driven down the price a lot.

 -WJ



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] Flash
  2009-03-07 14:42                                                   ` erik quanstrom
  2009-03-07 14:56                                                     ` William Josephson
@ 2009-03-07 15:39                                                     ` Russ Cox
  2009-03-07 16:34                                                       ` erik quanstrom
  1 sibling, 1 reply; 71+ messages in thread
From: Russ Cox @ 2009-03-07 15:39 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> i've been evaluating replacing the worm in the coraid fileserver's
>  sr15x1 with ssd.  so far it looks promising.

i don't understand this.  your "worm" is magnetic disk, right?
why would you put the ssd in the slow half (the worm)
instead of the fast half (the cache)?

similarly, someone on this thread said they'd use ssd
for just the arenas (which are mostly linear access),
when if i had to make the choice i would use it for just
the index (which is mostly random access and would
benefit more from dropping the seek penalities).

russ

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [9fans] Flash
  2009-03-07 15:39                                                     ` Russ Cox
@ 2009-03-07 16:34                                                       ` erik quanstrom
  0 siblings, 0 replies; 71+ messages in thread
From: erik quanstrom @ 2009-03-07 16:34 UTC (permalink / raw)
  To: 9fans

> i don't understand this.  your "worm" is magnetic disk, right?
> why would you put the ssd in the slow half (the worm)
> instead of the fast half (the cache)?

the current setup is that the worm and the "cache" are
the same speed (in fact, the cache is slower because
it's on two disks rather than 4).  i found it was needlessly
expensive to copy data from the worm to cache, so the
only data that makes it to the cache are dirty blocks.
it's used as a write buffer, not a cache.
http://www.quanstro.net/plan9/disklessfs.pdf.

if i simply reenabled the cache and replaced the two
hard drives with ssd, i don't think we'd see much performance
increase as we're not thrashing the ram cache yet.  and
for heavy write loads with a caching appliance, i think that
8 disk-limited disks would compete well with 2 sata-limited
drives.  i think  one would get better bang/buck by replacing
the worm drives with a greater number of smaller hard drives;
ssds would be even better.

on the other hand, a fusion i/o device would be a compelling
reason to reinstitute a true cache.

> similarly, someone on this thread said they'd use ssd
> for just the arenas (which are mostly linear access),
> when if i had to make the choice i would use it for just
> the index (which is mostly random access and would
> benefit more from dropping the seek penalities).

i guess the $640 question is, how good is that wear leveling.

- erik

^ permalink raw reply	[flat|nested] 71+ messages in thread

end of thread, other threads:[~2009-03-07 16:34 UTC | newest]

Thread overview: 71+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-03-03 11:52 [9fans] threads vs forks hugo rivera
2009-03-03 15:19 ` David Leimbach
2009-03-03 15:32   ` Uriel
2009-03-03 16:15     ` hugo rivera
2009-03-03 15:33   ` hugo rivera
2009-03-03 18:11   ` Roman V. Shaposhnik
2009-03-03 18:38     ` Bakul Shah
2009-03-06 18:47       ` Roman V Shaposhnik
2009-03-06 20:38         ` David Leimbach
2009-03-07  8:00           ` Bakul Shah
2009-03-07  0:21         ` Bakul Shah
2009-03-07  2:20           ` Brian L. Stuart
2009-03-03 23:08     ` J.R. Mauro
2009-03-03 23:15       ` Uriel
2009-03-03 23:23         ` J.R. Mauro
2009-03-03 23:54           ` Devon H. O'Dell
2009-03-04  0:33             ` J.R. Mauro
2009-03-04  0:54               ` erik quanstrom
2009-03-04  1:54                 ` J.R. Mauro
2009-03-04  3:18                   ` James Tomaschke
2009-03-04  3:30                     ` erik quanstrom
2009-03-04  4:44                       ` James Tomaschke
2009-03-04  5:05                         ` J.R. Mauro
2009-03-04  5:50                           ` erik quanstrom
2009-03-04  6:08                             ` andrey mirtchovski
2009-03-04 16:52                             ` J.R. Mauro
2009-03-04 17:14                               ` ron minnich
2009-03-04 17:27                                 ` William Josephson
2009-03-04 18:15                                 ` erik quanstrom
2009-03-05  3:32                                 ` J.R. Mauro
2009-03-05  3:39                                   ` erik quanstrom
2009-03-05  3:55                                   ` William K. Josephson
2009-03-05  4:00                                     ` erik quanstrom
2009-03-05  4:16                                       ` William K. Josephson
2009-03-07  3:01                                         ` William Josephson
2009-03-07  3:31                                           ` erik quanstrom
2009-03-07  6:00                                             ` William Josephson
2009-03-07 13:58                                               ` erik quanstrom
2009-03-07 14:37                                                 ` William Josephson
2009-03-07 15:05                                                   ` erik quanstrom
2009-03-07 15:28                                                     ` William K. Josephson
2009-03-07  5:00                                           ` lucio
2009-03-07  5:08                                             ` William Josephson
2009-03-07  5:19                                               ` erik quanstrom
2009-03-07  5:45                                                 ` [9fans] Flash William K. Josephson
2009-03-07 14:42                                                   ` erik quanstrom
2009-03-07 14:56                                                     ` William Josephson
2009-03-07 15:39                                                     ` Russ Cox
2009-03-07 16:34                                                       ` erik quanstrom
2009-03-07  5:24                                               ` [9fans] threads vs forks lucio
2009-03-04  5:19                   ` David Leimbach
2009-03-04  2:47                 ` John Barham
2009-03-04  5:24                 ` blstuart
2009-03-04  5:37                   ` erik quanstrom
2009-03-04 16:29                   ` Roman V Shaposhnik
2009-03-04 16:56                   ` john
2009-03-06  9:39             ` maht
2009-03-04  5:07     ` David Leimbach
2009-03-04  5:35     ` John Barham
2009-03-03 16:00 ` ron minnich
2009-03-03 16:28   ` hugo rivera
2009-03-03 17:31     ` ron minnich
2009-03-03 16:47 ` John Barham
2009-03-04  9:37   ` Vincent Schut
2009-03-04  9:58     ` hugo rivera
2009-03-04 10:30       ` Vincent Schut
2009-03-04 10:45         ` hugo rivera
2009-03-04 11:15           ` Vincent Schut
2009-03-04 11:33             ` hugo rivera
2009-03-04 13:23               ` Uriel
2009-03-04 14:57         ` ron minnich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).