From mboxrd@z Thu Jan  1 00:00:00 1970
To: 9fans@cse.psu.edu
From: Vincent Schut <schut@sarvision.nl>
Date: Wed,  4 Mar 2009 12:15:50 +0100
Message-ID: <golnt7$hkj$1@ger.gmane.org>
References: <138575260903030352s623807d7p5a3075b1f7a591f6@mail.gmail.com>	<4f34febc0903030847t9aedad9haf4355e74953e6a3@mail.gmail.com>	<goli5l$ub0$1@ger.gmane.org>	<138575260903040158r3ebc4e76haa5a328d2840bd5f@mail.gmail.com>	<goll8g$8g4$1@ger.gmane.org>
	<138575260903040245w3e8ede69t42d91f290ff82523@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
User-Agent: Thunderbird 2.0.0.19 (X11/20090213)
In-Reply-To: <138575260903040245w3e8ede69t42d91f290ff82523@mail.gmail.com>
Subject: Re: [9fans] threads vs forks
Topicbox-Message-UUID: af9ad990-ead4-11e9-9d60-3106f5b1d025

hugo rivera wrote:
> The cluster has torque installed as the resource manager. I think it
> runs of top of pbs (an older project).
> As far as I know now I just have to call a qsub command to submit my
> jobs on a queue, then the resource manager allocates a processor in
> the cluster for my process to run till is finished.

Well, I don't know torque neither pbs, but I'm guessing that when you
submit a job, this job will be some program or script that is run on the
allocated processor? If so, your initial question of forking vs
threading is bogus. Your cluster manager will run (exec) your job, which
if it is a python script will start a python interpreter for each job. I
guess that's the overhead you get when running a flexible cluster
system, flexible meaning that it can run any type of job (shell script,
binary executable, python script, perl, etc.).
However, your overhead of starting new python processes each time may
seem significant when viewed in absolute terms, but if each job
processes lots of data and takes, as you said, 5 min to run on a decent
processor, don't you think the startup time for the python process would
become non-significant? For example, on a decent machine here, the first
time python takes 0.224 secs to start and shutdown immediately, and
consequetive starts take only about 0.009 secs because everything is
still in memory. Let's take the 0.224 secs for a worst case scenario.
That would be approx 0.075 percent of your job execution time. Now lets
say you have 6 machines with 8 cores each and perfect scaling, all your
jobs would take 6000 / (6*8) *5min = 625 minutes (10 hours 25 mins)
without python starting each time, and 625 minutes and 28 seconds with
python starting anew each job. Don't you think you could just live with
these 28 seconds more? Just reading this message might already have
taken you more than those 28 seconds...

Vincent.

> And I am not really sure if I have access to all the nodes, so I can
> install pp on each one of them.
>
> 2009/3/4, Vincent Schut <schut@sarvision.nl>:
>> hugo rivera wrote:
>>
>>> Thanks for the advice.
>>> Nevertheless I am in no position to decide what pieces of software the
>>> cluster will run, I just have to deal with what I have, but anyway I
>>> can suggest other possibilities.
>>>
>>  Well, depends on how you define 'software the cluster will run'. Do you
>> mean cluster management software, or really any program or script or python
>> module that needs to be installed on each node? Because for pp, you won't
>> need any cluster software. pp is just some python module and helper scripts.
>> You *do* need to install this (pure python) module on each node, yes, but
>> that's it, nothing else needed.
>>  Btw, you said 'it's a small cluster, about 6 machines'. Now I'm not an
>> expert, but I don't think you can do threading/forking from one machine to
>> another (on linux). So I suppose there already is some cluster management
>> software involved? And while you appear to be "in no position to decide what
>> pieces of software the cluster will run", you might want to enlighten us on
>> what this cluster /will/ run? Your best solution might depend on that...
>>
>>  Cheers,
>>  Vincent.
>>
>>
>>
>
>