From mboxrd@z Thu Jan 1 00:00:00 1970 MIME-Version: 1.0 In-Reply-To: References: <138575260903030352s623807d7p5a3075b1f7a591f6@mail.gmail.com> <4f34febc0903030847t9aedad9haf4355e74953e6a3@mail.gmail.com> <138575260903040158r3ebc4e76haa5a328d2840bd5f@mail.gmail.com> <138575260903040245w3e8ede69t42d91f290ff82523@mail.gmail.com> Date: Wed, 4 Mar 2009 12:33:59 +0100 Message-ID: <138575260903040333i262ec35cr345075427c2b6bed@mail.gmail.com> From: hugo rivera To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: Re: [9fans] threads vs forks Topicbox-Message-UUID: afa58c1e-ead4-11e9-9d60-3106f5b1d025 you are right. I was totally confused at the beggining. Thanks a lot. 2009/3/4, Vincent Schut : > hugo rivera wrote: > > > The cluster has torque installed as the resource manager. I think it > > runs of top of pbs (an older project). > > As far as I know now I just have to call a qsub command to submit my > > jobs on a queue, then the resource manager allocates a processor in > > the cluster for my process to run till is finished. > > > > Well, I don't know torque neither pbs, but I'm guessing that when you > submit a job, this job will be some program or script that is run on the > allocated processor? If so, your initial question of forking vs threading is > bogus. Your cluster manager will run (exec) your job, which if it is a > python script will start a python interpreter for each job. I guess that's > the overhead you get when running a flexible cluster system, flexible > meaning that it can run any type of job (shell script, binary executable, > python script, perl, etc.). > However, your overhead of starting new python processes each time may seem > significant when viewed in absolute terms, but if each job processes lots of > data and takes, as you said, 5 min to run on a decent processor, don't you > think the startup time for the python process would become non-significant? > For example, on a decent machine here, the first time python takes 0.224 > secs to start and shutdown immediately, and consequetive starts take only > about 0.009 secs because everything is still in memory. Let's take the 0.224 > secs for a worst case scenario. That would be approx 0.075 percent of your > job execution time. Now lets say you have 6 machines with 8 cores each and > perfect scaling, all your jobs would take 6000 / (6*8) *5min = 625 minutes > (10 hours 25 mins) without python starting each time, and 625 minutes and 28 > seconds with python starting anew each job. Don't you think you could just > live with these 28 seconds more? Just reading this message might already > have taken you more than those 28 seconds... > > Vincent. > > > > > And I am not really sure if I have access to all the nodes, so I can > > install pp on each one of them. > > > > 2009/3/4, Vincent Schut : > > > > > hugo rivera wrote: > > > > > > > > > > Thanks for the advice. > > > > Nevertheless I am in no position to decide what pieces of software the > > > > cluster will run, I just have to deal with what I have, but anyway I > > > > can suggest other possibilities. > > > > > > > > > > > Well, depends on how you define 'software the cluster will run'. Do you > > > mean cluster management software, or really any program or script or > python > > > module that needs to be installed on each node? Because for pp, you > won't > > > need any cluster software. pp is just some python module and helper > scripts. > > > You *do* need to install this (pure python) module on each node, yes, > but > > > that's it, nothing else needed. > > > Btw, you said 'it's a small cluster, about 6 machines'. Now I'm not an > > > expert, but I don't think you can do threading/forking from one machine > to > > > another (on linux). So I suppose there already is some cluster > management > > > software involved? And while you appear to be "in no position to decide > what > > > pieces of software the cluster will run", you might want to enlighten us > on > > > what this cluster /will/ run? Your best solution might depend on that... > > > > > > Cheers, > > > Vincent. > > > > > > > > > > > > > > > > > > > > > -- Hugo