From mboxrd@z Thu Jan 1 00:00:00 1970 MIME-Version: 1.0 In-Reply-To: <138575260903040333i262ec35cr345075427c2b6bed@mail.gmail.com> References: <138575260903030352s623807d7p5a3075b1f7a591f6@mail.gmail.com> <4f34febc0903030847t9aedad9haf4355e74953e6a3@mail.gmail.com> <138575260903040158r3ebc4e76haa5a328d2840bd5f@mail.gmail.com> <138575260903040245w3e8ede69t42d91f290ff82523@mail.gmail.com> <138575260903040333i262ec35cr345075427c2b6bed@mail.gmail.com> Date: Wed, 4 Mar 2009 14:23:14 +0100 Message-ID: <5d375e920903040523j50625de0r9cdb6e3a51095d1b@mail.gmail.com> From: Uriel To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [9fans] threads vs forks Topicbox-Message-UUID: b0550b44-ead4-11e9-9d60-3106f5b1d025 What about xcpu? On Wed, Mar 4, 2009 at 12:33 PM, hugo rivera wrote: > you are right. I was totally confused at the beggining. > Thanks a lot. > > 2009/3/4, Vincent Schut : >> hugo rivera wrote: >> >> > The cluster has torque installed as the resource manager. I think it >> > runs of top of pbs (an older project). >> > As far as I know now I just have to call a qsub command to submit my >> > jobs on a queue, then the resource manager allocates a processor in >> > the cluster for my process to run till is finished. >> > >> >> =C2=A0Well, I don't know torque neither pbs, but I'm guessing that when = you >> submit a job, this job will be some program or script that is run on the >> allocated processor? If so, your initial question of forking vs threadin= g is >> bogus. Your cluster manager will run (exec) your job, which if it is a >> python script will start a python interpreter for each job. I guess that= 's >> the overhead you get when running a flexible cluster system, flexible >> meaning that it can run any type of job (shell script, binary executable= , >> python script, perl, etc.). >> =C2=A0However, your overhead of starting new python processes each time = may seem >> significant when viewed in absolute terms, but if each job processes lot= s of >> data and takes, as you said, 5 min to run on a decent processor, don't y= ou >> think the startup time for the python process would become non-significa= nt? >> For example, on a decent machine here, the first time python takes 0.224 >> secs to start and shutdown immediately, and consequetive starts take onl= y >> about 0.009 secs because everything is still in memory. Let's take the 0= .224 >> secs for a worst case scenario. That would be approx 0.075 percent of yo= ur >> job execution time. Now lets say you have 6 machines with 8 cores each a= nd >> perfect scaling, all your jobs would take 6000 / (6*8) *5min =3D 625 min= utes >> (10 hours 25 mins) without python starting each time, and 625 minutes an= d 28 >> seconds with python starting anew each job. Don't you think you could ju= st >> live with these 28 seconds more? Just reading this message might already >> have taken you more than those 28 seconds... >> >> =C2=A0Vincent. >> >> >> >> > And I am not really sure if I have access to all the nodes, so I can >> > install pp on each one of them. >> > >> > 2009/3/4, Vincent Schut : >> > >> > > hugo rivera wrote: >> > > >> > > >> > > > Thanks for the advice. >> > > > Nevertheless I am in no position to decide what pieces of software= the >> > > > cluster will run, I just have to deal with what I have, but anyway= I >> > > > can suggest other possibilities. >> > > > >> > > > >> > > =C2=A0Well, depends on how you define 'software the cluster will run= '. Do you >> > > mean cluster management software, or really any program or script or >> python >> > > module that needs to be installed on each node? Because for pp, you >> won't >> > > need any cluster software. pp is just some python module and helper >> scripts. >> > > You *do* need to install this (pure python) module on each node, yes= , >> but >> > > that's it, nothing else needed. >> > > =C2=A0Btw, you said 'it's a small cluster, about 6 machines'. Now I'= m not an >> > > expert, but I don't think you can do threading/forking from one mach= ine >> to >> > > another (on linux). So I suppose there already is some cluster >> management >> > > software involved? And while you appear to be "in no position to dec= ide >> what >> > > pieces of software the cluster will run", you might want to enlighte= n us >> on >> > > what this cluster /will/ run? Your best solution might depend on tha= t... >> > > >> > > =C2=A0Cheers, >> > > =C2=A0Vincent. >> > > >> > > >> > > >> > > >> > >> > >> > >> >> >> > > > -- > Hugo > >