From mboxrd@z Thu Jan 1 00:00:00 1970 MIME-Version: 1.0 In-Reply-To: References: <138575260903030352s623807d7p5a3075b1f7a591f6@mail.gmail.com> <4f34febc0903030847t9aedad9haf4355e74953e6a3@mail.gmail.com> Date: Wed, 4 Mar 2009 10:58:31 +0100 Message-ID: <138575260903040158r3ebc4e76haa5a328d2840bd5f@mail.gmail.com> From: hugo rivera To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: Re: [9fans] threads vs forks Topicbox-Message-UUID: af88b328-ead4-11e9-9d60-3106f5b1d025 Thanks for the advice. Nevertheless I am in no position to decide what pieces of software the cluster will run, I just have to deal with what I have, but anyway I can suggest other possibilities. 2009/3/4, Vincent Schut : > John Barham wrote: > > > On Tue, Mar 3, 2009 at 3:52 AM, hugo rivera wrote: > > > > > > > I have to launch many tasks running in parallel (~5000) in a > > > cluster running linux. Each of the task performs some astronomical > > > calculations and I am not pretty sure if using fork is the best answer > > > here. > > > First of all, all the programming is done in python and c... > > > > > > > Take a look at the multiprocessing package > > (http://docs.python.org/library/multiprocessing.html), > newly > > introduced with Python 2.6 and 3.0: > > > > "multiprocessing is a package that supports spawning processes using > > an API similar to the threading module. The multiprocessing package > > offers both local and remote concurrency, effectively side-stepping > > the Global Interpreter Lock by using subprocesses instead of threads." > > > > It should be a quick and easy way to set up a cluster-wide job > > processing system (provided all your jobs are driven by Python). > > > > Better: use parallelpython (www.parallelpython.org). Afaik multiprocessing > is geared towards multi-core systems (one machine), while pp is also > suitable for real clusters with more pc's. No special cluster software > needed. It will start (here's your fork) a (some) python interpreters on > each node, and then you can submit jobs to those 'workers'. The interpreters > are kept alive between jobs, so the startup penalty becomes neglectibly when > the number of jobs is large enough. > Using it here to process massive amounts of satellite data, works like a > charm. > > Vincent. > > > > > > It also looks like it's been (partially?) back-ported to Python 2.4 > > and 2.5: http://pypi.python.org/pypi/processing. > > > > John > > > > > > > > > -- Hugo