Smart ways to implement worker threads

* Smart ways to implement worker threads
@ 2010-07-14 16:09 Goswin von Brederlow
  2010-07-15 15:58 ` [Caml-list] " Rich Neswold
                   ` (2 more replies)
  0 siblings, 3 replies; 31+ messages in thread
From: Goswin von Brederlow @ 2010-07-14 16:09 UTC (permalink / raw)
  To: caml-list

Hi,

I'm writing a little programm that, besides some minor other stuff,
computes a lot of blockwise checksums and blockwise Reed-Solomon
codes. Both of these are done on BigArrays and in external C functions
surounded by enter/leave_blocking_section. They would be ideal for doing
multiple blocks in parallel, one per core.

I'm probably not the first with such a problem, so what are the smart
ways to implement this?

I think it makes sense to start one thread per core at the start of the
programm and keep them around till the end. They won't be needed all the
time, or even most of the time, but when they are needed the responce
time matters. Or is Thread.create overhead neglible nowadays?

But then I can see multiple possibilities of the top of my head:

1) Have all threads call call Unix.select. Whatever thread wakes up
parses the incoming request, does the checksum or Reed-Solomon as
needed, replies and goes back to select.

Problem 1: One of the FDs is a socket accepting new connects. If one
comes in the number of FDs to listen on changes and I need to signal all
threads to refresh their FD lists for Unix.select.

Problem 2: I have several global structures that will need locking
against multiple threads altering them. Might even run the danger of
deadlocks when structures are locked in different orders in different
threads.

2) Create pure worker threads that only do checksuming and Reed-Solomon
codes. I create a queue of jobs protected by a Mutex.t and a Condition.t
to signal worker threads when a job is added to the queue.

Problem: The main thread will usualy be in a Unix.select call while the
worker threads work. How does a worker thread signal the main thread
that a job is done? Have a job_done FD and write 1 byte to it? Send a
signal so the main thread aborts the Unix.select?

Possible solution: The worker threads might be able to reply on their
own when they are done without signaling the main thread. Not sure how
that would work with error handling though.

3) Like 2 but have an extra Input thread that runs Unix.select. Have two
queues. The Input thread put incoming requests into the first queue. The
main thread waits on the first queue and puts jobs for the worker
threads in the second queue. The worker threads wait on the second queue
and puts the request back into the first queue for finalization.

4) Do some magic with Event.t?

Problem: never used this and I could use a small example how to use
this.

5) Implement lock-free task steeling queues?

Probably overkill for this and I think that would involve a fair bit of
external C code with some inline asm, right?

So far I think option 3 might be the simplest approach. At least I don't
see any glaring problems with it.

What do you think? Other Ideas? Ready-to-use modules for this?

MfG
        Goswin

^ permalink raw reply	[flat|nested] 31+ messages in thread