CallCC using fork (and garbage collected processes)

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

* CallCC using fork (and garbage collected processes)
@ 2007-07-16  5:30 Jon Harrop
  2007-07-16  6:07 ` [Caml-list] " skaller
  0 siblings, 1 reply; 5+ messages in thread
From: Jon Harrop @ 2007-07-16  5:30 UTC (permalink / raw)
  To: caml-list

Just another crazy idea. So Oleg's callcc for OCaml works by copying the OCaml 
bytecode stack and its rather nifty because you can trivially control invert 
parsers and so forth.

You can replicate the native-code stack using fork. What if you wrap the 
forked process in an object and set the finalizer to pass it a message 
telling it to die. Then you could invoke the process (continuation) to 
propagate computation. Maybe you can pass your child the pipe from your 
parent so that it can respond directly and then die yourself (a tail call!).

I know this is really silly because you've got the OCaml GC collecting 
processes (which is even worse than GCing OpenGL textures) but my machine can 
fork processes pretty quickly and I'm just wondering if anyone's tried it?

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
The OCaml Journal
http://www.ffconsultancy.com/products/ocaml_journal/?e

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] CallCC using fork (and garbage collected processes)
  2007-07-16  5:30 CallCC using fork (and garbage collected processes) Jon Harrop
@ 2007-07-16  6:07 ` skaller
  2007-07-16 16:46   ` Richard Jones
  0 siblings, 1 reply; 5+ messages in thread
From: skaller @ 2007-07-16  6:07 UTC (permalink / raw)
  To: Jon Harrop; +Cc: caml-list

On Mon, 2007-07-16 at 06:30 +0100, Jon Harrop wrote:
> Just another crazy idea. So Oleg's callcc for OCaml works by copying the OCaml 
> bytecode stack and its rather nifty because you can trivially control invert 
> parsers and so forth.
> 
> You can replicate the native-code stack using fork. What if you wrap the 
> forked process in an object and set the finalizer to pass it a message 
> telling it to die. Then you could invoke the process (continuation) to 
> propagate computation. Maybe you can pass your child the pipe from your 
> parent so that it can respond directly and then die yourself (a tail call!).
> 
> I know this is really silly because you've got the OCaml GC collecting 
> processes (which is even worse than GCing OpenGL textures) but my machine can 
> fork processes pretty quickly and I'm just wondering if anyone's tried it?

A much reduced version of this idea has been tried
in C, swapping the C stack pointer and registers using
setjmp/longjmp and some assembler.

This is, of course, much faster than forking.

However the C method doesn't work, because Unix uses
linear bounded stacks, and there isn't enough address 
space to create enough stacks.

The forking idea has some interest because this problem
might go away. However it isn't necessary if you COPY
the stack, because the copy isn't executing so doesn't
need a bound. Of course, forking copies the stack
lazily (because it copies everything lazily) so it
actually might be quite efficient compared to copying.

Note the Fork idea will NOT work in ML anyhow, since
ML (including Ocaml) is a procedural programming language.
Fork won't support writing to shared memory (eg ref types).

Ultimately though, the only viable way to do this properly
is to use a spaghetti stack on the heap as Felix does.
If you're going to run with user space threads, you want
to be able to run millions of them and context switch
very quickly .. otherwise you can just use Threads
and the Event module for communications.

Felix has no trouble with a million threads on an AMD64 with
1G of RAM ..somehow I doubt your box has enough process
handles to spawn that many forked processes .. :)

MLton solves this problem by mmap()ing the stacks:
fast context switching AND solves the bound problem.
Stacks grow linearly and then get compacted .. its very cute!

The bottom line here is that programming language execution
models have to STOP using the stack except for very temporary
storage -- but even that isn't going to work, if you're calling
C code which calls back into your language.

-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] CallCC using fork (and garbage collected processes)
  2007-07-16  6:07 ` [Caml-list] " skaller
@ 2007-07-16 16:46   ` Richard Jones
  2007-07-16 17:25     ` skaller
  0 siblings, 1 reply; 5+ messages in thread
From: Richard Jones @ 2007-07-16 16:46 UTC (permalink / raw)
  To: skaller; +Cc: Jon Harrop, caml-list

On Mon, Jul 16, 2007 at 04:07:23PM +1000, skaller wrote:
> A much reduced version of this idea has been tried
> in C, swapping the C stack pointer and registers using
> setjmp/longjmp and some assembler.

It's even easier to use makecontext(3) et al, if you have them.  I
wrote a C implementation which tries a variety of strategies
(makecontext, setjmp, assembly), here:
http://annexia.org/freeware/pthrlib

> This is, of course, much faster than forking.
> 
> However the C method doesn't work, because Unix uses
> linear bounded stacks, and there isn't enough address 
> space to create enough stacks.

This is more of a problem with 32 bit machines.  64 bit machines
shouldn't have an issue.  Even on a 32 bit machine I could get
thousands of threads before hitting a limit.

Rich.

-- 
Richard Jones
Red Hat

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] CallCC using fork (and garbage collected processes)
  2007-07-16 16:46   ` Richard Jones
@ 2007-07-16 17:25     ` skaller
  2007-07-16 17:57       ` Tom
  0 siblings, 1 reply; 5+ messages in thread
From: skaller @ 2007-07-16 17:25 UTC (permalink / raw)
  To: Richard Jones; +Cc: Jon Harrop, caml-list

On Mon, 2007-07-16 at 17:46 +0100, Richard Jones wrote:

> > However the C method doesn't work, because Unix uses
> > linear bounded stacks, and there isn't enough address 
> > space to create enough stacks.
> 
> This is more of a problem with 32 bit machines.  64 bit machines
> shouldn't have an issue.  Even on a 32 bit machine I could get
> thousands of threads before hitting a limit.

Yeah, 16 bit machines have some problems but 32 bits is HEAPS .. 
woops .. 32 bit machines have some problems but 64 bits is HEAPS ..
woops .. :)

Yes, more can be done with 64 bits: say you allocate 8 bits for
stack choices, that's 256 stacks, 16 bits is 64K .. still
of your pair of 32 bit half words, say one half is used up,
the 16 bits is half the bits you have left .. just for 64K
threads .. which isn't really a lot.

Minimum stack size is one 4K page .. 4K x 64K = 256Meg ..
a quarter of my RAM wasted .. goto 18 bits and ALL my RAM 
is wasted (on 1G machine).

Yes, 64 bit box does give much higher capacity .. but Felix
can run several million threads on the same box without
any such problems because it uses linked heap objects instead
of a linear stack.

One stack is faster than the heap.. it isn't clear that
millions of stacks is faster though, especially if you have
a good heap manager such as the Ocaml gc. 

Actually be interesting to measure the relative performance
of spaghetti and linear stack on 64 bit box.

The one to look at here is MLton IMHO .. since it uses
a garbage collected (compacted) linear stack with mmap() 
extension .. hopefully obtaining the benefits of both
linear performance from linear addresing and low
resource usage from spaghetti stack implementation.

-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] CallCC using fork (and garbage collected processes)
  2007-07-16 17:25     ` skaller
@ 2007-07-16 17:57       ` Tom
  0 siblings, 0 replies; 5+ messages in thread
From: Tom @ 2007-07-16 17:57 UTC (permalink / raw)
  To: skaller; +Cc: Richard Jones, caml-list

[-- Attachment #1: Type: text/plain, Size: 496 bytes --]

On 16/07/07, skaller <skaller@users.sourceforge.net> wrote:
>
>
> The one to look at here is MLton IMHO .. since it uses
> a garbage collected (compacted) linear stack with mmap()
> extension .. hopefully obtaining the benefits of both
> linear performance from linear addresing and low
> resource usage from spaghetti stack implementation.
>

How exactly does that work? Is there any reference where I could read more
about that (except the MLton sources)? Is using mmap() crossplatform?

- Tom

[-- Attachment #2: Type: text/html, Size: 797 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-07-16 17:57 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-07-16  5:30 CallCC using fork (and garbage collected processes) Jon Harrop
2007-07-16  6:07 ` [Caml-list] " skaller
2007-07-16 16:46   ` Richard Jones
2007-07-16 17:25     ` skaller
2007-07-16 17:57       ` Tom

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).