caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] Why systhreads?
@ 2002-11-23  9:08 Lauri Alanko
  2002-11-24  7:36 ` Sven Luther
                   ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Lauri Alanko @ 2002-11-23  9:08 UTC (permalink / raw)
  To: caml-list

Hello.

A simple, fundamental question: why is native-code threading done using
system threads? Why isn't pure user-level scheduling used as with
bytecode?

It seems that all the time incompatibilities and deficiencies in Win32
threads and pthreads cause no end of trouble, for instance they fail to
support the asynchronous exceptions which I yearned for.

Since there is a single heap and threads are run in a strictly
serialized order, system threads don't even give any support for
parallelism. So user-level threading seems like the sensible option. For
instance, the GHC Haskell compiler uses pure user-level threading both
in native code and when interpreted, and it works pretty well. (All
right, there's now talk of adding systhread support, but only for
foreign interface issues.)

I cannot believe that supporting many different system thread interfaces
is easier than managing native-code stacks manually. So could someone
please clarify what the motivation here is?

Thanks.


Lauri Alanko
la@iki.fi
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Caml-list] Why systhreads?
  2002-11-23  9:08 [Caml-list] Why systhreads? Lauri Alanko
@ 2002-11-24  7:36 ` Sven Luther
  2002-11-24 17:41   ` Chris Hecker
  2002-11-24 17:14 ` Vitaly Lugovsky
  2002-11-25 10:01 ` Xavier Leroy
  2 siblings, 1 reply; 28+ messages in thread
From: Sven Luther @ 2002-11-24  7:36 UTC (permalink / raw)
  To: Lauri Alanko; +Cc: caml-list

On Sat, Nov 23, 2002 at 11:08:06AM +0200, Lauri Alanko wrote:
> Hello.
> 
> A simple, fundamental question: why is native-code threading done using
> system threads? Why isn't pure user-level scheduling used as with
> bytecode?

I don't really know about windows (which is a pain to use ocaml on
anyway) but on unix, you can choose at compile time to use either
systhreads or ocamlthreads.

Friendly,

Sven Luther
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Caml-list] Why systhreads?
  2002-11-23  9:08 [Caml-list] Why systhreads? Lauri Alanko
  2002-11-24  7:36 ` Sven Luther
@ 2002-11-24 17:14 ` Vitaly Lugovsky
  2002-11-24 17:18   ` Lauri Alanko
  2002-11-24 18:27   ` Dmitry Bely
  2002-11-25 10:01 ` Xavier Leroy
  2 siblings, 2 replies; 28+ messages in thread
From: Vitaly Lugovsky @ 2002-11-24 17:14 UTC (permalink / raw)
  To: Lauri Alanko; +Cc: caml-list

On Sat, 23 Nov 2002, Lauri Alanko wrote:

> A simple, fundamental question: why is native-code threading done using
> system threads? Why isn't pure user-level scheduling used as with
> bytecode?

 How will you manage SMP scheduling then? May be, smthng like OpenMP will 
be nice, but it's not so generic as just native threads.


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Caml-list] Why systhreads?
  2002-11-24 17:14 ` Vitaly Lugovsky
@ 2002-11-24 17:18   ` Lauri Alanko
  2002-11-24 18:27   ` Dmitry Bely
  1 sibling, 0 replies; 28+ messages in thread
From: Lauri Alanko @ 2002-11-24 17:18 UTC (permalink / raw)
  To: caml-list

On Sun, Nov 24, 2002 at 08:14:50PM +0300, Vitaly Lugovsky wrote:
>  How will you manage SMP scheduling then? May be, smthng like OpenMP will 
> be nice, but it's not so generic as just native threads.

Unless I have _very_ serious misconceptions, ocaml's threads _always_
run in a strictly serialized order, since they share a common heap and
it'd be horrendous to lock the heap at every allocation. So using
systhreads does _not_ buy us any parallelization with SMP.


Lauri Alanko
la@iki.fi
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Caml-list] Why systhreads?
  2002-11-24  7:36 ` Sven Luther
@ 2002-11-24 17:41   ` Chris Hecker
  2002-11-24 18:12     ` Basile STARYNKEVITCH
  0 siblings, 1 reply; 28+ messages in thread
From: Chris Hecker @ 2002-11-24 17:41 UTC (permalink / raw)
  To: Sven Luther, Lauri Alanko; +Cc: caml-list


>I don't really know about windows (which is a pain to use ocaml on
>anyway) but on unix, you can choose at compile time to use either
>systhreads or ocamlthreads.

For bytecode or for native?  His question was about native.

On a related note, now that the first CPUs with HyperThreading are 
shipping, is there any plan to multithread the GC so caml programs can take 
advantage of HT?  I can understand why it was not a high priority to 
support real threads for multiprocessor machines when that was the only way 
to get parallelism with threads, but once HT is ubiquitous, it has the 
potential to make it worth the trouble to thread a regular application to 
increase performance.  I don't think this is a high priority now, because 
there's 0% penetration of HT right now, but hopefully there's some plan for 
the future.

I guess the question is, is a multithreaded GC an open research problem, or 
is there a known good solution and it just hasn't gotten to the top of the 
priority list yet?

Chris


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Caml-list] Why systhreads?
  2002-11-24 17:41   ` Chris Hecker
@ 2002-11-24 18:12     ` Basile STARYNKEVITCH
  2002-11-24 21:10       ` Christopher Quinn
  0 siblings, 1 reply; 28+ messages in thread
From: Basile STARYNKEVITCH @ 2002-11-24 18:12 UTC (permalink / raw)
  To: Chris Hecker; +Cc: Sven Luther, Lauri Alanko, caml-list

>>>>> "Chris" == Chris Hecker <checker@d6.com> writes:


    Chris> On a related note, now that the first CPUs with
    Chris> HyperThreading are shipping, is there any plan to
    Chris> multithread the GC so caml programs can take advantage of
    Chris> HT?  I can understand why it was not a high priority to
    Chris> support real threads for multiprocessor machines when that
    Chris> was the only way to get parallelism with threads, but once
    Chris> HT is ubiquitous, it has the potential to make it worth the
    Chris> trouble to thread a regular application to increase
    Chris> performance.  I don't think this is a high priority now,
    Chris> because there's 0% penetration of HT right now, but
    Chris> hopefully there's some plan for the future.

I would suppose that HyperThreading chips can already successfully
being used on Linux (I was told, IIRC, that such chips gives 2 cpu
when asked thru /proc/cpuinfo). So, I would suppose that they can take
advantage of native threads on Ocaml already.

    Chris> I guess the question is, is a multithreaded GC an open
    Chris> research problem, or is there a known good solution and it
    Chris> just hasn't gotten to the top of the priority list yet?

I have a question to the Ocaml team : Could they explain what is the
current (and perhaps near future) status of multithreading in Ocaml,
notably with respect to garbage collection?

I thought that the current GC in Ocaml (3.06) is already
multithread-capable, at least because each thread has his own birth
region and can do minor garbage collections independently of other
threads, so threads have to synchronize only on major (ie full)
garbage collections. Is my assumption correct? I just had a glance
into ocaml/byterun/minor_gc.c and did not found any thread-local
variables there... Also, ocaml/otherlibs/systhreads/posix.c mentions
/* The global mutex used to ensure that at most one thread is running
Caml code */

What is (with ocamlopt -thread on x86/linux) the use of multithreading
in Ocaml? I really thought that it was really (in practical terms)
biprocessor support, not only some limited kind of (throwable once
only) continuations?

Regards.
-- 

Basile STARYNKEVITCH         http://starynkevitch.net/Basile/ 
email: basile<at>starynkevitch<dot>net 
alias: basile<at>tunes<dot>org 
8, rue de la Faïencerie, 92340 Bourg La Reine, France
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Caml-list] Why systhreads?
  2002-11-24 17:14 ` Vitaly Lugovsky
  2002-11-24 17:18   ` Lauri Alanko
@ 2002-11-24 18:27   ` Dmitry Bely
  2002-11-24 23:14     ` Vitaly Lugovsky
  1 sibling, 1 reply; 28+ messages in thread
From: Dmitry Bely @ 2002-11-24 18:27 UTC (permalink / raw)
  To: caml-list

Vitaly Lugovsky <vsl@ontil.ihep.su> writes:

>> A simple, fundamental question: why is native-code threading done using
>> system threads? Why isn't pure user-level scheduling used as with
>> bytecode?
>
>  How will you manage SMP scheduling then?

AFAIK Ocaml program cannot utilise SMP even with the native threads (due to
the single master lock).

- Dmitry Bely


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Caml-list] Why systhreads?
  2002-11-24 18:12     ` Basile STARYNKEVITCH
@ 2002-11-24 21:10       ` Christopher Quinn
  0 siblings, 0 replies; 28+ messages in thread
From: Christopher Quinn @ 2002-11-24 21:10 UTC (permalink / raw)
  Cc: caml-list

amazingly the threading of caml was done back in '93!
here is the paper on it:
http://pauillac.inria.fr/~xleroy/publi/concurrent-gc.ps.gz

but the runtime of the current distro is not so threaded. parallelism is limited to those system functions (for I/O) in the C source files you see surrounded by enter_blocking_section()/leave_blocking_section().
these functions mask whether a new, real thread is created for the duration of the call, or the descriptor is given to select() in the case of bytecode. i think they enforce the 'global lock' on the runtime.

i imagine the performance cost of threading the runtime is rather too high (just what is it that makes java so slow anyway - a multitude of resource locks? )
 
my particular wish is to see the runtime with a compile option to eliminate static global state (make it thread local?) to enable multiple instances of the runtime to operate in the same address space, albeit completely independently.

- chris

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Caml-list] Why systhreads?
  2002-11-24 18:27   ` Dmitry Bely
@ 2002-11-24 23:14     ` Vitaly Lugovsky
  2002-11-27 14:33       ` Tim Freeman
  0 siblings, 1 reply; 28+ messages in thread
From: Vitaly Lugovsky @ 2002-11-24 23:14 UTC (permalink / raw)
  To: Dmitry Bely; +Cc: caml-list

On Sun, 24 Nov 2002, Dmitry Bely wrote:

> >> A simple, fundamental question: why is native-code threading done using
> >> system threads? Why isn't pure user-level scheduling used as with
> >> bytecode?
> >
> >  How will you manage SMP scheduling then?
> 
> AFAIK Ocaml program cannot utilise SMP even with the native threads (due to
> the single master lock).

 I tried OCaml in a non memory-consuming numerical applications on SMP.
Seems to work well enough (100% load of all the processors).


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Caml-list] Why systhreads?
  2002-11-23  9:08 [Caml-list] Why systhreads? Lauri Alanko
  2002-11-24  7:36 ` Sven Luther
  2002-11-24 17:14 ` Vitaly Lugovsky
@ 2002-11-25 10:01 ` Xavier Leroy
  2002-11-25 14:20   ` Markus Mottl
                     ` (3 more replies)
  2 siblings, 4 replies; 28+ messages in thread
From: Xavier Leroy @ 2002-11-25 10:01 UTC (permalink / raw)
  To: Lauri Alanko; +Cc: caml-list

It seems that the annual discussion on threads started again.  Allow
me to deliver again my standard lecture on this topic.

Threads have at least three different purposes:

1- Parallelism on shared-memory multiprocessors.
2- Overlapping I/O and computation (while a thread is blocked on a network
   read, other threads may proceed).
3- Supporting the "coroutine" programming style
   (e.g. if a program has a GUI but performs long computations,
    using threads is a nicer way to structure the program than
    trying to wrap the long computation around the GUI event loop).

The goals of OCaml threads are (2) and (3) but not (1) (for reasons
that I'll get into later), with historical emphasis on (2) due to the
MMM (Web browser) and V6 (HTTP proxy) applications.

Pure user-level scheduling, or equivalently control operators (call/cc),
provide (3) but not (2).

To achieve (2) with a user-level scheduler such as OCaml's bytecode
thread library requires all sorts of hacks, such as non-blocking I/O
and select() under Unix, plus wrapping of all I/O operations so that
they call the user-level scheduler in cases where they are about to
block.  (Otherwise, the whole process would block, and not just the
calling thread.)

Not only this is ugly (read the sources of the bytecode thread library
to get an idea) and inefficient, but it interacts very poorly with
external libraries written in C.  For instance, deep inside the C
implementation of gethostbyname(), there are network reads that can
block; there is no way to wrap these with scheduler calls, short of
rewriting gethostbyname() entirely.

To make things worse, non-blocking I/O is done completely differently
under Unix and under Win32.  I'm not even sure Win32 provides enough
support for async I/O to write a real user-level scheduler.

Another issue with user-level threads, at least in native code, is the
handling of the thread stacks, especially if we wish to have thread
stacks that start small and grow on demand.  It can be done, but is
highly processor- and OS-dependent.  (For instance, stack handling on
the IA64 is, ah, peculiar: there are actually two stacks that grow in
opposite directions within the same memory area...)

One aspect of wisdom is to know when not to do something oneself, but
leave it to others.  Scheduling I/O and computation concurrently, and
managing process stacks, is the job of the operating system.  Trying
to do it entirely in a user-mode program is just not reasonable.
(For another reference point, see Java's move away from "green
threads" and towards system threads.)

What about parallelism on SMP machines?  The main issue here is that
the runtime system, and in particular the garbage collector and memory
manager, must be MP-safe.  This means minimizing global state, and
introducing locking around accesses to shared resources.  If done
naively (e.g. locking at each heap allocation), this can be extremely
costly; it also complicates the runtime system a lot.  Finally,
garbage collection can become a limiting factor if it is done in the
"stop the world" fashion (all threads stop during GC); a concurrent GC
avoids this problem, but adds tremendous complexity.

(Of course, all this SMP support stuff slows down the runtime system
even if there is only one processor, which is the case for almost all
our users...)

All this has been done before in the context of Caml: that was
Damien Doligez's Concurrent Caml Light system, in the early 90s.
Indeed, the incremental major GC that we have in OCaml is a
simplification of Damien's concurrent GC.  If you're interested, have
a look at Damien's publications.

Why was Concurrent Caml Light abandoned?  Too complex; too hard to debug
(despite the existence of a machine-checked proof of correctness);
and dubious practical interest.  Shared-memory multiprocessors have
never really "taken off", at least in the general public.  For large
parallel computations, clusters (distributed-memory systems) are the
norm.  For desktop use, monoprocessors are plenty fast.  Even if you
have a 4-processor SMP machine, it isn't clear whether you should
write your program using shared memory or using message passing -- the
latter is slightly more expensive, but scales to clusters...

What about hyperthreading?  Well, I believe it's the last convulsive
movement of SMP's corpse :-)  We'll see how it goes market-wise.  At
any rate, the speedups announced for hyperthreading in the Pentium 4
are below a factor of 1.5; probably not enough to offset the overhead
of making the OCaml runtime system thread-safe.  

In summary: there is no SMP support in OCaml, and it is very very
unlikely that there will ever be.  If you're into parallelism, better
investigate message-passing interfaces.

- Xavier Leroy
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Caml-list] Why systhreads?
  2002-11-25 10:01 ` Xavier Leroy
@ 2002-11-25 14:20   ` Markus Mottl
  2002-11-25 19:01   ` Blair Zajac
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 28+ messages in thread
From: Markus Mottl @ 2002-11-25 14:20 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: Lauri Alanko, caml-list

On Mon, 25 Nov 2002, Xavier Leroy wrote:
> In summary: there is no SMP support in OCaml, and it is very very
> unlikely that there will ever be.  If you're into parallelism, better
> investigate message-passing interfaces.

To make at least some users happy: it is indeed possible to exploit
SMP-machines with native threads in OCaml, but those benefits only occur
when calling external functions that do not interfere with the OCaml
runtime. E.g. LACAML (the LAPACK-interface for OCaml) makes use of this,
which means that you can, say, crunch several matrices in parallel. Due
to the elegant handling of threads in OCaml, this is much nicer to do
than in C.

Regards,
Markus Mottl

-- 
Markus Mottl                                             markus@oefai.at
Austrian Research Institute
for Artificial Intelligence                  http://www.oefai.at/~markus
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Caml-list] Why systhreads?
  2002-11-25 10:01 ` Xavier Leroy
  2002-11-25 14:20   ` Markus Mottl
@ 2002-11-25 19:01   ` Blair Zajac
  2002-11-25 21:06     ` james woodyatt
  2002-11-26  9:02     ` [Caml-list] Why systhreads? Xavier Leroy
  2002-11-26 19:04   ` Dave Berry
  2002-11-27  0:07   ` Lauri Alanko
  3 siblings, 2 replies; 28+ messages in thread
From: Blair Zajac @ 2002-11-25 19:01 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: Lauri Alanko, caml-list

Xavier Leroy wrote:
> 
> It seems that the annual discussion on threads started again.  Allow
> me to deliver again my standard lecture on this topic.
> 
> Threads have at least three different purposes:
> 
> 1- Parallelism on shared-memory multiprocessors.
> 2- Overlapping I/O and computation (while a thread is blocked on a network
>    read, other threads may proceed).
> 3- Supporting the "coroutine" programming style
>    (e.g. if a program has a GUI but performs long computations,
>     using threads is a nicer way to structure the program than
>     trying to wrap the long computation around the GUI event loop).

[Discussion on (1), (2) and (3) removed].

To summarize, for (2) system threads are required and and you can't
prevent blocking with user level threads easily or at all.  For (3),
making the Ocaml system support SMP is "Too complex; too hard to
debug" and SMP boxes aren't all that popular.

Aren't these contradictory statements?

For Ocaml to support a Ocaml program to have one thread to block on a
system call and to allow other threads to continue, doesn't this support
SMP?  Does Ocaml support this?

I need the functionality to have multiple threads where one thread can
block and not stop the others, either due to the OS or to the Ocaml
runtime system.

What am I missing here?

Best,
Blair

-- 
Blair Zajac <blair@orcaware.com>
Web and OS performance plots - http://www.orcaware.com/orca/
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Caml-list] Why systhreads?
  2002-11-25 19:01   ` Blair Zajac
@ 2002-11-25 21:06     ` james woodyatt
  2002-11-25 22:20       ` Chris Hecker
  2002-11-26  9:02     ` [Caml-list] Why systhreads? Xavier Leroy
  1 sibling, 1 reply; 28+ messages in thread
From: james woodyatt @ 2002-11-25 21:06 UTC (permalink / raw)
  To: Blair Zajac; +Cc: The Trade

[this thread should probably migrate to ocaml_beginners@yahoogroups.com]

On Monday, Nov 25, 2002, at 11:01 US/Pacific, Blair Zajac wrote:
> Xavier Leroy wrote:
>>
>> Threads have at least three different purposes:
>> 1- Parallelism on shared-memory multiprocessors.
>
> [Discussion on (1), (2) and (3) removed].
>
> To summarize, for (2) system threads are required and and you can't
> prevent blocking with user level threads easily or at all.  For (3),
> making the Ocaml system support SMP is "Too complex; too hard to
> debug" and SMP boxes aren't all that popular.
>
> Aren't these contradictory statements?

Assuming you meant (1) not (3), then the answer is: No.  They're not.

> For Ocaml to support a Ocaml program to have one thread to block on a
> system call and to allow other threads to continue, doesn't this 
> support
> SMP?

Not necessarily.

> Does Ocaml support this?

No.  All threads are serialized, so an SMP machine only loads one 
processor at a time.

> I need the functionality to have multiple threads where one thread can
> block and not stop the others, either due to the OS or to the Ocaml
> runtime system.
>
> What am I missing here?

If I had to guess, I would say you are probably missing how your 
application is covered by case (2) or case (3) in M. LeRoy's standard 
lecture on the subject.

I've been a very long way down this road myself, and I agree with him.  
If you want your application to parallelize well, the winning design 
pattern seems to be message passing between distributed memory 
processes.


-- 
j h woodyatt <jhw@wetware.com>
markets are only free to the people who own them.

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Caml-list] Why systhreads?
  2002-11-25 21:06     ` james woodyatt
@ 2002-11-25 22:20       ` Chris Hecker
  2002-11-26  6:49         ` Sven Luther
  2002-11-27 13:12         ` Damien Doligez
  0 siblings, 2 replies; 28+ messages in thread
From: Chris Hecker @ 2002-11-25 22:20 UTC (permalink / raw)
  To: james woodyatt, Blair Zajac; +Cc: The Trade


>If you want your application to parallelize well, the winning design 
>pattern seems to be message passing between distributed memory processes.

I was going to let it drop after the "lecture" (which should be put in a 
faq or something), but come on, this is a silly generalization.  I have 
colleagues who have gotten very large speedups from hyperthreading on 
commercial applications, not demos.  The point is, it's "free" for Intel to 
put it in, and your app is waiting on cache misses and pipeline stalls 
anyway, so you might as well do something with those cycles.  Now you can 
get extra work done during those times in C, but you won't be able to in 
caml, and that's a bummer.  It's not a showstopper, since you can always 
call out to C, but it is yet another thing in the list of features that 
aren't natively exploitable in caml.  Of course there's a cost to enabling 
this in caml, and it may be that there's no good way to do it or that it's 
not worth it cost/benefit-wise, but saying "you don't want to do it anyway" 
is just apologist.

Xavier saying 1.5x is not worth it is really strange to me; most 
performance sensitive programmers I know would kill their mother to get 
1.5x.  I wonder what factor would be worth it for Xavier?

I think the overriding point here is that in the past SMP has not taken off 
on the desktop, so it wasn't worth worrying about for end-user 
applications.  That will no longer be true, simply because it was so cheap 
for Intel to add HT.  From now on, almost all chips they ship will be 
"logically" SMP (barring some unforseen thing where HT isn't used at all 
and becomes expensive to keep in the chip...I assume this is what Xavier 
meant by "last gasp", but I doubt it based on Intel's historic behavior 
with other CPU features).  For commercial application developers, that 
changes the landscape a bit.

It's very similar to MMX and SSE.  Neither technology revolutionized to 
world (like the hype suggested), but once all viable end user machines have 
it, it becomes cost effective to use.  HT is even easier, because unlike 
MMX and SSE, it involves no compiler changes (for C compilers) and is 
backwards compatible.

I am not a big fan of threading; in fact, I think it's almost always a 
cost/benefit lose (except when used to simulate async io) for my kinds of 
applications (games).  However, HT changes the cost/benefit equation.  How 
much remains to be seen, of course.

Chris


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Caml-list] Why systhreads?
  2002-11-25 22:20       ` Chris Hecker
@ 2002-11-26  6:49         ` Sven Luther
  2002-11-27 13:12         ` Damien Doligez
  1 sibling, 0 replies; 28+ messages in thread
From: Sven Luther @ 2002-11-26  6:49 UTC (permalink / raw)
  To: Chris Hecker; +Cc: james woodyatt, Blair Zajac, The Trade

On Mon, Nov 25, 2002 at 02:20:11PM -0800, Chris Hecker wrote:
> 
> >If you want your application to parallelize well, the winning design 
> >pattern seems to be message passing between distributed memory processes.
> 
> I was going to let it drop after the "lecture" (which should be put in a 
> faq or something), but come on, this is a silly generalization.  I have 
> colleagues who have gotten very large speedups from hyperthreading on 
> commercial applications, not demos.  The point is, it's "free" for Intel to 
> put it in, and your app is waiting on cache misses and pipeline stalls 
> anyway, so you might as well do something with those cycles.  Now you can 
> get extra work done during those times in C, but you won't be able to in 
> caml, and that's a bummer.  It's not a showstopper, since you can always 
> call out to C, but it is yet another thing in the list of features that 
> aren't natively exploitable in caml.  Of course there's a cost to enabling 
> this in caml, and it may be that there's no good way to do it or that it's 
> not worth it cost/benefit-wise, but saying "you don't want to do it anyway" 
> is just apologist.
> 
> Xavier saying 1.5x is not worth it is really strange to me; most 
> performance sensitive programmers I know would kill their mother to get 
> 1.5x.  I wonder what factor would be worth it for Xavier?

I think he said that the 1.5x would not cover the cost of adding smp
support in the first place. Apart from the fact that the added cost will
also be incurred by the single processor people, and, well HT technology
is all fine, but there will be some time before it is widely available.
Maybe then this issue will come up again, and another response will be
made.

Friendly,

Sven Luther
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Caml-list] Why systhreads?
  2002-11-25 19:01   ` Blair Zajac
  2002-11-25 21:06     ` james woodyatt
@ 2002-11-26  9:02     ` Xavier Leroy
  2002-11-26  9:29       ` Sven Luther
  2002-11-26 18:42       ` Chris Hecker
  1 sibling, 2 replies; 28+ messages in thread
From: Xavier Leroy @ 2002-11-26  9:02 UTC (permalink / raw)
  To: Blair Zajac; +Cc: caml-list

Blair Zajac wrote:

> To summarize, for (2) system threads are required and and you can't
> prevent blocking with user level threads easily or at all.  For (3),
> making the Ocaml system support SMP is "Too complex; too hard to
> debug" and SMP boxes aren't all that popular.
> Aren't these contradictory statements?
> 
> For Ocaml to support a Ocaml program to have one thread to block on a
> system call and to allow other threads to continue, doesn't this support
> SMP?  Does Ocaml support this?

No to the first question.  Yes to the second.

By "supporting SMP", I mean having several threads executing Caml code
in parallel, thus using the Caml runtime system in a concurrent fashion.
This is the hard part.  

In the current implementation of systhreads, the Caml executor and
runtime system is one big critical section: at most one thread can
execute Caml code at a given time, but arbitrarily many other threads
can be blocked on I/O (and thus isn't calling the Caml runtime system).
Each thread leaves the critical section before calling a potentially
blocking I/O operation, and re-enters it when the I/O completes.

> I need the functionality to have multiple threads where one thread can
> block and not stop the others, either due to the OS or to the Ocaml
> runtime system.

You have that functionality.  What you don't have is the ability to
keep several processors busy running Caml code.  (As Markus said, you
can still have C code running concurrently with Caml code, provided
that the C code doesn't call the Caml runtime system.)

Chris Hecker wrote:

> Xavier saying 1.5x is not worth it is really strange to me; most 
> performance sensitive programmers I know would kill their mother to get 
> 1.5x.  I wonder what factor would be worth it for Xavier?

Factors of 10 are always nice :-)  Just kidding.  What I meant is the
following: assume making the Caml runtime system thread-safe entails a
25% slowdown on program execution.  (This can easily happen if e.g. we
have to lock a mutex at each heap allocation.)  Further assume that by
doing so, you get a 1.5 speedup from hyperthreading.  In the end, your
program will run 1.5 * 0.75 = 1.12 times faster than its equivalent
running on the standard, single-processor Caml runtime.  It's not
worth the effort.

- Xavier Leroy


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Caml-list] Why systhreads?
  2002-11-26  9:02     ` [Caml-list] Why systhreads? Xavier Leroy
@ 2002-11-26  9:29       ` Sven Luther
  2002-11-26  9:34         ` Xavier Leroy
  2002-11-26 18:42       ` Chris Hecker
  1 sibling, 1 reply; 28+ messages in thread
From: Sven Luther @ 2002-11-26  9:29 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: Blair Zajac, caml-list

On Tue, Nov 26, 2002 at 10:02:54AM +0100, Xavier Leroy wrote:
> Blair Zajac wrote:
> 
> > To summarize, for (2) system threads are required and and you can't
> > prevent blocking with user level threads easily or at all.  For (3),
> > making the Ocaml system support SMP is "Too complex; too hard to
> > debug" and SMP boxes aren't all that popular.
> > Aren't these contradictory statements?
> > 
> > For Ocaml to support a Ocaml program to have one thread to block on a
> > system call and to allow other threads to continue, doesn't this support
> > SMP?  Does Ocaml support this?
> 
> No to the first question.  Yes to the second.
> 
> By "supporting SMP", I mean having several threads executing Caml code
> in parallel, thus using the Caml runtime system in a concurrent fashion.
> This is the hard part.  
> 
> In the current implementation of systhreads, the Caml executor and
> runtime system is one big critical section: at most one thread can
> execute Caml code at a given time, but arbitrarily many other threads
> can be blocked on I/O (and thus isn't calling the Caml runtime system).
> Each thread leaves the critical section before calling a potentially
> blocking I/O operation, and re-enters it when the I/O completes.

In the case i have a multi-threaded lablgtk executable, having ine
thread managing the interface and the other running programs, that even
if i had a way of killing a thread (or setting a mutex or whatever to
signal it to stop), that if the running thread is looping, i will never
be able to execute the interface thread which will (trough a callback)
set the mutex to the stop option, because the running thread doesn't do
blocking IO ?

Mmm, checking the mutex is a blocking IO though, isn't it ? 

keeping the GUI alive even if some other stuff is taking time or looping
forever is a nice application of threading support.

Friendly,

Sven Luther
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Caml-list] Why systhreads?
  2002-11-26  9:29       ` Sven Luther
@ 2002-11-26  9:34         ` Xavier Leroy
  2002-11-26  9:39           ` Sven Luther
  0 siblings, 1 reply; 28+ messages in thread
From: Xavier Leroy @ 2002-11-26  9:34 UTC (permalink / raw)
  To: Sven Luther; +Cc: Blair Zajac, caml-list

> In the case i have a multi-threaded lablgtk executable, having ine
> thread managing the interface and the other running programs, that even
> if i had a way of killing a thread (or setting a mutex or whatever to
> signal it to stop), that if the running thread is looping, i will never
> be able to execute the interface thread which will (trough a callback)
> set the mutex to the stop option, because the running thread doesn't do
> blocking IO ?

That's a long question.  Had to read it three times to see what you mean :-)

The answer to your question is that Caml systhreads do support
preemption: a timer forces the currently running thread to call
Thread.yield() at regular intervals.  In turn, Thread.yield()
releases the master mutex, calls sched_yield(), and re-acquires the
master mutex, giving other threads a chance to grab the master mutex
and run.

> keeping the GUI alive even if some other stuff is taking time or looping
> forever is a nice application of threading support.

Sure.  But this is all taken care of.  

- Xavier Leroy
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Caml-list] Why systhreads?
  2002-11-26  9:34         ` Xavier Leroy
@ 2002-11-26  9:39           ` Sven Luther
  0 siblings, 0 replies; 28+ messages in thread
From: Sven Luther @ 2002-11-26  9:39 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: Sven Luther, Blair Zajac, caml-list

On Tue, Nov 26, 2002 at 10:34:01AM +0100, Xavier Leroy wrote:
> > In the case i have a multi-threaded lablgtk executable, having ine
> > thread managing the interface and the other running programs, that even
> > if i had a way of killing a thread (or setting a mutex or whatever to
> > signal it to stop), that if the running thread is looping, i will never
> > be able to execute the interface thread which will (trough a callback)
> > set the mutex to the stop option, because the running thread doesn't do
> > blocking IO ?
> 
> That's a long question.  Had to read it three times to see what you mean :-)

Yes, sorry about that.

> The answer to your question is that Caml systhreads do support
> preemption: a timer forces the currently running thread to call
> Thread.yield() at regular intervals.  In turn, Thread.yield()
> releases the master mutex, calls sched_yield(), and re-acquires the
> master mutex, giving other threads a chance to grab the master mutex
> and run.

So it is not necessary to call Thread.yield() myself before the blocking
code, right ?

> > keeping the GUI alive even if some other stuff is taking time or looping
> > forever is a nice application of threading support.
> 
> Sure.  But this is all taken care of.  

:)))

Friendly,

Sven Luther
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Caml-list] Why systhreads?
  2002-11-26  9:02     ` [Caml-list] Why systhreads? Xavier Leroy
  2002-11-26  9:29       ` Sven Luther
@ 2002-11-26 18:42       ` Chris Hecker
  1 sibling, 0 replies; 28+ messages in thread
From: Chris Hecker @ 2002-11-26 18:42 UTC (permalink / raw)
  To: Xavier Leroy, Blair Zajac; +Cc: caml-list


>Factors of 10 are always nice :-)  Just kidding.  What I meant is the
>following: assume making the Caml runtime system thread-safe entails a
>25% slowdown on program execution.  (This can easily happen if e.g. we
>have to lock a mutex at each heap allocation.)  Further assume that by
>doing so, you get a 1.5 speedup from hyperthreading.  In the end, your
>program will run 1.5 * 0.75 = 1.12 times faster than its equivalent
>running on the standard, single-processor Caml runtime.  It's not
>worth the effort.

Sure, that's kinda obvious.  My original question was whether there was a 
known way to do a multithreaded gc that doesn't suck (costing 25% on 
nonthreaded applications does not count as not sucking) that ocaml could 
use if it became worth it (ie. in the event HT was widely adopted and 
actually worked well in practice).  If you're saying the above is the state 
of the art in multithreaded gc, then yes, it's not worth it.  If there was 
a multithreaded gc technique that cost 3% for single threaded apps, and all 
processors in existence were HT-enabled, then the equation starts to look 
different.  I never said this was the case now, or somebody should start 
typing this new gc in, I just wondered if the technology existed in case it 
became interesting.

I find it slightly ironic that I'm the "HyperThread guy" in this thread, 
since I'm pretty anti-hype myself.  Oh well.  Another slightly frustrating 
thing is that 90% of this thread was taken up by stuff that's documented 
(poorly, but still) on the net (whether SMP is supported now, whether async 
io works, whether non-systhreads work in native code, how the global mutex 
works, etc.).  It would be so nice if the FAQ was better formatted and we 
had a way of quickly updating it, but no, I don't have any time for that 
and I'm sure nobody else does either.  And, of course, nobody reads the FAQ 
before posting anyway.  :)

Chris


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Caml-list] Why systhreads?
  2002-11-25 10:01 ` Xavier Leroy
  2002-11-25 14:20   ` Markus Mottl
  2002-11-25 19:01   ` Blair Zajac
@ 2002-11-26 19:04   ` Dave Berry
  2002-11-27  0:07   ` Lauri Alanko
  3 siblings, 0 replies; 28+ messages in thread
From: Dave Berry @ 2002-11-26 19:04 UTC (permalink / raw)
  To: Xavier Leroy, Lauri Alanko; +Cc: caml-list

At 11:01 25/11/2002, Xavier Leroy wrote:
>For large
>parallel computations, clusters (distributed-memory systems) are the
>norm.

I think this is an exaggeration.  I've just started work at the UK National
e-Science Centre, which is linked to the Edinburgh Parallel Computing
Centre.  We have several new multiprocessor machines (16 processors, 64
processors, etc.), and there doesn't seem to be a shortage of uses for them.

Dave.

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Caml-list] Why systhreads?
  2002-11-25 10:01 ` Xavier Leroy
                     ` (2 preceding siblings ...)
  2002-11-26 19:04   ` Dave Berry
@ 2002-11-27  0:07   ` Lauri Alanko
  3 siblings, 0 replies; 28+ messages in thread
From: Lauri Alanko @ 2002-11-27  0:07 UTC (permalink / raw)
  To: caml-list

On Mon, Nov 25, 2002 at 11:01:33AM +0100, Xavier Leroy wrote:
> One aspect of wisdom is to know when not to do something oneself, but
> leave it to others.  Scheduling I/O and computation concurrently, and
> managing process stacks, is the job of the operating system.  Trying
> to do it entirely in a user-mode program is just not reasonable.

Nevertheless that is the way many language implementations do it, mainly
because their idea of what a thread should look like and how it should
be used differs from what eg. Posix threads (or at least their common
implementations) provide. Pthreads are just too heavy.

So if I understand correctly, benefits of user-level threads include:

* Thread creation speed (no context switches)
* Minimal memory footprint
* Flexibility (eg. inter-thread exceptions)

whereas using system threads gives us:

* Ease of implementation
* Better handling of blocking functions in foreign libraries

Now this is of course a matter of taste, but I'd say that the former
weighs much more than the latter. The problems with gethostbyname can be
averted even with user-level threads (the standard way is spawning an
external server process for each gethostbyname call), whereas there's no
way to get the benefits of user-level threads while using system
threads (short of writing one's own threading system, which is also
pretty much impossible unless you have at least continuations...)

Thankfully it seems like system threads will be much lighter at least in
Linux 2.6...


Lauri Alanko
la@iki.fi
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Caml-list] Why systhreads?
  2002-11-25 22:20       ` Chris Hecker
  2002-11-26  6:49         ` Sven Luther
@ 2002-11-27 13:12         ` Damien Doligez
  2002-11-27 18:04           ` Chris Hecker
  1 sibling, 1 reply; 28+ messages in thread
From: Damien Doligez @ 2002-11-27 13:12 UTC (permalink / raw)
  To: caml-list

On Monday, Nov 25, 2002, at 23:20 Europe/Paris, Chris Hecker wrote:

> However, HT changes the cost/benefit equation.  How much remains to be 
> seen, of course.

Do you really think so ?  In my experience, 95% of the costs of threads
(with shared memory) are in the debugging (of the threads 
implementation,
AND of the programs).  Cheap SMP machines and HT do not change the
cost/benefit equation very much.

More important, you don't need threads and shared memory to make use
of a SMP machine.  Any kind of parallelism will do.  Several processes
with message-passing can easily get you 100% load on all your 
processors.
Also, message-passing is more general; for example it will work on 
clusters.

So my opinion is: multiprocessing good, threads bad.

-- Damien

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Caml-list] Why systhreads?
  2002-11-24 23:14     ` Vitaly Lugovsky
@ 2002-11-27 14:33       ` Tim Freeman
  2002-11-29 13:25         ` Vitaly Lugovsky
  0 siblings, 1 reply; 28+ messages in thread
From: Tim Freeman @ 2002-11-27 14:33 UTC (permalink / raw)
  To: vsl; +Cc: dbely, caml-list

> I tried OCaml in a non memory-consuming numerical applications on SMP.
>Seems to work well enough (100% load of all the processors).

Wrong metric.  You want speedup, not CPU utilization.  You can get CPU
utilization for free by running an infinite loop. Did the application
run anywhere near N times faster when you were using N processors?

-- 
Tim Freeman       
tim@fungible.com
GPG public key fingerprint ECDF 46F8 3B80 BB9E 575D  7180 76DF FE00 34B1 5C78 
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Caml-list] Why systhreads?
  2002-11-27 13:12         ` Damien Doligez
@ 2002-11-27 18:04           ` Chris Hecker
  2002-11-27 21:04             ` Gerd Stolpmann
  0 siblings, 1 reply; 28+ messages in thread
From: Chris Hecker @ 2002-11-27 18:04 UTC (permalink / raw)
  To: Damien Doligez, caml-list


[sorry for the longwinded response]

>Do you really think so ?  In my experience, 95% of the costs of threads
>(with shared memory) are in the debugging (of the threads implementation,
>AND of the programs).  Cheap SMP machines and HT do not change the
>cost/benefit equation very much.

Like I said in my previous mail, I think it's going to be similar to 
MMX/SSE.  The performance improvement you get is not worth the development 
and support headache, until the technology is ubiquitous.  Once it's 
everywhere, it becomes worthwhile.  I'm using a middleware library for my 
game right now that requires MMX.  That's finally an acceptable 
requirement.  On xbox, which is a fixed platform with a known cpu, every 
game uses SSE, because it's just guaranteed to be there, and can make a big 
difference if you're willing to work with its problems (using structure of 
arrays layout, etc.).  And let's not even talk about the insanity of the 
PS2 architecture.  Xbox2 will use a CPU with HT, because there won't be any 
Intel CPUs that don't have HT, so it'll get used there by apps.

Now, as you point out, threads are complicated to design, program, and 
debug.  I agree with this completely.  As I said, I never use threaded 
designs if I can avoid it.  However, if it becomes very easy to spawn very 
small scale parallel threads in C on an HT processor, then it could make a 
big performance difference for some algorithms.  People are working on C 
compilers that have these extensions built in.  Intel's got one 
now.  They'll be first, everyone will ignore it until the installed base is 
big enough, and then it'll go into msvc.  MMX, SSE, and 3dnow followed the 
exact same path.

The reason this is different (or has the potential to be different) with HT 
compared to discrete cpus is that a) HT is free so it will be ubiquitous 
eventually, and b) HT drops the thread context switch time to 0.  It's not 
worth starting up a thread on another cpu to do a few instructions worth of 
work, but it is conceivable that it would be for HT.  Again, I think this 
will mirror MMX.  The original version of MMX has a horrible context switch 
time, and overloaded the FPU registers.  It was worthless.  They fixed 
it.  I assume there are similar gotchas with the first version of HT.  But, 
in a couple revs, they'll fix it and it will be possible to have a second 
thread do half the work in a small loop, with no overhead (there'll be a hw 
thread pool, hw wait on mutex/sleep, etc.).

The reason HT can make a performance difference is that your app is 
stalling in the CPU all the time anyway.  Even tight loops aren't memory 
bandwidth bound (unless it's a copy or fill), they're memory access bound; 
there's a huge difference between the two.  HT can take advantage of the 
latter and give you way more utilization, even on a smallscale loop.  In 
theory, anyway.  :)  But, as I said, I have [non-Intel] colleagues who have 
seen big wins with HT on some applications, enough to make them say, "huh, 
this actually works!"

Now, you could just say, "hey, caml's not for that kind of lowlevel stuff", 
which is a fine response.  However, I've been doing a lot of lowlevel stuff 
in my game, all in caml (linear algebra, 3d transforms, bitmap operations, 
etc.), and it's so close to being good enough to just stay in caml and not 
have to drop to C.  I understand the point of using the right tool for the 
job, but there is overhead (both cognitive and development-process-wise, 
both important) associated with hooking something in C, and so it would be 
really nice to stay in caml all the time.  Bringing this back to HT, this 
is the kind of feature that requires inria to do it, because I don't think 
anybody else understands the gc.  By contrast, I could probably get an SSE 
code generator working if I thought it was worth it.  But there's no way I 
could multithread the gc.  :)

>More important, you don't need threads and shared memory to make use
>of a SMP machine.  Any kind of parallelism will do.  Several processes
>with message-passing can easily get you 100% load on all your processors.
>Also, message-passing is more general; for example it will work on clusters.

Sure, but an HT cpu shares L1 and L2 caches between the threads.  This 
means that you really want your threads to be working on the same data and 
code if you can help it.  It'll still work for processes, but you're going 
to thrash way more than if you're doing local stuff.

Again, I'm not an HT zealot; I don't even know if it's going to 
succeed.  But, I do think it has the potential to have a big impact on 
performance oriented programming, and it would be great if there's a plan 
for supporting it in caml if it actually works.  If it's simply not 
possible to multithread the gc well, then that's that.  But it seems like 
something you want to have simmering on the mental back burner in case it 
turns out you want it later.

Sorry for the huge post,
Chris


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Caml-list] Why systhreads?
  2002-11-27 18:04           ` Chris Hecker
@ 2002-11-27 21:04             ` Gerd Stolpmann
  2002-11-27 21:45               ` [Caml-list] Calling ocaml from external threads Quetzalcoatl Bradley
  0 siblings, 1 reply; 28+ messages in thread
From: Gerd Stolpmann @ 2002-11-27 21:04 UTC (permalink / raw)
  To: Chris Hecker; +Cc: Damien Doligez, caml-list


Am 2002.11.27 19:04 schrieb(en) Chris Hecker:
> Now, as you point out, threads are complicated to design, program, and 
> debug.  I agree with this completely.  As I said, I never use threaded 
> designs if I can avoid it.  However, if it becomes very easy to spawn very 
> small scale parallel threads in C on an HT processor, then it could make a 
> big performance difference for some algorithms.  People are working on C 
> compilers that have these extensions built in.  Intel's got one 
> now.  They'll be first, everyone will ignore it until the installed base is 
> big enough, and then it'll go into msvc.  MMX, SSE, and 3dnow followed the 
> exact same path.

If it is really easy to spawn a second thread (or wake an existing thread up),
this could be useful for OCaml's runtime system internally. I can imagine that it is
not that difficult to rewrite the GC such that it runs in two threads. I don't mean
that it runs in parallel with the rest of the program (expensive locking problems),
but that the runtime system wakes two GC threads up when it is necessary, and
waits until both threads have done their job. That would reduce the time
spent with GC, maybe from 30% to 20% for a typical program. Of course, this
is only possible when there are good ideas to parallelize the GC such that the
extra coordination time does not eat up the extra CPU power.

Just an idea, I really do not know whether it is doable (or worth doing it).

Gerd
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
------------------------------------------------------------
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Caml-list] Calling ocaml from external threads
  2002-11-27 21:04             ` Gerd Stolpmann
@ 2002-11-27 21:45               ` Quetzalcoatl Bradley
  0 siblings, 0 replies; 28+ messages in thread
From: Quetzalcoatl Bradley @ 2002-11-27 21:45 UTC (permalink / raw)
  To: caml-list


While the topic of threads is fresh...

Suppose you have an OCAML library (native code) to be embedded in a 
multithreaded C application.  The library is compiled with -output-obj 
unix.cmxa threads.cmxa and the C program calls caml_startup at the 
beginning.

Then I create a few C threads and they all call into the ocaml library 
occasionally.  before calling in they first leave_blocking_section, and 
afterwards they call leave_blocking_section.

At this point, The first time a call is made into the ocaml, 
Mutex.create is called, which crashes during a GC inside 
"oldify_local_roots".  Hash_retaddr(retaddr) is called, and the result 
looked up in the frame_descriptors table, but the result is a NULL 
pointer which crashes when dereferenced.

Is there anything special that needs to be done to "bless" external 
threads before they call into ocaml?

Unfortunately it isn't really feasible for me to have ocaml create all 
the threads and have the C code called from ocaml instead.  I presume 
that would be easier though.

Thanks,

Quetzalcoatl Bradley
qbradley@blackfen.com
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Caml-list] Why systhreads?
  2002-11-27 14:33       ` Tim Freeman
@ 2002-11-29 13:25         ` Vitaly Lugovsky
  0 siblings, 0 replies; 28+ messages in thread
From: Vitaly Lugovsky @ 2002-11-29 13:25 UTC (permalink / raw)
  To: Tim Freeman; +Cc: dbely, caml-list

On Wed, 27 Nov 2002, Tim Freeman wrote:

> > I tried OCaml in a non memory-consuming numerical applications on SMP.
> >Seems to work well enough (100% load of all the processors).
> 
> Wrong metric.  You want speedup, not CPU utilization.  You can get CPU
> utilization for free by running an infinite loop. Did the application
> run anywhere near N times faster when you were using N processors?

 Yes. But, >80% of the system time was in external C functions, without
any memory management. So, really wrong metric...





-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2002-11-29 13:25 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-11-23  9:08 [Caml-list] Why systhreads? Lauri Alanko
2002-11-24  7:36 ` Sven Luther
2002-11-24 17:41   ` Chris Hecker
2002-11-24 18:12     ` Basile STARYNKEVITCH
2002-11-24 21:10       ` Christopher Quinn
2002-11-24 17:14 ` Vitaly Lugovsky
2002-11-24 17:18   ` Lauri Alanko
2002-11-24 18:27   ` Dmitry Bely
2002-11-24 23:14     ` Vitaly Lugovsky
2002-11-27 14:33       ` Tim Freeman
2002-11-29 13:25         ` Vitaly Lugovsky
2002-11-25 10:01 ` Xavier Leroy
2002-11-25 14:20   ` Markus Mottl
2002-11-25 19:01   ` Blair Zajac
2002-11-25 21:06     ` james woodyatt
2002-11-25 22:20       ` Chris Hecker
2002-11-26  6:49         ` Sven Luther
2002-11-27 13:12         ` Damien Doligez
2002-11-27 18:04           ` Chris Hecker
2002-11-27 21:04             ` Gerd Stolpmann
2002-11-27 21:45               ` [Caml-list] Calling ocaml from external threads Quetzalcoatl Bradley
2002-11-26  9:02     ` [Caml-list] Why systhreads? Xavier Leroy
2002-11-26  9:29       ` Sven Luther
2002-11-26  9:34         ` Xavier Leroy
2002-11-26  9:39           ` Sven Luther
2002-11-26 18:42       ` Chris Hecker
2002-11-26 19:04   ` Dave Berry
2002-11-27  0:07   ` Lauri Alanko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).