caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] How does the Thread module work with the Gc?
@ 2014-04-17  8:52 Goswin von Brederlow
  2014-04-17 10:57 ` Mark Shinwell
  0 siblings, 1 reply; 3+ messages in thread
From: Goswin von Brederlow @ 2014-04-17  8:52 UTC (permalink / raw)
  To: caml-list

Hi,

as mentioned before I'm porting ocaml to run baremetal on a Raspberry
Pi and I'm getting close to a first release. Maybe porting is the
wrong word. I'm writing an exo kernel that you link to the ocamlopt
output to create a bootable kernel image. So just a verry thin layer
between hardware and ocaml.

What I'm currently a bit stuck with is the threading support and
interrupts. Since I don't plan to implement the pthread interface I
have to write my own Thread module. But it is verry similar to
otherlibs/systhread/ by necessity.

There seems to be 3 parts where the Thread module interacts with the Gc:

1) scan_roots_hook

This handles the per thread local roots, stacks and gc registers. This
ensures that data seen by any thread remains alive. Otherwise only
data seen by the current thread would be seen by the Gc.

2) stack usage estimation

No idea what that is for. It seems to add up the stack usage per
thread.

3) enter/leave_blocking_section

There is a mutex that prevents any two threads from leaving the
blocking section. I.e. no two threads can ever run ocaml code.

This is where the thread switching happens in the Thread module. This
causes threads to switch when you call e.g. Printf.printf.


But here is where my problem starts:

How does ocaml switch threads when a signal occurs? What if a thread
never enters a blocking section? Isn't there some other point where
tasks can be switched other than enter/leave_blocking_section?

I looked at the implementation for signals and they seem to set the
allocation limit for the thread to 0 so the next allocation will
trigger a Gc run. But how does that lead to a thread switch? What am I
missing here?

MfG
	Goswin


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Caml-list] How does the Thread module work with the Gc?
  2014-04-17  8:52 [Caml-list] How does the Thread module work with the Gc? Goswin von Brederlow
@ 2014-04-17 10:57 ` Mark Shinwell
  2014-04-17 22:10   ` Goswin von Brederlow
  0 siblings, 1 reply; 3+ messages in thread
From: Mark Shinwell @ 2014-04-17 10:57 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: caml-list

On 17 April 2014 09:52, Goswin von Brederlow <goswin-v-b@web.de> wrote:
> How does ocaml switch threads when a signal occurs? What if a thread
> never enters a blocking section? Isn't there some other point where
> tasks can be switched other than enter/leave_blocking_section?
>
> I looked at the implementation for signals and they seem to set the
> allocation limit for the thread to 0 so the next allocation will
> trigger a Gc run. But how does that lead to a thread switch? What am I
> missing here?

Gah, this is a complicated part of the world.

It works as follows when there is no explicit call to release the
OCaml lock.  The ticker thread records a "preemption signal" and sets
the minor heap limit such that the next minor allocation goes to
[caml_garbage_collection].  At this point, pending signals are
processed; and then here is the piece I think you've missed: the
systhreads library has already registered a handler on that signal
(see otherlibs/systhreads/thread.ml) and that handler will cause
[Thread.yield] to be called.  This in turn calls the "yield" operation
in st_posix.h.  On Linux this does nothing; see the comment.  However,
that call (see otherlibs/systhreads/st_stubs.c) is also wrapped in an
enter/leave blocking section, which causes the mutex to be dropped.
At this time, if the stars align and the operating system scheduler is
willing, a thread switch may occur.  There are no guarantees about
fairness made in the runtime at this point (see Xavier's comment on
mantis 5373), so you may in fact find that sometimes a thread does not
yield.  (Also, code that never allocates will not yield unless it
explicitly drops the lock for some other reason.)

Mark

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Caml-list] How does the Thread module work with the Gc?
  2014-04-17 10:57 ` Mark Shinwell
@ 2014-04-17 22:10   ` Goswin von Brederlow
  0 siblings, 0 replies; 3+ messages in thread
From: Goswin von Brederlow @ 2014-04-17 22:10 UTC (permalink / raw)
  To: caml-list

On Thu, Apr 17, 2014 at 11:57:11AM +0100, Mark Shinwell wrote:
> On 17 April 2014 09:52, Goswin von Brederlow <goswin-v-b@web.de> wrote:
> > How does ocaml switch threads when a signal occurs? What if a thread
> > never enters a blocking section? Isn't there some other point where
> > tasks can be switched other than enter/leave_blocking_section?
> >
> > I looked at the implementation for signals and they seem to set the
> > allocation limit for the thread to 0 so the next allocation will
> > trigger a Gc run. But how does that lead to a thread switch? What am I
> > missing here?
> 
> Gah, this is a complicated part of the world.
> 
> It works as follows when there is no explicit call to release the
> OCaml lock.  The ticker thread records a "preemption signal" and sets
> the minor heap limit such that the next minor allocation goes to
> [caml_garbage_collection].  At this point, pending signals are
> processed; and then here is the piece I think you've missed: the
> systhreads library has already registered a handler on that signal
> (see otherlibs/systhreads/thread.ml) and that handler will cause
> [Thread.yield] to be called.  This in turn calls the "yield" operation
> in st_posix.h.  On Linux this does nothing; see the comment.  However,
> that call (see otherlibs/systhreads/st_stubs.c) is also wrapped in an
> enter/leave blocking section, which causes the mutex to be dropped.
> At this time, if the stars align and the operating system scheduler is
> willing, a thread switch may occur.  There are no guarantees about
> fairness made in the runtime at this point (see Xavier's comment on
> mantis 5373), so you may in fact find that sometimes a thread does not
> yield.  (Also, code that never allocates will not yield unless it
> explicitly drops the lock for some other reason.)
> 
> Mark

I took another stab at that and that worked eventually. Tanks, that
pointer helped. But ...

First snag was that ocaml signals are negative and get converted when
installing the signal handler. I decided to simply use signal 0 for
now instead of Sys.sigvtalrm since positive signal numbers are used as
is.

Then I thought I needed enter/leave blocking section and had to
schedule threads in leave blocking section, like otherlibs/systhreads
does. That kind of works too but gets triggered far too often.
Specifically Printf.printf calls it in the midle of output and the
channel structure is not thread safe. So the first time 2 threads
output something it blows up.

For scheduling in enter/leave blocking section to work one has to add
the channel mutex hooks:
  extern void (*caml_channel_mutex_free)(struct channel *);
  extern void (*caml_channel_mutex_lock)(struct channel *);
  extern void (*caml_channel_mutex_unlock)(struct channel *);
  extern void (*caml_channel_mutex_unlock_exn)(void);

Took me a while to get those right because the struct channel depends
on _FILE_OFFSET_BITS=64.

And after all that work I noticed two things:

1) Scheduling in enter/leave blocking section just makes the threads
fight for the channel mutex. It keeps going through all threads till
it gets to the original again so it can continue its printf. No
handling of blocked/sleeping threads yet.

This is a bit of an artefact of my test case because they printf a lot.

2) Why go through enter/leave blocking section when I can just
schedule properly in Thread.yield() itself?

I don't think I need enter/leave blocking section at all since this is
purely a single core system and being barebone there are no blocking
syscalls. And even if, scheduling in enter or leave blocking section
wouldn't help. I would need a master mutex and uncouple scheduling
from ocaml completly for that to help. But one day I might port that
to SMP so I will keep the source for the hooks.

So now when I create two threads with loop and loop2 as closure I get:

let rec fib = function
  | 0 -> (1, "0")
  | 1 -> (1, "1")
  | n ->
    (* alloc something for preemption to work later *)
    let s = Printf.sprintf "%d" n in
    let (n1, _) = fib (n - 1) in
    let (n2, _) = fib (n - 2) in
    (n1 + n2, s)

let rec loop n =
  let (res, s) = fib n
  in
  Printf.printf "fib(%s) = %d\n%!" s res;
  Thread.signal 0;
  loop (n + 1)

let rec loop2 n =
  let (res, s) = fib n
  in
  Printf.printf "fib2(%s) = %d\n%!" s res;
  Thread.signal 0;
  loop2 (n + 1)


fib2(16) = 1597
# ocaml_thread_signal(0)
# sigemptyset()
# sigaddset()
# sigprocmask()
# schedule()
# switching: old_stack = 0xC03F1838, new_stack = 0xC016DDB8
# switched: old_stack = 0xC03F1838, new_stack = 0xC016DDB8
# sigprocmask()
# write()
Signal number 0
fib(16) = 1597
# ocaml_thread_signal(0)
# sigemptyset()
# sigaddset()
# sigprocmask()
# schedule()
# switching: old_stack = 0xC016DDB8, new_stack = 0xC03F1838
# switched: old_stack = 0xC016DDB8, new_stack = 0xC03F1838
# sigprocmask()
# write()
Signal number 0
fib2(17) = 2584
# ocaml_thread_signal(0)
# sigemptyset()
# sigaddset()
# sigprocmask()
# schedule()
# switching: old_stack = 0xC03F1838, new_stack = 0xC016DDB8
# switched: old_stack = 0xC03F1838, new_stack = 0xC016DDB8
# sigprocmask()
# write()
Signal number 0
fib(17) = 2584
# ocaml_thread_signal(0)
# sigemptyset()
# sigaddset()
# sigprocmask()
# schedule()
# switching: old_stack = 0xC016DDB8, new_stack = 0xC03F1838
# switched: old_stack = 0xC016DDB8, new_stack = 0xC03F1838
# sigprocmask()
# write()
Signal number 0
fib2(18) = 4181
...

Lines with # are printf from the exo kernel and "Signal number 0" is
from the signal handler. You can see that after every fibonacci number
is printed the threads switch due to the Thread.signal call.

Time to program and activate the timer interrupt and make it call
caml_record_signal(). Then it will be nearly preemptive multitasking,
assuming threads alloc enough. I might want to add a
Thread.maybe_yield() for loops that don't alloc.



For those that are waiting to see the source I have a few more things
I want to add before the first release:

- timer interrupt for nearly preemptive multitasking
- time function (driven by the timer interrupt)
- scheduler with running/ready, waiting and sleeping thread queues
- Thread.sleep
- Thread.wait_for_signal (wait for hardware interrupt or user signal)
- Thread.signal_thread

With the 4 day weekend coming up things look promising.

MfG
	Goswin

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-04-17 22:10 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-17  8:52 [Caml-list] How does the Thread module work with the Gc? Goswin von Brederlow
2014-04-17 10:57 ` Mark Shinwell
2014-04-17 22:10   ` Goswin von Brederlow

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).