caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* How to re-implement the GC?
@ 2010-09-13  6:29 Eray Ozkural
  2010-09-13  7:57 ` Sylvain Le Gall
  0 siblings, 1 reply; 9+ messages in thread
From: Eray Ozkural @ 2010-09-13  6:29 UTC (permalink / raw)
  To: caml-list

[-- Attachment #1: Type: text/plain, Size: 359 bytes --]

Hi there,

What exactly are the requirements for substituting the current GC with
another, preferably non-locking, GC? Any pitfalls I wouldn't see just
reading the code?

Best,

-- 
Eray Ozkural, PhD candidate.  Comp. Sci. Dept., Bilkent University, Ankara
http://groups.yahoo.com/group/ai-philosophy
http://myspace.com/arizanesil http://myspace.com/malfunct

[-- Attachment #2: Type: text/html, Size: 597 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: How to re-implement the GC?
  2010-09-13  6:29 How to re-implement the GC? Eray Ozkural
@ 2010-09-13  7:57 ` Sylvain Le Gall
  2010-09-13 12:02   ` [Caml-list] " Eray Ozkural
  0 siblings, 1 reply; 9+ messages in thread
From: Sylvain Le Gall @ 2010-09-13  7:57 UTC (permalink / raw)
  To: caml-list

Hi,

On 13-09-2010, Eray Ozkural <examachine@gmail.com> wrote:
> Hi there,
>
> What exactly are the requirements for substituting the current GC with
> another, preferably non-locking, GC? Any pitfalls I wouldn't see just
> reading the code?
>

The GC is deeply interacting with the the rest of the compiler. I think
you will spend a lot of time on this task.

I would recommend you trying OC4MC, which is probably what you are
looking for:
http://www.algo-prog.info/ocmc/web/

They show quite interesting results using Thread at the last OCaml
Meeting, though they are still some bugs (almost linear speed-up with
multicore).

Regards,
Sylvain Le Gall


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Caml-list] Re: How to re-implement the GC?
  2010-09-13  7:57 ` Sylvain Le Gall
@ 2010-09-13 12:02   ` Eray Ozkural
  2010-09-13 12:22     ` Sylvain Le Gall
  0 siblings, 1 reply; 9+ messages in thread
From: Eray Ozkural @ 2010-09-13 12:02 UTC (permalink / raw)
  To: Sylvain Le Gall; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 1349 bytes --]

On Mon, Sep 13, 2010 at 10:57 AM, Sylvain Le Gall <sylvain@le-gall.net>wrote:

> Hi,
>
> On 13-09-2010, Eray Ozkural <examachine@gmail.com> wrote:
> > Hi there,
> >
> > What exactly are the requirements for substituting the current GC with
> > another, preferably non-locking, GC? Any pitfalls I wouldn't see just
> > reading the code?
> >
>
> The GC is deeply interacting with the the rest of the compiler. I think
> you will spend a lot of time on this task.
>
>
Deeply interacting with the compiler, how? Not through the public interface
of GC? Do you mean it is not used in a clean way?


> I would recommend you trying OC4MC, which is probably what you are
> looking for:
> http://www.algo-prog.info/ocmc/web/
>
>
Yes, I've seen it but it's a work in progress, and it's being rewritten from
scratch.


> They show quite interesting results using Thread at the last OCaml
> Meeting, though they are still some bugs (almost linear speed-up with
> multicore).
>


What exactly is the GC being used there? Is it a custom algorithm or a known
one? Could we plug our own algorithm to the oc4mc if it has already provided
the basic changes to substitute the GC?

Best,

-- 
Eray Ozkural, PhD candidate.  Comp. Sci. Dept., Bilkent University, Ankara
http://groups.yahoo.com/group/ai-philosophy
http://myspace.com/arizanesil http://myspace.com/malfunct

[-- Attachment #2: Type: text/html, Size: 2325 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: How to re-implement the GC?
  2010-09-13 12:02   ` [Caml-list] " Eray Ozkural
@ 2010-09-13 12:22     ` Sylvain Le Gall
  2010-09-13 14:14       ` [Caml-list] " Eray Ozkural
  0 siblings, 1 reply; 9+ messages in thread
From: Sylvain Le Gall @ 2010-09-13 12:22 UTC (permalink / raw)
  To: caml-list

On 13-09-2010, Eray Ozkural <examachine@gmail.com> wrote:
>> On 13-09-2010, Eray Ozkural <examachine@gmail.com> wrote:
>> > Hi there,
>> >
>> > What exactly are the requirements for substituting the current GC with
>> > another, preferably non-locking, GC? Any pitfalls I wouldn't see just
>> > reading the code?
>> >
>>
>> The GC is deeply interacting with the the rest of the compiler. I think
>> you will spend a lot of time on this task.
>>
>>
> Deeply interacting with the compiler, how? Not through the public interface
> of GC? Do you mean it is not used in a clean way?
>

I am not sure how you define "clean way". I think it is very efficient,
but not "modular or object-oriented". I would say that it is clean with
regard of the efficiency. But I won't use it to demonstrate how GC works
to student (but I won't either show them real world implementation of
other GC which are always more complex when optimization is required).

AFAIK, it uses some machine register to store a pointer to the minor
heap. But I am not a GC expert. 

>
>> I would recommend you trying OC4MC, which is probably what you are
>> looking for:
>> http://www.algo-prog.info/ocmc/web/
>>
>>
> Yes, I've seen it but it's a work in progress, and it's being rewritten from
> scratch.
>
>

If you stick to 3.11.1 OCaml version, you'll be able to compile with one
of their latest "stable" patch. 

To be honest, I think that if you join your efforts with theirs, you'll
probably get something quicker than going alone on this path. But this
is only my opinion.

At least, you will need the "fully-reentrant" runtime they are doing. 

>> They show quite interesting results using Thread at the last OCaml
>> Meeting, though they are still some bugs (almost linear speed-up with
>> multicore).
>>
>
>
> What exactly is the GC being used there? Is it a custom algorithm or a known
> one? Could we plug our own algorithm to the oc4mc if it has already provided
> the basic changes to substitute the GC?
>

I think you won't be able to plugin your own GC. The one they provide is
a "stop the world"... I am not sure though, ask them directly.

Regards,
Sylvain Le Gall


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Caml-list] Re: How to re-implement the GC?
  2010-09-13 12:22     ` Sylvain Le Gall
@ 2010-09-13 14:14       ` Eray Ozkural
  2010-09-13 14:29         ` Sylvain Le Gall
  2010-09-13 21:25         ` Jon Harrop
  0 siblings, 2 replies; 9+ messages in thread
From: Eray Ozkural @ 2010-09-13 14:14 UTC (permalink / raw)
  To: Sylvain Le Gall; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 4540 bytes --]

On Mon, Sep 13, 2010 at 3:22 PM, Sylvain Le Gall <sylvain@le-gall.net>wrote:

> On 13-09-2010, Eray Ozkural <examachine@gmail.com> wrote:
> >> On 13-09-2010, Eray Ozkural <examachine@gmail.com> wrote:
> >> > Hi there,
> >> >
> >> > What exactly are the requirements for substituting the current GC with
> >> > another, preferably non-locking, GC? Any pitfalls I wouldn't see just
> >> > reading the code?
> >> >
> >>
> >> The GC is deeply interacting with the the rest of the compiler. I think
> >> you will spend a lot of time on this task.
> >>
> >>
> > Deeply interacting with the compiler, how? Not through the public
> interface
> > of GC? Do you mean it is not used in a clean way?
> >
>
> I am not sure how you define "clean way". I think it is very efficient,
> but not "modular or object-oriented". I would say that it is clean with
> regard of the efficiency. But I won't use it to demonstrate how GC works
> to student (but I won't either show them real world implementation of
> other GC which are always more complex when optimization is required).
>
>
Well, programming anything in C is messy, I suppose.


> AFAIK, it uses some machine register to store a pointer to the minor
> heap. But I am not a GC expert.
>

Ah, that's interesting. I wonder if it provides any real speedup on new
architectures compared to storing the pointer in RAM.

>
> >
> >> I would recommend you trying OC4MC, which is probably what you are
> >> looking for:
> >> http://www.algo-prog.info/ocmc/web/
> >>
> >>
> > Yes, I've seen it but it's a work in progress, and it's being rewritten
> from
> > scratch.
> >
> >
>
> If you stick to 3.11.1 OCaml version, you'll be able to compile with one
> of their latest "stable" patch.
>
>
http://www.algo-prog.info/ocmc/distribution/

Which one is it?


> To be honest, I think that if you join your efforts with theirs, you'll
> probably get something quicker than going alone on this path. But this
> is only my opinion.
>

Yes, if I decide to carry on I would try to join efforts. But I really need
to find out what's necessary first. Hence, all the pondering.


>
> At least, you will need the "fully-reentrant" runtime they are doing.
>
>
Yes, fully-entrant is necessary for any proper POSIX threads code. If the
runtime had some global state, you simply carry that to local state
(plugging into function args etc.) and you're done. Depends on how much
global state there is. In well-designed programs, there isn't much of a
global state, it's unfortunate they didn't notice the runtime wasn't
reentrant at first. They would also need to pay attention to such things as
volatile memory and synchronization. It can actually get quite difficult to
write POSIX threads code that won't deadlock or do unexpected things, while
in theory it is quite easy to write. It would be nice to have a complete
checklist of everything you need to do to make sure the multithreaded code
is correct, which I believe I never had so I prefer message passing :)


> >> They show quite interesting results using Thread at the last OCaml
> >> Meeting, though they are still some bugs (almost linear speed-up with
> >> multicore).
> >>
> >
> >
> > What exactly is the GC being used there? Is it a custom algorithm or a
> known
> > one? Could we plug our own algorithm to the oc4mc if it has already
> provided
> > the basic changes to substitute the GC?
> >
>
> I think you won't be able to plugin your own GC. The one they provide is
> a "stop the world"... I am not sure though, ask them directly.
>
>

That's unfortunate, too, because from reading their source code I had had
the impression that they had in mind an easy way to plug-in my GC. One with
global lock isn't good enough though, it will not have good performance with
memory intensive programs. Hence, my question, suppose this project actually
made progress in other parts of the code (like making the runtime fully
re-entrant) how do I go about implementing a state-of-the-art GC for this,
are there any special requirements or do I just have to implement a minor
heap and a major heap etc. to match the interface and the parameters and I
am done? I mean, is this a garbage collector as we know it, or does it have
any exotic features or requirements? I am looking to see if a competent
programmer without an intimate knowledge of the whole compilation system can
do it.

Best,

-- 
Eray Ozkural, PhD candidate.  Comp. Sci. Dept., Bilkent University, Ankara
http://groups.yahoo.com/group/ai-philosophy
http://myspace.com/arizanesil http://myspace.com/malfunct

[-- Attachment #2: Type: text/html, Size: 6585 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: How to re-implement the GC?
  2010-09-13 14:14       ` [Caml-list] " Eray Ozkural
@ 2010-09-13 14:29         ` Sylvain Le Gall
  2010-09-13 21:25           ` [Caml-list] " Jon Harrop
  2010-09-13 21:25         ` Jon Harrop
  1 sibling, 1 reply; 9+ messages in thread
From: Sylvain Le Gall @ 2010-09-13 14:29 UTC (permalink / raw)
  To: caml-list

On 13-09-2010, Eray Ozkural <examachine@gmail.com> wrote:
> --===============0758070018==
> Content-Type: multipart/alternative; boundary=000e0cd18672fce48b049024b79e
>
> --000e0cd18672fce48b049024b79e
> Content-Type: text/plain; charset=ISO-8859-1
>
> On Mon, Sep 13, 2010 at 3:22 PM, Sylvain Le Gall <sylvain@le-gall.net>wrote:
>
>> On 13-09-2010, Eray Ozkural <examachine@gmail.com> wrote:
>> >> On 13-09-2010, Eray Ozkural <examachine@gmail.com> wrote:
>> >> > Hi there,
>> >> >
>> >> > What exactly are the requirements for substituting the current GC with
>> >> > another, preferably non-locking, GC? Any pitfalls I wouldn't see just
>> >> > reading the code?
>> >> >
>> >>
>> >> The GC is deeply interacting with the the rest of the compiler. I think
>> >> you will spend a lot of time on this task.
>> >>
>> >>
>> > Deeply interacting with the compiler, how? Not through the public
>> interface
>> > of GC? Do you mean it is not used in a clean way?
>> >
>>
>> I am not sure how you define "clean way". I think it is very efficient,
>> but not "modular or object-oriented". I would say that it is clean with
>> regard of the efficiency. But I won't use it to demonstrate how GC works
>> to student (but I won't either show them real world implementation of
>> other GC which are always more complex when optimization is required).
>>
>>
> Well, programming anything in C is messy, I suppose.
>
>
>> AFAIK, it uses some machine register to store a pointer to the minor
>> heap. But I am not a GC expert.
>>
>
> Ah, that's interesting. I wonder if it provides any real speedup on new
> architectures compared to storing the pointer in RAM.
>

<take this with care, I am still not a GC expert>
I think it provides an ultra quick way to allocate data on the minor
heap. For heavy allocating programming languages like FP, it is a good
speedup.

Other GC algorithm for Java/C# often made the assumption of long-living
objects with mutation. This is not the case for OCaml. 
</take this with care>

>>
>> >
>> >> I would recommend you trying OC4MC, which is probably what you are
>> >> looking for:
>> >> http://www.algo-prog.info/ocmc/web/
>> >>
>> >>
>> > Yes, I've seen it but it's a work in progress, and it's being rewritten
>> from
>> > scratch.
>> >
>> >
>>
>> If you stick to 3.11.1 OCaml version, you'll be able to compile with one
>> of their latest "stable" patch.
>>
>>
> http://www.algo-prog.info/ocmc/distribution/
>
> Which one is it?
>

Maybe this one:
http://www.algo-prog.info/ocmc/distribution/oc4mc-toronto-stack32k.tar.gz

It seems to be based on 3.11.1. I really don't know in fact, I am not a
oc4mc expert.

>
>> >> They show quite interesting results using Thread at the last OCaml
>> >> Meeting, though they are still some bugs (almost linear speed-up with
>> >> multicore).
>> >>
>> >
>> >
>> > What exactly is the GC being used there? Is it a custom algorithm or a
>> known
>> > one? Could we plug our own algorithm to the oc4mc if it has already
>> provided
>> > the basic changes to substitute the GC?
>> >
>>
>> I think you won't be able to plugin your own GC. The one they provide is
>> a "stop the world"... I am not sure though, ask them directly.
>>
>>
>
> That's unfortunate, too, because from reading their source code I had had
> the impression that they had in mind an easy way to plug-in my GC. One with
> global lock isn't good enough though, it will not have good performance with
> memory intensive programs. Hence, my question, suppose this project actually
> made progress in other parts of the code (like making the runtime fully
> re-entrant) how do I go about implementing a state-of-the-art GC for this,
> are there any special requirements or do I just have to implement a minor
> heap and a major heap etc. to match the interface and the parameters and I
> am done? I mean, is this a garbage collector as we know it, or does it have
> any exotic features or requirements? I am looking to see if a competent
> programmer without an intimate knowledge of the whole compilation system can
> do it.
>

I really don't know how to answer, contact directly the OC4MC team. I
only answer you with the data they give at OCaml Meeting, back in April. 

Regards
Sylvain Le Gall


^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [Caml-list] Re: How to re-implement the GC?
  2010-09-13 14:14       ` [Caml-list] " Eray Ozkural
  2010-09-13 14:29         ` Sylvain Le Gall
@ 2010-09-13 21:25         ` Jon Harrop
  2010-09-13 21:47           ` Eray Ozkural
  1 sibling, 1 reply; 9+ messages in thread
From: Jon Harrop @ 2010-09-13 21:25 UTC (permalink / raw)
  To: 'Eray Ozkural', 'Sylvain Le Gall'; +Cc: caml-list

Hi Eray,

Retrofitting a new multicore-friendly GC onto OCaml is difficult for two
main reasons:

1. You want plug-and-play GCs but the OCaml compiler is tightly coupled to
the old GC (although OC4MC has decoupled them!).

2. Recovering similar performance whilst reusing the same data
representation is extremely difficult because the current design relies so
heavily on lightweight allocation. You really want to change the data
representation to avoid unnecessary boxing (e.g. never box or tag int,
floats or tuples) in order to reduce the allocation rate and, consequently,
the stress on the garbage collector but OCaml cannot express value types and
its ability to represent polymorphic recursion makes this extremely
difficult to implement.

As Sylvain has said, OC4MC is your best bet if you want to try to write a
new GC for OCaml. However, making more extensive changes has the potential
to address many more problems (e.g. convey run-time type information for
generic printing) so you might consider alternatives like trying to compile
OCaml's bytecode into HLVM's IR for JIT compilation because HLVM already has
a multicore friendly GC and it is much easier to develop.

> Ah, that's interesting. I wonder if it provides any real speedup on new
> architectures compared to storing the pointer in RAM.

For a multicore GC, using a register to refer to thread-local data is a huge
win because accessing thread-local data is very slow. I made that mistake in
HLVM's first multicore-capable GC and it was basically useless as a
consequence because all of the time was spent looking up thread-local data.
When I started passing the thread-local data around as an extra argument to
every function (not necessarily consuming a register all the time because
LLVM is free to spill it), performance improved enormously.

Cheers,
Jon.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [Caml-list] Re: How to re-implement the GC?
  2010-09-13 14:29         ` Sylvain Le Gall
@ 2010-09-13 21:25           ` Jon Harrop
  0 siblings, 0 replies; 9+ messages in thread
From: Jon Harrop @ 2010-09-13 21:25 UTC (permalink / raw)
  To: 'Sylvain Le Gall', caml-list

> Other GC algorithm for Java/C# often made the assumption of long-living
> objects with mutation. This is not the case for OCaml.

They do favour mutation and, consequently, have cheaper write barriers but
both the JVM and CLR use pointer-bumping minor heaps for the first "nursery"
generation to collect short-lived objects efficiently, just like OCaml.

Cheers,
Jon.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Caml-list] Re: How to re-implement the GC?
  2010-09-13 21:25         ` Jon Harrop
@ 2010-09-13 21:47           ` Eray Ozkural
  0 siblings, 0 replies; 9+ messages in thread
From: Eray Ozkural @ 2010-09-13 21:47 UTC (permalink / raw)
  To: Jon Harrop; +Cc: Sylvain Le Gall, caml-list

[-- Attachment #1: Type: text/plain, Size: 1011 bytes --]

On Tue, Sep 14, 2010 at 12:25 AM, Jon Harrop <
jonathandeanharrop@googlemail.com> wrote:

> Hi Eray,
>
> Retrofitting a new multicore-friendly GC onto OCaml is difficult for two
> main reasons:
>
> 1. You want plug-and-play GCs but the OCaml compiler is tightly coupled to
> the old GC (although OC4MC has decoupled them!).
>
>
I'm not really interested in changing anything else at the moment. I am just
looking to see if I can commission the implementation of a state-of-the-art
GC and plug it into ocaml. Naturally, like anyone, I have my own ideas about
how to correctly optimize dynamic memory allocation but I can take ocaml's
idea, that of using two heaps and go with it. So, oc4mc was successful in
decoupling after all? I need to go back and take a look at the source again.
It's getting complicated quickly :)

Cheers,

-- 
Eray Ozkural, PhD candidate.  Comp. Sci. Dept., Bilkent University, Ankara
http://groups.yahoo.com/group/ai-philosophy
http://myspace.com/arizanesil http://myspace.com/malfunct

[-- Attachment #2: Type: text/html, Size: 1496 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2010-09-13 21:47 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-09-13  6:29 How to re-implement the GC? Eray Ozkural
2010-09-13  7:57 ` Sylvain Le Gall
2010-09-13 12:02   ` [Caml-list] " Eray Ozkural
2010-09-13 12:22     ` Sylvain Le Gall
2010-09-13 14:14       ` [Caml-list] " Eray Ozkural
2010-09-13 14:29         ` Sylvain Le Gall
2010-09-13 21:25           ` [Caml-list] " Jon Harrop
2010-09-13 21:25         ` Jon Harrop
2010-09-13 21:47           ` Eray Ozkural

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).