caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* AW: [Caml-list] The tag bit
@ 2004-08-12 15:22 Bauer, Christoph
  2004-08-13  3:53 ` Jacques Garrigue
  0 siblings, 1 reply; 11+ messages in thread
From: Bauer, Christoph @ 2004-08-12 15:22 UTC (permalink / raw)
  To: caml-list

[-- Attachment #1: Type: text/plain, Size: 902 bytes --]


>     Well, I can see several reasons :
>   * processors like powers of two, especially when it comes 
> to the size 
> of a memory address, because of cache issues, so you'd better 
> make it 32 
> or 64 words than 33 or 65.
>   * If the tag bit can be anywhere in a word you have to spend extra 
> time to extract it, whereas when it is at a fixed place, 
> especially LSB 
> or MSB, it is very cheap and easy.
>   * You would need two registers to access a value and its 
> tag instead 
> of one, and registers are very precious, at least on IA-32 
> architectures.

But who needs the tag bit? Only the garbage collector. Maybe it's
an advantage to see 32 tag bits as a whole, e.g. the question
"does the block contains any pointer" can be calculated bit-parallel.
Anyway the garbage collector could works on blocks and needs 
just one additional memory access per block.

Regards,
Christoph Bauer

[-- Attachment #2: Type: text/html, Size: 1807 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: AW: [Caml-list] The tag bit
  2004-08-12 15:22 AW: [Caml-list] The tag bit Bauer, Christoph
@ 2004-08-13  3:53 ` Jacques Garrigue
  2004-08-13 12:58   ` Christophe Raffalli
  0 siblings, 1 reply; 11+ messages in thread
From: Jacques Garrigue @ 2004-08-13  3:53 UTC (permalink / raw)
  To: Christoph.Bauer; +Cc: caml-list

From: "Bauer, Christoph" <Christoph.Bauer@lms-gmbh.de>
> But who needs the tag bit? Only the garbage collector. Maybe it's
> an advantage to see 32 tag bits as a whole, e.g. the question
> "does the block contains any pointer" can be calculated bit-parallel.
> Anyway the garbage collector could works on blocks and needs 
> just one additional memory access per block.

But who works in your program?
The garbage collector!
Do some profiling, and you will be surprised at how much time is used
collecting garbage in a perfectly sane program.
No surprise here: in symbolic programming, you are generating garbage
all the time. In the old lisp days, people talked of the mutator (your
program) and the collector, emphasizing that they are almost equal.
And you wouldn't give 1 bit for your brother, doing all the dirty
work? :-)

More seriously, it may be that your idea of blocks could be made to
work, but it's not going to be easy. For instance, what do you do when
manipulating data between registers? Looks like it would mean updating
some special register every time you do a move operation. This could
generate heaps of code.
GC is such an old field that I suppose all weird approaches have
already been studied in the litterature.

Jacques Garrigue

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: AW: [Caml-list] The tag bit
  2004-08-13  3:53 ` Jacques Garrigue
@ 2004-08-13 12:58   ` Christophe Raffalli
  2004-08-13 13:14     ` [Caml-list] Other GC in ML family ? Diego Olivier Fernandez Pons
                       ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Christophe Raffalli @ 2004-08-13 12:58 UTC (permalink / raw)
  To: Jacques Garrigue; +Cc: Christoph.Bauer, caml-list

There is a less costly way to avoid the tag bit in integer: 
"conservative GC": any int which happens to point in an alloccated block 
(or only at the beginning if you do not consider C but ML) is considered 
as a pointer. You will have very few wrong pointers (especially in the 
second case). Moreover, array of int or float, or block of memory can be 
tagged with a flag saying they do not old pointer.

The Boehm GC for C and C++ is very succefull to do that and very often 
allow you to share data-structure in C as you would in ML (not caring 
about who will release first the data) and gain both speed and memory.

Does anyone have  a comparison between two identical GC except one 
should have a tag bit and the other be conservative ?

The cost of conservative GC is the test to know if an int is pointing in 
(or at the beginning) of an allocated block which require for instance a 
hashtbl of allocated blocks by adress ranges. I don't know if the gain 
for arithmetic + easier C interface would compensate the lost in the GC 
for a real GC like Caml's.


--
Christophe Raffalli
Université de Savoie
Batiment Le Chablais, bureau 21
73376 Le Bourget-du-Lac Cedex

tél: (33) 4 79 75 81 03
fax: (33) 4 79 75 87 42
mail: Christophe.Raffalli@univ-savoie.fr
www: http://www.lama.univ-savoie.fr/~RAFFALLI
---------------------------------------------
IMPORTANT: this mail is signed using PGP/MIME
At least Enigmail/Mozilla, mutt or evolution
can check this signature
---------------------------------------------

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Caml-list] Other GC in ML family ?
  2004-08-13 12:58   ` Christophe Raffalli
@ 2004-08-13 13:14     ` Diego Olivier Fernandez Pons
  2004-08-13 13:24     ` AW: [Caml-list] The tag bit Andreas Rossberg
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Diego Olivier Fernandez Pons @ 2004-08-13 13:14 UTC (permalink / raw)
  To: Christophe Raffalli; +Cc: caml-list

    Bonjour,

On Fri, 13 Aug 2004, Christophe Raffalli wrote:

> The cost of conservative GC is the test to know if an int is
> pointing in (or at the beginning) of an allocated block which
> require for instance a hashtbl of allocated blocks by adress ranges.
> I don't know if the gain for arithmetic + easier C interface would
> compensate the lost in the GC for a real GC like Caml's.

How does the GC of mlton work ? If I understood correctly mlton does
have 32-bit integers so if it has a mark-and-copy family garbage
collector, its tags are kept apart.

ML-Kit has a region garbage collector. It collects static information
by analyzing the source-code at compile time but it is not enough so
it is combined with an other kind of 'dynamically computed' collection
(also by regions if I understood correctly the simple explanations of
the user manual)

Maybe could you find more information (or at least their filling) on
the subject ? (C interface, performances, etc.)

        Diego Olivier





-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: AW: [Caml-list] The tag bit
  2004-08-13 12:58   ` Christophe Raffalli
  2004-08-13 13:14     ` [Caml-list] Other GC in ML family ? Diego Olivier Fernandez Pons
@ 2004-08-13 13:24     ` Andreas Rossberg
  2004-08-13 14:32       ` T. Kurt Bond
  2004-08-13 14:28     ` skaller
  2004-08-13 15:40     ` Brian Hurt
  3 siblings, 1 reply; 11+ messages in thread
From: Andreas Rossberg @ 2004-08-13 13:24 UTC (permalink / raw)
  To: caml-list

Christophe Raffalli wrote:
> There is a less costly way to avoid the tag bit in integer: 
> "conservative GC": any int which happens to point in an alloccated block 
> (or only at the beginning if you do not consider C but ML) is considered 
> as a pointer. You will have very few wrong pointers (especially in the 
> second case). Moreover, array of int or float, or block of memory can be 
> tagged with a flag saying they do not old pointer.
> 
> The Boehm GC for C and C++ is very succefull to do that and very often 
> allow you to share data-structure in C as you would in ML (not caring 
> about who will release first the data) and gain both speed and memory.

AFAIK, a conservative collector is not allowed to move anything. Hence 
it is inherently incompatible with compacting and generational GC, like 
used in OCaml (and highly desirable).

Cheers,

	- Andreas

-- 
Andreas Rossberg, rossberg@ps.uni-sb.de

Let's get rid of those possible thingies!  -- TB

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: AW: [Caml-list] The tag bit
  2004-08-13 12:58   ` Christophe Raffalli
  2004-08-13 13:14     ` [Caml-list] Other GC in ML family ? Diego Olivier Fernandez Pons
  2004-08-13 13:24     ` AW: [Caml-list] The tag bit Andreas Rossberg
@ 2004-08-13 14:28     ` skaller
  2004-08-13 15:44       ` Christophe Raffalli
  2004-08-13 15:40     ` Brian Hurt
  3 siblings, 1 reply; 11+ messages in thread
From: skaller @ 2004-08-13 14:28 UTC (permalink / raw)
  To: Christophe Raffalli; +Cc: Jacques Garrigue, Christoph.Bauer, caml-list

On Fri, 2004-08-13 at 22:58, Christophe Raffalli wrote:

> Does anyone have  a comparison between two identical GC except one 
> should have a tag bit and the other be conservative ?

The Boehm collector is quite efficient: if you compare it
to hand written encodings such as reference counting
for example.

The main problem with it is that it has to 'stop the world'
whilst it is doing its thing, and so isn't useful for
real time applications such as a game where you can easily
pay 20% of all CPU for the GC -- but you simply can't freeze
up the game for 10 seconds every few minutes.

The Ocaml generational collector is likely to be much better
at this -- some of the workload is spread over time, and
the remaining major collection when needed will also be
faster, and can be called manually at appropriate points.

A second point is -- Boehm cannot defragment memory.
Ocaml can (although the compaction is 'world stop').

So .. i don't think the 'overall CPU use' of the two collector
kinds is actually what you need to compare: the real time
performance and/or ability to operate with C/C++ code
are the likely issues.


-- 
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net



-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: AW: [Caml-list] The tag bit
  2004-08-13 13:24     ` AW: [Caml-list] The tag bit Andreas Rossberg
@ 2004-08-13 14:32       ` T. Kurt Bond
  2004-08-13 14:41         ` AW: [Caml-list] Conservative GC T. Kurt Bond
  2004-08-13 16:14         ` AW: [Caml-list] The tag bit Ville-Pertti Keinonen
  0 siblings, 2 replies; 11+ messages in thread
From: T. Kurt Bond @ 2004-08-13 14:32 UTC (permalink / raw)
  To: Andreas Rossberg; +Cc: caml-list

Andreas Rossberg writes:
> AFAIK, a conservative collector is not allowed to move anything. Hence 
> it is inherently incompatible with compacting and generational GC, like 
> used in OCaml (and highly desirable).

Joel F. Bartlett's 1988 paper "Compacting garbage collection with
ambiguous roots" describes a conservative "mostly copying" compacting
GC scheme; his 1989 paper "Mostly-Copying Garbage Collection Picks Up
Generations and C++" descibes a generation variation.  Frederick Smith
and Greg Morrisett's 1997 paper "Mostly-Copying Collection: A Viable
Alternative to Conservative Mark-Sweep" describes an improved variant
that they compared with the BDW by using both with the TIL/C ML
compiler.  Giuseppe Attardi, Tito Flagella, and Pietro Iglio describe
a GC in their 1998 paper "A Customisable Memory Management Framework
for C++" that uses mostly copying GC for the default heap.

-- 
T. Kurt Bond, tkb@tkb.mpl.com

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: AW: [Caml-list] Conservative GC
  2004-08-13 14:32       ` T. Kurt Bond
@ 2004-08-13 14:41         ` T. Kurt Bond
  2004-08-13 16:14         ` AW: [Caml-list] The tag bit Ville-Pertti Keinonen
  1 sibling, 0 replies; 11+ messages in thread
From: T. Kurt Bond @ 2004-08-13 14:41 UTC (permalink / raw)
  To: caml-list; +Cc: T. Kurt Bond

Conservative GC has the drawback that there *are* pathological cases
where the GC, because of its conservatism, retains memory that could
actually be discarded.  If I remember correctly, the PLT Scheme folks,
who use the BDW conservative gc, apparently ran into this with the web
server that is distributed with DrScheme, and I've seen mention of
other programs that use the BDW gc occasionally running into this
problem.  In these cases it is often possible to tune the program to
eliminate the pathological case, but it's often difficult to figure
out the source of the problem.  Accurate (that is, non-conservative)
gc, of course, does not have this problem.
-- 
T. Kurt Bond, tkb@tkb.mpl.com

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: AW: [Caml-list] The tag bit
  2004-08-13 12:58   ` Christophe Raffalli
                       ` (2 preceding siblings ...)
  2004-08-13 14:28     ` skaller
@ 2004-08-13 15:40     ` Brian Hurt
  3 siblings, 0 replies; 11+ messages in thread
From: Brian Hurt @ 2004-08-13 15:40 UTC (permalink / raw)
  To: Christophe Raffalli; +Cc: Jacques Garrigue, Christoph.Bauer, caml-list

On Fri, 13 Aug 2004, Christophe Raffalli wrote:

> There is a less costly way to avoid the tag bit in integer: 
> "conservative GC": any int which happens to point in an alloccated block 
> (or only at the beginning if you do not consider C but ML) is considered 
> as a pointer. You will have very few wrong pointers (especially in the 
> second case). Moreover, array of int or float, or block of memory can be 
> tagged with a flag saying they do not old pointer.
> 
> The Boehm GC for C and C++ is very succefull to do that and very often 
> allow you to share data-structure in C as you would in ML (not caring 
> about who will release first the data) and gain both speed and memory.

This works well for languages like C/C++, where allocation is
(compartively) rare.  Ocaml programs allocate like crazy (most of the
stuff they allocate becomes garbage almost immediately, which is why they
don't take 300 terabytes of ram to run).  Cost of allocation is a very 
important number to Ocaml's performance.

The problem with Boehm-style conservative GC is that you can't do copying 
collection with it.  You're not *sure* if that word is an integer or a 
pointer, so you can't change it to move the object it's pointing to- you 
might be changing an integer, with catastrophic consequences for the 
program.  

With copying garbage collection, allocation is very, very fast.  Ocaml, on 
the x86, takes five simple instructions to allocate a block of memory (if 
you don't kick off a garbage collection).  So a high allocation rate isn't 
a big problem.  But this only works because Ocaml keeps the heap compact 
by moving objects around.  So allocating on the heap is not much slower 
than allocating on the stack- you just bump a pointer.  If you can't move 
objects around, the heap becomes fragmented- free and used blocks mixed 
together.  And you end up searching the heap for a free space when you 
want to allocate.  This isn't a problem if allocation is rare, but it's 
deadly for Ocaml.


-- 
"Usenet is like a herd of performing elephants with diarrhea -- massive,
difficult to redirect, awe-inspiring, entertaining, and a source of
mind-boggling amounts of excrement when you least expect it."
                                - Gene Spafford 
Brian

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: AW: [Caml-list] The tag bit
  2004-08-13 14:28     ` skaller
@ 2004-08-13 15:44       ` Christophe Raffalli
  0 siblings, 0 replies; 11+ messages in thread
From: Christophe Raffalli @ 2004-08-13 15:44 UTC (permalink / raw)
  To: skaller; +Cc: Jacques Garrigue, Christoph.Bauer, caml-list

skaller wrote:
> On Fri, 2004-08-13 at 22:58, Christophe Raffalli wrote:
> 
> 
>>Does anyone have  a comparison between two identical GC except one 
>>should have a tag bit and the other be conservative ?
> 
> 
> The Boehm collector is quite efficient: if you compare it
> to hand written encodings such as reference counting
> for example.
> 
> The main problem with it is that it has to 'stop the world'
> whilst it is doing its thing, and so isn't useful for
> real time applications such as a game where you can easily
> pay 20% of all CPU for the GC -- but you simply can't freeze
> up the game for 10 seconds every few minutes.
> 
> The Ocaml generational collector is likely to be much better
> at this -- some of the workload is spread over time, and
> the remaining major collection when needed will also be
> faster, and can be called manually at appropriate points.
> 
> A second point is -- Boehm cannot defragment memory.
> Ocaml can (although the compaction is 'world stop').
> 
> So .. i don't think the 'overall CPU use' of the two collector
> kinds is actually what you need to compare: the real time
> performance and/or ability to operate with C/C++ code
> are the likely issues.
> 
> 
It is not true, on some configuration (including intel I think) Boehm's 
GC can be incremental. At least the documentation say so.
-- 
Christophe Raffalli
Université de Savoie
Batiment Le Chablais, bureau 21
73376 Le Bourget-du-Lac Cedex

tél: (33) 4 79 75 81 03
fax: (33) 4 79 75 87 42
mail: Christophe.Raffalli@univ-savoie.fr
www: http://www.lama.univ-savoie.fr/~RAFFALLI
---------------------------------------------
IMPORTANT: this mail is signed using PGP/MIME
At least Enigmail/Mozilla, mutt or evolution
can check this signature
---------------------------------------------

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: AW: [Caml-list] The tag bit
  2004-08-13 14:32       ` T. Kurt Bond
  2004-08-13 14:41         ` AW: [Caml-list] Conservative GC T. Kurt Bond
@ 2004-08-13 16:14         ` Ville-Pertti Keinonen
  1 sibling, 0 replies; 11+ messages in thread
From: Ville-Pertti Keinonen @ 2004-08-13 16:14 UTC (permalink / raw)
  To: T. Kurt Bond; +Cc: Andreas Rossberg, caml-list

T. Kurt Bond wrote:

>Joel F. Bartlett's 1988 paper "Compacting garbage collection with
>ambiguous roots" describes a conservative "mostly copying" compacting
>GC scheme; his 1989 paper "Mostly-Copying Garbage Collection Picks Up
>Generations and C++" descibes a generation variation.  Frederick Smith
>and Greg Morrisett's 1997 paper "Mostly-Copying Collection: A Viable
>Alternative to Conservative Mark-Sweep" describes an improved variant
>that they compared with the BDW by using both with the TIL/C ML
>compiler.  Giuseppe Attardi, Tito Flagella, and Pietro Iglio describe
>a GC in their 1998 paper "A Customisable Memory Management Framework
>for C++" that uses mostly copying GC for the default heap.
>  
>
Don't forget G. May Yip's "Incremental, generational mostly-copying 
garbage collection in uncooperative environments".  :)

Obviously, all of these variations are subject to the problem mentioned 
by others - movable pointers must be tagged or otherwise identifiable.

I've implemented a variation of the algorithm described in G. May Yip's 
thesis - it works nicely, but my impression of the technique is that it 
inevitably has far more processing overhead than OCaml's collector (even 
without the incremental aspect).  However, it may be worthwile for some 
applications (incremental, supports a mixed environment of arbitrary 
C/C++ code that may refer to the heap of another language that does tag 
(non-)pointers).

The thing with garbage collection is that while it's often obviously a 
good solution, for some specific requirements no garbage collector seems 
"quite right", and you have to either give up on the idea of garbage 
collection or use a massively customized algorithm.

OCaml's collector is impressive - it's not (properly) incremental or 
real-time, but fast, precise, compacting and very good for a lot of 
practical purposes.  From Xavier's papers, we can see that quite a bit 
of testing and tuning has gone into it.

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2004-08-13 16:15 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-08-12 15:22 AW: [Caml-list] The tag bit Bauer, Christoph
2004-08-13  3:53 ` Jacques Garrigue
2004-08-13 12:58   ` Christophe Raffalli
2004-08-13 13:14     ` [Caml-list] Other GC in ML family ? Diego Olivier Fernandez Pons
2004-08-13 13:24     ` AW: [Caml-list] The tag bit Andreas Rossberg
2004-08-13 14:32       ` T. Kurt Bond
2004-08-13 14:41         ` AW: [Caml-list] Conservative GC T. Kurt Bond
2004-08-13 16:14         ` AW: [Caml-list] The tag bit Ville-Pertti Keinonen
2004-08-13 14:28     ` skaller
2004-08-13 15:44       ` Christophe Raffalli
2004-08-13 15:40     ` Brian Hurt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).