caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* finalisation
@ 2000-02-03 14:06 skaller
  0 siblings, 0 replies; 3+ messages in thread
From: skaller @ 2000-02-03 14:06 UTC (permalink / raw)
  To: caml list

I'm not sure if my previous email got through to this list,
so I'll address this issue again.

I've installed David's ocaml 2.999, which supports ocaml finalisation
functions.

The semantics provided offer a number of significant advances over other
finalisation technology, particularly the allowance of multiple
finalisers,
and the fact that finalisers can cause an object to become reachable --
without any problems. Contrast this to the mess Java made of
finalisation.

However, the code I have has a serious deficiency: there is no way
to control the _order_ in which finalisers are invoked, and in
particular,
the order implemented is 'first in, first finalised', which is usually
the _wrong_ order for me.

In Vyper, my ocaml based Python interpreter, I have now enabled
python __del__ methods, which are finalisers for python class instances.
CPython uses reference counting, and so it finalises an acyclic graph
of objects top down: first the finaliser (__del__ method) of the root
object is invoked, possibly causing some child object's ref counts
to go to zero, in which case they're finalised recursively, then the
attributes of the object are detached, causing all remaining child
reference count to be decremented, possibly going to zero (and invoking
finalisers).

This scheme has two advantages: finalisers are invoked as early as
possible, and always in the correct order, and it has a disadvantage,
that circular references leave garbage.

Using the new finalisation feature, synchronous finalisation
is lost, which is not really a problem, whereas garbage is correctly
collected, which is a distinct advantage.

However, the problem is that the finalisers are invoked
in an 'arbitrary' order, which happens to be the order of
object construction. This is no good at all. In a typical
scenario, child objects are created first, and passed as
arguments to the parent constructor. Finalising the child
objects first, before the parent, makes it impossible to
correctly execute the parent finaliser, since it assumes
the parent object maintains invariants, including the ability
to access fully constructed children.

When there is an acyclic graph of references, the parent MUST be
finalised before any children. If the order of finalisers
was first in, last finalised, then more code would work correctly
(almost nothing works correctly at the moment). However, this isn't
really an adequate solution either.

An algorithm which finds a garbage object on which no other garbage
objects depend (if any) can proceed to safely finalise that object.
By applying this algorithm repeatedly to all garbage objects, a correct
order
of finalisation is found for acyclic graphs of objects.

In principle, the garbage collector must already know how to do this,
although I do not know the details enough to know if there is a
convenient
way to reuse existing code to sequence finalisers correctly in this
case.

This leaves the problem of cyclic references. In Python, this will
rarely
be a problem because where finalisation is important, the client must
already break any cycles manually: the only potential issue would be
if the code depended on a finaliser of an object in a cycle which is not
broken NOT being executed -- and ocaml actually invoked it.

However, to extend the available functionality, it would be useful
to be able to break cycles. Clearly, in a cycle, if there is only
one object with a finaliser, it can be invoked without a problem.
[Note it is important never to _collect_ any objects which a finaliser
_might_ refer to!]

This leaves the problem of a cycle with more than one finaliser.
In this case, for Python, there is no way to control the order
of finalisation (because it is never done). So it would probably
be reasonable to invoke the finalisers in an arbitrary order.
However, in the case of _other_ software, a different strategy might
be required (weak pointers? explicit declaration of a finalisation
order?)

I'd be very interested to see if it is possible to produce a
finalisation
strategy which at least handles acyclic graphs, since this is necessary
for
my interpreter to operate correctly. I'd be even more interested
in any comments on how to handle cycles (I have no idea, really).

-- 
John (Max) Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia voice: 61-2-9660-0850
homepage: http://www.maxtal.com.au/~skaller
download: ftp://ftp.cs.usyd.edu/au/jskaller



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: finalisation
  2000-02-04 16:27 finalisation Damien Doligez
@ 2000-02-04 18:36 ` skaller
  0 siblings, 0 replies; 3+ messages in thread
From: skaller @ 2000-02-04 18:36 UTC (permalink / raw)
  To: Damien Doligez; +Cc: caml-list

Damien Doligez wrote:
> 
> >From: skaller <skaller@maxtal.com.au>

> If python specifies reference counting, I guess the only solution is
> to implement reference counting.  

	Python specifically _doesn't_ specify reference counting,
it specifies garbage collection. But the current implementation
is deficient in that it uses reference counting. So I need to
be compatible with existing programs at least as far as preserving
their correctness provided only that doesn't depend on synchronous
finalisation (a delay is OK, a change of order may not be).

>You can use O'Caml's finalisers to
> help you in the following way:
> 
> Add a reference count field to each finalisable object.  Initialise it
> to 1 upon allocation and register a finalisation function for this
> object.  Increment the reference count when you create a new reference
> from another finalised object, and decrement it when you remove a
> reference (you can do it by simply calling its finalisation function).

	I'm not sure how this works .. a reference from a finalised
object to another can be indirect (via another object). I'd like to
avoid
the cost of maintaining reference counts for non-finalisable objects.


-- 
John (Max) Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia voice: 61-2-9660-0850
homepage: http://www.maxtal.com.au/~skaller
download: ftp://ftp.cs.usyd.edu/au/jskaller




^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re:  finalisation
@ 2000-02-04 16:27 Damien Doligez
  2000-02-04 18:36 ` finalisation skaller
  0 siblings, 1 reply; 3+ messages in thread
From: Damien Doligez @ 2000-02-04 16:27 UTC (permalink / raw)
  To: caml-list, skaller

>From: skaller <skaller@maxtal.com.au>

>In Vyper, my ocaml based Python interpreter, I have now enabled
>python __del__ methods, which are finalisers for python class instances.
>CPython uses reference counting, and so it finalises an acyclic graph
>of objects top down: first the finaliser (__del__ method) of the root
>object is invoked, possibly causing some child object's ref counts
>to go to zero, in which case they're finalised recursively, then the
>attributes of the object are detached, causing all remaining child
>reference count to be decremented, possibly going to zero (and invoking
>finalisers).

If python specifies reference counting, I guess the only solution is
to implement reference counting.  You can use O'Caml's finalisers to
help you in the following way:

Add a reference count field to each finalisable object.  Initialise it
to 1 upon allocation and register a finalisation function for this
object.  Increment the reference count when you create a new reference
from another finalised object, and decrement it when you remove a
reference (you can do it by simply calling its finalisation function).

The finalisation function will do the following:

1. Decrement the reference count.
2. If the reference count is not 0, stop here.
3. Do your finalisation work for this object.
4. Call the finalisation function of each object that is referenced by
   this one.

The reference keeps track of other finalisable objets that point to
this one, plus one (the initial value) that keeps track of all other
pointers to this object, with the help of O'Caml's GC.


>An algorithm which finds a garbage object on which no other garbage
>objects depend (if any) can proceed to safely finalise that object.
>By applying this algorithm repeatedly to all garbage objects, a correct
>order of finalisation is found for acyclic graphs of objects.
>
>In principle, the garbage collector must already know how to do this,
>although I do not know the details enough to know if there is a
>convenient way to reuse existing code to sequence finalisers
>correctly in this case.

No.  The garbage collector never looks at the pointers between dead
objects.  That's one of the reasons why it is a lot more efficient
than reference counting.

-- Damien




^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2000-02-04 18:43 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-02-03 14:06 finalisation skaller
2000-02-04 16:27 finalisation Damien Doligez
2000-02-04 18:36 ` finalisation skaller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).