caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: "\"Martin R. Neuhäußer\"" <post@marneu.com>
To: "Török Edwin" <edwin+ml-ocaml@etorok.net>
Cc: caml-list@inria.fr
Subject: Re: [Caml-list] Presumed bug in OCaml's garbage collector or in its Weak module...
Date: Fri, 17 Jun 2016 22:29:35 +0200	[thread overview]
Message-ID: <01CFEDAE-C1CB-4968-9FDF-1EC004CC027A@marneu.com> (raw)
In-Reply-To: <3673db34-9350-a9b1-fcd5-e7593ba0fd01@etorok.net>

[-- Attachment #1: Type: text/plain, Size: 3755 bytes --]

Thanks for the good point! And yes, the weak sets use Weak.get_copy internally.

However, the exact semantics of Weak.get_copy is still a bit unclear to me. It creates a „shallow copy“ of a value that might independently be garbage collected. If that shallow copy can survive the finalization of its source, that might cause the problem. Are there any restrictions enforced on values obtained by Weak.get_copy? Or could this even be misused to create arbitrary copies of custom blocks that propagate anywhere?

I will try to extend the example such that maybe it triggers the problem outside the compare and hash functions, i.e. not in one of the functions that the weak set code might expose to a shallow copy.

Nevertheless, the behavior seems to have changes somewhere between OCaml 4.01.1 and 4.02.0. And the Mantis tickets 7161 and 7157 seem to affect OCaml’s behavior, as well. That might explain that the testcase succeeds with the first beta of OCaml 4.03.0…

Best,
Martin

> Am 17.06.2016 um 21:41 schrieb Török Edwin <edwin+ml-ocaml@etorok.net>:
> 
> On 06/17/2016 09:09 PM, "Martin R. Neuhäußer" wrote:
>> Dear all,
>> 
>> after an intense week of debugging some large C-bindings, I presume to have found a bug in OCaml’s garbage collection code or in its Weak module.
>> To summarize, it seems as if OCaml’s garbage collector and its Weak module sometimes release a custom block (by calling its finalizer) too early, i.e. when it is still reachable.
>> 
>> I hesitate a bit before opening an „official“ issue on Mantis as I might very well overlook some detail. Therefore I’d like to double-check some of the assumptions that I have made when writing my C-stubs. Any corrections are highly welcome:
>> 1. Custom blocks may be moved around in memory by the GC, but they are never duplicated.
>> 2. The finalizer for each custom block that is allocated by caml_alloc_custom is called at most once.
>> 3. The finalizer is never called for blocks that are still alive; stated otherwise, a block that has been finalized can never been presented as a value to a C-stub anymore.
> 
> Shallow copies don't seem very safe in the presence of out-of-OCaml-heap C pointers [*].
> Perhaps Weak.get_copy should raise if it encounters a Custom_tag?
> 
> The custom value may contain C pointers, that may get invalidated or changed if the custom value has a finalizer (as in the testcase):
> When the original weak value has no more references it gets the finalizer invoked, and gets garbage collected
> (in this case it changes unique_id to c_data_invalid_id, but it might as well free it instead causing a crash).
> 
> Meanwhile the copy made through Weak.get_copy still lives, and shares the copied custom value (pointer) from the weak value that got finalized.
> Operations on the copied value (e.g. comparison/hashing) will try to access the custom value, which points to c_data_invalid_id.
> 
> 
>> There is a small example program available on github: https://github.com/martin-neuhaeusser/ocaml_bug where the above assumptions seem to be violated.
> 
> [*] AFAICT a shallow copy is created by WeakDummySet.find -> Weak.get_copy
> 
> Best regards,
> --
> Edwin Török | Co-founder and Lead Developer
> 
> Skylable open-source object storage: reliable, fast, secure
> http://www.skylable.com
> 
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs

--
Martin Neuhäußer							Phone: +49 (911) 49051223
Normannenstraße 15						Mobile: +49 (172) 8966488
D-90461Nürnberg							GnuPG: 0x16FDB298


[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 204 bytes --]

  reply	other threads:[~2016-06-17 20:29 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-17 18:09 "Martin R. Neuhäußer"
2016-06-17 19:23 ` Gabriel Scherer
2016-06-18 15:15   ` "Martin R. Neuhäußer"
2016-06-17 19:41 ` Török Edwin
2016-06-17 20:29   ` "Martin R. Neuhäußer" [this message]
2016-06-18 10:59     ` Josh Berdine
2016-06-18 15:11   ` "Martin R. Neuhäußer"
2016-06-18 16:54   ` Leo White
2016-06-21 12:43     ` François Bobot
2016-06-21 19:37       ` Alain Frisch
2016-06-22  8:12         ` François Bobot
2016-06-18 17:39 ` "Martin R. Neuhäußer"
2016-06-21 11:55   ` François Bobot
2016-06-27 11:35     ` AW: " Neuhaeusser, Martin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=01CFEDAE-C1CB-4968-9FDF-1EC004CC027A@marneu.com \
    --to=post@marneu.com \
    --cc=caml-list@inria.fr \
    --cc=edwin+ml-ocaml@etorok.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).