caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Samuel Hornus <Samuel.Hornus@inria.fr>
To: O Caml <caml-list@inria.fr>
Subject: Re: [Caml-list] Segfault in C++ stub with many 'new' allocations
Date: Sat, 17 Nov 2012 23:45:46 +0100	[thread overview]
Message-ID: <E36D483A-5F66-445B-849E-62F729FDB05F@inria.fr> (raw)
In-Reply-To: <CAP_800oDXyj9nQD7ip+jfriPZcSu8zCjJ+Gzt9eSOUpyX46HgA@mail.gmail.com>

On 17 nov. 2012, at 20:46, Markus Mottl wrote:

> On Sat, Nov 17, 2012 at 1:29 PM, Samuel Hornus <Samuel.Hornus@inria.fr> wrote:
>> 1/ I'm writing a stub to the C++ ANN library [1] to find geometric neighboring points in space.
>> The constructor of the main class in this library uses a lot of allocation with the "new" C++ keyword.
>> For small input point sets (e.g. 2500 points), it all seems to work fine.
>> For larger ones (50 K points), the C++ constructor crashes.
>> My question is : is it possible that the C++ "new" allocator differs sufficiently from the C-style malloc, that bad interactions with OCaml heap happen ?
> 
> Having written quite some C++ bindings, I'm unaware of any such
> problems.  Segfaults are typically due to incorrect interaction with
> the OCaml runtime.

It turns out I was putting a bit too much faith in libANN. I must use it wrongly & need to investigate more since, even after removing input points with same coordinates, it does enter an infinite loop while computing the nearest-neighbors data structure, and exhausts the stack, hence the crash.
I'm glad that this is not a C++::new problem, but a more easily debug-able one.

>> 2/ Regarding bigarray: before using them, I let the C++ constructor access, and keep pointers inside regular OCaml [float array] or [float array array]. It was working well (again, for small input point set), but is that safe ? Or can the garbage collector eventually relocate the content of a  [float array]  or of a [float array array] ? so that the pointer kept in the C++ class would become dangling ?
> 
> No, it's not safe to keep pointers into any standard OCaml array if
> allocations can happen.  Kakadu mentions register_global_root in a
> reply, but it is not correct that this prevents the GC from moving
> values, it merely protects against their reclamation (i.e. it keeps
> the values live).

OK ! I'll stick to bigarrays.

> Even if you want to use bigarrays, you will have to protect them from
> being reclaimed while your C++ code is accessing them.  But otherwise
> their contents will remain fixed in memory, because unlike with
> standard arrays it lives outside the OCaml heap.

So my Ann module defines

	type internal
	type t = internal * bigarray

and the Ann.mli abstracts the type t.

The creation of the ANN data structure returns a block of size 1 with tag Abstract_tag, containing the C++pointer, as suggested in the OCaml manual. The resulting value, call it 'ptr', is stored with the abstract type "internal" in the pair (ptr,ba) where ba is the bigarray containing the points (so that the big array lives at least as long as my ANN C++ object). A special destructor is attached to that pair with Gc.finalise. I believe (perhaps naively?) that this is sufficient to avoid memory corruption (unless I write stupid code in the Ann module itself).

> For numerical calculations I'd strongly suggest using bigarrays,
> especially if the C/C++ functions can take a long time to run.  In
> this case you could then release the OCaml runtime lock and benefit
> from parallelism and/or improved latencies if your OCaml program needs
> to react to the outside world.

Thank you Markus,
-- 
Sam

      reply	other threads:[~2012-11-17 22:45 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-17 18:29 Samuel Hornus
2012-11-17 18:35 ` Kakadu
2012-11-17 18:42   ` Samuel Hornus
2012-11-17 19:10     ` Török Edwin
2012-11-17 19:18       ` Samuel Hornus
2012-11-17 19:50       ` Markus Mottl
2012-11-17 19:46 ` Markus Mottl
2012-11-17 22:45   ` Samuel Hornus [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E36D483A-5F66-445B-849E-62F729FDB05F@inria.fr \
    --to=samuel.hornus@inria.fr \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).