From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id F3518BC57 for ; Thu, 18 Mar 2010 11:58:03 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AhUCAFKkoUvZSMDjkWdsb2JhbACbLBUBAQEBCQsKBxMDH7dthHgE X-IronPort-AV: E=Sophos;i="4.51,265,1267398000"; d="scan'208";a="59290021" Received: from fmmailgate02.web.de ([217.72.192.227]) by mail4-smtp-sop.national.inria.fr with ESMTP; 18 Mar 2010 11:58:02 +0100 Received: from smtp05.web.de (fmsmtp05.dlan.cinetic.de [172.20.4.166]) by fmmailgate02.web.de (Postfix) with ESMTP id 8E475155B249B for ; Thu, 18 Mar 2010 11:56:41 +0100 (CET) Received: from [78.43.204.177] (helo=frosties.localdomain) by smtp05.web.de with asmtp (TLSv1:AES256-SHA:256) (WEB.DE 4.110 #4) id 1NsDP6-0001Zw-00 for caml-list@inria.fr; Thu, 18 Mar 2010 11:56:40 +0100 Received: from mrvn by frosties.localdomain with local (Exim 4.71) (envelope-from ) id 1NsDP6-0002Ll-2N for caml-list@inria.fr; Thu, 18 Mar 2010 11:56:40 +0100 From: Goswin von Brederlow To: caml-list Subject: Re: Random segfaults / out of memory References: <87wrxb5p59.fsf@frosties.localdomain> <20100317083907.GC26002@janestreet.com> <87sk7zgtpq.fsf@frosties.localdomain> <87hboehphv.fsf_-_@frosties.localdomain> Date: Thu, 18 Mar 2010 11:56:40 +0100 In-Reply-To: <87hboehphv.fsf_-_@frosties.localdomain> (Goswin von Brederlow's message of "Wed, 17 Mar 2010 17:39:08 +0100") Message-ID: <874okd2907.fsf@frosties.localdomain> User-Agent: Gnus/5.110009 (No Gnus v0.9) XEmacs/21.4.22 (linux, no MULE) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: goswin-v-b@web.de X-Sender: goswin-v-b@web.de X-Provags-ID: V01U2FsdGVkX18aAM6APv+Xm7W8VwGbQt5VZJOKmxsKicQNu2IF wvB6JyEJYYOMt9GWjeMQDGFPf6kWLaQ0J2/QD1munsGJGq0su3 0hCryp6OE= X-Spam: no; 0.00; segfaults:01 0100,:01 bigarrays:01 patched:01 ocaml:01 compiler:01 segfaults:01 allocations:01 camlparam:01 camlparam:01 finalizer:01 camlprim:01 bigarray:01 struct:01 struct:01 Goswin von Brederlow writes: > Goswin von Brederlow writes: > >>> On Wed, Mar 17, 2010 at 09:27:30AM +0100, Goswin von Brederlow wrote: >>>> I want to rewrite the Digest module to expose a more lowlevel interface >>>> to the md5 digest and add support to digest Bigarrays. I've patched the >>>> respective files involved and it all looks alright but when I try to >>>> build ocaml I get the following error: > > Ok, so I managed to bootstrap the compiler properly and build debian > packages with my new Digest interface. But something is still wrong as I > randomly get segfaults or > > Fatal error: out of memory. > > The more threads I use to compute digests in parallel the more likely > the error becomes. But that might just be an issue with more allocations > hapening and not a race condition between threads. I finaly tracked down the issue with the help of Erkki Seppala. First problem: I had the function declared as "noalloc" but used CAMLparam2() in it. That seems to cause random segfaults. I don't understand why but if I remove the "noalloc" then it works. Second problem: When I remove the CAMLparam2() the finalizer is called too early: CAMLprim value md5_update_bigarray(value context, value vb) { //CAMLparam2(context, vb); struct helper *helper = (struct helper*)Data_custom_val(context); struct MD5Context *ctx = helper->ctx; fprintf(stderr, "update_bigarray: helper = %p, ctx = %p\n", helper, ctx); struct caml_ba_array * b = Caml_ba_array_val(vb); unsigned char *data = b->data; uintnat len = caml_ba_byte_size(b); caml_enter_blocking_section(); caml_MD5Update(ctx, data, len); caml_leave_blocking_section(); //CAMLreturn(Val_unit); return Val_unit; } let rec loop () = Mutex.lock mutex; if !num = 0 then Mutex.unlock mutex else begin decr num; Mutex.unlock mutex; let context = context () in let () = update_bigarray context buf in loop () end in loop () This sometimes results in the following code flow: context () <- allocates memory update_bigarray context buf caml_enter_blocking_section(); THREAD SWITCH GC runs context is finalized <- frees memory THREAD SWITCH BACK caml_MD5Update(ctx, data, len); <- writes to ctx which is freeed Looks like ocamlopt really is so smart that is sees that context is never used after the call to update_bigarray and removes it from the root set before calling update_bigarray. It assumes the update_bigarray will hold all its arguments alive itself, which is a valid assumption. This is a tricky situation. The md5_update_bigarray() on its own is a "noalloc" function. But due to the caml_enter_blocking_section() another thread can alloc and trigger a GC run in parallel. So I guess that makes the function actually not "noalloc". Well, problem solved, lesson learned. :) MfG Goswin