From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id p23IQsc7028779 for ; Thu, 3 Mar 2011 19:26:54 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AjsBAAZsb01V2gB5mWdsb2JhbACmegEBAQEBCAsKBxElvlsNhVQEkAg X-IronPort-AV: E=Sophos;i="4.62,259,1297033200"; d="scan'208";a="77036767" Received: from emailfrontal2.citycable.ch ([85.218.0.121]) by mail3-smtp-sop.national.inria.fr with SMTP; 03 Mar 2011 19:26:49 +0100 X-Alinto-smtpauth-localdomain: Yes Received: from seldon (unknown [85.218.93.111]) (Authenticated sender: guillaume.yziquel@citycable.ch) by emailfrontal2.citycable.ch (Postfix) with ESMTPA id 8AEFE83408A; Thu, 3 Mar 2011 19:26:47 +0100 (CET) Received: from yziquel by seldon with local (Exim 4.72) (envelope-from ) id 1PvDCn-0000Hu-9j; Thu, 03 Mar 2011 19:24:53 +0100 Date: Thu, 3 Mar 2011 19:24:53 +0100 From: Guillaume Yziquel To: Yoann Padioleau Cc: Caml List Message-ID: <20110303182453.GF4097@localhost> References: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.20 (2009-06-14) Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by walapai.inria.fr id p23IQsc7028779 Subject: Re: [Caml-list] tips to debug ocaml programs segfaulting Le Thursday 03 Mar 2011 à 09:56:55 (-0800), Yoann Padioleau a écrit : > Hi, > > Program received signal SIGSEGV, Segmentation fault. > [Switching to Thread 1124940096 (LWP 24796)] > 0x00000000004258d0 in caml_interprete () > (gdb) bt > #0 0x00000000004258d0 in caml_interprete () > #1 0x0000000000421c32 in caml_callbackN_exn () > #2 0x0000000000421d16 in caml_callback_exn () > #3 0x00000000004095e9 in caml_thread_start () > #4 0x000000358ac062f7 in start_thread () from /lib64/libpthread.so.0 > #5 0x000000358a0d1e3d in clone () from /lib64/libc.so.6 > (gdb) > > > At this point I don't know what to do. No idea how from this backtrace to go back to the root cause of the segfault. Any tips ? Your program seems quite complex, so it's a bit hard to tell. But since you are using bytecode callbacks, I'd be curious to know if you are having the same issue when compiled to native code. If not, then it reminds me of an issue I had. You could disassemble the caml_interprete function and follow the execution of the bytecode interpreter. A possible segfault reason in caml_interprete is the following. There is a 'movq *%rax' instruction (at least on my box) from which the interpretation of bytecode is dispatched to the appropriate case. When using bytecode closures for callbacks, it did happen that there was one indirection too much (though I do not remember doing anything wrong), and garbage was then fed to the rax register. Code jumped to a garbage position, and segfaulted. To me, that's one of the most likely segfault reasons in caml_interprete in the presence of bytecode callbacks. So disassemble caml_interprete, look at registers with info registers, and check that you're not jumping to a random position. -- Guillaume Yziquel