From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by yquem.inria.fr (Postfix) with ESMTP id 183CFBC57 for ; Wed, 1 Dec 2010 17:12:10 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApAFAGMC9kzV+668/2dsb2JhbACVBI8BxCeFRwSEXoky X-IronPort-AV: E=Sophos;i="4.59,283,1288566000"; d="scan'208";a="90514923" Received: from rastageeks.org (HELO mail.rastageeks.org) ([213.251.174.188]) by mail1-smtp-roc.national.inria.fr with ESMTP/TLS/ADH-AES256-SHA; 01 Dec 2010 17:12:09 +0100 Received: from leonard.localnet (unknown [129.81.170.252]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.rastageeks.org (Postfix) with ESMTPSA id 610328899F for ; Wed, 1 Dec 2010 17:12:18 +0100 (CET) From: Romain Beauxis To: caml-list@yquem.inria.fr Subject: Re: [Caml-list] Tips to find the cause of a seg fault Date: Wed, 1 Dec 2010 10:15:33 -0600 User-Agent: KMail/1.13.5 (Linux/2.6.32-4-amd64; KDE/4.4.5; x86_64; ; ) References: <201011301959.04218.toots@rastageeks.org> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-15" Content-Transfer-Encoding: quoted-printable Message-Id: <201012011015.34879.toots@rastageeks.org> X-Spam: no; 0.00; segfault:01 uncomment:01 uncomment:01 instanciated:01 unrolling:01 instanciated:01 gdb:01 segfault:01 backtrace:01 ocaml:01 caml-list:01 partially:02 binary:02 confusing:02 seems:03 Le mercredi 1 d=E9cembre 2010 09:17:15, Philippe Veber a =E9crit : > > The function that triggers the segfault may be confusing, in particular > > in case of a memory corruption, which I suspect here. > > The pattern matching can cause a crash because it is using a value that > > is already corrupted and because the third case is one that, for some > > random conditions, touches the part in memory that is corrupted. >=20 > How is this possible if it is never reached (no left click) ? Well, I was giving a general reply which may or may not apply here.. The fact that the problem goes away when you uncomment the unused case coul= d=20 be unrelated, though. It could also be that the issue is not related to thi= s=20 exact function but that the compiled binary has a different execution flow = when=20 you uncomment the third case.. > > In this case, I would try to unroll the code and see where the value th= at > > is > > used in this function was instanciated. >=20 > What do you mean by "unrolling the code" ? Looking backward where the value used in the function was instanciated. > > Finally, the best approach could be to actually look closely to the > > binding's > > code and try to spot anything fishy there related to your issue. This > > generaly > > worked better for me than trying to get information from gdb and the > > like.. >=20 > Many thanks for the clarification. Maybe I could (partially) "unplug" the > GC by setting space_overhead to 100 ? That could give an indication on the > moment the problem occurs ? I've never tried this. What you can try also for instance is to comment the= =20 code that finalizes a value that you suspect causes the segfault.. However, I do not think your issue is related to the Gc. The backtrace does= =20 not seem to indicate that it occurs in C code with global lock removed so=20 maybe what I said was irrelevant. I have tried your minimal example and it does not segfault here too. As for= =20 Olivier, maybe this means that this is also related to the driver you are=20 using. However, the segfault definitely seem to occur in ocaml part of the = code=20 so it seems that the problem is entangled, at least. Romain Romain