From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id p23Hv5QO027364 for ; Thu, 3 Mar 2011 18:57:05 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApEBAHdlb01QDPKCkWdsb2JhbACmegEBAQEJCwoHEQMivl6FYQSFGopU X-IronPort-AV: E=Sophos;i="4.62,259,1297033200"; d="scan'208";a="101205422" Received: from smtp08.smtpout.orange.fr (HELO smtp.smtpout.orange.fr) ([80.12.242.130]) by mail1-smtp-roc.national.inria.fr with ESMTP; 03 Mar 2011 18:57:00 +0100 Received: from [172.24.131.25] ([66.220.144.74]) by mwinf5d16 with ME id Ehww1g00B1cXi5u03hww7L; Thu, 03 Mar 2011 18:56:57 +0100 From: Yoann Padioleau Content-Type: text/plain; charset=us-ascii Date: Thu, 3 Mar 2011 09:56:55 -0800 Message-Id: To: Caml List Mime-Version: 1.0 (Apple Message framework v1082) X-Mailer: Apple Mail (2.1082) Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by walapai.inria.fr id p23Hv5QO027364 Subject: [Caml-list] tips to debug ocaml programs segfaulting Hi, I have a quite large program that segfaults. I can reproduce the segfault deterministically but have no idea how to fix it. The program is a server that given a filename lookup information in a berkley DB database on this file and then returns some results. For certain files everything is right but for other files the program just segfault. When I attach with gdb on the server here is what I get: [pad@unittest002 ~]$ gdb /home/engshare/tools/pfff_server 22436 GNU gdb Red Hat Linux (6.5-37.el5_2.2rh) ... Attaching to program: /home/engshare/tools/pfff_server, process 22436 ... Reading symbols from /lib64/libpcre.so.0...done. Loaded symbols for /lib64/libpcre.so.0 Reading symbols from /lib64/libdb-4.3.so...done. Loaded symbols for /lib64/libdb-4.3.so Reading symbols from /lib64/libpthread.so.0...done. [Thread debugging using libthread_db enabled] [New Thread 46912496215408 (LWP 22436)] [New Thread 1176140096 (LWP 23759)] Loaded symbols for /lib64/libpthread.so.0 Reading symbols from /lib64/libm.so.6...done. Loaded symbols for /lib64/libm.so.6 Reading symbols from /lib64/libdl.so.2...done. Loaded symbols for /lib64/libdl.so.2 Reading symbols from /usr/lib64/libncurses.so.5...done. Loaded symbols for /usr/lib64/libncurses.so.5 Reading symbols from /lib64/libc.so.6...done. Loaded symbols for /lib64/libc.so.6 Reading symbols from /lib64/ld-linux-x86-64.so.2...done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 0x000000358ac0ceab in accept () from /lib64/libpthread.so.0 (gdb) bt #0 0x000000358ac0ceab in accept () from /lib64/libpthread.so.0 #1 0x000000000040de8f in unix_accept () #2 0x0000000000425dd9 in caml_interprete () #3 0x000000000041317a in caml_main () #4 0x00000000004249cc in main () (gdb) run The program being debugged has been started already. Start it from the beginning? (y or n) n Program not restarted. (gdb) (gdb) continue Continuing. [New Thread 1124940096 (LWP 24691)] [Thread 1124940096 (LWP 24691) exited] [New Thread 1124940096 (LWP 24723)] [Thread 1124940096 (LWP 24723) exited] [New Thread 1124940096 (LWP 24758)] [Thread 1124940096 (LWP 24758) exited] [New Thread 1124940096 (LWP 24796)] Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 1124940096 (LWP 24796)] 0x00000000004258d0 in caml_interprete () (gdb) bt #0 0x00000000004258d0 in caml_interprete () #1 0x0000000000421c32 in caml_callbackN_exn () #2 0x0000000000421d16 in caml_callback_exn () #3 0x00000000004095e9 in caml_thread_start () #4 0x000000358ac062f7 in start_thread () from /lib64/libpthread.so.0 #5 0x000000358a0d1e3d in clone () from /lib64/libc.so.6 (gdb) At this point I don't know what to do. No idea how from this backtrace to go back to the root cause of the segfault. Any tips ?