From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id p23IAil0028062 for ; Thu, 3 Mar 2011 19:10:44 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApEBAINob01QDPKCkWdsb2JhbACYLI5OAQEBAQkLCgcRAyK+XIVhBIUailQ X-IronPort-AV: E=Sophos;i="4.62,259,1297033200"; d="scan'208";a="77035840" Received: from smtp08.smtpout.orange.fr (HELO smtp.smtpout.orange.fr) ([80.12.242.130]) by mail3-smtp-sop.national.inria.fr with ESMTP; 03 Mar 2011 19:10:39 +0100 Received: from [172.24.131.25] ([66.220.144.74]) by mwinf5d16 with ME id EiAd1g00L1cXi5u03iAeFg; Thu, 03 Mar 2011 19:10:38 +0100 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1082) From: Yoann Padioleau In-Reply-To: Date: Thu, 3 Mar 2011 10:10:36 -0800 Message-Id: References: To: Caml List X-Mailer: Apple Mail (2.1082) Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by walapai.inria.fr id p23IAil0028062 Subject: Re: [Caml-list] tips to debug ocaml programs segfaulting And this is what I get when in native mode: [pad@unittest002 ~]$ gdb /home/engshare/tools/pfff_server.opt 28322 GNU gdb Red Hat Linux (6.5-37.el5_2.2rh) Copyright (C) 2006 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu".../home/pad/.gdbinit:1: Error in sourced command file: Undefined command: "python". Try "help". Using host libthread_db library "/lib64/libthread_db.so.1". Attaching to program: /home/engshare/tools/pfff_server.opt, process 28322 Reading symbols from /lib64/libpcre.so.0...done. Loaded symbols for /lib64/libpcre.so.0 Reading symbols from /lib64/libdb-4.3.so...done. Loaded symbols for /lib64/libdb-4.3.so Reading symbols from /lib64/libpthread.so.0...done. [Thread debugging using libthread_db enabled] [New Thread 46912496213936 (LWP 28322)] [New Thread 1176140096 (LWP 28627)] Loaded symbols for /lib64/libpthread.so.0 Reading symbols from /lib64/libm.so.6...done. Loaded symbols for /lib64/libm.so.6 Reading symbols from /lib64/libdl.so.2...done. Loaded symbols for /lib64/libdl.so.2 Reading symbols from /lib64/libc.so.6...done. Loaded symbols for /lib64/libc.so.6 Reading symbols from /lib64/ld-linux-x86-64.so.2...done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 0x000000358ac0ceab in accept () from /lib64/libpthread.so.0 (gdb) continue Continuing. [New Thread 1124940096 (LWP 28767)] [Thread 1124940096 (LWP 28767) exited] [New Thread 1124940096 (LWP 28808)] Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 1124940096 (LWP 28808)] 0x00000000004d06e6 in camlVisitor_php__v_paren_685 () (gdb) bt #0 0x00000000004d06e6 in camlVisitor_php__v_paren_685 () #1 0x00002aaaaaad71e8 in ?? () #2 0x00002aaaaaad7500 in ?? () #3 0x0000000000000b00 in ?? () #4 0x00000000004cdde2 in camlVisitor_php__v_variablebis_779 () #5 0x00002aaaaaad7fb0 in ?? () #6 0x000000001769e888 in ?? () #7 0x00000000430d2a80 in ?? () #8 0x00000000004caca8 in camlVisitor_php__k_1608 () ... the Visitor_php.v_paren function is as the name suggest part of a set of functions to help visit the AST of a PHP program. This AST is marshalled in berkeley DB tables. I guess that's one possible cause for this segfault, a bug in berkeley DB that causes an incorrect marshalling of the AST which when unmarshalled cause some segfault ? On Mar 3, 2011, at 9:56 AM, Yoann Padioleau wrote: > Hi, > > I have a quite large program that segfaults. I can reproduce the segfault deterministically but have no idea > how to fix it. The program is a server that given a filename lookup information in a berkley DB database on this file > and then returns some results. For certain files everything is right but for other files the program just segfault. > When I attach with gdb on the server here is what I get: > > [pad@unittest002 ~]$ gdb /home/engshare/tools/pfff_server 22436 > GNU gdb Red Hat Linux (6.5-37.el5_2.2rh) > ... > Attaching to program: /home/engshare/tools/pfff_server, process 22436 > ... > Reading symbols from /lib64/libpcre.so.0...done. > Loaded symbols for /lib64/libpcre.so.0 > Reading symbols from /lib64/libdb-4.3.so...done. > Loaded symbols for /lib64/libdb-4.3.so > Reading symbols from /lib64/libpthread.so.0...done. > [Thread debugging using libthread_db enabled] > [New Thread 46912496215408 (LWP 22436)] > [New Thread 1176140096 (LWP 23759)] > Loaded symbols for /lib64/libpthread.so.0 > Reading symbols from /lib64/libm.so.6...done. > Loaded symbols for /lib64/libm.so.6 > Reading symbols from /lib64/libdl.so.2...done. > Loaded symbols for /lib64/libdl.so.2 > Reading symbols from /usr/lib64/libncurses.so.5...done. > Loaded symbols for /usr/lib64/libncurses.so.5 > Reading symbols from /lib64/libc.so.6...done. > Loaded symbols for /lib64/libc.so.6 > Reading symbols from /lib64/ld-linux-x86-64.so.2...done. > Loaded symbols for /lib64/ld-linux-x86-64.so.2 > 0x000000358ac0ceab in accept () from /lib64/libpthread.so.0 > (gdb) bt > #0 0x000000358ac0ceab in accept () from /lib64/libpthread.so.0 > #1 0x000000000040de8f in unix_accept () > #2 0x0000000000425dd9 in caml_interprete () > #3 0x000000000041317a in caml_main () > #4 0x00000000004249cc in main () > (gdb) run > The program being debugged has been started already. > Start it from the beginning? (y or n) n > Program not restarted. > (gdb) > (gdb) continue > Continuing. > [New Thread 1124940096 (LWP 24691)] > [Thread 1124940096 (LWP 24691) exited] > [New Thread 1124940096 (LWP 24723)] > [Thread 1124940096 (LWP 24723) exited] > [New Thread 1124940096 (LWP 24758)] > [Thread 1124940096 (LWP 24758) exited] > [New Thread 1124940096 (LWP 24796)] > > Program received signal SIGSEGV, Segmentation fault. > [Switching to Thread 1124940096 (LWP 24796)] > 0x00000000004258d0 in caml_interprete () > (gdb) bt > #0 0x00000000004258d0 in caml_interprete () > #1 0x0000000000421c32 in caml_callbackN_exn () > #2 0x0000000000421d16 in caml_callback_exn () > #3 0x00000000004095e9 in caml_thread_start () > #4 0x000000358ac062f7 in start_thread () from /lib64/libpthread.so.0 > #5 0x000000358a0d1e3d in clone () from /lib64/libc.so.6 > (gdb) > > > At this point I don't know what to do. No idea how from this backtrace to go back to the root cause of the segfault. Any tips ? > > > > > -- > Caml-list mailing list. Subscription management and archives: > https://sympa-roc.inria.fr/wws/info/caml-list > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs >