caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Re: [Caml-list] debugger losing contact with debuggee process
       [not found] <Pine.BSF.4.44.0210200815490.98557-100000@bowser.eecs.harvard.edu>
@ 2002-10-21  6:58 ` Pierre Weis
  0 siblings, 0 replies; 4+ messages in thread
From: Pierre Weis @ 2002-10-21  6:58 UTC (permalink / raw)
  To: Lex Stein; +Cc: pierre.weis, caml-list

> > The information that ocamldebug gave you is helpful : it provides the
> > mean to go back way before the bus error (Time 290000), and ensures
> > that the bus error will appear before Time 300000. To go (go) just before
> > the bus error and ask for a backtrace then (bt) is just a matter of
> > dichotomy and is very fast. Once your very near the bus error you can
> > step use instruction stepping (s) and print (p) and next event (n) to
> > understand what happens.
> 
> Yes, but then I need to start up the program again, goto time 290000 and
> step through the code until I hit the bus error. Then I need to note which
> time it happened at, reload the program again, goto that time, and do a
> backtrace.

You need not to reload the program (as mentioned in your trace the
debugger automatically reconnects to the nearest check point).

> The stepping through the code part is of concern to me. There are 10000
> events between time 290000 and 300000. Am I really expect to press "n"
> 10000 times (this is the worst case, but the expected number is 5000,
> which is still a large number for an interactive human operation)? I think
> not.

Dichotomy is there to prevent you for this fastidious task.

> So I jump forward to 295000, see if the core dump happens between
> 290000 and 295000, and repeat. Is this the suggested approach? This is a
> binary search to find the location of the fault. I like O(log n)
> operations, but I like O(1) operations better. Loading up a core dump file
> and doing a backtrace is an O(1) operation for me.

Yes, but there is no such feature in the debugger. Sorry for that.

> Another concern I have with the approach of goto time 290000 is that the
> fault is caused by an external event (receiving an RPC) so it will not
> always be at the same time. What would I do first, the goto or trigger the
> external event ? Both would seem to be problematic.

Yes, our debugger is tuned towards debugging of deterministic
programs. The first thing you should do to use it is to tune your
program to become deterministic first.

Anyhow debugging nn deterministic programs is extremely challenging
and we don't know how to do it in general. (Also our debugger is not
able to help you for that.)

> Your help is appreciated.

You're welcome.

Regards,

Pierre Weis

INRIA, Projet Cristal, Pierre.Weis@inria.fr, http://pauillac.inria.fr/~weis/


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] debugger losing contact with debuggee process
  2002-10-19 22:15 ` [Caml-list] debugger losing contact with debuggee process Lex Stein
  2002-10-20 10:06   ` Pierre Weis
@ 2002-10-21  9:11   ` Xavier Leroy
  1 sibling, 0 replies; 4+ messages in thread
From: Xavier Leroy @ 2002-10-21  9:11 UTC (permalink / raw)
  To: Lex Stein; +Cc: caml-list

> A process I am debugging has a bus error and crashes. When it crashes, it
> loses contact with ocaml_debug. The output looks something like this:
> 
> calling db->put with db=3c8c0 txn=0 flags=0
> Invalid argument (error number 22)
> BDB: aborting transaction

To complement Pierre's reply: if you do the dichotomy trick that
Pierre described, you'll probably find that the crash occurs in a 
C function (declared "external" in Caml).  Thus, you'll have to run
your program under a C debugger such as gdb.  To make this easier,
try to compile the C code with "-g", and link the Caml code with
"ocamlc -o myprog -custom -ccopt -g".  This way, you'll get a
standalone executable named myprog, with debug information attached.

Then, do "gdb myprog", and "run", until it crashes.  gdb "bt" command
will show you where the crash is located.  If it's in a C function
called directly or indirectly from OCaml's "interprete" function,
you're lucky: the error is indeed inside C code, and can be tracked
down just like you'd do for a C program.  If the crash is in
"interprete" or some other function of the OCaml runtime system,
things will be harder: presumably, some C code returned an illegal
Caml value, or messed up with the GC, causing a crash later in the
OCaml runtime system.  A good way to attack these problems is to
conduct a careful code review of the C/OCaml stub code, questioning
every single allocation and construction of OCaml values.

> [ I posted this question to the ocaml_beginners list. After receiving no
> replies on that list after 12hours, I conclude that the people on that
> list don't have the experience with ocamldebug to answer the question and
> am posting it to this list. ]

Your post is on-topic for this list.  However, your expectation that
you should get answers within 12 hours is ridiculous.  Even if you
paid a hefty support contract for a commercial development tool, you
would not get that.

- Xavier Leroy
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] debugger losing contact with debuggee process
  2002-10-19 22:15 ` [Caml-list] debugger losing contact with debuggee process Lex Stein
@ 2002-10-20 10:06   ` Pierre Weis
  2002-10-21  9:11   ` Xavier Leroy
  1 sibling, 0 replies; 4+ messages in thread
From: Pierre Weis @ 2002-10-20 10:06 UTC (permalink / raw)
  To: Lex Stein; +Cc: caml-list

Hello,

> A process I am debugging has a bus error and crashes. When it crashes, it
> loses contact with ocaml_debug. The output looks something like this:
> 
> calling db->put with db=3c8c0 txn=0 flags=0
> Invalid argument (error number 22)
> BDB: aborting transaction
> Lost connection with process 3531 (active process)
> between time 290000 and time 300000
               ^^^^^^          ^^^^^^
These numbers are very valuable information...

> Trying to recover...
> Time : 290000 - pc : 59612 - module Printf
         ^^^^^^
> 186 <|b|>let res = Buffer.contents dest in
> 
> The debugger loses contact with the debuggee process because the debuggee
> has a bus error and terminates.
> 
> The information provided by ocamldebug above isn't very helpful. How do I
> get a backtrace at the time of the bus error ? Something along the lines
> of a bactrace on a core dump file would be great. How does one get this
> information using ocamldebug ?

The information that ocamldebug gave you is helpful : it provides the
mean to go back way before the bus error (Time 290000), and ensures
that the bus error will appear before Time 300000. To go (go) just before
the bus error and ask for a backtrace then (bt) is just a matter of
dichotomy and is very fast. Once your very near the bus error you can
step use instruction stepping (s) and print (p) and next event (n) to
understand what happens.

(Use help in the debugger to get help in the debugger.)

> Lex
> 
> [ I posted this question to the ocaml_beginners list. After receiving no
> replies on that list after 12hours, I conclude that the people on that
> list don't have the experience with ocamldebug to answer the question and
> am posting it to this list. ]

This is a bit fast: don't forget the time difference between you and
the rest of the world! Also consider that people may have something
else to do than answering to your message just now!

Sincerely,

Pierre Weis

INRIA, Projet Cristal, Pierre.Weis@inria.fr, http://pauillac.inria.fr/~weis/
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Caml-list] debugger losing contact with debuggee process
  2002-10-19  9:16 [Caml-list] Array.resize ? Eray Ozkural
@ 2002-10-19 22:15 ` Lex Stein
  2002-10-20 10:06   ` Pierre Weis
  2002-10-21  9:11   ` Xavier Leroy
  0 siblings, 2 replies; 4+ messages in thread
From: Lex Stein @ 2002-10-19 22:15 UTC (permalink / raw)
  To: caml-list


Hello,

A process I am debugging has a bus error and crashes. When it crashes, it
loses contact with ocaml_debug. The output looks something like this:

calling db->put with db=3c8c0 txn=0 flags=0
Invalid argument (error number 22)
BDB: aborting transaction
Lost connection with process 3531 (active process)
between time 290000 and time 300000
Trying to recover...
Time : 290000 - pc : 59612 - module Printf
186 <|b|>let res = Buffer.contents dest in

The debugger loses contact with the debuggee process because the debuggee
has a bus error and terminates.

The information provided by ocamldebug above isn't very helpful. How do I
get a backtrace at the time of the bus error ? Something along the lines
of a bactrace on a core dump file would be great. How does one get this
information using ocamldebug ?

Sincerely,
Lex

[ I posted this question to the ocaml_beginners list. After receiving no
replies on that list after 12hours, I conclude that the people on that
list don't have the experience with ocamldebug to answer the question and
am posting it to this list. ]

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2002-10-21  9:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <Pine.BSF.4.44.0210200815490.98557-100000@bowser.eecs.harvard.edu>
2002-10-21  6:58 ` [Caml-list] debugger losing contact with debuggee process Pierre Weis
2002-10-19  9:16 [Caml-list] Array.resize ? Eray Ozkural
2002-10-19 22:15 ` [Caml-list] debugger losing contact with debuggee process Lex Stein
2002-10-20 10:06   ` Pierre Weis
2002-10-21  9:11   ` Xavier Leroy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).