caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] Exception backtraces and stack overflows
@ 2012-07-16 13:51 Alexey Rodriguez
  2012-07-16 15:06 ` [Caml-list] " Alexey Rodriguez
  0 siblings, 1 reply; 3+ messages in thread
From: Alexey Rodriguez @ 2012-07-16 13:51 UTC (permalink / raw)
  To: OCaml List

[-- Attachment #1: Type: text/plain, Size: 1785 bytes --]

Hi,

I am having trouble understanding exception backtraces for stack overflows.

Sometimes the backtrace only contains entries for the function that filled
the stack with frames (you would see many backtrace entries pointing to
List.map if you were trying to map a very long list). Such traces are
useless to fix the stack overflow since you cannot use them to find the
code path that leads to List.map.

In other situations, the backtrace contains the full path from the Ocaml
entry point to the recursive functions that is blowing up the stack. In
these situations the backtrace appears to have "compressed" the hundreds of
thousands of frames that the recursive calls generated since there is only
one entry for List.map.

Is there documentation that explains when you get one backtrace or the
other? I tried to understand the source code of caml_stash_backtrace and
there it seems that all the stack frames are captured (if the backtrace
buffer size allows). Casual inspection with gdb shows that
caml_stash_backtrace does not get the full stack at the moment of the
fault. Maybe the signal handler is skipping over the hundreds of thousands
of frames somehow? If someone can elucidate this mystery for me I'll be
very grateful!

I can provide more details if needed, but probably someone on the list can
already help with this short description.

Oh, one more question on backtraces. I see that when tracing is enabled,
caml_stash_backtrace is called whenever an exception is thrown. This might
be expensive as Not_found is raised by many functions in the standard
library. Is there a high overhead in leaving tracing enabled? This is
useful in production systems as very often it is not possible to have the
original inputs to trigger the bug in a debug build.

Thanks!

Alexey

[-- Attachment #2: Type: text/html, Size: 2000 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Caml-list] Re: Exception backtraces and stack overflows
  2012-07-16 13:51 [Caml-list] Exception backtraces and stack overflows Alexey Rodriguez
@ 2012-07-16 15:06 ` Alexey Rodriguez
  2012-07-16 22:09   ` Fabrice Le Fessant
  0 siblings, 1 reply; 3+ messages in thread
From: Alexey Rodriguez @ 2012-07-16 15:06 UTC (permalink / raw)
  To: OCaml List


[-- Attachment #1.1: Type: text/plain, Size: 2825 bytes --]

Hi again,

A colleague suggested doing the following experiment: call List.map on a
large list and throw an exception from deep down in the call chain.

Now the backtrace I get contains 1022 entries for map, an entry for the
raise site and some other entry. This matches the 1024 limit of
BACKTRACE_BUFFER_SIZE. Since the limit has been reached, the backtrace is
useless to diagnose the stack overflow. This matches my understanding of
caml_stash_backtrace: all stack frames are inspected and reported as long
as there is space in the trace buffer.

So it seems there is something funny happening when a stack overflow is
detected in the SIGSEGV handler:  there are only 3 trace entries whereas
the stack contains over a hundred thousand frames. Is this intended
behavior?

If it is of any help I am including the test program. I am using Ocaml
3.12.0 on a x86-64 platform.

Cheers,

Alexey

On Mon, Jul 16, 2012 at 3:51 PM, Alexey Rodriguez <mrchebas@gmail.com>wrote:

> Hi,
>
> I am having trouble understanding exception backtraces for stack overflows.
>
> Sometimes the backtrace only contains entries for the function that filled
> the stack with frames (you would see many backtrace entries pointing to
> List.map if you were trying to map a very long list). Such traces are
> useless to fix the stack overflow since you cannot use them to find the
> code path that leads to List.map.
>
> In other situations, the backtrace contains the full path from the Ocaml
> entry point to the recursive functions that is blowing up the stack. In
> these situations the backtrace appears to have "compressed" the hundreds of
> thousands of frames that the recursive calls generated since there is only
> one entry for List.map.
>
> Is there documentation that explains when you get one backtrace or the
> other? I tried to understand the source code of caml_stash_backtrace and
> there it seems that all the stack frames are captured (if the backtrace
> buffer size allows). Casual inspection with gdb shows that
> caml_stash_backtrace does not get the full stack at the moment of the
> fault. Maybe the signal handler is skipping over the hundreds of thousands
> of frames somehow? If someone can elucidate this mystery for me I'll be
> very grateful!
>
> I can provide more details if needed, but probably someone on the list can
> already help with this short description.
>
> Oh, one more question on backtraces. I see that when tracing is enabled,
> caml_stash_backtrace is called whenever an exception is thrown. This might
> be expensive as Not_found is raised by many functions in the standard
> library. Is there a high overhead in leaving tracing enabled? This is
> useful in production systems as very often it is not possible to have the
> original inputs to trigger the bug in a debug build.
>
> Thanks!
>
> Alexey
>

[-- Attachment #1.2: Type: text/html, Size: 3440 bytes --]

[-- Attachment #2: stack_overflow.ml --]
[-- Type: application/octet-stream, Size: 693 bytes --]


let make_list n =
  let rec go accum = function
    | 0 -> accum
    | n -> go (n::accum) (n-1)
  in
  go [] n

let rec my_map f = function
  | [] -> []
  | (x::xs) ->
      let y = f x in
      let ys = my_map f xs in
      y :: ys

exception Die_die_die

(* Make this false to generate a "normally handled" exception and get
 * many backtrace entries. *)
let generate_stack_overflow = true

let inc n =
  if not generate_stack_overflow && n = 10000 then raise Die_die_die;
  n + 1

let main () =
  Printf.fprintf stderr "Making list\n";
  let l = make_list (1000 * 400) in
  Printf.fprintf stderr "Mapping\n";
  let _l2 = my_map inc l in
  Printf.fprintf stderr "Done\n"

let () = main ()


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Caml-list] Re: Exception backtraces and stack overflows
  2012-07-16 15:06 ` [Caml-list] " Alexey Rodriguez
@ 2012-07-16 22:09   ` Fabrice Le Fessant
  0 siblings, 0 replies; 3+ messages in thread
From: Fabrice Le Fessant @ 2012-07-16 22:09 UTC (permalink / raw)
  To: Alexey Rodriguez; +Cc: OCaml List

The problem for backtraces with SIGSEGV is that the stack trace starts
from the last pointer on the stack on which the system can rely, with
a corresponding return address in the set of OCaml stack frames. The
only such pointer available at the SIGSEGV handler is stored in
caml_bottom_of_stack (with the PC in caml_last_return_address), and
these pointers are updated only when you do a C call. Since in your
program, there is nothing happening before the recursive call, this
pointer is never updated during the recursion and the backtrace only
contains what was on the stack at the last C call.

In your program, you can experiment that by adding "let z = [x] in"
before the recursive call in "my_map", this will allocate something,
and the GC will be triggered at some point, so that you will get the
full backtrace... at least from the point where the GC was called,
before the stack overflow.

Another funny example is to replace the test in "inc" by :
  if n mod 100000 = 0 then print_char 'x';

Then, whatever you do, the backtrace will be restricted to :
Raised at file "pervasives.ml", line 363, characters 19-39
In fact, the stack overflow did not happen in that function (check
using gdb, the backtrace printed by ocaml is actually completely
wrong), but in "my_map": "caml_bottom_of_stack" and
"caml_last_return_address" point to "print_char", so this location is
printed, and then the scan of the stack immediatly stops when it
discovers that the stack does not correspond to that (believing that
it's probably because -g was forgotten).

Maybe this behavior could be improved, at the cost of a more expensive
scan of the stack (as done in bytecode), done only in the case of a
stack overflow.
-Fabrice


On Mon, Jul 16, 2012 at 5:06 PM, Alexey Rodriguez <mrchebas@gmail.com> wrote:
> Hi again,
>
> A colleague suggested doing the following experiment: call List.map on a
> large list and throw an exception from deep down in the call chain.
>
> Now the backtrace I get contains 1022 entries for map, an entry for the
> raise site and some other entry. This matches the 1024 limit of
> BACKTRACE_BUFFER_SIZE. Since the limit has been reached, the backtrace is
> useless to diagnose the stack overflow. This matches my understanding of
> caml_stash_backtrace: all stack frames are inspected and reported as long as
> there is space in the trace buffer.
>
> So it seems there is something funny happening when a stack overflow is
> detected in the SIGSEGV handler:  there are only 3 trace entries whereas the
> stack contains over a hundred thousand frames. Is this intended behavior?
>
> If it is of any help I am including the test program. I am using Ocaml
> 3.12.0 on a x86-64 platform.
>
> Cheers,
>
> Alexey
>
> On Mon, Jul 16, 2012 at 3:51 PM, Alexey Rodriguez <mrchebas@gmail.com>
> wrote:
>>
>> Hi,
>>
>> I am having trouble understanding exception backtraces for stack
>> overflows.
>>
>> Sometimes the backtrace only contains entries for the function that filled
>> the stack with frames (you would see many backtrace entries pointing to
>> List.map if you were trying to map a very long list). Such traces are
>> useless to fix the stack overflow since you cannot use them to find the code
>> path that leads to List.map.
>>
>> In other situations, the backtrace contains the full path from the Ocaml
>> entry point to the recursive functions that is blowing up the stack. In
>> these situations the backtrace appears to have "compressed" the hundreds of
>> thousands of frames that the recursive calls generated since there is only
>> one entry for List.map.
>>
>> Is there documentation that explains when you get one backtrace or the
>> other? I tried to understand the source code of caml_stash_backtrace and
>> there it seems that all the stack frames are captured (if the backtrace
>> buffer size allows). Casual inspection with gdb shows that
>> caml_stash_backtrace does not get the full stack at the moment of the fault.
>> Maybe the signal handler is skipping over the hundreds of thousands of
>> frames somehow? If someone can elucidate this mystery for me I'll be very
>> grateful!
>>
>> I can provide more details if needed, but probably someone on the list can
>> already help with this short description.
>>
>> Oh, one more question on backtraces. I see that when tracing is enabled,
>> caml_stash_backtrace is called whenever an exception is thrown. This might
>> be expensive as Not_found is raised by many functions in the standard
>> library. Is there a high overhead in leaving tracing enabled? This is useful
>> in production systems as very often it is not possible to have the original
>> inputs to trigger the bug in a debug build.
>>
>> Thanks!
>>
>> Alexey
>
>



-- 
Fabrice LE FESSANT

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-07-16 22:09 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-16 13:51 [Caml-list] Exception backtraces and stack overflows Alexey Rodriguez
2012-07-16 15:06 ` [Caml-list] " Alexey Rodriguez
2012-07-16 22:09   ` Fabrice Le Fessant

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).