Native code stack overflow detection guarantees

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

* Native code stack overflow detection guarantees
@ 2010-07-09  0:59 Michael Ekstrand
  2010-07-09  8:39 ` [Caml-list] " Mark Shinwell
  2010-07-11 11:03 ` [Caml-list] Native code stack overflow detection guarantees Goswin von Brederlow
  0 siblings, 2 replies; 4+ messages in thread
From: Michael Ekstrand @ 2010-07-09  0:59 UTC (permalink / raw)
  To: caml-list

Some time ago, I saw someone mention non-tail-recursive functions in
native code as a security problem.  Unfortunately, I cannot find where I
read that again, but the basic idea was that, if you have a recursive
function that uses stack linear in user-provided input, then the user
can trigger a stack overflow which, in native code, can allow your stack
pointer to go waltzing through memory and wreak general havoc since
stack overflows are not trapped.

I was attempting to do some further research on this subject today and
found section 11.5 in the OCaml manual (ocamlopt compatibility with the
bytecode compiler), where it described the stack overflow behavior:

 * By raising a Stack_overflow exception, like the bytecode compiler
does. (IA32/Linux, AMD64/Linux, PowerPC/MacOSX, MS Windows 32-bit ports).
 * By aborting the program on a “segmentation fault” signal. (All other
Unix systems.)
 * By terminating the program silently. (MS Windows 64 bits).

I have also turned up mailing list posts indicating that the stack
overflow exception is generated by a sigsegv signal handler on Linux.

My goal is to determine whether the ability for user input to cause a
stack overflow is merely a denial-of-service vulnerability (they shut
down the process, but can't do anything else), or whether it represents
an attack vector whereby they could modify memory, execute arbitrary
code, or perform other such shenanigans.

Therefore, I am wondering: are there documented guarantees on which the
native code stack overflow behavior rests?  Linux processes usually
receive a segmentation fault when they run out of stack space; is that
guaranteed, or is it simply the usual convenient behavior?  What about
for other systems?

If the program will always die, either via Stack_overflow or a
segmentation fault, as soon as the stack overflow occurs before there is
opportunity for more thorough memory corruption, then I think I can take
user-triggerable stack overflows to be inconvenient and undesirable but
not a viable attack vector other than for DoS attacks.

Thank you for any answers or insight you might be able to provide.

- Michael

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] Native code stack overflow detection guarantees
  2010-07-09  0:59 Native code stack overflow detection guarantees Michael Ekstrand
@ 2010-07-09  8:39 ` Mark Shinwell
  2010-07-09  8:46   ` [Caml-list] Native code stack overflow detection guarantees - followup Mark Shinwell
  2010-07-11 11:03 ` [Caml-list] Native code stack overflow detection guarantees Goswin von Brederlow
  1 sibling, 1 reply; 4+ messages in thread
From: Mark Shinwell @ 2010-07-09  8:39 UTC (permalink / raw)
  To: Michael Ekstrand; +Cc: caml-list

On Thu, Jul 08, 2010 at 08:59:30PM -0400, Michael Ekstrand wrote:
> Therefore, I am wondering: are there documented guarantees on which the
> native code stack overflow behavior rests?  Linux processes usually
> receive a segmentation fault when they run out of stack space; is that
> guaranteed, or is it simply the usual convenient behavior?  What about
> for other systems?

I am not an expert on all the various cases which might arise during a case of
stack overflow, but I believe on Linux it is exposed to userland as a
segmentation fault.  Exactly how this is exposed shouldn't be any different
when executing Caml native-compiled code or C code, for example.  In terms of
Caml-specific behaviour (and assuming that the user's code does not itself
alter the signal handling behaviour) then the runtime should catch the stack
overflow via the SIGSEGV handler.  The intuition behind what is supposed to
happen next is as follows: if the faulting address was in the stack, and the
program counter (PC) was "in your Caml program", then we produce a
Stack_overflow exception.  Otherwise we will just invoke the default signal
action for the segmentation fault, which on Linux will terminate the program.

(There are lots of functions in the runtime which could cause a stack overflow
in the case of a bug in their own code; and it would probably be bad if those
ended up with a Stack_overflow exception rather than a segfault.  This seems to
me to be at least one reason why you probably don't want to turn every segfault
in the stack into a Stack_overflow exception; instead, we try to distinguish
based on the PC.)

Unfortunately, there isn't really any wholly satisfactory notion of "in your
Caml program".  For example, you might end up getting close to blowing the
stack in a recursive Caml function which just computes things without calling
any library calls; and then you call a glibc function and hit the stack limit.
Such a fault might morally be "in your code" even though the PC is pointing to
somewhere inside libc.so.  Yet, you could also fail if you don't do any
recursing at all in your Caml program, and happen to call a function in glibc
which chooses to eat all the stack on Fridays only.  The symptom would be the
same; the PC is still inside libc.so, except this time glibc is at fault.

The approximation used by the runtime for "in your Caml program" is whether
the PC lies within the compiled Caml code of your program.  Thus, in
particular, if you get close to blowing the stack in your recursive function
and then happen to fail inside the runtime function [compare], for example,
it will not be reported as a Stack_overflow exception but rather a segfault.
As such, you can't rely on exactly what will happen; from the outside this
appears non-deterministic.

A further complication is addressed in Mantis 4746, which notes that the
address space randomization features of recent Linux kernels also has the
potential to mess up the runtime's metrics for calculating whether the
faulting address lies in the stack or not.  This can also affect, basically
randomly, whether you get Stack_overflow or a segfault.

Mark

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] Native code stack overflow detection guarantees - followup
  2010-07-09  8:39 ` [Caml-list] " Mark Shinwell
@ 2010-07-09  8:46   ` Mark Shinwell
  0 siblings, 0 replies; 4+ messages in thread
From: Mark Shinwell @ 2010-07-09  8:46 UTC (permalink / raw)
  To: Michael Ekstrand; +Cc: caml-list

On Fri, Jul 09, 2010 at 09:39:06AM +0100, Mark Shinwell wrote:
> On Thu, Jul 08, 2010 at 08:59:30PM -0400, Michael Ekstrand wrote:
> > Therefore, I am wondering: are there documented guarantees on which the
> > native code stack overflow behavior rests?  Linux processes usually receive
> > a segmentation fault when they run out of stack space; is that guaranteed,
> > or is it simply the usual convenient behavior?  What about for other
> > systems?
> 
> I am not an expert on all the various cases which might arise during a case
> of stack overflow, but I believe on Linux it is exposed to userland as a
> segmentation fault.  Exactly how this is exposed shouldn't be any different
> when executing Caml native-compiled code or C code, for example.  In terms of
> Caml-specific behaviour (and assuming that the user's code does not itself
> alter the signal handling behaviour) then the runtime should catch the stack
> overflow via the SIGSEGV handler.  The intuition behind what is supposed to
> happen next is as follows: if the faulting address was in the stack, and the
> program counter (PC) was "in your Caml program", then we produce a
> Stack_overflow exception.  Otherwise we will just invoke the default signal
> action for the segmentation fault, which on Linux will terminate the program.
> 
> (There are lots of functions in the runtime which could cause a stack
> overflow in the case of a bug in their own code; and it would probably be bad
> if those ended up with a Stack_overflow exception rather than a segfault.
> This seems to me to be at least one reason why you probably don't want to
> turn every segfault in the stack into a Stack_overflow exception; instead, we
> try to distinguish based on the PC.)

I should add that what I wrote was for Linux/x86.  On other platforms the
behaviour may differ depending on what system support is available.
You need HAS_STACK_OVERFLOW_DETECTION (see the Caml configure script) set
to get any of this at all; and to have the distinguishing based on the
program counter location, you need CONTEXT_PC to have been defined
(see asmrun/signals_osdep.h in the Caml source).

Mark


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] Native code stack overflow detection guarantees
  2010-07-09  0:59 Native code stack overflow detection guarantees Michael Ekstrand
  2010-07-09  8:39 ` [Caml-list] " Mark Shinwell
@ 2010-07-11 11:03 ` Goswin von Brederlow
  1 sibling, 0 replies; 4+ messages in thread
From: Goswin von Brederlow @ 2010-07-11 11:03 UTC (permalink / raw)
  To: Michael Ekstrand; +Cc: caml-list

Michael Ekstrand <michael@elehack.net> writes:

> Some time ago, I saw someone mention non-tail-recursive functions in
> native code as a security problem.  Unfortunately, I cannot find where I
> read that again, but the basic idea was that, if you have a recursive
> function that uses stack linear in user-provided input, then the user
> can trigger a stack overflow which, in native code, can allow your stack
> pointer to go waltzing through memory and wreak general havoc since
> stack overflows are not trapped.

For that to happen you would have to get the stack pointer to overflow
so far that it actualy points into an allocated memory region
again. Stack frames usualy aren't that big and I'm pretty certain there
will be some unallocated space around the stack to catch
overflows. Isn't a stackframe for a recursive call in ocaml limited in
size (<< PAGE_SIZE)? Unless you have some varargs in there.

I don't see any security probem there other than DOS attacks. With an
exception you could catch it and continue running while a segfault kills
your program (usualy). So for native code you would have to inspect your
input and check if it will stack overflow before calling the recursive
function. Or just write the function tail recursive.

MfG
        Goswin

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-07-11 11:03 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-07-09  0:59 Native code stack overflow detection guarantees Michael Ekstrand
2010-07-09  8:39 ` [Caml-list] " Mark Shinwell
2010-07-09  8:46   ` [Caml-list] Native code stack overflow detection guarantees - followup Mark Shinwell
2010-07-11 11:03 ` [Caml-list] Native code stack overflow detection guarantees Goswin von Brederlow

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).