caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Segfault in ARM EABI for programm compiled with ocamlopt 3.12.0
@ 2010-11-24  0:20 rixed
  2011-06-29  8:52 ` [Caml-list] " SerP
  0 siblings, 1 reply; 2+ messages in thread
From: rixed @ 2010-11-24  0:20 UTC (permalink / raw)
  To: caml-list

For some time now I'm after a bug hitting a program of mine when
compiled on ARM with ocaml 3.12.0.
I initially though my own C code was misbehaving but the program keep
crashing, although not as early, if I comment out all calls to the C
functions.

The segfaults happen frequently during the GC, in oldify_one or
oldify_mopup, but also in a few other places such as camlList__rev_append
or caml__apply2 or any other places as well. In caml_oldify_one, for
instance, the segfault always happen at the same location : the
assertion that sz is not 0 (and of course when you read the code it's
pretty clear that sz=0 correspond to the case "already forwarded" that's
handled at the beginning of the function).

The pattern, then, is that a register (usually r0, r2 or r5) is
restored from the stack after a call to a function that might call the
GC (or to a call to the GC itself), then dereferenced. It's obvious
inspecting the stack with gdb that this very word was changed during the
call and a value like 0, 3 or 1024 is read back into the register
instead of an mlvalue.

I didn't managed (yet) to reduce the size of the program to a small show
case, and I am under the impression that all these components are
required in order for the bug to happen 'fast enough' :

- threads
- floats
- call to C function (greatly reduce the time to wait before the crash)

I am also under the impression that the bug is affected by the new stack
alignment requirement (because in one occurrence, calling or not a
function that does nothing from within a function hit by the bug reduced
drastically the probability of the bug, and the major difference I saw
was that on one version of the function the stack size was 16 bytes and
the other 24 bytes (16+4 apparently for the address of a "module"
structure, aligned up to 24 bytes). I thus manually checked the
generated framesets but they were allright as far as I understand them.

Now I'm a little desperate since each recompile+test takes about 20
minutes and the bug is so erratic ; so if someone here is familiar with
ARM arch and in particular the difference between old and new ABI please
suggest me what I should check, or any hint whatsoever. I'd be very much
grateful as this consumes a lot of my spare time.

Also, I'm compiling ocaml with gcc 4.2.1 - do you think it may be a
problem with gcc not following the very same ABI ?

Also I've run the testsuite but it did not reveal anything.


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [Caml-list] Segfault in ARM EABI for programm compiled with ocamlopt 3.12.0
  2010-11-24  0:20 Segfault in ARM EABI for programm compiled with ocamlopt 3.12.0 rixed
@ 2011-06-29  8:52 ` SerP
  0 siblings, 0 replies; 2+ messages in thread
From: SerP @ 2011-06-29  8:52 UTC (permalink / raw)
  To: rixed; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 3019 bytes --]

It took a long time, could you understand why this bug happens.
On the iphone I get the same bug with ocaml-3.12?

On Wed, Nov 24, 2010 at 3:20 AM, <rixed@happyleptic.org> wrote:

> For some time now I'm after a bug hitting a program of mine when
> compiled on ARM with ocaml 3.12.0.
> I initially though my own C code was misbehaving but the program keep
> crashing, although not as early, if I comment out all calls to the C
> functions.
>
> The segfaults happen frequently during the GC, in oldify_one or
> oldify_mopup, but also in a few other places such as camlList__rev_append
> or caml__apply2 or any other places as well. In caml_oldify_one, for
> instance, the segfault always happen at the same location : the
> assertion that sz is not 0 (and of course when you read the code it's
> pretty clear that sz=0 correspond to the case "already forwarded" that's
> handled at the beginning of the function).
>
> The pattern, then, is that a register (usually r0, r2 or r5) is
> restored from the stack after a call to a function that might call the
> GC (or to a call to the GC itself), then dereferenced. It's obvious
> inspecting the stack with gdb that this very word was changed during the
> call and a value like 0, 3 or 1024 is read back into the register
> instead of an mlvalue.
>
> I didn't managed (yet) to reduce the size of the program to a small show
> case, and I am under the impression that all these components are
> required in order for the bug to happen 'fast enough' :
>
> - threads
> - floats
> - call to C function (greatly reduce the time to wait before the crash)
>
> I am also under the impression that the bug is affected by the new stack
> alignment requirement (because in one occurrence, calling or not a
> function that does nothing from within a function hit by the bug reduced
> drastically the probability of the bug, and the major difference I saw
> was that on one version of the function the stack size was 16 bytes and
> the other 24 bytes (16+4 apparently for the address of a "module"
> structure, aligned up to 24 bytes). I thus manually checked the
> generated framesets but they were allright as far as I understand them.
>
> Now I'm a little desperate since each recompile+test takes about 20
> minutes and the bug is so erratic ; so if someone here is familiar with
> ARM arch and in particular the difference between old and new ABI please
> suggest me what I should check, or any hint whatsoever. I'd be very much
> grateful as this consumes a lot of my spare time.
>
> Also, I'm compiling ocaml with gcc 4.2.1 - do you think it may be a
> problem with gcc not following the very same ABI ?
>
> Also I've run the testsuite but it did not reveal anything.
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>

[-- Attachment #2: Type: text/html, Size: 4559 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2011-06-29  8:52 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-24  0:20 Segfault in ARM EABI for programm compiled with ocamlopt 3.12.0 rixed
2011-06-29  8:52 ` [Caml-list] " SerP

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).