caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] Delimcc.0 OPAM Package Difficulty on Mac OS X
@ 2013-04-12  8:56 oleg
  0 siblings, 0 replies; 3+ messages in thread
From: oleg @ 2013-04-12  8:56 UTC (permalink / raw)
  To: paulfsnively; +Cc: caml-list


> Unfortunately, testd0opt is my next headache: it crashes with a Bus Error on
> Darwin, but not on Debian Wheezy running in VirtualBox. VirtualBox doesn't
> do processor emulation, so the issue once again seems to be differences in
> the Darwin runtime environment from the Linux runtime environment. In the
> native context, my initial guess is that it may have something to do with
> stack alignment issues, but this is merely a guess.

I think Darwin and Ubuntu might be using different compilers (I think
Apple bet on LLVM/clang). Even when both system use GCC, chances are
they are of different versions, and some are more aggressive than
others (at optimization). The first thing to try is to compile
testd0opt without optimization (set -O0 or something like
that). Second, if there is a way to get a stack trace on seg fault
(e.g., via GDB) that could be helpful. 

BTW, it helps to compile stacks-native.c with the DEBUG
option. One can use either -DDEBUG or change
	#define DEBUG 0
at the beginning of the file so it reads "#define DEBUG 1". Now when
running the tesd0opt we should see more output. It would be good to
get the whole output of running tesd0opt.

And another thing: for native delimcc we don't have to use dynamic
linking. One may link in delimcc.cmx along with stacks-native.o
statically. Perhaps that might help.




^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Caml-list] Delimcc.0 OPAM Package Difficulty on Mac OS X
@ 2013-04-16  6:01 oleg
  0 siblings, 0 replies; 3+ messages in thread
From: oleg @ 2013-04-16  6:01 UTC (permalink / raw)
  To: paulfsnively; +Cc: caml-list


> ./test_samplingM
> rejection_sample: done 100 worlds
> Fatal error: exception Assert_failure("samplingM.ml", 16, 9)

Luckily, the assertion failure is harmless. Here is the complete code
where the failure occurs:

let tflip2_shared () =   (* sharing of flips *)
  let v = flip 0.5 in
  v && v;;
let () = assert (
  sample_rejection (random_selector 1) 100 tflip2_shared
    = [(0.48, V true); (0.52, V false)]);;

Essentially, we flip a coin 100 times and see how many times it comes
out head. The `random source' (random_selector 1) seeds the random
generator with a fixed value -- 1 in this case. Therefore, re-running
the code should give the same results. Alas, the implementation of
random number generator may change -- and it did between OCaml 3.11
and OCaml 3.12. Therefore, given the same seed, the sequence of
produced numbers can be different. Also, the behavior of the random
number generator may be platform-specific.

So, the good news is that the failure is harmless. The bad news is that
the change in the random number generator or the platform has
invalidated the regression tests. To see that Hansei functions
properly, one has to check the results manually. For example, 

  sample_rejection (random_selector 1) 100 tflip2_shared

may now produce  [(0.49, V true); (0.51, V false)]) or
[(0.52, V true); (0.48, V false)], etc.


> this is a little disturbing: I got almost-correct output with -O0 and
> -DDEBUG=1, but there was still a message about "can't happen" at the
> very end, still using Apple's GCC 4.2.1 for Mac OS X 10.6.8.  I got
> similarly incorrect results using the current release, 3.2, of clang.
> I then remembered that, some time ago, I'd installed a binary of a
> much more recent GCC 4.6.2 that I could optionally use. Using it, I
> get correct results from testd0opt, even without using -O0!
Very interesting! I do use GCC 4.2.1 but on FreeBSD. I guess the
problem is platform-specific (or GCC-backend specific). It could
happen that GCC 4.2.1 over-optimized on MacOS, and the later versions
of GCC have corrected the problem. Thank you for the investigation!



^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Caml-list] Delimcc.0 OPAM Package Difficulty on Mac OS X
@ 2013-04-09 11:08 oleg
  0 siblings, 0 replies; 3+ messages in thread
From: oleg @ 2013-04-09 11:08 UTC (permalink / raw)
  To: paulfsnively; +Cc: caml-list



Perhaps the following comment from delimcc:stacks.c will help:

/* This function is defined in stacks.h and implemented in the ocamlrun.
   We re-define it here with the weak attribute: although 
   the function is implemented by ocamlrun and so will be available
   at run-time, it is not exported by ocamlc. Unfortunately, ocamlc, 
   when linking the bytecode executable and checking dlls requires 
   that all symbols imported by dlls be satisfied at link time.
*/
void caml_realloc_stack (asize_t required_size)
  __attribute__ ((weak));


To elaborate: caml_realloc_stack is part of OCaml run-time, that is,
byte code interpreter. On my FreeBSD system:

nm -D /usr/local/bin/ocamlrun 
...
0000000000409ae0 T caml_realloc_stack
...

So, caml_realloc_stack exists and it is global (that is, exported and
available for linking). 

So, when testd0 is invoked, the bytecode interpreter ocamlrun starts,
loads eventually dlldelimcc.so. The latter needs caml_realloc_stack,
and it can find it in ocamlrun. Everything should work...

You might have noticed the strange thing that the error is reported
before testd0 gets to run -- by the compiler itself. This is because
ocamlc is a little bit too helpful. When it links the bytecode
executable, it loads all referenced dll -- at link time -- and checks
that all needed symbols _will_ be satisfied at run-time. (More
precisely, it does ldopen with the flag RTLD_NOW). Normally it
is not ordinary linker's job to simulate run-time linking. After all, the
run-time environment may be different from link-time
environment. And in our case, it is different indeed. If your ocamlc
is natively compiled, it does not use ocamlrun. And natively compiled
ocamlc does _not_ export caml_realloc_stack. So, the premature, too
helpful check really goes wrong here: caml_realloc_stack will be found
at run-time but cannot be found at link time (where it is not really
needed). 

The weak subterfuge was sufficient to keep ocamlc happy. 

nm -D dlldelimcc.so 
                 w caml_realloc_stack

The weak reference should never cause an error (even if it remains
unresolved) -- according to my man pages.  Apparently MacOS thinks
different. Or findlib may set some flag that cause dlopen
to behave differently. I don't have access to MacOS. I guess I would
first check that dlldelimcc.so  does indeed refers to
caml_realloc_stack as a weak symbol, and then ask around as to why it
causes the problems on MacOS specifically.

Of course I'd be happier if ocamlc didn't use the RTLD_NOW flag.




^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2013-04-16  6:01 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-04-12  8:56 [Caml-list] Delimcc.0 OPAM Package Difficulty on Mac OS X oleg
  -- strict thread matches above, loose matches on Subject: below --
2013-04-16  6:01 oleg
2013-04-09 11:08 oleg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).