caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] OCaml binary formats -- how are they linked?
@ 2013-06-14  9:30 Ömer Sinan Ağacan
  2013-06-14 10:16 ` [Caml-list] " Dawid Toton
  0 siblings, 1 reply; 8+ messages in thread
From: Ömer Sinan Ağacan @ 2013-06-14  9:30 UTC (permalink / raw)
  To: caml-list

Hi all,

Let's say I have a C API called from OCaml. Bindings are compiled to
.cma, .cmx, .cmxa files.

What I'm wondering is that are those C objects linked to those binary
files statically or dynamically?

Basically what I want to do is to link two different versions of those
C objects(but they share same API, only difference is some assertions
and debug info is enabled in one version but disabled in other) with
minimal effort. If I want to enable debug info, I link one version of
compiled C object files, and if I want to operate faster I link other
version.

Now if those C objects are linked to .cma, .cmx, .cmxa, etc. files
statically, I think I have to compile those OCaml files with this
different versions of C objects, is that correct?

My guess is that those C objects are linked statically, because to
compile my program with this library, I only needed to point to
compiler .cma files. I'm not passing any parameters to show C object
files' location. I still wanted to be 100% sure about that.

And if those C objects are linked statically, is there a parameter or
something to force them to be linked dynamically?

Thanks,


---
Ömer Sinan Ağacan
http://osa1.net

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Caml-list] Re: OCaml binary formats -- how are they linked?
  2013-06-14  9:30 [Caml-list] OCaml binary formats -- how are they linked? Ömer Sinan Ağacan
@ 2013-06-14 10:16 ` Dawid Toton
  2013-06-14 12:02   ` Ömer Sinan Ağacan
  0 siblings, 1 reply; 8+ messages in thread
From: Dawid Toton @ 2013-06-14 10:16 UTC (permalink / raw)
  To: caml-list

On 06/14/2013 11:30 AM, Ömer Sinan Ağacan wrote:
> Hi all,
> 
> Let's say I have a C API called from OCaml. Bindings are compiled 
> to .cma, .cmx, .cmxa files.
> 
> What I'm wondering is that are those C objects linked to those 
> binary files statically or dynamically?
> 

Depending on what you do, some objects can be dynamically linked to an
OCaml program.

Since OCaml←→C interface requires the C code to follow the rules and
use the provided headers (like mlvalues.h), in practice an extra layer
of code is usually added:

---- normal OCaml stuff:
 a.ml → a.cmx + a.o
---- OCaml modules with "external" functions
 b.ml → b.cmx + b.o
---- the extra layer of C code:
 wrappers.c → wrappers.o
---- the actural C code that doesn't care about special OCaml needs:
 foo.c → foo.o
----

The OCaml compiler includes b.o and every other compiled OCaml module
in the executable.
wrappers.o and foo.o can end up as a part of dynamically linked
library, but you have choice. You can even call the linker directly
and play weird things with all the *.o files that C and OCaml
compilers produce.

Dawid



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] Re: OCaml binary formats -- how are they linked?
  2013-06-14 10:16 ` [Caml-list] " Dawid Toton
@ 2013-06-14 12:02   ` Ömer Sinan Ağacan
  2013-06-14 14:26     ` Dawid Toton
  0 siblings, 1 reply; 8+ messages in thread
From: Ömer Sinan Ağacan @ 2013-06-14 12:02 UTC (permalink / raw)
  To: Dawid Toton; +Cc: caml-list

Thanks for your answer,

> The OCaml compiler includes b.o and every other compiled OCaml module
> in the executable.
> wrappers.o and foo.o can end up as a part of dynamically linked
> library, but you have choice. You can even call the linker directly
> and play weird things with all the *.o files that C and OCaml
> compilers produce.

What do you mean by 'calling the linker'? Do you mean system level
functions like `dload` (or it's equivalent for OCaml libraries) ? Or
do you mean calling OCaml compiler's linker in compile time?

Is there a program like `ldd` that shows dynamically linked OCaml
libraries of a OCaml program?

Where can I learn more about OCaml binary file formats like .cma, .cmx
? I couldn't find a related section in language manual.

---
Ömer Sinan Ağacan
http://osa1.net

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Caml-list] Re: OCaml binary formats -- how are they linked?
  2013-06-14 12:02   ` Ömer Sinan Ağacan
@ 2013-06-14 14:26     ` Dawid Toton
  2013-06-14 19:17       ` Ömer Sinan Ağacan
  0 siblings, 1 reply; 8+ messages in thread
From: Dawid Toton @ 2013-06-14 14:26 UTC (permalink / raw)
  To: caml-list

On 06/14/2013 02:02 PM, Ömer Sinan Ağacan wrote:
> 
> What do you mean by 'calling the linker'? Do you mean system level 
> functions like `dload` (or it's equivalent for OCaml libraries) ?
> Or do you mean calling OCaml compiler's linker in compile time?

I mean calling the system linker in the last stage of build process.
OCaml native compiler uses standard assembler and linker provided by
the system. So, in principle, you can experiment with linking like this:

ld -o myexecutable .../crt1.o .../crti.o .../usr/lib/ocaml/std_exit.o
... a.o b.o ...
where the dots stay for quite a lot of other object files.

You can see what actually happens with

ocamlopt -verbose -c a.ml
ocamlopt -verbose -cc 'gcc --verbose' a.cmx

By the way, I would suggest sticking to bytecode for regular
development, as it permits faster edit-build-debug cycles. ocamlc is
very fast. I believe one has to have very demanding testsuite in order
to benefit from native compilation (barring production run).

> 
> Is there a program like `ldd` that shows dynamically linked OCaml 
> libraries of a OCaml program?

Natively compiled OCaml code is put to regular executables, so you can
use standard tools. ldd should show you all you need.

Bytecode is compiled to a files with #! prefix. They are executable in
a sense that the prefix tells the system to launch ocamlrun, otherwise
the compiled files are OCaml-specific.
In order to look inside the OCaml-specific files, you can use
ocamlobjinfo
ocamldumpobj

On my Debian system they are both part of the standard OCaml installation.

But, personally, I would just

strace -e trace=open ./myexecutable

to see the loaded libraries, as it is universal and simple to remember
tool.
	
> 
> Where can I learn more about OCaml binary file formats like .cma,
> .cmx ? I couldn't find a related section in language manual.
> 

Don't know, but here are the basics:

* cmi = compiled interface (module signature + CRCs of imported
interfaces + a flag for rectypes)	
(see https://github.com/ocaml/ocaml/blob/trunk/typing/cmi_format.ml#L22)
	cmi file is needed for compilation any time one uses the module it
describes. The compiler looks for cmi files implicitely, you don't
mention them on the command line.
	
* cmo = a module compiled to bytecode (actual bytecode + basic info
about the bytecode like its size + list of imported interfaces + list
of primitives + force-link flag + debug info descr.)
(see https://github.com/ocaml/ocaml/blob/trunk/bytecomp/bytelink.ml#L123)
	cmo files are needed at the moment you want a bytecode executable to
be produced. You can forget about these intermediate files if you tell
the compiler (or another tool) directly that you want an executable.

* cmx = extra info about a module compiled to native code. The actual
machine code is stored in standard object files (like *.o). It means
that foobar.ml is compiled to {foobar.cmx, foobar.o} pair of files.
	cmx files are optionally used as a companion to cmi files to provide
extra optimization hints. I think that they are also needed just
before linking.
(see https://github.com/ocaml/ocaml/blob/trunk/asmcomp/cmx_format.mli#L25)

* cma = some linking information + a collection of bytecode-compiled
modules
	cma files are used to store a collection compiled modules in a single
file (this sort of aggregation of files is a traditional extra step
that precedes linking; it makes published libraries look nicer, at no
real benefit nowadays). Also, they are needed whenever storage of the
extra linking info is crucial (in this case the benefit is real).

* cmxa = some linking information together with a collection of the
pieces of info that accompany natively compiled modules (see
Cmx_format.library_infos)
	cmxa files accompany the traditional *.a files (libraries,
collections of pieces of object code). They are used similarly to cma
files, so publication of a library requires less files.
	*.cma, *.cmxa, and *.a files are used just before and during linking
whenever a normally packed library is in use.

* cms = a shared object (standard file format) named so because it is
going to be loaded by Dynlink

Looks like that in ocaml 4.01 we'll have also
* cmt = information intended for IDEs like source file names and,
apparently, some pieces of typed AST. cmt files are like extension of
cmi files.

Each of the above (except cms) start with a magic number
(see https://github.com/ocaml/ocaml/blob/trunk/utils/config.mlp#L51)
and the standard "file" command is able to say a bit about a file it sees.

Dawid


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] Re: OCaml binary formats -- how are they linked?
  2013-06-14 14:26     ` Dawid Toton
@ 2013-06-14 19:17       ` Ömer Sinan Ağacan
  2013-06-17  0:18         ` Philippe Wang
  0 siblings, 1 reply; 8+ messages in thread
From: Ömer Sinan Ağacan @ 2013-06-14 19:17 UTC (permalink / raw)
  To: Dawid Toton; +Cc: caml-list

Dawid, thanks for your answer. It really helped.

And now I have some more questions(that's a part of learning process, right? ;-)

> You can see what actually happens with
>
> ocamlopt -verbose -c a.ml
> ocamlopt -verbose -cc 'gcc --verbose' a.cmx

I added -verbose parameter in my Makefile, and output was interesting,
it called gcc! At first I thought it's called just for linking
purposes, but later I realized there is also a C file passed to gcc.

It first passed output parameter (`-o executableName`), later some -L
parameters. And after that it passed the file `/tmp/camlprimcb16b7.c`.
What is that file? I couldn't read it because it was deleted after
compilation.

Later some parameters are passed for statically linking(-ldl, -lm,
-lpthread, -lcamlrun etc.).


> strace -e trace=open ./myexecutable

This is so great .. A format-independent way to see dynamically linked
libraries. Thanks for the tip! (btw, is there a different name given
to the process of dynamically loading and linking libraries with
`dload` like system calls? is it also called `dynamically linked`?)


> Bytecode is compiled to a files with #! prefix.

Interesting, I just tried reading an OCaml executable created with
ocamlc, and it had a ELF header. Am I compiling to my program to
native by mistake? I'm not calling ocamlopt, only ocamlc.

Thanks again,


---
Ömer Sinan Ağacan
http://osa1.net

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] Re: OCaml binary formats -- how are they linked?
  2013-06-14 19:17       ` Ömer Sinan Ağacan
@ 2013-06-17  0:18         ` Philippe Wang
  2013-06-17 12:12           ` Ömer Sinan Ağacan
  0 siblings, 1 reply; 8+ messages in thread
From: Philippe Wang @ 2013-06-17  0:18 UTC (permalink / raw)
  To: Ömer Sinan Ağacan; +Cc: Dawid Toton, OCaml Mailing List

On Fri, Jun 14, 2013 at 8:17 PM, Ömer Sinan Ağacan <omeragacan@gmail.com> wrote:
> Dawid, thanks for your answer. It really helped.

>
>> Bytecode is compiled to a files with #! prefix.
>
> Interesting, I just tried reading an OCaml executable created with
> ocamlc, and it had a ELF header. Am I compiling to my program to
> native by mistake? I'm not calling ocamlopt, only ocamlc.
>

If you're using ocamlc and have ELF header, it probably means that you
used the option -custom. In that case, instead of a
#!/path/to/ocamlrun, you get ocamlrun (the binary) as the first part
of the binary.
Actually, when the OCaml VM runs a bytecode executable, it reads from
the end of file (where there are information on, e.g., where the
bytecode is), hence the fact that virtually anything can be put before
the actual bytecode.

$ echo 'print_endline "hello";;' > p.ml
$ ocamlc p.ml -o p
$ ( head -n 1 p ; cat p.ml p ) > q
$ chmod a+x q
$ ./q
hello

Oh, a binary that contains its source code! :-)
(It's not that easy if it's compiled using -custom, because then you
have to find out when the ELF binary ends in order to insert whatever
you might want to insert).

--
Philippe Wang
   mail@philippewang.info

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] Re: OCaml binary formats -- how are they linked?
  2013-06-17  0:18         ` Philippe Wang
@ 2013-06-17 12:12           ` Ömer Sinan Ağacan
  2013-06-17 12:34             ` Philippe Wang
  0 siblings, 1 reply; 8+ messages in thread
From: Ömer Sinan Ağacan @ 2013-06-17 12:12 UTC (permalink / raw)
  To: Philippe Wang; +Cc: Dawid Toton, OCaml Mailing List

> $ echo 'print_endline "hello";;' > p.ml
> $ ocamlc p.ml -o p
> $ ( head -n 1 p ; cat p.ml p ) > q
> $ chmod a+x q
> $ ./q
> hello
>
> Oh, a binary that contains its source code! :-)

Thanks, this is very interesting. As far as I understand, OCamlrun
looks for a magic number in source file to start interpreting, right?

---
Ömer Sinan Ağacan
http://osa1.net

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] Re: OCaml binary formats -- how are they linked?
  2013-06-17 12:12           ` Ömer Sinan Ağacan
@ 2013-06-17 12:34             ` Philippe Wang
  0 siblings, 0 replies; 8+ messages in thread
From: Philippe Wang @ 2013-06-17 12:34 UTC (permalink / raw)
  To: Ömer Sinan Ağacan; +Cc: Dawid Toton, OCaml Mailing List

On Mon, Jun 17, 2013 at 1:12 PM, Ömer Sinan Ağacan <omeragacan@gmail.com> wrote:
>> $ echo 'print_endline "hello";;' > p.ml
>> $ ocamlc p.ml -o p
>> $ ( head -n 1 p ; cat p.ml p ) > q
>> $ chmod a+x q
>> $ ./q
>> hello
>>
>> Oh, a binary that contains its source code! :-)
>
> Thanks, this is very interesting. As far as I understand, OCamlrun
> looks for a magic number in source file to start interpreting, right?

ocamlrun checks the magic number and a few other properties as well.
For instance, if it can read the bytecode for some reason, it'll raise
an error. If you're interested in how it's implemented, you might want
to look into ocaml-4.00.1/byterun/fix_code.c.

--
Philippe Wang
   mail@philippewang.info

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2013-06-17 12:34 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-14  9:30 [Caml-list] OCaml binary formats -- how are they linked? Ömer Sinan Ağacan
2013-06-14 10:16 ` [Caml-list] " Dawid Toton
2013-06-14 12:02   ` Ömer Sinan Ağacan
2013-06-14 14:26     ` Dawid Toton
2013-06-14 19:17       ` Ömer Sinan Ağacan
2013-06-17  0:18         ` Philippe Wang
2013-06-17 12:12           ` Ömer Sinan Ağacan
2013-06-17 12:34             ` Philippe Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).