caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Dawid Toton <d0@wp.pl>
To: caml-list@inria.fr
Subject: [Caml-list] Re: OCaml binary formats -- how are they linked?
Date: Fri, 14 Jun 2013 16:26:47 +0200	[thread overview]
Message-ID: <kpf971$3f8$1@ger.gmane.org> (raw)
In-Reply-To: <CAMQQO3=TzNn1dqZYdZKLx_zv34H86iEKdM480++V84wRsiwKeA@mail.gmail.com>

On 06/14/2013 02:02 PM, Ömer Sinan Ağacan wrote:
> 
> What do you mean by 'calling the linker'? Do you mean system level 
> functions like `dload` (or it's equivalent for OCaml libraries) ?
> Or do you mean calling OCaml compiler's linker in compile time?

I mean calling the system linker in the last stage of build process.
OCaml native compiler uses standard assembler and linker provided by
the system. So, in principle, you can experiment with linking like this:

ld -o myexecutable .../crt1.o .../crti.o .../usr/lib/ocaml/std_exit.o
... a.o b.o ...
where the dots stay for quite a lot of other object files.

You can see what actually happens with

ocamlopt -verbose -c a.ml
ocamlopt -verbose -cc 'gcc --verbose' a.cmx

By the way, I would suggest sticking to bytecode for regular
development, as it permits faster edit-build-debug cycles. ocamlc is
very fast. I believe one has to have very demanding testsuite in order
to benefit from native compilation (barring production run).

> 
> Is there a program like `ldd` that shows dynamically linked OCaml 
> libraries of a OCaml program?

Natively compiled OCaml code is put to regular executables, so you can
use standard tools. ldd should show you all you need.

Bytecode is compiled to a files with #! prefix. They are executable in
a sense that the prefix tells the system to launch ocamlrun, otherwise
the compiled files are OCaml-specific.
In order to look inside the OCaml-specific files, you can use
ocamlobjinfo
ocamldumpobj

On my Debian system they are both part of the standard OCaml installation.

But, personally, I would just

strace -e trace=open ./myexecutable

to see the loaded libraries, as it is universal and simple to remember
tool.
	
> 
> Where can I learn more about OCaml binary file formats like .cma,
> .cmx ? I couldn't find a related section in language manual.
> 

Don't know, but here are the basics:

* cmi = compiled interface (module signature + CRCs of imported
interfaces + a flag for rectypes)	
(see https://github.com/ocaml/ocaml/blob/trunk/typing/cmi_format.ml#L22)
	cmi file is needed for compilation any time one uses the module it
describes. The compiler looks for cmi files implicitely, you don't
mention them on the command line.
	
* cmo = a module compiled to bytecode (actual bytecode + basic info
about the bytecode like its size + list of imported interfaces + list
of primitives + force-link flag + debug info descr.)
(see https://github.com/ocaml/ocaml/blob/trunk/bytecomp/bytelink.ml#L123)
	cmo files are needed at the moment you want a bytecode executable to
be produced. You can forget about these intermediate files if you tell
the compiler (or another tool) directly that you want an executable.

* cmx = extra info about a module compiled to native code. The actual
machine code is stored in standard object files (like *.o). It means
that foobar.ml is compiled to {foobar.cmx, foobar.o} pair of files.
	cmx files are optionally used as a companion to cmi files to provide
extra optimization hints. I think that they are also needed just
before linking.
(see https://github.com/ocaml/ocaml/blob/trunk/asmcomp/cmx_format.mli#L25)

* cma = some linking information + a collection of bytecode-compiled
modules
	cma files are used to store a collection compiled modules in a single
file (this sort of aggregation of files is a traditional extra step
that precedes linking; it makes published libraries look nicer, at no
real benefit nowadays). Also, they are needed whenever storage of the
extra linking info is crucial (in this case the benefit is real).

* cmxa = some linking information together with a collection of the
pieces of info that accompany natively compiled modules (see
Cmx_format.library_infos)
	cmxa files accompany the traditional *.a files (libraries,
collections of pieces of object code). They are used similarly to cma
files, so publication of a library requires less files.
	*.cma, *.cmxa, and *.a files are used just before and during linking
whenever a normally packed library is in use.

* cms = a shared object (standard file format) named so because it is
going to be loaded by Dynlink

Looks like that in ocaml 4.01 we'll have also
* cmt = information intended for IDEs like source file names and,
apparently, some pieces of typed AST. cmt files are like extension of
cmi files.

Each of the above (except cms) start with a magic number
(see https://github.com/ocaml/ocaml/blob/trunk/utils/config.mlp#L51)
and the standard "file" command is able to say a bit about a file it sees.

Dawid


  reply	other threads:[~2013-06-14 14:26 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-14  9:30 [Caml-list] " Ömer Sinan Ağacan
2013-06-14 10:16 ` [Caml-list] " Dawid Toton
2013-06-14 12:02   ` Ömer Sinan Ağacan
2013-06-14 14:26     ` Dawid Toton [this message]
2013-06-14 19:17       ` Ömer Sinan Ağacan
2013-06-17  0:18         ` Philippe Wang
2013-06-17 12:12           ` Ömer Sinan Ağacan
2013-06-17 12:34             ` Philippe Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='kpf971$3f8$1@ger.gmane.org' \
    --to=d0@wp.pl \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).