caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Gerd Stolpmann <info@gerd-stolpmann.de>
To: Alain Frisch <alain.frisch@lexifi.com>
Cc: OCaml Mailing List <caml-list@inria.fr>
Subject: Re: [Caml-list] ocamldep, transitive dependencies, build systems, flambda
Date: Tue, 05 Jul 2016 15:53:39 +0200	[thread overview]
Message-ID: <1467726819.17506.22.camel@e130.lan.sumadev.de> (raw)
In-Reply-To: <a3ecd961-eb6a-27ba-823c-7e798771ceb6@lexifi.com>

[-- Attachment #1: Type: text/plain, Size: 8566 bytes --]

Hi Alain,

my thinking is that the core of the issue is more a naming problem.
Currently, cmi and cmx files are named by the module to which they
refer. This works well as long as all modules are exposed to the user.

What you need is a way of hiding the module while keeping a reference to
the "module as such" - which got kind of anonymous after removing the
cmi. Let's follow this idea for a while: Support for anonymous modules.
Imagine there was a command

ocamlhide A

which reads a.cmi and outputs _a_900150983cd24fb0d6963f7d28e17f72.cmi,
and the name consisted of the original module name and the module
checksum. Same for the cmx file. In this form, the cmi file can no
longer be directly used, but indirect references from other modules are
still possible (the compiler would use the checksum for resolving the
dependency). These files can then be installed into the pub directory.

First of all, this solves an imminent problem of just not installing the
cmi, namely a possible conflict with the name of a user's module (what
will happen when another module A enters the scene later?). Second, all
problems with different versions of the same module are gone (well, I'm
not sure about the symbols, but let's consider this as a solvable
detail). In particular, it is possible to have several versions of a
module at the same time. If a particular version of a module is demanded
for linking, it will be available and not conflict with any older
artifact.

Of course, ocamldep still has no chance to see such hidden deps by only
looking at the sources. I don't have a really good idea that would
easily integrate with build tools. If you manually manage the deps you'd
have to do it this way:

lib1/pub/B.cmx: lib1/src/B.cmx lib1/src/A.cmx
lib1/pub/B.cmi: lib1/src/B.cmi lib1/src/A.cmi

Because when A.cmi changes the reference in B.cmi to the anonymous
module also changes. You cannot use the files with the checksums,
because this is exactly the part that would be updated by the rebuild.

This is just a thought. I'm sure whether a good solution is in reach
with these ideas.

Gerd


Am Montag, den 04.07.2016, 18:49 +0200 schrieb Alain Frisch:
> Dear all,
> 
> I'd like to know if people have good solutions to address the problem below.
> 
> Assume a large project, with multiple libraries spread over 
> sub-directories, all managed by a single global build system that tracks 
> dependencies on a per-file basis (i.e. if a module depends on modules 
> another library, it is not necessarily recompiled when only modules in 
> that library are modified).
> 
> For instance, imagine a library in lib1/src with two modules A and B, 
> B.ml and B.mli both depending on A.  Thanks to ocamldep, the build 
> system learns about the following dependencies (in make syntax):
> 
>   lib1/src/B.cmx: lib1/src/A.cmi lib1/src/A.cmx
>   lib1/src/B.cmi: lib1/src/A.cmi
> 
> For various reasons, one might want to "install" some build artefefacts 
> (.cmi, .cmx) in staging directories.  One possible reason is to expose 
> only a subset of a library internal modules to other libraries.  For our 
> example, imagine that both A and B are part of the public API. So we 
> create copy rules and record the associated dependencies to the build 
> system:
> 
>   lib1/pub/A.cmx: lib1/src/A.cmx
>   lib1/pub/A.cmi: lib1/src/A.cmi
>   lib1/pub/B.cmx: lib1/src/B.cmx
>   lib1/pub/B.cmi: lib1/src/B.cmi
> 
> Another library lib2/ is only allowed to see this public API, and so is 
> compiled with "-I $(ROOT)/lib1/pub" (and not "-I $(ROOT)/lib1/src").  A 
> module C in this library depends directly on B, and the build system 
> thus infer the following dependencies:
> 
>   lib2/src/C.cmx: lib1/pub/B.cmi lib1/pub/B.cmx
> 
> C has no reference to A in its source code so ocamldep has no way to 
> know that it (transitively) depends on A.  The trouble is that some 
> dependencies are effectively unknown to the build system, which can lead 
> to broken builds.  For instance, when lib1/pub/A.mli is modified and one 
> ask the build system to refresh lib2/src/C.cmx, the dependencies above 
> will force only the following files to be refreshed in the process:
> 
>   lib1/pub/B.cmi lib1/pub/B.cmx lib1/src/B.cmx lib1/src/B.cmi 
> lib1/src/A.cmi lib1/src/A.cmx
> 
> So when C.ml is recompiled to produce C.cmx, it will see the old version 
> of lib1/pub/A.cmi.  But even if ocamldep does not report any dependency 
> from C to A, the type-checker might need to open A.cmi to expand e.g. 
> type aliases, hence the broken build.  I reported this problem in 
> http://caml.inria.fr/mantis/view.php?id=5624 and the fix we have in 
> place at LexiFi is to compile in a "strict" mode where the compiler 
> prevents itself from opening a .cmi file which is not a direct 
> dependency (i.e. the compiler runs ocamldep internally and restrict its 
> view of the file system accordingly).  This works fine and only forces 
> us to explicitly add some dummy references.  (Typically, if one needs 
> A.cmi to compile C.ml, one would add a dummy reference to A somewhere in 
> C.ml.  And ocamldep will thus report that C.cmx depends on A.cmi, which 
> will fix the problem above.)
> 
> I'm wondering how other groups manage this kind of problem.
> 
> Moreover, flambda makes the problem actually quite a bit worse.  Indeed, 
> B.cmx can now contain symbolic references to A.cmx, and when compiling 
> C.cmx, the compiler will complain that it cannot find A.cmx (typically 
> when a function in B is inlined in C and calls a function in A).  This 
> is warning 58.  Simply disabling the warning does not work, since an old 
> version of A.cmx could remain in lib1/pub, leading to mismatched 
> implementation digests and to unreliable parallel build.
> 
> One could apply the same trick as for .cmi files, i.e. prevent the 
> compiler from opening A.cmx if the current unit does not depend 
> (according to ocamldep) on A.  But this is not so good as for 
> interfaces, for two reasons:
> 
>    - It's harder for the user to figure out that an explicit dependency 
> must be forced, because this is not exposed in the published API (i.e. 
> the module interfaces), but only in the implementation.  Moreover, it 
> depends on internals of the compiler whether A.cmx is actually needed to 
> compile C.cmx (e.g. in non-flambda mode, and perhaps in flambda mode 
> with some settings, it is not needed).
> 
>    - We still want to be able *not* to install A.cmi in lib1/pub if A is 
> not part of the public API of lib1.  But this would prevent the code in 
> C to force a dependency to A.
> 
> 
> A different direction would be to register extra dependencies between 
> "installed" files depending on the dependencies between source units. 
> In the example above, one would register:
> 
> 
>   lib1/pub/B.cmx: lib1/pub/A.cmi lib1/pub/A.cmx
>   lib1/pub/B.cmi: lib1/pub/A.cmi lib1/src/B.cmi
> 
> 
> The problem is that this creates interactions between the copy rules 
> (which are just regular copy commands with the associated dependencies) 
> and the normal build rules for OCaml units (with automatic discovery of 
> dependencies with "ocamldep -modules").  In our case, our build system 
> is omake and these two kinds of rules are completely separated (generic 
> build rules and one or several "install" rules to expose different APIs 
> to various parts of the projects).  We don't see how to write our build 
> rules in a modular way and keep the automatic discovery of dependencies.
> 
> The core of the problem, as I see it, is that ocamldep cannot return 
> even an over-approximation of the actual dependencies of a given unit. 
> It misses "implicit" dependencies related to either aliases in the type 
> system or cross-module optimizations in cmx files (with flambda at 
> least, the problem does not seem to exist at the implementation level 
> for non-flambda mode).
> 
> 
> So if any other group has faced the same problem and found a nice 
> solution (with omake or another build system), I'd love to hear about it!
> 
> 
> -- Alain
> 

-- 
------------------------------------------------------------
Gerd Stolpmann, Darmstadt, Germany    gerd@gerd-stolpmann.de
My OCaml site:          http://www.camlcity.org
Contact details:        http://www.camlcity.org/contact.html
Company homepage:       http://www.gerd-stolpmann.de
------------------------------------------------------------


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

  parent reply	other threads:[~2016-07-05 13:53 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-04 16:49 Alain Frisch
2016-07-05  9:17 ` Nick Chapman
2016-07-18 14:47   ` Alain Frisch
2016-07-19  9:20     ` Goswin von Brederlow
2016-07-19  9:46   ` Daniel Bünzli
2016-07-05 12:00 ` François Bobot
2016-07-05 13:53 ` Gerd Stolpmann [this message]
2016-07-05 13:06 Hongbo Zhang (BLOOMBERG/ 731 LEX)
2016-07-05 13:17 ` Gabriel Scherer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1467726819.17506.22.camel@e130.lan.sumadev.de \
    --to=info@gerd-stolpmann.de \
    --cc=alain.frisch@lexifi.com \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).