caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Nick Chapman <nchapman@janestreet.com>
To: Alain Frisch <alain.frisch@lexifi.com>
Cc: OCaml Mailing List <caml-list@inria.fr>
Subject: Re: [Caml-list] ocamldep, transitive dependencies, build systems, flambda
Date: Tue, 5 Jul 2016 10:17:20 +0100	[thread overview]
Message-ID: <CANViCaQozEoDvXf4NcOt-xr3dgjWZfujBWX6zZXh5H3ExejWtg@mail.gmail.com> (raw)
In-Reply-To: <a3ecd961-eb6a-27ba-823c-7e798771ceb6@lexifi.com>

[-- Attachment #1: Type: text/plain, Size: 7558 bytes --]

Hi Alain,

We have a setup at Jane Street quite similar to how you describe. We also
install library artifacts into what you call a "pub" directory.

We sidestep the ocamldep issues you describe by using ocamldep only to
determine dependencies within a library but not between libraries.
Dependencies between libraries and handled by requiring the user explicitly
list dependent libraries (in a "jbuild" file). And then we setup
dependencies on all public .cmi's in the listed libraries.

This does mean our dependencies are not as fine grained as one might like.
Previously when we packed all our libraries this wasn't an issue since
there was only a single .cmi per library anyway - i.e. we had already lost
any chance to be more fine grained - but we could perhaps do better now.

There was a further issue we needed to solve to get our scheme working.
Suppose library A lists library B as a dependency and allows this
dependence to be exposed in its interface. Clients of library A will
require access to the .cmi's of library B to be compiled but it seems
unreasonable to require them to explicitly list library B as a dependency.
We solve this by automatically running ocamlinfo on the public .cmi's of a
library to discover additional library deps required by clients of the
library.

We use jenga to setup all the dependencies described above.

The above is from memory and represents our approach circa a year ago or
so. Some details might be wrong now.

Nick Chapman.


On 4 July 2016 at 17:49, Alain Frisch <alain.frisch@lexifi.com> wrote:

> Dear all,
>
> I'd like to know if people have good solutions to address the problem
> below.
>
> Assume a large project, with multiple libraries spread over
> sub-directories, all managed by a single global build system that tracks
> dependencies on a per-file basis (i.e. if a module depends on modules
> another library, it is not necessarily recompiled when only modules in that
> library are modified).
>
> For instance, imagine a library in lib1/src with two modules A and B, B.ml
> and B.mli both depending on A.  Thanks to ocamldep, the build system learns
> about the following dependencies (in make syntax):
>
>  lib1/src/B.cmx: lib1/src/A.cmi lib1/src/A.cmx
>  lib1/src/B.cmi: lib1/src/A.cmi
>
> For various reasons, one might want to "install" some build artefefacts
> (.cmi, .cmx) in staging directories.  One possible reason is to expose only
> a subset of a library internal modules to other libraries.  For our
> example, imagine that both A and B are part of the public API. So we create
> copy rules and record the associated dependencies to the build system:
>
>  lib1/pub/A.cmx: lib1/src/A.cmx
>  lib1/pub/A.cmi: lib1/src/A.cmi
>  lib1/pub/B.cmx: lib1/src/B.cmx
>  lib1/pub/B.cmi: lib1/src/B.cmi
>
> Another library lib2/ is only allowed to see this public API, and so is
> compiled with "-I $(ROOT)/lib1/pub" (and not "-I $(ROOT)/lib1/src").  A
> module C in this library depends directly on B, and the build system thus
> infer the following dependencies:
>
>  lib2/src/C.cmx: lib1/pub/B.cmi lib1/pub/B.cmx
>
> C has no reference to A in its source code so ocamldep has no way to know
> that it (transitively) depends on A.  The trouble is that some dependencies
> are effectively unknown to the build system, which can lead to broken
> builds.  For instance, when lib1/pub/A.mli is modified and one ask the
> build system to refresh lib2/src/C.cmx, the dependencies above will force
> only the following files to be refreshed in the process:
>
>  lib1/pub/B.cmi lib1/pub/B.cmx lib1/src/B.cmx lib1/src/B.cmi
> lib1/src/A.cmi lib1/src/A.cmx
>
> So when C.ml is recompiled to produce C.cmx, it will see the old version
> of lib1/pub/A.cmi.  But even if ocamldep does not report any dependency
> from C to A, the type-checker might need to open A.cmi to expand e.g. type
> aliases, hence the broken build.  I reported this problem in
> http://caml.inria.fr/mantis/view.php?id=5624 and the fix we have in place
> at LexiFi is to compile in a "strict" mode where the compiler prevents
> itself from opening a .cmi file which is not a direct dependency (i.e. the
> compiler runs ocamldep internally and restrict its view of the file system
> accordingly).  This works fine and only forces us to explicitly add some
> dummy references.  (Typically, if one needs A.cmi to compile C.ml, one
> would add a dummy reference to A somewhere in C.ml.  And ocamldep will thus
> report that C.cmx depends on A.cmi, which will fix the problem above.)
>
> I'm wondering how other groups manage this kind of problem.
>
> Moreover, flambda makes the problem actually quite a bit worse.  Indeed,
> B.cmx can now contain symbolic references to A.cmx, and when compiling
> C.cmx, the compiler will complain that it cannot find A.cmx (typically when
> a function in B is inlined in C and calls a function in A).  This is
> warning 58.  Simply disabling the warning does not work, since an old
> version of A.cmx could remain in lib1/pub, leading to mismatched
> implementation digests and to unreliable parallel build.
>
> One could apply the same trick as for .cmi files, i.e. prevent the
> compiler from opening A.cmx if the current unit does not depend (according
> to ocamldep) on A.  But this is not so good as for interfaces, for two
> reasons:
>
>   - It's harder for the user to figure out that an explicit dependency
> must be forced, because this is not exposed in the published API (i.e. the
> module interfaces), but only in the implementation.  Moreover, it depends
> on internals of the compiler whether A.cmx is actually needed to compile
> C.cmx (e.g. in non-flambda mode, and perhaps in flambda mode with some
> settings, it is not needed).
>
>   - We still want to be able *not* to install A.cmi in lib1/pub if A is
> not part of the public API of lib1.  But this would prevent the code in C
> to force a dependency to A.
>
>
> A different direction would be to register extra dependencies between
> "installed" files depending on the dependencies between source units. In
> the example above, one would register:
>
>
>  lib1/pub/B.cmx: lib1/pub/A.cmi lib1/pub/A.cmx
>  lib1/pub/B.cmi: lib1/pub/A.cmi lib1/src/B.cmi
>
>
> The problem is that this creates interactions between the copy rules
> (which are just regular copy commands with the associated dependencies) and
> the normal build rules for OCaml units (with automatic discovery of
> dependencies with "ocamldep -modules").  In our case, our build system is
> omake and these two kinds of rules are completely separated (generic build
> rules and one or several "install" rules to expose different APIs to
> various parts of the projects).  We don't see how to write our build rules
> in a modular way and keep the automatic discovery of dependencies.
>
> The core of the problem, as I see it, is that ocamldep cannot return even
> an over-approximation of the actual dependencies of a given unit. It misses
> "implicit" dependencies related to either aliases in the type system or
> cross-module optimizations in cmx files (with flambda at least, the problem
> does not seem to exist at the implementation level for non-flambda mode).
>
>
> So if any other group has faced the same problem and found a nice solution
> (with omake or another build system), I'd love to hear about it!
>
>
> -- Alain
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>

[-- Attachment #2: Type: text/html, Size: 8738 bytes --]

  reply	other threads:[~2016-07-05  9:17 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-04 16:49 Alain Frisch
2016-07-05  9:17 ` Nick Chapman [this message]
2016-07-18 14:47   ` Alain Frisch
2016-07-19  9:20     ` Goswin von Brederlow
2016-07-19  9:46   ` Daniel Bünzli
2016-07-05 12:00 ` François Bobot
2016-07-05 13:53 ` Gerd Stolpmann
2016-07-05 13:06 Hongbo Zhang (BLOOMBERG/ 731 LEX)
2016-07-05 13:17 ` Gabriel Scherer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANViCaQozEoDvXf4NcOt-xr3dgjWZfujBWX6zZXh5H3ExejWtg@mail.gmail.com \
    --to=nchapman@janestreet.com \
    --cc=alain.frisch@lexifi.com \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).