From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by sympa.inria.fr (Postfix) with ESMTPS id 01B367EE35 for ; Sun, 17 Apr 2016 10:43:58 +0200 (CEST) IronPort-PHdr: 9a23:FGJoYRXFvtq8lv/07iuQIk/v3mvV8LGtZVwlr6E/grcLSJyIuqrYZxaEt8tkgFKBZ4jH8fUM07OQ6PCwHzJYqs7c+Fk5M7VyFDY9wf0MmAIhBMPXQWbaF9XNKxIAIcJZSVV+9Gu6O0UGUOz3ZlnVv2HgpWVKQka3CwN5K6zPF5LIiIzvjqbpq82VPV0D3Gf1SIgxBSv1hD2ZjtMRj4pmJ/R54TryiVwMRd5rw3h1L0mYhRf265T41pdi9yNNp6BprJYYAu3SNp41Rr1ADTkgL3t9pIiy7UGCHj20+2AEX24Kvh1NCgnDpFGmD9ai+hf948V00jObMMm+drs0VC6v9e8/RxbikiYKM3gi+2HakMFqpK1eqROl4Rd4xtiHTpuSMa9cc7jFcMlSYW1cX90ZfipND5mnYo1HW+gMJv5Vtc/5oEYPtl23AwWhHvjizBdHg3b32esx1OF3QlKO5xApA99b6Cecl97yLqpHFL3swQ== Authentication-Results: mail2-smtp-roc.national.inria.fr; spf=None smtp.pra=jesper.louis.andersen@gmail.com; spf=Pass smtp.mailfrom=jesper.louis.andersen@gmail.com; spf=None smtp.helo=postmaster@mail-lf0-f48.google.com Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of jesper.louis.andersen@gmail.com) identity=pra; client-ip=209.85.215.48; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="jesper.louis.andersen@gmail.com"; x-sender="jesper.louis.andersen@gmail.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail2-smtp-roc.national.inria.fr: domain of jesper.louis.andersen@gmail.com designates 209.85.215.48 as permitted sender) identity=mailfrom; client-ip=209.85.215.48; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="jesper.louis.andersen@gmail.com"; x-sender="jesper.louis.andersen@gmail.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of postmaster@mail-lf0-f48.google.com) identity=helo; client-ip=209.85.215.48; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="jesper.louis.andersen@gmail.com"; x-sender="postmaster@mail-lf0-f48.google.com"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A0B5AACySxNXkzDXVdFchAt9Bqk5iQ6HTgENgXEihBWBVwKBGAc4FAEBAQEBAQEBEQEBAQEHDQkJIS+CLYIUAQEBAwESCAkdARQHEgsBAwELBgMCCw0NEwoCAiEBAREBBQEKEgYTEhCHcQEDCggOjQSPQYExPjGLNoFqgleGcQoZJwMKUYRBAQEBAQEBAQMBAQEBAQEBAQERAQUKBYlbgQKCQYIrglOCVgWGMwyBNYVfigoxhXiGIYF1gWdOhymFM4dOhiARHoEODw8BAYIOKw0RCYFMOjCJDwEBAQ X-IPAS-Result: A0B5AACySxNXkzDXVdFchAt9Bqk5iQ6HTgENgXEihBWBVwKBGAc4FAEBAQEBAQEBEQEBAQEHDQkJIS+CLYIUAQEBAwESCAkdARQHEgsBAwELBgMCCw0NEwoCAiEBAREBBQEKEgYTEhCHcQEDCggOjQSPQYExPjGLNoFqgleGcQoZJwMKUYRBAQEBAQEBAQMBAQEBAQEBAQERAQUKBYlbgQKCQYIrglOCVgWGMwyBNYVfigoxhXiGIYF1gWdOhymFM4dOhiARHoEODw8BAYIOKw0RCYFMOjCJDwEBAQ X-IronPort-AV: E=Sophos;i="5.24,496,1454972400"; d="scan'208,217";a="214608629" Received: from mail-lf0-f48.google.com ([209.85.215.48]) by mail2-smtp-roc.national.inria.fr with ESMTP/TLS/AES128-GCM-SHA256; 17 Apr 2016 10:43:55 +0200 Received: by mail-lf0-f48.google.com with SMTP id g184so186904255lfb.3 for ; Sun, 17 Apr 2016 01:43:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=AK3HNAKzsnel5eRy0Q+HU0Ffct859o6HdMzj7++9/BM=; b=whG91dXSxjOMSWC0RZPoYdj3PnxLqytNg506BjKCnUo185dPphsGAfwhjsYHjPckFo gShemZnHb4KtX48aTjj2mPaByEzfqOENBPROjXAFqU2wrben+13n7Kt1NJVAxhpQm+Md 0dHV28dzXyn2OI5YfvEOjldS5HKZc7QqR2P5MoWqJMQnwos+BD4eWlM23NVsGyClllvS nYO6DJxtAGxBQB90DpS2NrZFiR+sB93O0DVBhrw79cK7qn5nzKViIsHSHOkl5r3w9GCI BfJXlzccEPqUFP8cgCkjrD0W76//NegklT1zOf4nUq8aoWrX/ctokr2Q/sFOTWmChzlL xaLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=AK3HNAKzsnel5eRy0Q+HU0Ffct859o6HdMzj7++9/BM=; b=L/zgLuoWXxjpHQX5qFTxzVOEqZuU7abMsAt7vPY7S4MoevfhIIzq8Km9+/AQqED6l9 g6xn7nWQxVkTs9DxeXDYpcuw6/q7Scf+68sOSrr7lWtEw1wLKP1iKiDf9M/kZlK4iYKa 3YGlGKb9umrbU90y92zP1A9e3FzhvIeJELfQnFExKz87aE6t6hq1cRv+j7Jck/xdBOE9 j0pcFyHJaJLiDdxIg9jrGg9Bn5JY16QR+DlYbdP6XN0JVqRLyxphMk3cucj/HDBgu9cj npiH10YqQFQNnrORbYSlnMMBGPx+FrswZG6AnN94VwLuZe1gkBD/GIxQ0ejtZLDRzOQn CEcw== X-Gm-Message-State: AOPr4FUYWM7vQtDKNkULD4RwsNmlD0VPUbxKmWBtRClQHfw5wxJauljt5sHAx06HgFUWkIwFL7IKE+cuwOgxGA== X-Received: by 10.25.157.79 with SMTP id g76mr12296262lfe.41.1460882635000; Sun, 17 Apr 2016 01:43:55 -0700 (PDT) MIME-Version: 1.0 Received: by 10.112.74.196 with HTTP; Sun, 17 Apr 2016 01:43:15 -0700 (PDT) In-Reply-To: References: <56DF57FA.9070309@lexifi.com> From: Jesper Louis Andersen Date: Sun, 17 Apr 2016 10:43:15 +0200 Message-ID: To: Gabriel Scherer Cc: Markus Mottl , Mark Shinwell , Yotam Barnoy , Alain Frisch , OCaml List Content-Type: multipart/alternative; boundary=001a11411dd48492ab0530aa3e68 Subject: Re: [Caml-list] Status of Flambda in OCaml 4.03 --001a11411dd48492ab0530aa3e68 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Tried `opam switch ocaml-4.03.0+trunk+flambda` on the Transit format encoder/decoder i have. I wanted to see how much faster flambda would make the code, since I've heard 20% and 30% thrown around. It is not very optimized code, and in particular, the encoder path is rather elegant, but awfully slow. Well, not anymore: 4.02: Name Time/Run mWd/Run mjWd/Run Prom/Run Percentage ------------ ---------- ------------ ---------- ---------- ------------ decode 2.12ms 352.86kw 34.86kw 34.86kw 27.88% encode 5.07ms 647.93kw 263.69kw 250.40kw 66.70% round_trip 7.61ms 1_000.79kw 298.54kw 285.26kw 100.00% 4.03.0+trunk+flambda: =E2=94=82 Name =E2=94=82 Time/Run =E2=94=82 mWd/Run =E2=94=82 mjWd/R= un =E2=94=82 Prom/Run =E2=94=82 Percentage =E2=94=82 =E2=94=82 decode =E2=94=82 2.04ms =E2=94=82 319.83kw =E2=94=82 35.94= kw =E2=94=82 35.94kw =E2=94=82 43.97% =E2=94=82 =E2=94=82 encode =E2=94=82 2.65ms =E2=94=82 422.67kw =E2=94=82 130.88= kw =E2=94=82 117.59kw =E2=94=82 56.95% =E2=94=82 =E2=94=82 round_trip =E2=94=82 4.65ms =E2=94=82 742.50kw =E2=94=82 164.85= kw =E2=94=82 151.56kw =E2=94=82 100.00% =E2=94=82 Pretty impressive result. Note the heavyweight lifting is due to the yajl JSON parser and this poses a lower bound. But I think the speedup in the encode-path is rather good. Note that the benchmark is probably flawed and some time passed between these two runs, so there might be a confounder hidden in other fixes, either to yajl, or to other parts of the compiler toolchain. However, I think the result itself stands since in practice, my encoding time was just cut in half. On Thu, Mar 10, 2016 at 4:49 PM, Gabriel Scherer wrote: > One point that is tangentially related to your message is that the > flambda people observed that it's easy to miss cross-module > optimizations because .cmx files are missing -- the compiler is silent > about this. Leo White added a new warning (58) when a module does not > find the .cmx of one of its dependencies, which interacts with -opaque > (initially introduced in 4.02.0 when compiling implementation files) > in the following way. In 4.03, you can compile an *interface* with > -opaque, announcing the intent not to provide an .cmx file (or to > choose among several implementations at link-time) for its > implementation(s). Warning 58 will not warn about a missing .cmx if > the dependency's interface was compiled opaque. > https://github.com/ocaml/ocaml/pull/319 > > I think the long-term plan is to encourage people to enable the > warning, and explicitly use -opaque on .cmi when it is their intent > not to distribute .cmx files. That said, those things may be refined > once we get more experience of flambda in the wild. > > On Thu, Mar 10, 2016 at 10:32 AM, Markus Mottl > wrote: > > Ok, that explains things. Is it realistic to assume that the size of > > .cmx files can be substantially reduced? It seems there is a natural > > tradeoff between "optimize well" and "compile fast". I suspect it may > > be inevitable to add more compilation files. We actually already have > > that situation with native code libraries: the .cmxa file is enough to > > compile a project, but if the .cmx files of contained modules are > > visible in the path, too, then, and only then, the compiler can and > > will do cross-module inlining - which takes longer, of course. > > > > What about the following approach? - There is one "minimal" set of > > compilation files that always allows you to quickly obtain a running > > (albeit slow / large) executable. Additional compilation files then > > monotonically augment this information and can be produced and > > consumed optionally depending on compilation flags. The nice thing > > about this approach is that you don't necessarily have to recompile > > the whole project with different flags whenever you need a different > > compile time / performance tradeoff. E.g. if Flambda information is > > available for an unchanged file, you don't have to rebuild it when > > needed. If you just want to compile quickly, you don't have to read > > data you don't need. Separate compilation files would also integrate > > much better with build tools (timestamping, etc.). > > > > I guess we would already be looking at OCaml version 5 for such a change > :) > > > > On Thu, Mar 10, 2016 at 2:20 AM, Mark Shinwell > wrote: > >> By "enabled at configure time" I mean that you need to pass the > >> "-flambda" option to the configure script when building the compiler. > >> > >> The main reason Flambda isn't enabled by default is because we need to > >> do further work to improve compile-time performance. There are also > >> concerns about .cmx file size. Flambda produces larger .cmx files: it > >> stores the entire intermediate representation of the compilation unit > >> so that no subsequent cross-module inlining decision is compromised. > >> > >> There is a mode, -Oclassic, which uses Flambda but mimics the > >> behaviour of the existing compiler; unfortunately this isn't really > >> fast enough yet either and .cmx sizes aren't small enough. > >> > >> When we manage to address some of these issues further, hopefully for > >> 4.04, we will revisit whether Flambda should be enabled by default. > >> > >> One of the main reasons there is a configure option rather than a > >> runtime switch is to avoid having to re-engineer the compiler's build > >> system to permit multiple builds of the various libraries (the stdlib, > >> for example) with differing options that affect what appears in the > >> .cmx files (e.g. with and without Flambda). Even if code were used to > >> allow Flambda to read non-Flambda .cmx files, performance degradation > >> would result. > >> > >> Mark > >> > >> On 10 March 2016 at 01:43, Markus Mottl wrote: > >>> I agree with Yotam. Assuming that Flambda produces correct code and > >>> doesn't cause any serious performance issues either with the generated > >>> code or with excessive compile times, I'd prefer building it into the > >>> compiler by default. I'd be fine if I had to pass an extra flag at > >>> compile time to actually run Flambda optimizers, but it should at > >>> least be available. It doesn't have to be perfect to be useful. > >>> > >>> On Wed, Mar 9, 2016 at 8:32 PM, Yotam Barnoy > wrote: > >>>> While we await the manual, can you explain what you mean by 'enabled > at > >>>> configure time'? Will a -flambda -O-something argument passed to the > normal > >>>> 4.03 compiler enable flambda optimizations? Flambda is clearly the > star of > >>>> the 4.03 release, so not enabling it using command line options seems > >>>> counter-intuitive (if this is the case). > >>>> > >>>> -Yotam > >>>> > >>>> On Wed, Mar 9, 2016 at 7:59 PM, Markus Mottl > wrote: > >>>>> > >>>>> I've just tested Flambda, and it seems to already be doing a pretty > >>>>> decent job on some non-trivial examples (e.g. inlining combinations > of > >>>>> functors and first class functions). I hope there will be a stable > >>>>> 4.03 OPAM switch that enables it. I'm looking forward to being able > >>>>> to write more elegant, abstract code that's still efficient. > >>>>> > >>>>> Regards, > >>>>> Markus > >>>>> > >>>>> On Wed, Mar 9, 2016 at 2:14 AM, Mark Shinwell < > mshinwell@janestreet.com> > >>>>> wrote: > >>>>> > It will not be enabled by default in 4.03. For the majority of > >>>>> > programs, in the current state, it should improve performance > (mainly > >>>>> > by lowering allocation). It should never generate wrong code. > >>>>> > However we know of examples that don't improve as much as we would > >>>>> > like, which we will try to address for 4.04. > >>>>> > > >>>>> > There will be a draft version of the new Flambda manual chapter > >>>>> > available shortly (hopefully this week). Amongst other things th= is > >>>>> > documents what you found about the configure options and the flag= s' > >>>>> > operation. > >>>>> > > >>>>> > Mark > >>>>> > > >>>>> > On 9 March 2016 at 03:55, Markus Mottl > wrote: > >>>>> >> Hi Alain, > >>>>> >> > >>>>> >> I see, thanks. It was a little confusing, because the command > line > >>>>> >> options for tuning flambda were still available even without > Flambda > >>>>> >> being enabled. > >>>>> >> > >>>>> >> Will Flambda be enabled by default in OCaml 4.03 or is it still > >>>>> >> considered to be too experimental? It could turn out to become > one of > >>>>> >> the most impactful new features in terms of how I write code. > >>>>> >> > >>>>> >> Regards, > >>>>> >> Markus > >>>>> >> > >>>>> >> On Tue, Mar 8, 2016 at 5:53 PM, Alain Frisch < > alain.frisch@lexifi.com> > >>>>> >> wrote: > >>>>> >>> Hi Markus, > >>>>> >>> > >>>>> >>> flambda needs to be enabled explicitly at configure time with t= he > >>>>> >>> "-flambda" > >>>>> >>> flag. The new optimizer will then be used unconditionally, and > you > >>>>> >>> can > >>>>> >>> tweak it using command-line parameters passed to ocamlopt (see > >>>>> >>> "ocamlopt > >>>>> >>> -h"). > >>>>> >>> > >>>>> >>> > >>>>> >>> Alain > >>>>> >>> > >>>>> >>> > >>>>> >>> On 08/03/2016 23:10, Markus Mottl wrote: > >>>>> >>>> > >>>>> >>>> Hi, > >>>>> >>>> > >>>>> >>>> I'm trying out OCaml 4.03.0+beta1 right now and wanted to test > >>>>> >>>> Flambda > >>>>> >>>> optimizations. But looking at the generated assembly, it > doesn't > >>>>> >>>> seem > >>>>> >>>> to be doing much if anything on the simple test examples that I > >>>>> >>>> thought would benefit. > >>>>> >>>> > >>>>> >>>> To give an example of what I expected to see, lets consider th= is > >>>>> >>>> code: > >>>>> >>>> > >>>>> >>>> ----- > >>>>> >>>> let map_pair f (x, y) =3D f x, f y > >>>>> >>>> > >>>>> >>>> let succ x =3D x + 1 > >>>>> >>>> let map_pair_succ1 pair =3D map_pair succ pair > >>>>> >>>> let map_pair_succ2 (x, y) =3D succ x, succ y > >>>>> >>>> ----- > >>>>> >>>> > >>>>> >>>> I would have thought that the "succ" function would be inlined > in > >>>>> >>>> "map_pair_succ1" as the compiler would do for "map_pair_succ2". > >>>>> >>>> But the generated code looks like this: > >>>>> >>>> > >>>>> >>>> ----- > >>>>> >>>> L101: > >>>>> >>>> movq %rax, %rdi > >>>>> >>>> movq %rdi, 8(%rsp) > >>>>> >>>> movq %rbx, (%rsp) > >>>>> >>>> movq 8(%rbx), %rax > >>>>> >>>> movq (%rdi), %rsi > >>>>> >>>> movq %rdi, %rbx > >>>>> >>>> call *%rsi > >>>>> >>>> L102: > >>>>> >>>> movq %rax, 16(%rsp) > >>>>> >>>> movq (%rsp), %rax > >>>>> >>>> movq (%rax), %rax > >>>>> >>>> movq 8(%rsp), %rbx > >>>>> >>>> movq (%rbx), %rdi > >>>>> >>>> call *%rdi > >>>>> >>>> ----- > >>>>> >>>> > >>>>> >>>> Is Flambda supposed to work out of the box with the current > beta? > >>>>> >>>> What flags or annotations should I use for testing? Any > showcase > >>>>> >>>> examples I should try out that are expected to be improved? > >>>>> >>>> > >>>>> >>>> Regards, > >>>>> >>>> Markus > >>>>> >>>> > >>>>> >>> > >>>>> >> > >>>>> >> > >>>>> >> > >>>>> >> -- > >>>>> >> Markus Mottl http://www.ocaml.info > markus.mottl@gmail.com > >>>>> >> > >>>>> >> -- > >>>>> >> Caml-list mailing list. Subscription management and archives: > >>>>> >> https://sympa.inria.fr/sympa/arc/caml-list > >>>>> >> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > >>>>> >> Bug reports: http://caml.inria.fr/bin/caml-bugs > >>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> Markus Mottl http://www.ocaml.info > markus.mottl@gmail.com > >>>>> > >>>>> -- > >>>>> Caml-list mailing list. Subscription management and archives: > >>>>> https://sympa.inria.fr/sympa/arc/caml-list > >>>>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > >>>>> Bug reports: http://caml.inria.fr/bin/caml-bugs > >>>> > >>>> > >>> > >>> > >>> > >>> -- > >>> Markus Mottl http://www.ocaml.info > markus.mottl@gmail.com > > > > > > > > -- > > Markus Mottl http://www.ocaml.info markus.mottl@gmail.com > > > > -- > > Caml-list mailing list. Subscription management and archives: > > https://sympa.inria.fr/sympa/arc/caml-list > > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > > Bug reports: http://caml.inria.fr/bin/caml-bugs > > -- > Caml-list mailing list. Subscription management and archives: > https://sympa.inria.fr/sympa/arc/caml-list > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs > --=20 J. --001a11411dd48492ab0530aa3e68 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Tried `opam switch ocaml-4.03.0+trunk+flambda` on the Tran= sit format encoder/decoder i have. I wanted to see how much faster flambda = would make the code, since I've heard 20% and 30% thrown around. It is = not very optimized code, and in particular, the encoder path is rather eleg= ant, but awfully slow. Well, not anymore:

4.02:
=C2=A0Name =C2= =A0 =C2=A0 =C2=A0 =C2=A0 Time/Run =C2=A0 =C2=A0 =C2=A0mWd/Run =C2=A0 mjWd/R= un =C2=A0 Prom/Run =C2=A0 Percentage =C2=A0
------------ ---------- ------------ -------= --- ---------- ------------=C2=A0
=C2=A0decode =C2=A0 =C2=A0 =C2=A0 =C2=A0 2.12ms =C2=A0= =C2=A0 352.86kw =C2=A0 =C2=A034.86kw =C2=A0 =C2=A034.86kw =C2=A0 =C2=A0 = =C2=A0 27.88% =C2=A0
= =C2=A0encode =C2=A0 =C2=A0 =C2=A0 =C2=A0 5.07ms =C2=A0 =C2=A0 647.9= 3kw =C2=A0 263.69kw =C2=A0 250.40kw =C2=A0 =C2=A0 =C2=A0 66.70% =C2=A0
=C2=A0round_trip = =C2=A0 =C2=A0 7.61ms =C2=A0 1_000.79kw =C2=A0 298.54kw =C2=A0 285.26kw =C2= =A0 =C2=A0 =C2=A0100.00% =C2=A0
4.03.0+trunk+flambda:

=E2=94=82 Name =C2=A0 = =C2=A0 =C2=A0 =E2=94=82 Time/Run =E2=94=82 =C2=A0mWd/Run =E2=94=82 mjWd/Run= =E2=94=82 Prom/Run =E2=94=82 Percentage =E2=94=82

=E2=94=82 decode =C2=A0= =C2=A0 =E2=94=82 =C2=A0 2.04ms =E2=94=82 319.83kw =E2=94=82 =C2=A035.94kw = =E2=94=82 =C2=A035.94kw =E2=94=82 =C2=A0 =C2=A0 43.97% =E2=94=82
= =E2=94=82 encode =C2=A0 = =C2=A0 =E2=94=82 =C2=A0 2.65ms =E2=94=82 422.67kw =E2=94=82 130.88kw =E2=94= =82 117.59kw =E2=94=82 =C2=A0 =C2=A0 56.95% =E2=94=82
=E2=94=82 round_trip =E2=94=82 =C2= =A0 4.65ms =E2=94=82 742.50kw =E2=94=82 164.85kw =E2=94=82 151.56kw =E2=94= =82 =C2=A0 =C2=A0100.00% =E2=94=82

Pretty im= pressive result. Note the heavyweight lifting is due to the yajl JSON parse= r and this poses a lower bound. But I think the speedup in the encode-path = is rather good.

Note that the benchmark is probabl= y flawed and some time passed between these two runs, so there might be a c= onfounder hidden in other fixes, either to yajl, or to other parts of the c= ompiler toolchain. However, I think the result itself stands since in pract= ice, my encoding time was just cut in half.


On Thu, Mar 10, 2016 a= t 4:49 PM, Gabriel Scherer <gabriel.scherer@gmail.com> wrote:
One point that is tangentially = related to your message is that the
flambda people observed that it's easy to miss cross-module
optimizations because .cmx files are missing -- the compiler is silent
about this. Leo White added a new warning (58) when a module does not
find the .cmx of one of its dependencies, which interacts with -opaque
(initially introduced in 4.02.0 when compiling implementation files)
in the following way. In 4.03, you can compile an *interface* with
-opaque, announcing the intent not to provide an .cmx file (or to
choose among several implementations at link-time) for its
implementation(s). Warning 58 will not warn about a missing .cmx if
the dependency's interface was compiled opaque.
=C2=A0 https://github.com/ocaml/ocaml/pull/319

I think the long-term plan is to encourage people to enable the
warning, and explicitly use -opaque on .cmi when it is their intent
not to distribute .cmx files. That said, those things may be refined
once we get more experience of flambda in the wild.

On Thu, Mar 10, 2016 at 10:32 AM, Markus Mottl <markus.mottl@gmail.com> wrote:
> Ok, that explains things.=C2=A0 Is it realistic to assume that the siz= e of
> .cmx files can be substantially reduced?=C2=A0 It seems there is a nat= ural
> tradeoff between "optimize well" and "compile fast"= ;.=C2=A0 I suspect it may
> be inevitable to add more compilation files.=C2=A0 We actually already= have
> that situation with native code libraries: the .cmxa file is enough to=
> compile a project, but if the .cmx files of contained modules are
> visible in the path, too, then, and only then, the compiler can and
> will do cross-module inlining - which takes longer, of course.
>
> What about the following approach? - There is one "minimal" = set of
> compilation files that always allows you to quickly obtain a running > (albeit slow / large) executable.=C2=A0 =C2=A0Additional compilation f= iles then
> monotonically augment this information and can be produced and
> consumed optionally depending on compilation flags.=C2=A0 The nice thi= ng
> about this approach is that you don't necessarily have to recompil= e
> the whole project with different flags whenever you need a different > compile time / performance tradeoff.=C2=A0 E.g. if Flambda information= is
> available for an unchanged file, you don't have to rebuild it when=
> needed.=C2=A0 If you just want to compile quickly, you don't have = to read
> data you don't need.=C2=A0 Separate compilation files would also i= ntegrate
> much better with build tools (timestamping, etc.).
>
> I guess we would already be looking at OCaml version 5 for such a chan= ge :)
>
> On Thu, Mar 10, 2016 at 2:20 AM, Mark Shinwell <mshinwell@janestreet.com> wrote:
>> By "enabled at configure time" I mean that you need to p= ass the
>> "-flambda" option to the configure script when building = the compiler.
>>
>> The main reason Flambda isn't enabled by default is because we= need to
>> do further work to improve compile-time performance.=C2=A0 There a= re also
>> concerns about .cmx file size.=C2=A0 Flambda produces larger .cmx = files: it
>> stores the entire intermediate representation of the compilation u= nit
>> so that no subsequent cross-module inlining decision is compromise= d.
>>
>> There is a mode, -Oclassic, which uses Flambda but mimics the
>> behaviour of the existing compiler; unfortunately this isn't r= eally
>> fast enough yet either and .cmx sizes aren't small enough.
>>
>> When we manage to address some of these issues further, hopefully = for
>> 4.04, we will revisit whether Flambda should be enabled by default= .
>>
>> One of the main reasons there is a configure option rather than a<= br> >> runtime switch is to avoid having to re-engineer the compiler'= s build
>> system to permit multiple builds of the various libraries (the std= lib,
>> for example) with differing options that affect what appears in th= e
>> .cmx files (e.g. with and without Flambda).=C2=A0 Even if code wer= e used to
>> allow Flambda to read non-Flambda .cmx files, performance degradat= ion
>> would result.
>>
>> Mark
>>
>> On 10 March 2016 at 01:43, Markus Mottl <markus.mottl@gmail.com> wrote:
>>> I agree with Yotam.=C2=A0 Assuming that Flambda produces corre= ct code and
>>> doesn't cause any serious performance issues either with t= he generated
>>> code or with excessive compile times, I'd prefer building = it into the
>>> compiler by default.=C2=A0 I'd be fine if I had to pass an= extra flag at
>>> compile time to actually run Flambda optimizers, but it should= at
>>> least be available.=C2=A0 It doesn't have to be perfect to= be useful.
>>>
>>> On Wed, Mar 9, 2016 at 8:32 PM, Yotam Barnoy <yotambarnoy@gmail.com> wrote:
>>>> While we await the manual, can you explain what you mean b= y 'enabled at
>>>> configure time'? Will a -flambda -O-something argument= passed to the normal
>>>> 4.03 compiler enable flambda optimizations? Flambda is cle= arly the star of
>>>> the 4.03 release, so not enabling it using command line op= tions seems
>>>> counter-intuitive (if this is the case).
>>>>
>>>> -Yotam
>>>>
>>>> On Wed, Mar 9, 2016 at 7:59 PM, Markus Mottl <markus.mottl@gmail.com> wrote:
>>>>>
>>>>> I've just tested Flambda, and it seems to already = be doing a pretty
>>>>> decent job on some non-trivial examples (e.g. inlining= combinations of
>>>>> functors and first class functions).=C2=A0 I hope ther= e will be a stable
>>>>> 4.03 OPAM switch that enables it.=C2=A0 I'm lookin= g forward to being able
>>>>> to write more elegant, abstract code that's still = efficient.
>>>>>
>>>>> Regards,
>>>>> Markus
>>>>>
>>>>> On Wed, Mar 9, 2016 at 2:14 AM, Mark Shinwell <mshinwell@janestreet.com>
>>>>> wrote:
>>>>> > It will not be enabled by default in 4.03.=C2=A0 = For the majority of
>>>>> > programs, in the current state, it should improve= performance (mainly
>>>>> > by lowering allocation).=C2=A0 It should never ge= nerate wrong code.
>>>>> > However we know of examples that don't improv= e as much as we would
>>>>> > like, which we will try to address for 4.04.
>>>>> >
>>>>> > There will be a draft version of the new Flambda = manual chapter
>>>>> > available shortly (hopefully this week).=C2=A0 Am= ongst other things this
>>>>> > documents what you found about the configure opti= ons and the flags'
>>>>> > operation.
>>>>> >
>>>>> > Mark
>>>>> >
>>>>> > On 9 March 2016 at 03:55, Markus Mottl <markus.mottl@gmail.com> wrote: >>>>> >> Hi Alain,
>>>>> >>
>>>>> >> I see, thanks.=C2=A0 It was a little confusin= g, because the command line
>>>>> >> options for tuning flambda were still availab= le even without Flambda
>>>>> >> being enabled.
>>>>> >>
>>>>> >> Will Flambda be enabled by default in OCaml 4= .03 or is it still
>>>>> >> considered to be too experimental?=C2=A0 It c= ould turn out to become one of
>>>>> >> the most impactful new features in terms of h= ow I write code.
>>>>> >>
>>>>> >> Regards,
>>>>> >> Markus
>>>>> >>
>>>>> >> On Tue, Mar 8, 2016 at 5:53 PM, Alain Frisch = <alain.frisch@lexifi.com&= gt;
>>>>> >> wrote:
>>>>> >>> Hi Markus,
>>>>> >>>
>>>>> >>> flambda needs to be enabled explicitly at= configure time with the
>>>>> >>> "-flambda"
>>>>> >>> flag.=C2=A0 The new optimizer will then b= e used unconditionally, and you
>>>>> >>> can
>>>>> >>> tweak it using command-line parameters pa= ssed to ocamlopt (see
>>>>> >>> "ocamlopt
>>>>> >>> -h").
>>>>> >>>
>>>>> >>>
>>>>> >>> Alain
>>>>> >>>
>>>>> >>>
>>>>> >>> On 08/03/2016 23:10, Markus Mottl wrote:<= br> >>>>> >>>>
>>>>> >>>> Hi,
>>>>> >>>>
>>>>> >>>> I'm trying out OCaml 4.03.0+beta1= right now and wanted to test
>>>>> >>>> Flambda
>>>>> >>>> optimizations.=C2=A0 But looking at t= he generated assembly, it doesn't
>>>>> >>>> seem
>>>>> >>>> to be doing much if anything on the s= imple test examples that I
>>>>> >>>> thought would benefit.
>>>>> >>>>
>>>>> >>>> To give an example of what I expected= to see, lets consider this
>>>>> >>>> code:
>>>>> >>>>
>>>>> >>>> -----
>>>>> >>>> let map_pair f (x, y) =3D f x, f y
>>>>> >>>>
>>>>> >>>> let succ x =3D x + 1
>>>>> >>>> let map_pair_succ1 pair =3D map_pair = succ pair
>>>>> >>>> let map_pair_succ2 (x, y) =3D succ x,= succ y
>>>>> >>>> -----
>>>>> >>>>
>>>>> >>>> I would have thought that the "s= ucc" function would be inlined in
>>>>> >>>> "map_pair_succ1" as the com= piler would do for "map_pair_succ2".
>>>>> >>>> But the generated code looks like thi= s:
>>>>> >>>>
>>>>> >>>> -----
>>>>> >>>> L101:
>>>>> >>>>=C2=A0 =C2=A0 movq=C2=A0 %rax, %rdi
>>>>> >>>>=C2=A0 =C2=A0 movq=C2=A0 %rdi, 8(%rsp)=
>>>>> >>>>=C2=A0 =C2=A0 movq=C2=A0 %rbx, (%rsp)<= br> >>>>> >>>>=C2=A0 =C2=A0 movq=C2=A0 8(%rbx), %rax=
>>>>> >>>>=C2=A0 =C2=A0 movq=C2=A0 (%rdi), %rsi<= br> >>>>> >>>>=C2=A0 =C2=A0 movq=C2=A0 %rdi, %rbx
>>>>> >>>>=C2=A0 =C2=A0 call=C2=A0 *%rsi
>>>>> >>>> L102:
>>>>> >>>>=C2=A0 =C2=A0 movq=C2=A0 %rax, 16(%rsp= )
>>>>> >>>>=C2=A0 =C2=A0 movq=C2=A0 (%rsp), %rax<= br> >>>>> >>>>=C2=A0 =C2=A0 movq=C2=A0 (%rax), %rax<= br> >>>>> >>>>=C2=A0 =C2=A0 movq=C2=A0 8(%rsp), %rbx=
>>>>> >>>>=C2=A0 =C2=A0 movq=C2=A0 (%rbx), %rdi<= br> >>>>> >>>>=C2=A0 =C2=A0 call=C2=A0 *%rdi
>>>>> >>>> -----
>>>>> >>>>
>>>>> >>>> Is Flambda supposed to work out of th= e box with the current beta?
>>>>> >>>> What flags or annotations should I us= e for testing?=C2=A0 Any showcase
>>>>> >>>> examples I should try out that are ex= pected to be improved?
>>>>> >>>>
>>>>> >>>> Regards,
>>>>> >>>> Markus
>>>>> >>>>
>>>>> >>>
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> --
>>>>> >> Markus Mottl=C2=A0 =C2=A0 =C2=A0 =C2=A0 http://ww= w.ocaml.info=C2=A0 =C2=A0 =C2=A0 =C2=A0 markus.mottl@gmail.com
>>>>> >>
>>>>> >> --
>>>>> >> Caml-list mailing list.=C2=A0 Subscription ma= nagement and archives:
>>>>> >> https://sympa.inria.fr/sympa= /arc/caml-list
>>>>> >> Beginner's list: http= ://groups.yahoo.com/group/ocaml_beginners
>>>>> >> Bug reports: http://caml.inria.fr/bi= n/caml-bugs
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Markus Mottl=C2=A0 =C2=A0 =C2=A0 =C2=A0 http://www.ocaml.i= nfo=C2=A0 =C2=A0 =C2=A0 =C2=A0 markus.mottl@gmail.com
>>>>>
>>>>> --
>>>>> Caml-list mailing list.=C2=A0 Subscription management = and archives:
>>>>> https://sympa.inria.fr/sympa/arc/caml= -list
>>>>> Beginner's list: http://groups= .yahoo.com/group/ocaml_beginners
>>>>> Bug reports: http://caml.inria.fr/bin/caml-bu= gs
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Markus Mottl=C2=A0 =C2=A0 =C2=A0 =C2=A0 http://www.ocaml.info= =C2=A0 =C2=A0 =C2=A0 =C2=A0 marku= s.mottl@gmail.com
>
>
>
> --
> Markus Mottl=C2=A0 =C2=A0 =C2=A0 =C2=A0 http://www.ocaml.info=C2=A0 = =C2=A0 =C2=A0 =C2=A0 markus.mottl= @gmail.com
>
> --
> Caml-list mailing list.=C2=A0 Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group= /ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs

--
Caml-list mailing list.=C2=A0 Subscription management and archives:
https://sympa.inria.fr/sympa/arc/caml-list
Beginner's list: http://groups.yahoo.com/group/ocam= l_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



--
=
J.
--001a11411dd48492ab0530aa3e68--