From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by sympa.inria.fr (Postfix) with ESMTPS id C49F47F026 for ; Fri, 11 Mar 2016 10:05:48 +0100 (CET) IronPort-PHdr: 9a23:CGyqjxWzZp/mphldc7qRtvjzpL7V8LGtZVwlr6E/grcLSJyIuqrYZxSBt8tkgFKBZ4jH8fUM07OQ6PC/HzxQqsze4TgrS99laVwssY0uhQsuAcqIWwXQDcXBSGgEJvlET0Jv5HqhMEJYS47UblzWpWCuv3ZJQk2sfTR8Kum9IIPOlcP/j7n0oM2MJVUYz2DiMPtbF1afk0b4joEum4xsK6I8mFPig0BjXKBo/15uPk+ZhB3m5829r9ZJ+iVUvO89pYYbCf2pN4xxd7FTDSwnPmYp/4Wr8ECbFUrc0EABSX0bmQZkBA3M7ReyHsug83iyiu0o9ySAMYXNUbcwQTGr6aEjHB7uhiAvODMj/CTMlst0lKdSphTnqxEpkKDOZ4TAEfNkfevmfdIcWmdFWo4FUjdBA4WjYo8LJ+gIO+tDs5PwqkdIphy7U1r/TNjzwyNF0yellZYx1P4sRESbhQE= Authentication-Results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=mshinwell@janestreet.com; spf=Pass smtp.mailfrom=mshinwell@janestreet.com; spf=None smtp.helo=postmaster@mxout1.mail.janestreet.com Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of mshinwell@janestreet.com) identity=pra; client-ip=38.105.200.112; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="mshinwell@janestreet.com"; x-sender="mshinwell@janestreet.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail3-smtp-sop.national.inria.fr: domain of mshinwell@janestreet.com designates 38.105.200.112 as permitted sender) identity=mailfrom; client-ip=38.105.200.112; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="mshinwell@janestreet.com"; x-sender="mshinwell@janestreet.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of postmaster@mxout1.mail.janestreet.com) identity=helo; client-ip=38.105.200.112; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="mshinwell@janestreet.com"; x-sender="postmaster@mxout1.mail.janestreet.com"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A0A8AADMieJWjnDIaSZehBBtBqloiQCFKoITAQ2BbSGFbgKBLgc4FAEBAQEBAQEBEAEBAQEHFglQgi2CFQEBBBIRHQEBExkLAQ8LCwMKAgIJHQICIQESAQUBChIGEwkJEIdtAxIDC6EBgTE+MYpPZ4RBAQSGDAMKhD8BAQEBAQEBAQEBAQEBAQEBAQEBAQEPBgpyhRyDRH6CPYFMDDCCdYE6hiUMgTOPZIVthhoDgXKBZEuHI4UwhyCGExEegQ8eAQGCOB6BUGoBiRaBOgEBAQ X-IPAS-Result: A0A8AADMieJWjnDIaSZehBBtBqloiQCFKoITAQ2BbSGFbgKBLgc4FAEBAQEBAQEBEAEBAQEHFglQgi2CFQEBBBIRHQEBExkLAQ8LCwMKAgIJHQICIQESAQUBChIGEwkJEIdtAxIDC6EBgTE+MYpPZ4RBAQSGDAMKhD8BAQEBAQEBAQEBAQEBAQEBAQEBAQEPBgpyhRyDRH6CPYFMDDCCdYE6hiUMgTOPZIVthhoDgXKBZEuHI4UwhyCGExEegQ8eAQGCOB6BUGoBiRaBOgEBAQ X-IronPort-AV: E=Sophos;i="5.24,320,1454972400"; d="scan'208";a="168139522" Received: from mxout1.mail.janestreet.com ([38.105.200.112]) by mail3-smtp-sop.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Mar 2016 10:05:46 +0100 Received: from tot-qpr-mailcore1.delacy.com ([172.27.56.68] helo=tot-qpr-mailcore1) by mxout1.mail.janestreet.com with esmtps (TLSv1:DHE-RSA-AES256-SHA:256) (Exim 4.82) (envelope-from ) id 1aeJ0z-00086m-RV for caml-list@inria.fr; Fri, 11 Mar 2016 04:05:45 -0500 X-JS-Flow: external Received: by tot-qpr-mailcore1 with JS-mailcore (0.1) (envelope-from ) id BW4opm-AAABz2-Lq; 2016-03-11 04:05:42.374277-05:00 Received: from mail-oi0-f51.google.com ([209.85.218.51]) by mxgoog1.mail.janestreet.com with esmtps (UNKNOWN:AES128-GCM-SHA256:128) (Exim 4.72) (envelope-from ) id 1aeJ0w-0005he-7C for caml-list@inria.fr; Fri, 11 Mar 2016 04:05:42 -0500 Received: by mail-oi0-f51.google.com with SMTP id m82so81060588oif.1 for ; Fri, 11 Mar 2016 01:05:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com; s=google; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc; bh=oyrjfuq5/T8dfMCJu3PBcqpTSHwhOTzosiAbfKwehlk=; b=PiPvl5vwWq/ZmE3Cc2zcZRSDBFA2BMyuAMYBItqjGDw6HqFJGuhzYmjhYqVKQomLlE YmH9kRpO/ABR60KPa7O2FvpJ8tbQDyxJxNlqTxRaOA3Q8OwDatR+3uTfcIDyJz4p3/+N DC2ugG2LBf1PY/htbjuRjnYEu63K72QzFZlkk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc; bh=oyrjfuq5/T8dfMCJu3PBcqpTSHwhOTzosiAbfKwehlk=; b=E/YqAn3qBi8wYpmP3ht91Ys04P1+HYFDV5s3y/en/B75OkRivSVgJP3dPfO8TY3fY+ eui0fs9f3z/ldLyt7BEG2JI2Mne2aljWRifvZVxPJxGCk9zhjHcnWjMHH/4DIZgVTUs+ b7G4x3uRkbWTG5gf0CA8u0geU19EzoDC5to/Qwn0l1A7m/1DxdQVHRfjU60jPa7XY7d6 w+oIcCWfAalGeERNoH4gH3RYm74MVy4hRc3BNI6htNC51qcsXj3BdUhKHYwpb+j0u3QD r6Md8lSAZ3NmSb/s5HsvHMApNF4CzHbD9eJwoZmR94oqNOl+PiNJDBaT3rqWQ0JHByk3 pRrQ== X-Gm-Message-State: AD7BkJJEWzc5+jIu6mw4rkVf5IatxLwic6jqK71lbNV/qlcx2fThODgY2zl0s/aOHf8BKLy9EZzQTktujF1rTyttC6Xq2bxIPWIzjmLNzwFTY7EJVla1+aE3eCn9NUo3GHC0r6ZPN0FT3orknLvP X-Received: by 10.202.52.70 with SMTP id b67mr4803746oia.125.1457687141831; Fri, 11 Mar 2016 01:05:41 -0800 (PST) MIME-Version: 1.0 X-Received: by 10.202.52.70 with SMTP id b67mr4803734oia.125.1457687141636; Fri, 11 Mar 2016 01:05:41 -0800 (PST) Received: by 10.202.216.214 with HTTP; Fri, 11 Mar 2016 01:05:41 -0800 (PST) In-Reply-To: References: <56DF57FA.9070309@lexifi.com> <56E1D523.7000701@laposte.net> <1457650315.13223.42.camel@e130.lan.sumadev.de> Date: Fri, 11 Mar 2016 09:05:41 +0000 Message-ID: From:Mark Shinwell To:Gerd Stolpmann Cc:Pierre Chambart , Markus Mottl , OCaml List Content-Type: text/plain; charset=UTF-8 X-JS-Processed-by: mailcore X-Validation-by: mshinwell@janestreet.com Subject: Re: [Caml-list] Re: Status of Flambda in OCaml 4.03 (Sorry, I meant: "the large expression and can be eliminated") On 11 March 2016 at 08:59, Mark Shinwell wrote: > Markus: I think we should at least consider whether package management > can help with multiple installations of libraries at different > optimisation levels. As regards multiple files, I suppose it could be > configurable, but I worry that for large source trees the overhead of > having that many more compilation artifacts may be non-negligible. > Perhaps another option would be to arrange the .cmx files so that they > could be read without importing the full information for optimisation > unless requested. > > Gerd: Unless you have unlimited source files, there shouldn't be > unlimited code size blowup, because there are parameters that restrict > inlining. In particular (unless the user forces behaviour via an > attribute) there is always a calculation that weighs up the change in > code size resulting from a proposed inlining against the expected > runtime performance benefit based on which operations will be > simplified away as a result of doing such inlining. > > Also, for a function like the one you gave containing: > > if then else > > one of the reasons this should be kept in the .cmx files is because, > when the compiler comes to examine whether to inline it, it may be > able to fully evaluate . In particular when it's true then the > large expression can be eliminated completely (so long as is > not side-effecting). Another example is functions containing a large > match, where we may end up knowing which case is to be taken. > > Mark > > On 10 March 2016 at 22:51, Gerd Stolpmann wrote: >> Am Donnerstag, den 10.03.2016, 21:12 +0100 schrieb Pierre Chambart: >>> It is realistic when using the -Oclassic option that Mark mentioned. >>> By default the flambda inlining heuristic is decided at call site. Hence >>> all the information about a function needs to be available to correctly >>> decide. That means that the size of the cmx file is approximatively >>> linearly related to the .o file size. It is not easy to decide that some >>> function will never be inlined, so the information is always kept, >>> even on function annotated with [@inline never]. >> >> This assumes that the user is fine with an unlimited code size blow-up. >> So, to make an example, when you have >> >> let f() = >> if then else >> >> there is the chance that the "if-then" part can be inlined and leads to >> a speed-up at the price that the unproductive "else" part is also >> inlined. In total, there is a good chance that you see some >> acceleration. However, the question is whether the code duplication is >> acceptable or not. I guess, you need to also draw a line at the callee >> site, and disregard functions that are too large in total (though this >> limit can be way higher than the limit for "classic" inlining). >> >> Surely this will also limit the cmx size somewhat. >> >> Gerd >> >>> But I wouldn't >>> expect that to benefit that much. But for the -Oclassic mode where >>> the decision is made at the definition, it is possible to decide not >>> to include some information in the cmx. This is what happens in >>> non-flambda mode, and in flambda mode it also reduce a bit the >>> cmx size, but not as much as it could. This will probably improve >>> in 4.04 if there is sufficient interest in this -Oclassic mode. >>> -- >>> Pierre >>> >>> On 10/03/2016 16:32, Markus Mottl wrote: >>> > Ok, that explains things. Is it realistic to assume that the size of >>> > .cmx files can be substantially reduced? It seems there is a natural >>> > tradeoff between "optimize well" and "compile fast". I suspect it may >>> > be inevitable to add more compilation files. We actually already have >>> > that situation with native code libraries: the .cmxa file is enough to >>> > compile a project, but if the .cmx files of contained modules are >>> > visible in the path, too, then, and only then, the compiler can and >>> > will do cross-module inlining - which takes longer, of course. >>> > >>> > What about the following approach? - There is one "minimal" set of >>> > compilation files that always allows you to quickly obtain a running >>> > (albeit slow / large) executable. Additional compilation files then >>> > monotonically augment this information and can be produced and >>> > consumed optionally depending on compilation flags. The nice thing >>> > about this approach is that you don't necessarily have to recompile >>> > the whole project with different flags whenever you need a different >>> > compile time / performance tradeoff. E.g. if Flambda information is >>> > available for an unchanged file, you don't have to rebuild it when >>> > needed. If you just want to compile quickly, you don't have to read >>> > data you don't need. Separate compilation files would also integrate >>> > much better with build tools (timestamping, etc.). >>> > >>> > I guess we would already be looking at OCaml version 5 for such a change :) >>> > >>> > On Thu, Mar 10, 2016 at 2:20 AM, Mark Shinwell wrote: >>> >> By "enabled at configure time" I mean that you need to pass the >>> >> "-flambda" option to the configure script when building the compiler. >>> >> >>> >> The main reason Flambda isn't enabled by default is because we need to >>> >> do further work to improve compile-time performance. There are also >>> >> concerns about .cmx file size. Flambda produces larger .cmx files: it >>> >> stores the entire intermediate representation of the compilation unit >>> >> so that no subsequent cross-module inlining decision is compromised. >>> >> >>> >> There is a mode, -Oclassic, which uses Flambda but mimics the >>> >> behaviour of the existing compiler; unfortunately this isn't really >>> >> fast enough yet either and .cmx sizes aren't small enough. >>> >> >>> >> When we manage to address some of these issues further, hopefully for >>> >> 4.04, we will revisit whether Flambda should be enabled by default. >>> >> >>> >> One of the main reasons there is a configure option rather than a >>> >> runtime switch is to avoid having to re-engineer the compiler's build >>> >> system to permit multiple builds of the various libraries (the stdlib, >>> >> for example) with differing options that affect what appears in the >>> >> .cmx files (e.g. with and without Flambda). Even if code were used to >>> >> allow Flambda to read non-Flambda .cmx files, performance degradation >>> >> would result. >>> >> >>> >> Mark >>> >> >>> >> On 10 March 2016 at 01:43, Markus Mottl wrote: >>> >>> I agree with Yotam. Assuming that Flambda produces correct code and >>> >>> doesn't cause any serious performance issues either with the generated >>> >>> code or with excessive compile times, I'd prefer building it into the >>> >>> compiler by default. I'd be fine if I had to pass an extra flag at >>> >>> compile time to actually run Flambda optimizers, but it should at >>> >>> least be available. It doesn't have to be perfect to be useful. >>> >>> >>> >>> On Wed, Mar 9, 2016 at 8:32 PM, Yotam Barnoy wrote: >>> >>>> While we await the manual, can you explain what you mean by 'enabled at >>> >>>> configure time'? Will a -flambda -O-something argument passed to the normal >>> >>>> 4.03 compiler enable flambda optimizations? Flambda is clearly the star of >>> >>>> the 4.03 release, so not enabling it using command line options seems >>> >>>> counter-intuitive (if this is the case). >>> >>>> >>> >>>> -Yotam >>> >>>> >>> >>>> On Wed, Mar 9, 2016 at 7:59 PM, Markus Mottl wrote: >>> >>>>> I've just tested Flambda, and it seems to already be doing a pretty >>> >>>>> decent job on some non-trivial examples (e.g. inlining combinations of >>> >>>>> functors and first class functions). I hope there will be a stable >>> >>>>> 4.03 OPAM switch that enables it. I'm looking forward to being able >>> >>>>> to write more elegant, abstract code that's still efficient. >>> >>>>> >>> >>>>> Regards, >>> >>>>> Markus >>> >>>>> >>> >>>>> On Wed, Mar 9, 2016 at 2:14 AM, Mark Shinwell >>> >>>>> wrote: >>> >>>>>> It will not be enabled by default in 4.03. For the majority of >>> >>>>>> programs, in the current state, it should improve performance (mainly >>> >>>>>> by lowering allocation). It should never generate wrong code. >>> >>>>>> However we know of examples that don't improve as much as we would >>> >>>>>> like, which we will try to address for 4.04. >>> >>>>>> >>> >>>>>> There will be a draft version of the new Flambda manual chapter >>> >>>>>> available shortly (hopefully this week). Amongst other things this >>> >>>>>> documents what you found about the configure options and the flags' >>> >>>>>> operation. >>> >>>>>> >>> >>>>>> Mark >>> >>>>>> >>> >>>>>> On 9 March 2016 at 03:55, Markus Mottl wrote: >>> >>>>>>> Hi Alain, >>> >>>>>>> >>> >>>>>>> I see, thanks. It was a little confusing, because the command line >>> >>>>>>> options for tuning flambda were still available even without Flambda >>> >>>>>>> being enabled. >>> >>>>>>> >>> >>>>>>> Will Flambda be enabled by default in OCaml 4.03 or is it still >>> >>>>>>> considered to be too experimental? It could turn out to become one of >>> >>>>>>> the most impactful new features in terms of how I write code. >>> >>>>>>> >>> >>>>>>> Regards, >>> >>>>>>> Markus >>> >>>>>>> >>> >>>>>>> On Tue, Mar 8, 2016 at 5:53 PM, Alain Frisch >>> >>>>>>> wrote: >>> >>>>>>>> Hi Markus, >>> >>>>>>>> >>> >>>>>>>> flambda needs to be enabled explicitly at configure time with the >>> >>>>>>>> "-flambda" >>> >>>>>>>> flag. The new optimizer will then be used unconditionally, and you >>> >>>>>>>> can >>> >>>>>>>> tweak it using command-line parameters passed to ocamlopt (see >>> >>>>>>>> "ocamlopt >>> >>>>>>>> -h"). >>> >>>>>>>> >>> >>>>>>>> >>> >>>>>>>> Alain >>> >>>>>>>> >>> >>>>>>>> >>> >>>>>>>> On 08/03/2016 23:10, Markus Mottl wrote: >>> >>>>>>>>> Hi, >>> >>>>>>>>> >>> >>>>>>>>> I'm trying out OCaml 4.03.0+beta1 right now and wanted to test >>> >>>>>>>>> Flambda >>> >>>>>>>>> optimizations. But looking at the generated assembly, it doesn't >>> >>>>>>>>> seem >>> >>>>>>>>> to be doing much if anything on the simple test examples that I >>> >>>>>>>>> thought would benefit. >>> >>>>>>>>> >>> >>>>>>>>> To give an example of what I expected to see, lets consider this >>> >>>>>>>>> code: >>> >>>>>>>>> >>> >>>>>>>>> ----- >>> >>>>>>>>> let map_pair f (x, y) = f x, f y >>> >>>>>>>>> >>> >>>>>>>>> let succ x = x + 1 >>> >>>>>>>>> let map_pair_succ1 pair = map_pair succ pair >>> >>>>>>>>> let map_pair_succ2 (x, y) = succ x, succ y >>> >>>>>>>>> ----- >>> >>>>>>>>> >>> >>>>>>>>> I would have thought that the "succ" function would be inlined in >>> >>>>>>>>> "map_pair_succ1" as the compiler would do for "map_pair_succ2". >>> >>>>>>>>> But the generated code looks like this: >>> >>>>>>>>> >>> >>>>>>>>> ----- >>> >>>>>>>>> L101: >>> >>>>>>>>> movq %rax, %rdi >>> >>>>>>>>> movq %rdi, 8(%rsp) >>> >>>>>>>>> movq %rbx, (%rsp) >>> >>>>>>>>> movq 8(%rbx), %rax >>> >>>>>>>>> movq (%rdi), %rsi >>> >>>>>>>>> movq %rdi, %rbx >>> >>>>>>>>> call *%rsi >>> >>>>>>>>> L102: >>> >>>>>>>>> movq %rax, 16(%rsp) >>> >>>>>>>>> movq (%rsp), %rax >>> >>>>>>>>> movq (%rax), %rax >>> >>>>>>>>> movq 8(%rsp), %rbx >>> >>>>>>>>> movq (%rbx), %rdi >>> >>>>>>>>> call *%rdi >>> >>>>>>>>> ----- >>> >>>>>>>>> >>> >>>>>>>>> Is Flambda supposed to work out of the box with the current beta? >>> >>>>>>>>> What flags or annotations should I use for testing? Any showcase >>> >>>>>>>>> examples I should try out that are expected to be improved? >>> >>>>>>>>> >>> >>>>>>>>> Regards, >>> >>>>>>>>> Markus >>> >>>>>>>>> >>> >>>>>>> >>> >>>>>>> >>> >>>>>>> -- >>> >>>>>>> Markus Mottl http://www.ocaml.info markus.mottl@gmail.com >>> >>>>>>> >>> >>>>>>> -- >>> >>>>>>> Caml-list mailing list. Subscription management and archives: >>> >>>>>>> https://sympa.inria.fr/sympa/arc/caml-list >>> >>>>>>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners >>> >>>>>>> Bug reports: http://caml.inria.fr/bin/caml-bugs >>> >>>>> >>> >>>>> >>> >>>>> -- >>> >>>>> Markus Mottl http://www.ocaml.info markus.mottl@gmail.com >>> >>>>> >>> >>>>> -- >>> >>>>> Caml-list mailing list. Subscription management and archives: >>> >>>>> https://sympa.inria.fr/sympa/arc/caml-list >>> >>>>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners >>> >>>>> Bug reports: http://caml.inria.fr/bin/caml-bugs >>> >>>> >>> >>> >>> >>> >>> >>> -- >>> >>> Markus Mottl http://www.ocaml.info markus.mottl@gmail.com >>> > >>> > >>> >>> >> >> -- >> ------------------------------------------------------------ >> Gerd Stolpmann, Darmstadt, Germany gerd@gerd-stolpmann.de >> My OCaml site: http://www.camlcity.org >> Contact details: http://www.camlcity.org/contact.html >> Company homepage: http://www.gerd-stolpmann.de >> ------------------------------------------------------------ >>