From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by sympa.inria.fr (Postfix) with ESMTPS id 654287F026 for ; Fri, 11 Mar 2016 09:59:40 +0100 (CET) IronPort-PHdr: 9a23:/te7YxxGPccP7vnXCy+O+j09IxM/srCxBDY+r6Qd0OoQIJqq85mqBkHD//Il1AaPBtWEraIUwLCJ4ujJYi8p39WoiDg6aptCVhsI2409vjcLJ4q7M3D9N+PgdCcgHc5PBxdP9nC/NlVJSo6lPwWB6kO74TNaIBjjLw09fr2zQd6NyZTnnLrtqtX6WEZhunmUWftKNhK4rAHc5IE9oLBJDeIP8CbPuWZCYO9MxGlldhq5lhf44dqsrtY4q3wD89pozcNLUL37cqIkVvQYSW1+ayFm0vb2rgHORhej4X4VU2Ne0kYZQluN0BavcZrrvmPBqu15wCyTO8u+GbEyVzOK4KpxRFrzlCADLzsw9meRhsEm34xBpxf0ghVlwMbvYICTK/d6euuJeMgaRGxeU8JVfy5IBI6nc5ECAvZHNuFd+dqu72ASpAezUFH/TNjkzSVF0zqrhKA= Authentication-Results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=mshinwell@janestreet.com; spf=Pass smtp.mailfrom=mshinwell@janestreet.com; spf=None smtp.helo=postmaster@mxout1.mail.janestreet.com Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of mshinwell@janestreet.com) identity=pra; client-ip=38.105.200.112; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="mshinwell@janestreet.com"; x-sender="mshinwell@janestreet.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail3-smtp-sop.national.inria.fr: domain of mshinwell@janestreet.com designates 38.105.200.112 as permitted sender) identity=mailfrom; client-ip=38.105.200.112; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="mshinwell@janestreet.com"; x-sender="mshinwell@janestreet.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of postmaster@mxout1.mail.janestreet.com) identity=helo; client-ip=38.105.200.112; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="mshinwell@janestreet.com"; x-sender="postmaster@mxout1.mail.janestreet.com"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A0BKAACriOJWjnDIaSZehBBtBqloiQCFKoITAQ2BbRcKhW4CgS4HOBQBAQEBAQEBARABAQEBBxYJUIItghUBAQQSER0BARMZCwEPCwsDCgICCR0CAiEBEgEFAQoSBhMJCRCHbQMSAwuhAIExPjGKT2eEQQEEhgwDCoQ/AQEBAQEBAQEBAQEBAQEBAQEBAQEBDwYKcoUcg0R+gj2BTAwwgj0LLROBJ4YjDIEzj2GFbYYaA4FygWRLhyOFMIcfhhMRHoEPHgEBgjgegVBqAYkWgToBAQE X-IPAS-Result: A0BKAACriOJWjnDIaSZehBBtBqloiQCFKoITAQ2BbRcKhW4CgS4HOBQBAQEBAQEBARABAQEBBxYJUIItghUBAQQSER0BARMZCwEPCwsDCgICCR0CAiEBEgEFAQoSBhMJCRCHbQMSAwuhAIExPjGKT2eEQQEEhgwDCoQ/AQEBAQEBAQEBAQEBAQEBAQEBAQEBDwYKcoUcg0R+gj2BTAwwgj0LLROBJ4YjDIEzj2GFbYYaA4FygWRLhyOFMIcfhhMRHoEPHgEBgjgegVBqAYkWgToBAQE X-IronPort-AV: E=Sophos;i="5.24,320,1454972400"; d="scan'208";a="168137782" Received: from mxout1.mail.janestreet.com ([38.105.200.112]) by mail3-smtp-sop.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Mar 2016 09:59:36 +0100 Received: from tot-qpr-mailcore2.delacy.com ([172.27.56.106] helo=tot-qpr-mailcore2) by mxout1.mail.janestreet.com with esmtps (TLSv1:DHE-RSA-AES256-SHA:256) (Exim 4.82) (envelope-from ) id 1aeIv1-0005JE-6U for caml-list@inria.fr; Fri, 11 Mar 2016 03:59:35 -0500 X-JS-Flow: external Received: by tot-qpr-mailcore2 with JS-mailcore (0.1) (envelope-from ) id BW4oj3-AAAB6W-EX; 2016-03-11 03:59:35.140861-05:00 Received: from mail-ob0-f169.google.com ([209.85.214.169]) by mxgoog1.mail.janestreet.com with esmtps (UNKNOWN:AES128-GCM-SHA256:128) (Exim 4.72) (envelope-from ) id 1aeIv1-0005Z1-0k for caml-list@inria.fr; Fri, 11 Mar 2016 03:59:35 -0500 Received: by mail-ob0-f169.google.com with SMTP id ts10so106950659obc.1 for ; Fri, 11 Mar 2016 00:59:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com; s=google; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc; bh=/bVevcyx9V+p4rxMGHIy3ga8QiXuqWQO/XurAhUfyuY=; b=u0BTrMIWhtH+5KQHjdi7y3XzXUqYomZ8GKOI9N0IEtxW9/xnxc32/Ihsd2LyGBDlNF uWpbclBLpeffxOx4jOVJI8wRFQ5+7isVBX8o4C1r1/esSO6p0piM5PAqghnRacj2JvvE CU9wpM05j30+frIH/YWbl2LIjNJQqaaHV1egc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc; bh=/bVevcyx9V+p4rxMGHIy3ga8QiXuqWQO/XurAhUfyuY=; b=dBo+VSu9cYcnNl9HWPC1gh3vOKdoOer40Ps1+ks07+9HXnOLxvETNhElN1cW/JyItt 7NQhP/4HyluIKdwGfFe75pNQ6H5MyfHYDS8Evm7vhqC8VEbWAwU79mEp0p49pSM7m3j3 FGO6Pyn90Bnru/KzMYhjnRb3iHx1mJ0IusT9lGxnOzZ6/QqX331LB1suD8VL4ln7rIcy ywxPr+uOozL+T3JXgrgr7mMIjBcA9Fl4dfNjShDPn7vcD5qCnmMT1aQaozj0EaV1sVEo rIWI+8voed0hZdEvI1JtGdOVp/NeztEedaiLbLIDZ9mOF1r6xyFMS7+eqzyHTLWNdAXw V+0w== X-Gm-Message-State: AD7BkJJn98lCtLbEidLCRZT1RQlhzKTQvAvGIigsjLhAu7hls5CPQtPxAq187jHb9BY3t/lvAJVWsYkw1Mpi+wBOJEOxo69213y02nW6DUrpyNiiCXX8q4MsO18RZ9+b1xzczn/Lf/VZ9mJQAfI9 X-Received: by 10.60.142.170 with SMTP id rx10mr4692420oeb.37.1457686774487; Fri, 11 Mar 2016 00:59:34 -0800 (PST) MIME-Version: 1.0 X-Received: by 10.60.142.170 with SMTP id rx10mr4692406oeb.37.1457686774300; Fri, 11 Mar 2016 00:59:34 -0800 (PST) Received: by 10.202.216.214 with HTTP; Fri, 11 Mar 2016 00:59:34 -0800 (PST) In-Reply-To: <1457650315.13223.42.camel@e130.lan.sumadev.de> References: <56DF57FA.9070309@lexifi.com> <56E1D523.7000701@laposte.net> <1457650315.13223.42.camel@e130.lan.sumadev.de> Date: Fri, 11 Mar 2016 08:59:34 +0000 Message-ID: From:Mark Shinwell To:Gerd Stolpmann Cc:Pierre Chambart , Markus Mottl , OCaml List Content-Type: text/plain; charset=UTF-8 X-JS-Processed-by: mailcore X-Validation-by: mshinwell@janestreet.com Subject: Re: [Caml-list] Re: Status of Flambda in OCaml 4.03 Markus: I think we should at least consider whether package management can help with multiple installations of libraries at different optimisation levels. As regards multiple files, I suppose it could be configurable, but I worry that for large source trees the overhead of having that many more compilation artifacts may be non-negligible. Perhaps another option would be to arrange the .cmx files so that they could be read without importing the full information for optimisation unless requested. Gerd: Unless you have unlimited source files, there shouldn't be unlimited code size blowup, because there are parameters that restrict inlining. In particular (unless the user forces behaviour via an attribute) there is always a calculation that weighs up the change in code size resulting from a proposed inlining against the expected runtime performance benefit based on which operations will be simplified away as a result of doing such inlining. Also, for a function like the one you gave containing: if then else one of the reasons this should be kept in the .cmx files is because, when the compiler comes to examine whether to inline it, it may be able to fully evaluate . In particular when it's true then the large expression can be eliminated completely (so long as is not side-effecting). Another example is functions containing a large match, where we may end up knowing which case is to be taken. Mark On 10 March 2016 at 22:51, Gerd Stolpmann wrote: > Am Donnerstag, den 10.03.2016, 21:12 +0100 schrieb Pierre Chambart: >> It is realistic when using the -Oclassic option that Mark mentioned. >> By default the flambda inlining heuristic is decided at call site. Hence >> all the information about a function needs to be available to correctly >> decide. That means that the size of the cmx file is approximatively >> linearly related to the .o file size. It is not easy to decide that some >> function will never be inlined, so the information is always kept, >> even on function annotated with [@inline never]. > > This assumes that the user is fine with an unlimited code size blow-up. > So, to make an example, when you have > > let f() = > if then else > > there is the chance that the "if-then" part can be inlined and leads to > a speed-up at the price that the unproductive "else" part is also > inlined. In total, there is a good chance that you see some > acceleration. However, the question is whether the code duplication is > acceptable or not. I guess, you need to also draw a line at the callee > site, and disregard functions that are too large in total (though this > limit can be way higher than the limit for "classic" inlining). > > Surely this will also limit the cmx size somewhat. > > Gerd > >> But I wouldn't >> expect that to benefit that much. But for the -Oclassic mode where >> the decision is made at the definition, it is possible to decide not >> to include some information in the cmx. This is what happens in >> non-flambda mode, and in flambda mode it also reduce a bit the >> cmx size, but not as much as it could. This will probably improve >> in 4.04 if there is sufficient interest in this -Oclassic mode. >> -- >> Pierre >> >> On 10/03/2016 16:32, Markus Mottl wrote: >> > Ok, that explains things. Is it realistic to assume that the size of >> > .cmx files can be substantially reduced? It seems there is a natural >> > tradeoff between "optimize well" and "compile fast". I suspect it may >> > be inevitable to add more compilation files. We actually already have >> > that situation with native code libraries: the .cmxa file is enough to >> > compile a project, but if the .cmx files of contained modules are >> > visible in the path, too, then, and only then, the compiler can and >> > will do cross-module inlining - which takes longer, of course. >> > >> > What about the following approach? - There is one "minimal" set of >> > compilation files that always allows you to quickly obtain a running >> > (albeit slow / large) executable. Additional compilation files then >> > monotonically augment this information and can be produced and >> > consumed optionally depending on compilation flags. The nice thing >> > about this approach is that you don't necessarily have to recompile >> > the whole project with different flags whenever you need a different >> > compile time / performance tradeoff. E.g. if Flambda information is >> > available for an unchanged file, you don't have to rebuild it when >> > needed. If you just want to compile quickly, you don't have to read >> > data you don't need. Separate compilation files would also integrate >> > much better with build tools (timestamping, etc.). >> > >> > I guess we would already be looking at OCaml version 5 for such a change :) >> > >> > On Thu, Mar 10, 2016 at 2:20 AM, Mark Shinwell wrote: >> >> By "enabled at configure time" I mean that you need to pass the >> >> "-flambda" option to the configure script when building the compiler. >> >> >> >> The main reason Flambda isn't enabled by default is because we need to >> >> do further work to improve compile-time performance. There are also >> >> concerns about .cmx file size. Flambda produces larger .cmx files: it >> >> stores the entire intermediate representation of the compilation unit >> >> so that no subsequent cross-module inlining decision is compromised. >> >> >> >> There is a mode, -Oclassic, which uses Flambda but mimics the >> >> behaviour of the existing compiler; unfortunately this isn't really >> >> fast enough yet either and .cmx sizes aren't small enough. >> >> >> >> When we manage to address some of these issues further, hopefully for >> >> 4.04, we will revisit whether Flambda should be enabled by default. >> >> >> >> One of the main reasons there is a configure option rather than a >> >> runtime switch is to avoid having to re-engineer the compiler's build >> >> system to permit multiple builds of the various libraries (the stdlib, >> >> for example) with differing options that affect what appears in the >> >> .cmx files (e.g. with and without Flambda). Even if code were used to >> >> allow Flambda to read non-Flambda .cmx files, performance degradation >> >> would result. >> >> >> >> Mark >> >> >> >> On 10 March 2016 at 01:43, Markus Mottl wrote: >> >>> I agree with Yotam. Assuming that Flambda produces correct code and >> >>> doesn't cause any serious performance issues either with the generated >> >>> code or with excessive compile times, I'd prefer building it into the >> >>> compiler by default. I'd be fine if I had to pass an extra flag at >> >>> compile time to actually run Flambda optimizers, but it should at >> >>> least be available. It doesn't have to be perfect to be useful. >> >>> >> >>> On Wed, Mar 9, 2016 at 8:32 PM, Yotam Barnoy wrote: >> >>>> While we await the manual, can you explain what you mean by 'enabled at >> >>>> configure time'? Will a -flambda -O-something argument passed to the normal >> >>>> 4.03 compiler enable flambda optimizations? Flambda is clearly the star of >> >>>> the 4.03 release, so not enabling it using command line options seems >> >>>> counter-intuitive (if this is the case). >> >>>> >> >>>> -Yotam >> >>>> >> >>>> On Wed, Mar 9, 2016 at 7:59 PM, Markus Mottl wrote: >> >>>>> I've just tested Flambda, and it seems to already be doing a pretty >> >>>>> decent job on some non-trivial examples (e.g. inlining combinations of >> >>>>> functors and first class functions). I hope there will be a stable >> >>>>> 4.03 OPAM switch that enables it. I'm looking forward to being able >> >>>>> to write more elegant, abstract code that's still efficient. >> >>>>> >> >>>>> Regards, >> >>>>> Markus >> >>>>> >> >>>>> On Wed, Mar 9, 2016 at 2:14 AM, Mark Shinwell >> >>>>> wrote: >> >>>>>> It will not be enabled by default in 4.03. For the majority of >> >>>>>> programs, in the current state, it should improve performance (mainly >> >>>>>> by lowering allocation). It should never generate wrong code. >> >>>>>> However we know of examples that don't improve as much as we would >> >>>>>> like, which we will try to address for 4.04. >> >>>>>> >> >>>>>> There will be a draft version of the new Flambda manual chapter >> >>>>>> available shortly (hopefully this week). Amongst other things this >> >>>>>> documents what you found about the configure options and the flags' >> >>>>>> operation. >> >>>>>> >> >>>>>> Mark >> >>>>>> >> >>>>>> On 9 March 2016 at 03:55, Markus Mottl wrote: >> >>>>>>> Hi Alain, >> >>>>>>> >> >>>>>>> I see, thanks. It was a little confusing, because the command line >> >>>>>>> options for tuning flambda were still available even without Flambda >> >>>>>>> being enabled. >> >>>>>>> >> >>>>>>> Will Flambda be enabled by default in OCaml 4.03 or is it still >> >>>>>>> considered to be too experimental? It could turn out to become one of >> >>>>>>> the most impactful new features in terms of how I write code. >> >>>>>>> >> >>>>>>> Regards, >> >>>>>>> Markus >> >>>>>>> >> >>>>>>> On Tue, Mar 8, 2016 at 5:53 PM, Alain Frisch >> >>>>>>> wrote: >> >>>>>>>> Hi Markus, >> >>>>>>>> >> >>>>>>>> flambda needs to be enabled explicitly at configure time with the >> >>>>>>>> "-flambda" >> >>>>>>>> flag. The new optimizer will then be used unconditionally, and you >> >>>>>>>> can >> >>>>>>>> tweak it using command-line parameters passed to ocamlopt (see >> >>>>>>>> "ocamlopt >> >>>>>>>> -h"). >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> Alain >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> On 08/03/2016 23:10, Markus Mottl wrote: >> >>>>>>>>> Hi, >> >>>>>>>>> >> >>>>>>>>> I'm trying out OCaml 4.03.0+beta1 right now and wanted to test >> >>>>>>>>> Flambda >> >>>>>>>>> optimizations. But looking at the generated assembly, it doesn't >> >>>>>>>>> seem >> >>>>>>>>> to be doing much if anything on the simple test examples that I >> >>>>>>>>> thought would benefit. >> >>>>>>>>> >> >>>>>>>>> To give an example of what I expected to see, lets consider this >> >>>>>>>>> code: >> >>>>>>>>> >> >>>>>>>>> ----- >> >>>>>>>>> let map_pair f (x, y) = f x, f y >> >>>>>>>>> >> >>>>>>>>> let succ x = x + 1 >> >>>>>>>>> let map_pair_succ1 pair = map_pair succ pair >> >>>>>>>>> let map_pair_succ2 (x, y) = succ x, succ y >> >>>>>>>>> ----- >> >>>>>>>>> >> >>>>>>>>> I would have thought that the "succ" function would be inlined in >> >>>>>>>>> "map_pair_succ1" as the compiler would do for "map_pair_succ2". >> >>>>>>>>> But the generated code looks like this: >> >>>>>>>>> >> >>>>>>>>> ----- >> >>>>>>>>> L101: >> >>>>>>>>> movq %rax, %rdi >> >>>>>>>>> movq %rdi, 8(%rsp) >> >>>>>>>>> movq %rbx, (%rsp) >> >>>>>>>>> movq 8(%rbx), %rax >> >>>>>>>>> movq (%rdi), %rsi >> >>>>>>>>> movq %rdi, %rbx >> >>>>>>>>> call *%rsi >> >>>>>>>>> L102: >> >>>>>>>>> movq %rax, 16(%rsp) >> >>>>>>>>> movq (%rsp), %rax >> >>>>>>>>> movq (%rax), %rax >> >>>>>>>>> movq 8(%rsp), %rbx >> >>>>>>>>> movq (%rbx), %rdi >> >>>>>>>>> call *%rdi >> >>>>>>>>> ----- >> >>>>>>>>> >> >>>>>>>>> Is Flambda supposed to work out of the box with the current beta? >> >>>>>>>>> What flags or annotations should I use for testing? Any showcase >> >>>>>>>>> examples I should try out that are expected to be improved? >> >>>>>>>>> >> >>>>>>>>> Regards, >> >>>>>>>>> Markus >> >>>>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> -- >> >>>>>>> Markus Mottl http://www.ocaml.info markus.mottl@gmail.com >> >>>>>>> >> >>>>>>> -- >> >>>>>>> Caml-list mailing list. Subscription management and archives: >> >>>>>>> https://sympa.inria.fr/sympa/arc/caml-list >> >>>>>>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners >> >>>>>>> Bug reports: http://caml.inria.fr/bin/caml-bugs >> >>>>> >> >>>>> >> >>>>> -- >> >>>>> Markus Mottl http://www.ocaml.info markus.mottl@gmail.com >> >>>>> >> >>>>> -- >> >>>>> Caml-list mailing list. Subscription management and archives: >> >>>>> https://sympa.inria.fr/sympa/arc/caml-list >> >>>>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners >> >>>>> Bug reports: http://caml.inria.fr/bin/caml-bugs >> >>>> >> >>> >> >>> >> >>> -- >> >>> Markus Mottl http://www.ocaml.info markus.mottl@gmail.com >> > >> > >> >> > > -- > ------------------------------------------------------------ > Gerd Stolpmann, Darmstadt, Germany gerd@gerd-stolpmann.de > My OCaml site: http://www.camlcity.org > Contact details: http://www.camlcity.org/contact.html > Company homepage: http://www.gerd-stolpmann.de > ------------------------------------------------------------ >