caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] Status of Flambda in OCaml 4.03
@ 2016-03-08 22:10 Markus Mottl
  2016-03-08 22:53 ` Alain Frisch
  0 siblings, 1 reply; 23+ messages in thread
From: Markus Mottl @ 2016-03-08 22:10 UTC (permalink / raw)
  To: OCaml List

Hi,

I'm trying out OCaml 4.03.0+beta1 right now and wanted to test Flambda
optimizations.  But looking at the generated assembly, it doesn't seem
to be doing much if anything on the simple test examples that I
thought would benefit.

To give an example of what I expected to see, lets consider this code:

-----
let map_pair f (x, y) = f x, f y

let succ x = x + 1
let map_pair_succ1 pair = map_pair succ pair
let map_pair_succ2 (x, y) = succ x, succ y
-----

I would have thought that the "succ" function would be inlined in
"map_pair_succ1" as the compiler would do for "map_pair_succ2".
But the generated code looks like this:

-----
L101:
  movq  %rax, %rdi
  movq  %rdi, 8(%rsp)
  movq  %rbx, (%rsp)
  movq  8(%rbx), %rax
  movq  (%rdi), %rsi
  movq  %rdi, %rbx
  call  *%rsi
L102:
  movq  %rax, 16(%rsp)
  movq  (%rsp), %rax
  movq  (%rax), %rax
  movq  8(%rsp), %rbx
  movq  (%rbx), %rdi
  call  *%rdi
-----

Is Flambda supposed to work out of the box with the current beta?
What flags or annotations should I use for testing?  Any showcase
examples I should try out that are expected to be improved?

Regards,
Markus

-- 
Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Status of Flambda in OCaml 4.03
  2016-03-08 22:10 [Caml-list] Status of Flambda in OCaml 4.03 Markus Mottl
@ 2016-03-08 22:53 ` Alain Frisch
  2016-03-09  3:55   ` Markus Mottl
  0 siblings, 1 reply; 23+ messages in thread
From: Alain Frisch @ 2016-03-08 22:53 UTC (permalink / raw)
  To: Markus Mottl, OCaml List

Hi Markus,

flambda needs to be enabled explicitly at configure time with the 
"-flambda" flag.  The new optimizer will then be used unconditionally, 
and you can tweak it using command-line parameters passed to ocamlopt 
(see "ocamlopt -h").


Alain

On 08/03/2016 23:10, Markus Mottl wrote:
> Hi,
>
> I'm trying out OCaml 4.03.0+beta1 right now and wanted to test Flambda
> optimizations.  But looking at the generated assembly, it doesn't seem
> to be doing much if anything on the simple test examples that I
> thought would benefit.
>
> To give an example of what I expected to see, lets consider this code:
>
> -----
> let map_pair f (x, y) = f x, f y
>
> let succ x = x + 1
> let map_pair_succ1 pair = map_pair succ pair
> let map_pair_succ2 (x, y) = succ x, succ y
> -----
>
> I would have thought that the "succ" function would be inlined in
> "map_pair_succ1" as the compiler would do for "map_pair_succ2".
> But the generated code looks like this:
>
> -----
> L101:
>    movq  %rax, %rdi
>    movq  %rdi, 8(%rsp)
>    movq  %rbx, (%rsp)
>    movq  8(%rbx), %rax
>    movq  (%rdi), %rsi
>    movq  %rdi, %rbx
>    call  *%rsi
> L102:
>    movq  %rax, 16(%rsp)
>    movq  (%rsp), %rax
>    movq  (%rax), %rax
>    movq  8(%rsp), %rbx
>    movq  (%rbx), %rdi
>    call  *%rdi
> -----
>
> Is Flambda supposed to work out of the box with the current beta?
> What flags or annotations should I use for testing?  Any showcase
> examples I should try out that are expected to be improved?
>
> Regards,
> Markus
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Status of Flambda in OCaml 4.03
  2016-03-08 22:53 ` Alain Frisch
@ 2016-03-09  3:55   ` Markus Mottl
  2016-03-09  7:14     ` Mark Shinwell
  0 siblings, 1 reply; 23+ messages in thread
From: Markus Mottl @ 2016-03-09  3:55 UTC (permalink / raw)
  To: Alain Frisch; +Cc: OCaml List

Hi Alain,

I see, thanks.  It was a little confusing, because the command line
options for tuning flambda were still available even without Flambda
being enabled.

Will Flambda be enabled by default in OCaml 4.03 or is it still
considered to be too experimental?  It could turn out to become one of
the most impactful new features in terms of how I write code.

Regards,
Markus

On Tue, Mar 8, 2016 at 5:53 PM, Alain Frisch <alain.frisch@lexifi.com> wrote:
> Hi Markus,
>
> flambda needs to be enabled explicitly at configure time with the "-flambda"
> flag.  The new optimizer will then be used unconditionally, and you can
> tweak it using command-line parameters passed to ocamlopt (see "ocamlopt
> -h").
>
>
> Alain
>
>
> On 08/03/2016 23:10, Markus Mottl wrote:
>>
>> Hi,
>>
>> I'm trying out OCaml 4.03.0+beta1 right now and wanted to test Flambda
>> optimizations.  But looking at the generated assembly, it doesn't seem
>> to be doing much if anything on the simple test examples that I
>> thought would benefit.
>>
>> To give an example of what I expected to see, lets consider this code:
>>
>> -----
>> let map_pair f (x, y) = f x, f y
>>
>> let succ x = x + 1
>> let map_pair_succ1 pair = map_pair succ pair
>> let map_pair_succ2 (x, y) = succ x, succ y
>> -----
>>
>> I would have thought that the "succ" function would be inlined in
>> "map_pair_succ1" as the compiler would do for "map_pair_succ2".
>> But the generated code looks like this:
>>
>> -----
>> L101:
>>    movq  %rax, %rdi
>>    movq  %rdi, 8(%rsp)
>>    movq  %rbx, (%rsp)
>>    movq  8(%rbx), %rax
>>    movq  (%rdi), %rsi
>>    movq  %rdi, %rbx
>>    call  *%rsi
>> L102:
>>    movq  %rax, 16(%rsp)
>>    movq  (%rsp), %rax
>>    movq  (%rax), %rax
>>    movq  8(%rsp), %rbx
>>    movq  (%rbx), %rdi
>>    call  *%rdi
>> -----
>>
>> Is Flambda supposed to work out of the box with the current beta?
>> What flags or annotations should I use for testing?  Any showcase
>> examples I should try out that are expected to be improved?
>>
>> Regards,
>> Markus
>>
>



-- 
Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Status of Flambda in OCaml 4.03
  2016-03-09  3:55   ` Markus Mottl
@ 2016-03-09  7:14     ` Mark Shinwell
  2016-03-10  0:59       ` Markus Mottl
  0 siblings, 1 reply; 23+ messages in thread
From: Mark Shinwell @ 2016-03-09  7:14 UTC (permalink / raw)
  To: Markus Mottl; +Cc: Alain Frisch, OCaml List

It will not be enabled by default in 4.03.  For the majority of
programs, in the current state, it should improve performance (mainly
by lowering allocation).  It should never generate wrong code.
However we know of examples that don't improve as much as we would
like, which we will try to address for 4.04.

There will be a draft version of the new Flambda manual chapter
available shortly (hopefully this week).  Amongst other things this
documents what you found about the configure options and the flags'
operation.

Mark

On 9 March 2016 at 03:55, Markus Mottl <markus.mottl@gmail.com> wrote:
> Hi Alain,
>
> I see, thanks.  It was a little confusing, because the command line
> options for tuning flambda were still available even without Flambda
> being enabled.
>
> Will Flambda be enabled by default in OCaml 4.03 or is it still
> considered to be too experimental?  It could turn out to become one of
> the most impactful new features in terms of how I write code.
>
> Regards,
> Markus
>
> On Tue, Mar 8, 2016 at 5:53 PM, Alain Frisch <alain.frisch@lexifi.com> wrote:
>> Hi Markus,
>>
>> flambda needs to be enabled explicitly at configure time with the "-flambda"
>> flag.  The new optimizer will then be used unconditionally, and you can
>> tweak it using command-line parameters passed to ocamlopt (see "ocamlopt
>> -h").
>>
>>
>> Alain
>>
>>
>> On 08/03/2016 23:10, Markus Mottl wrote:
>>>
>>> Hi,
>>>
>>> I'm trying out OCaml 4.03.0+beta1 right now and wanted to test Flambda
>>> optimizations.  But looking at the generated assembly, it doesn't seem
>>> to be doing much if anything on the simple test examples that I
>>> thought would benefit.
>>>
>>> To give an example of what I expected to see, lets consider this code:
>>>
>>> -----
>>> let map_pair f (x, y) = f x, f y
>>>
>>> let succ x = x + 1
>>> let map_pair_succ1 pair = map_pair succ pair
>>> let map_pair_succ2 (x, y) = succ x, succ y
>>> -----
>>>
>>> I would have thought that the "succ" function would be inlined in
>>> "map_pair_succ1" as the compiler would do for "map_pair_succ2".
>>> But the generated code looks like this:
>>>
>>> -----
>>> L101:
>>>    movq  %rax, %rdi
>>>    movq  %rdi, 8(%rsp)
>>>    movq  %rbx, (%rsp)
>>>    movq  8(%rbx), %rax
>>>    movq  (%rdi), %rsi
>>>    movq  %rdi, %rbx
>>>    call  *%rsi
>>> L102:
>>>    movq  %rax, 16(%rsp)
>>>    movq  (%rsp), %rax
>>>    movq  (%rax), %rax
>>>    movq  8(%rsp), %rbx
>>>    movq  (%rbx), %rdi
>>>    call  *%rdi
>>> -----
>>>
>>> Is Flambda supposed to work out of the box with the current beta?
>>> What flags or annotations should I use for testing?  Any showcase
>>> examples I should try out that are expected to be improved?
>>>
>>> Regards,
>>> Markus
>>>
>>
>
>
>
> --
> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Status of Flambda in OCaml 4.03
  2016-03-09  7:14     ` Mark Shinwell
@ 2016-03-10  0:59       ` Markus Mottl
  2016-03-10  1:32         ` Yotam Barnoy
  0 siblings, 1 reply; 23+ messages in thread
From: Markus Mottl @ 2016-03-10  0:59 UTC (permalink / raw)
  To: Mark Shinwell; +Cc: Alain Frisch, OCaml List

I've just tested Flambda, and it seems to already be doing a pretty
decent job on some non-trivial examples (e.g. inlining combinations of
functors and first class functions).  I hope there will be a stable
4.03 OPAM switch that enables it.  I'm looking forward to being able
to write more elegant, abstract code that's still efficient.

Regards,
Markus

On Wed, Mar 9, 2016 at 2:14 AM, Mark Shinwell <mshinwell@janestreet.com> wrote:
> It will not be enabled by default in 4.03.  For the majority of
> programs, in the current state, it should improve performance (mainly
> by lowering allocation).  It should never generate wrong code.
> However we know of examples that don't improve as much as we would
> like, which we will try to address for 4.04.
>
> There will be a draft version of the new Flambda manual chapter
> available shortly (hopefully this week).  Amongst other things this
> documents what you found about the configure options and the flags'
> operation.
>
> Mark
>
> On 9 March 2016 at 03:55, Markus Mottl <markus.mottl@gmail.com> wrote:
>> Hi Alain,
>>
>> I see, thanks.  It was a little confusing, because the command line
>> options for tuning flambda were still available even without Flambda
>> being enabled.
>>
>> Will Flambda be enabled by default in OCaml 4.03 or is it still
>> considered to be too experimental?  It could turn out to become one of
>> the most impactful new features in terms of how I write code.
>>
>> Regards,
>> Markus
>>
>> On Tue, Mar 8, 2016 at 5:53 PM, Alain Frisch <alain.frisch@lexifi.com> wrote:
>>> Hi Markus,
>>>
>>> flambda needs to be enabled explicitly at configure time with the "-flambda"
>>> flag.  The new optimizer will then be used unconditionally, and you can
>>> tweak it using command-line parameters passed to ocamlopt (see "ocamlopt
>>> -h").
>>>
>>>
>>> Alain
>>>
>>>
>>> On 08/03/2016 23:10, Markus Mottl wrote:
>>>>
>>>> Hi,
>>>>
>>>> I'm trying out OCaml 4.03.0+beta1 right now and wanted to test Flambda
>>>> optimizations.  But looking at the generated assembly, it doesn't seem
>>>> to be doing much if anything on the simple test examples that I
>>>> thought would benefit.
>>>>
>>>> To give an example of what I expected to see, lets consider this code:
>>>>
>>>> -----
>>>> let map_pair f (x, y) = f x, f y
>>>>
>>>> let succ x = x + 1
>>>> let map_pair_succ1 pair = map_pair succ pair
>>>> let map_pair_succ2 (x, y) = succ x, succ y
>>>> -----
>>>>
>>>> I would have thought that the "succ" function would be inlined in
>>>> "map_pair_succ1" as the compiler would do for "map_pair_succ2".
>>>> But the generated code looks like this:
>>>>
>>>> -----
>>>> L101:
>>>>    movq  %rax, %rdi
>>>>    movq  %rdi, 8(%rsp)
>>>>    movq  %rbx, (%rsp)
>>>>    movq  8(%rbx), %rax
>>>>    movq  (%rdi), %rsi
>>>>    movq  %rdi, %rbx
>>>>    call  *%rsi
>>>> L102:
>>>>    movq  %rax, 16(%rsp)
>>>>    movq  (%rsp), %rax
>>>>    movq  (%rax), %rax
>>>>    movq  8(%rsp), %rbx
>>>>    movq  (%rbx), %rdi
>>>>    call  *%rdi
>>>> -----
>>>>
>>>> Is Flambda supposed to work out of the box with the current beta?
>>>> What flags or annotations should I use for testing?  Any showcase
>>>> examples I should try out that are expected to be improved?
>>>>
>>>> Regards,
>>>> Markus
>>>>
>>>
>>
>>
>>
>> --
>> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>>
>> --
>> Caml-list mailing list.  Subscription management and archives:
>> https://sympa.inria.fr/sympa/arc/caml-list
>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> Bug reports: http://caml.inria.fr/bin/caml-bugs



-- 
Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Status of Flambda in OCaml 4.03
  2016-03-10  0:59       ` Markus Mottl
@ 2016-03-10  1:32         ` Yotam Barnoy
  2016-03-10  1:43           ` Markus Mottl
  0 siblings, 1 reply; 23+ messages in thread
From: Yotam Barnoy @ 2016-03-10  1:32 UTC (permalink / raw)
  To: Markus Mottl; +Cc: Mark Shinwell, Alain Frisch, OCaml List

[-- Attachment #1: Type: text/plain, Size: 4550 bytes --]

While we await the manual, can you explain what you mean by 'enabled at
configure time'? Will a -flambda -O-something argument passed to the normal
4.03 compiler enable flambda optimizations? Flambda is clearly the star of
the 4.03 release, so not enabling it using command line options seems
counter-intuitive (if this is the case).

-Yotam

On Wed, Mar 9, 2016 at 7:59 PM, Markus Mottl <markus.mottl@gmail.com> wrote:

> I've just tested Flambda, and it seems to already be doing a pretty
> decent job on some non-trivial examples (e.g. inlining combinations of
> functors and first class functions).  I hope there will be a stable
> 4.03 OPAM switch that enables it.  I'm looking forward to being able
> to write more elegant, abstract code that's still efficient.
>
> Regards,
> Markus
>
> On Wed, Mar 9, 2016 at 2:14 AM, Mark Shinwell <mshinwell@janestreet.com>
> wrote:
> > It will not be enabled by default in 4.03.  For the majority of
> > programs, in the current state, it should improve performance (mainly
> > by lowering allocation).  It should never generate wrong code.
> > However we know of examples that don't improve as much as we would
> > like, which we will try to address for 4.04.
> >
> > There will be a draft version of the new Flambda manual chapter
> > available shortly (hopefully this week).  Amongst other things this
> > documents what you found about the configure options and the flags'
> > operation.
> >
> > Mark
> >
> > On 9 March 2016 at 03:55, Markus Mottl <markus.mottl@gmail.com> wrote:
> >> Hi Alain,
> >>
> >> I see, thanks.  It was a little confusing, because the command line
> >> options for tuning flambda were still available even without Flambda
> >> being enabled.
> >>
> >> Will Flambda be enabled by default in OCaml 4.03 or is it still
> >> considered to be too experimental?  It could turn out to become one of
> >> the most impactful new features in terms of how I write code.
> >>
> >> Regards,
> >> Markus
> >>
> >> On Tue, Mar 8, 2016 at 5:53 PM, Alain Frisch <alain.frisch@lexifi.com>
> wrote:
> >>> Hi Markus,
> >>>
> >>> flambda needs to be enabled explicitly at configure time with the
> "-flambda"
> >>> flag.  The new optimizer will then be used unconditionally, and you can
> >>> tweak it using command-line parameters passed to ocamlopt (see
> "ocamlopt
> >>> -h").
> >>>
> >>>
> >>> Alain
> >>>
> >>>
> >>> On 08/03/2016 23:10, Markus Mottl wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> I'm trying out OCaml 4.03.0+beta1 right now and wanted to test Flambda
> >>>> optimizations.  But looking at the generated assembly, it doesn't seem
> >>>> to be doing much if anything on the simple test examples that I
> >>>> thought would benefit.
> >>>>
> >>>> To give an example of what I expected to see, lets consider this code:
> >>>>
> >>>> -----
> >>>> let map_pair f (x, y) = f x, f y
> >>>>
> >>>> let succ x = x + 1
> >>>> let map_pair_succ1 pair = map_pair succ pair
> >>>> let map_pair_succ2 (x, y) = succ x, succ y
> >>>> -----
> >>>>
> >>>> I would have thought that the "succ" function would be inlined in
> >>>> "map_pair_succ1" as the compiler would do for "map_pair_succ2".
> >>>> But the generated code looks like this:
> >>>>
> >>>> -----
> >>>> L101:
> >>>>    movq  %rax, %rdi
> >>>>    movq  %rdi, 8(%rsp)
> >>>>    movq  %rbx, (%rsp)
> >>>>    movq  8(%rbx), %rax
> >>>>    movq  (%rdi), %rsi
> >>>>    movq  %rdi, %rbx
> >>>>    call  *%rsi
> >>>> L102:
> >>>>    movq  %rax, 16(%rsp)
> >>>>    movq  (%rsp), %rax
> >>>>    movq  (%rax), %rax
> >>>>    movq  8(%rsp), %rbx
> >>>>    movq  (%rbx), %rdi
> >>>>    call  *%rdi
> >>>> -----
> >>>>
> >>>> Is Flambda supposed to work out of the box with the current beta?
> >>>> What flags or annotations should I use for testing?  Any showcase
> >>>> examples I should try out that are expected to be improved?
> >>>>
> >>>> Regards,
> >>>> Markus
> >>>>
> >>>
> >>
> >>
> >>
> >> --
> >> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
> >>
> >> --
> >> Caml-list mailing list.  Subscription management and archives:
> >> https://sympa.inria.fr/sympa/arc/caml-list
> >> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> >> Bug reports: http://caml.inria.fr/bin/caml-bugs
>
>
>
> --
> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>

[-- Attachment #2: Type: text/html, Size: 7218 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Status of Flambda in OCaml 4.03
  2016-03-10  1:32         ` Yotam Barnoy
@ 2016-03-10  1:43           ` Markus Mottl
  2016-03-10  7:20             ` Mark Shinwell
  0 siblings, 1 reply; 23+ messages in thread
From: Markus Mottl @ 2016-03-10  1:43 UTC (permalink / raw)
  To: Yotam Barnoy; +Cc: Mark Shinwell, Alain Frisch, OCaml List

I agree with Yotam.  Assuming that Flambda produces correct code and
doesn't cause any serious performance issues either with the generated
code or with excessive compile times, I'd prefer building it into the
compiler by default.  I'd be fine if I had to pass an extra flag at
compile time to actually run Flambda optimizers, but it should at
least be available.  It doesn't have to be perfect to be useful.

On Wed, Mar 9, 2016 at 8:32 PM, Yotam Barnoy <yotambarnoy@gmail.com> wrote:
> While we await the manual, can you explain what you mean by 'enabled at
> configure time'? Will a -flambda -O-something argument passed to the normal
> 4.03 compiler enable flambda optimizations? Flambda is clearly the star of
> the 4.03 release, so not enabling it using command line options seems
> counter-intuitive (if this is the case).
>
> -Yotam
>
> On Wed, Mar 9, 2016 at 7:59 PM, Markus Mottl <markus.mottl@gmail.com> wrote:
>>
>> I've just tested Flambda, and it seems to already be doing a pretty
>> decent job on some non-trivial examples (e.g. inlining combinations of
>> functors and first class functions).  I hope there will be a stable
>> 4.03 OPAM switch that enables it.  I'm looking forward to being able
>> to write more elegant, abstract code that's still efficient.
>>
>> Regards,
>> Markus
>>
>> On Wed, Mar 9, 2016 at 2:14 AM, Mark Shinwell <mshinwell@janestreet.com>
>> wrote:
>> > It will not be enabled by default in 4.03.  For the majority of
>> > programs, in the current state, it should improve performance (mainly
>> > by lowering allocation).  It should never generate wrong code.
>> > However we know of examples that don't improve as much as we would
>> > like, which we will try to address for 4.04.
>> >
>> > There will be a draft version of the new Flambda manual chapter
>> > available shortly (hopefully this week).  Amongst other things this
>> > documents what you found about the configure options and the flags'
>> > operation.
>> >
>> > Mark
>> >
>> > On 9 March 2016 at 03:55, Markus Mottl <markus.mottl@gmail.com> wrote:
>> >> Hi Alain,
>> >>
>> >> I see, thanks.  It was a little confusing, because the command line
>> >> options for tuning flambda were still available even without Flambda
>> >> being enabled.
>> >>
>> >> Will Flambda be enabled by default in OCaml 4.03 or is it still
>> >> considered to be too experimental?  It could turn out to become one of
>> >> the most impactful new features in terms of how I write code.
>> >>
>> >> Regards,
>> >> Markus
>> >>
>> >> On Tue, Mar 8, 2016 at 5:53 PM, Alain Frisch <alain.frisch@lexifi.com>
>> >> wrote:
>> >>> Hi Markus,
>> >>>
>> >>> flambda needs to be enabled explicitly at configure time with the
>> >>> "-flambda"
>> >>> flag.  The new optimizer will then be used unconditionally, and you
>> >>> can
>> >>> tweak it using command-line parameters passed to ocamlopt (see
>> >>> "ocamlopt
>> >>> -h").
>> >>>
>> >>>
>> >>> Alain
>> >>>
>> >>>
>> >>> On 08/03/2016 23:10, Markus Mottl wrote:
>> >>>>
>> >>>> Hi,
>> >>>>
>> >>>> I'm trying out OCaml 4.03.0+beta1 right now and wanted to test
>> >>>> Flambda
>> >>>> optimizations.  But looking at the generated assembly, it doesn't
>> >>>> seem
>> >>>> to be doing much if anything on the simple test examples that I
>> >>>> thought would benefit.
>> >>>>
>> >>>> To give an example of what I expected to see, lets consider this
>> >>>> code:
>> >>>>
>> >>>> -----
>> >>>> let map_pair f (x, y) = f x, f y
>> >>>>
>> >>>> let succ x = x + 1
>> >>>> let map_pair_succ1 pair = map_pair succ pair
>> >>>> let map_pair_succ2 (x, y) = succ x, succ y
>> >>>> -----
>> >>>>
>> >>>> I would have thought that the "succ" function would be inlined in
>> >>>> "map_pair_succ1" as the compiler would do for "map_pair_succ2".
>> >>>> But the generated code looks like this:
>> >>>>
>> >>>> -----
>> >>>> L101:
>> >>>>    movq  %rax, %rdi
>> >>>>    movq  %rdi, 8(%rsp)
>> >>>>    movq  %rbx, (%rsp)
>> >>>>    movq  8(%rbx), %rax
>> >>>>    movq  (%rdi), %rsi
>> >>>>    movq  %rdi, %rbx
>> >>>>    call  *%rsi
>> >>>> L102:
>> >>>>    movq  %rax, 16(%rsp)
>> >>>>    movq  (%rsp), %rax
>> >>>>    movq  (%rax), %rax
>> >>>>    movq  8(%rsp), %rbx
>> >>>>    movq  (%rbx), %rdi
>> >>>>    call  *%rdi
>> >>>> -----
>> >>>>
>> >>>> Is Flambda supposed to work out of the box with the current beta?
>> >>>> What flags or annotations should I use for testing?  Any showcase
>> >>>> examples I should try out that are expected to be improved?
>> >>>>
>> >>>> Regards,
>> >>>> Markus
>> >>>>
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>> >>
>> >> --
>> >> Caml-list mailing list.  Subscription management and archives:
>> >> https://sympa.inria.fr/sympa/arc/caml-list
>> >> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> >> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>
>>
>>
>> --
>> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>>
>> --
>> Caml-list mailing list.  Subscription management and archives:
>> https://sympa.inria.fr/sympa/arc/caml-list
>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>
>



-- 
Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Status of Flambda in OCaml 4.03
  2016-03-10  1:43           ` Markus Mottl
@ 2016-03-10  7:20             ` Mark Shinwell
  2016-03-10 15:32               ` Markus Mottl
  0 siblings, 1 reply; 23+ messages in thread
From: Mark Shinwell @ 2016-03-10  7:20 UTC (permalink / raw)
  To: Markus Mottl; +Cc: Yotam Barnoy, Alain Frisch, OCaml List

By "enabled at configure time" I mean that you need to pass the
"-flambda" option to the configure script when building the compiler.

The main reason Flambda isn't enabled by default is because we need to
do further work to improve compile-time performance.  There are also
concerns about .cmx file size.  Flambda produces larger .cmx files: it
stores the entire intermediate representation of the compilation unit
so that no subsequent cross-module inlining decision is compromised.

There is a mode, -Oclassic, which uses Flambda but mimics the
behaviour of the existing compiler; unfortunately this isn't really
fast enough yet either and .cmx sizes aren't small enough.

When we manage to address some of these issues further, hopefully for
4.04, we will revisit whether Flambda should be enabled by default.

One of the main reasons there is a configure option rather than a
runtime switch is to avoid having to re-engineer the compiler's build
system to permit multiple builds of the various libraries (the stdlib,
for example) with differing options that affect what appears in the
.cmx files (e.g. with and without Flambda).  Even if code were used to
allow Flambda to read non-Flambda .cmx files, performance degradation
would result.

Mark

On 10 March 2016 at 01:43, Markus Mottl <markus.mottl@gmail.com> wrote:
> I agree with Yotam.  Assuming that Flambda produces correct code and
> doesn't cause any serious performance issues either with the generated
> code or with excessive compile times, I'd prefer building it into the
> compiler by default.  I'd be fine if I had to pass an extra flag at
> compile time to actually run Flambda optimizers, but it should at
> least be available.  It doesn't have to be perfect to be useful.
>
> On Wed, Mar 9, 2016 at 8:32 PM, Yotam Barnoy <yotambarnoy@gmail.com> wrote:
>> While we await the manual, can you explain what you mean by 'enabled at
>> configure time'? Will a -flambda -O-something argument passed to the normal
>> 4.03 compiler enable flambda optimizations? Flambda is clearly the star of
>> the 4.03 release, so not enabling it using command line options seems
>> counter-intuitive (if this is the case).
>>
>> -Yotam
>>
>> On Wed, Mar 9, 2016 at 7:59 PM, Markus Mottl <markus.mottl@gmail.com> wrote:
>>>
>>> I've just tested Flambda, and it seems to already be doing a pretty
>>> decent job on some non-trivial examples (e.g. inlining combinations of
>>> functors and first class functions).  I hope there will be a stable
>>> 4.03 OPAM switch that enables it.  I'm looking forward to being able
>>> to write more elegant, abstract code that's still efficient.
>>>
>>> Regards,
>>> Markus
>>>
>>> On Wed, Mar 9, 2016 at 2:14 AM, Mark Shinwell <mshinwell@janestreet.com>
>>> wrote:
>>> > It will not be enabled by default in 4.03.  For the majority of
>>> > programs, in the current state, it should improve performance (mainly
>>> > by lowering allocation).  It should never generate wrong code.
>>> > However we know of examples that don't improve as much as we would
>>> > like, which we will try to address for 4.04.
>>> >
>>> > There will be a draft version of the new Flambda manual chapter
>>> > available shortly (hopefully this week).  Amongst other things this
>>> > documents what you found about the configure options and the flags'
>>> > operation.
>>> >
>>> > Mark
>>> >
>>> > On 9 March 2016 at 03:55, Markus Mottl <markus.mottl@gmail.com> wrote:
>>> >> Hi Alain,
>>> >>
>>> >> I see, thanks.  It was a little confusing, because the command line
>>> >> options for tuning flambda were still available even without Flambda
>>> >> being enabled.
>>> >>
>>> >> Will Flambda be enabled by default in OCaml 4.03 or is it still
>>> >> considered to be too experimental?  It could turn out to become one of
>>> >> the most impactful new features in terms of how I write code.
>>> >>
>>> >> Regards,
>>> >> Markus
>>> >>
>>> >> On Tue, Mar 8, 2016 at 5:53 PM, Alain Frisch <alain.frisch@lexifi.com>
>>> >> wrote:
>>> >>> Hi Markus,
>>> >>>
>>> >>> flambda needs to be enabled explicitly at configure time with the
>>> >>> "-flambda"
>>> >>> flag.  The new optimizer will then be used unconditionally, and you
>>> >>> can
>>> >>> tweak it using command-line parameters passed to ocamlopt (see
>>> >>> "ocamlopt
>>> >>> -h").
>>> >>>
>>> >>>
>>> >>> Alain
>>> >>>
>>> >>>
>>> >>> On 08/03/2016 23:10, Markus Mottl wrote:
>>> >>>>
>>> >>>> Hi,
>>> >>>>
>>> >>>> I'm trying out OCaml 4.03.0+beta1 right now and wanted to test
>>> >>>> Flambda
>>> >>>> optimizations.  But looking at the generated assembly, it doesn't
>>> >>>> seem
>>> >>>> to be doing much if anything on the simple test examples that I
>>> >>>> thought would benefit.
>>> >>>>
>>> >>>> To give an example of what I expected to see, lets consider this
>>> >>>> code:
>>> >>>>
>>> >>>> -----
>>> >>>> let map_pair f (x, y) = f x, f y
>>> >>>>
>>> >>>> let succ x = x + 1
>>> >>>> let map_pair_succ1 pair = map_pair succ pair
>>> >>>> let map_pair_succ2 (x, y) = succ x, succ y
>>> >>>> -----
>>> >>>>
>>> >>>> I would have thought that the "succ" function would be inlined in
>>> >>>> "map_pair_succ1" as the compiler would do for "map_pair_succ2".
>>> >>>> But the generated code looks like this:
>>> >>>>
>>> >>>> -----
>>> >>>> L101:
>>> >>>>    movq  %rax, %rdi
>>> >>>>    movq  %rdi, 8(%rsp)
>>> >>>>    movq  %rbx, (%rsp)
>>> >>>>    movq  8(%rbx), %rax
>>> >>>>    movq  (%rdi), %rsi
>>> >>>>    movq  %rdi, %rbx
>>> >>>>    call  *%rsi
>>> >>>> L102:
>>> >>>>    movq  %rax, 16(%rsp)
>>> >>>>    movq  (%rsp), %rax
>>> >>>>    movq  (%rax), %rax
>>> >>>>    movq  8(%rsp), %rbx
>>> >>>>    movq  (%rbx), %rdi
>>> >>>>    call  *%rdi
>>> >>>> -----
>>> >>>>
>>> >>>> Is Flambda supposed to work out of the box with the current beta?
>>> >>>> What flags or annotations should I use for testing?  Any showcase
>>> >>>> examples I should try out that are expected to be improved?
>>> >>>>
>>> >>>> Regards,
>>> >>>> Markus
>>> >>>>
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>>> >>
>>> >> --
>>> >> Caml-list mailing list.  Subscription management and archives:
>>> >> https://sympa.inria.fr/sympa/arc/caml-list
>>> >> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>>> >> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>>
>>>
>>>
>>> --
>>> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>>>
>>> --
>>> Caml-list mailing list.  Subscription management and archives:
>>> https://sympa.inria.fr/sympa/arc/caml-list
>>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>
>>
>
>
>
> --
> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Status of Flambda in OCaml 4.03
  2016-03-10  7:20             ` Mark Shinwell
@ 2016-03-10 15:32               ` Markus Mottl
  2016-03-10 15:49                 ` Gabriel Scherer
  2016-03-10 20:12                 ` [Caml-list] <DKIM> " Pierre Chambart
  0 siblings, 2 replies; 23+ messages in thread
From: Markus Mottl @ 2016-03-10 15:32 UTC (permalink / raw)
  To: Mark Shinwell; +Cc: Yotam Barnoy, Alain Frisch, OCaml List

Ok, that explains things.  Is it realistic to assume that the size of
.cmx files can be substantially reduced?  It seems there is a natural
tradeoff between "optimize well" and "compile fast".  I suspect it may
be inevitable to add more compilation files.  We actually already have
that situation with native code libraries: the .cmxa file is enough to
compile a project, but if the .cmx files of contained modules are
visible in the path, too, then, and only then, the compiler can and
will do cross-module inlining - which takes longer, of course.

What about the following approach? - There is one "minimal" set of
compilation files that always allows you to quickly obtain a running
(albeit slow / large) executable.   Additional compilation files then
monotonically augment this information and can be produced and
consumed optionally depending on compilation flags.  The nice thing
about this approach is that you don't necessarily have to recompile
the whole project with different flags whenever you need a different
compile time / performance tradeoff.  E.g. if Flambda information is
available for an unchanged file, you don't have to rebuild it when
needed.  If you just want to compile quickly, you don't have to read
data you don't need.  Separate compilation files would also integrate
much better with build tools (timestamping, etc.).

I guess we would already be looking at OCaml version 5 for such a change :)

On Thu, Mar 10, 2016 at 2:20 AM, Mark Shinwell <mshinwell@janestreet.com> wrote:
> By "enabled at configure time" I mean that you need to pass the
> "-flambda" option to the configure script when building the compiler.
>
> The main reason Flambda isn't enabled by default is because we need to
> do further work to improve compile-time performance.  There are also
> concerns about .cmx file size.  Flambda produces larger .cmx files: it
> stores the entire intermediate representation of the compilation unit
> so that no subsequent cross-module inlining decision is compromised.
>
> There is a mode, -Oclassic, which uses Flambda but mimics the
> behaviour of the existing compiler; unfortunately this isn't really
> fast enough yet either and .cmx sizes aren't small enough.
>
> When we manage to address some of these issues further, hopefully for
> 4.04, we will revisit whether Flambda should be enabled by default.
>
> One of the main reasons there is a configure option rather than a
> runtime switch is to avoid having to re-engineer the compiler's build
> system to permit multiple builds of the various libraries (the stdlib,
> for example) with differing options that affect what appears in the
> .cmx files (e.g. with and without Flambda).  Even if code were used to
> allow Flambda to read non-Flambda .cmx files, performance degradation
> would result.
>
> Mark
>
> On 10 March 2016 at 01:43, Markus Mottl <markus.mottl@gmail.com> wrote:
>> I agree with Yotam.  Assuming that Flambda produces correct code and
>> doesn't cause any serious performance issues either with the generated
>> code or with excessive compile times, I'd prefer building it into the
>> compiler by default.  I'd be fine if I had to pass an extra flag at
>> compile time to actually run Flambda optimizers, but it should at
>> least be available.  It doesn't have to be perfect to be useful.
>>
>> On Wed, Mar 9, 2016 at 8:32 PM, Yotam Barnoy <yotambarnoy@gmail.com> wrote:
>>> While we await the manual, can you explain what you mean by 'enabled at
>>> configure time'? Will a -flambda -O-something argument passed to the normal
>>> 4.03 compiler enable flambda optimizations? Flambda is clearly the star of
>>> the 4.03 release, so not enabling it using command line options seems
>>> counter-intuitive (if this is the case).
>>>
>>> -Yotam
>>>
>>> On Wed, Mar 9, 2016 at 7:59 PM, Markus Mottl <markus.mottl@gmail.com> wrote:
>>>>
>>>> I've just tested Flambda, and it seems to already be doing a pretty
>>>> decent job on some non-trivial examples (e.g. inlining combinations of
>>>> functors and first class functions).  I hope there will be a stable
>>>> 4.03 OPAM switch that enables it.  I'm looking forward to being able
>>>> to write more elegant, abstract code that's still efficient.
>>>>
>>>> Regards,
>>>> Markus
>>>>
>>>> On Wed, Mar 9, 2016 at 2:14 AM, Mark Shinwell <mshinwell@janestreet.com>
>>>> wrote:
>>>> > It will not be enabled by default in 4.03.  For the majority of
>>>> > programs, in the current state, it should improve performance (mainly
>>>> > by lowering allocation).  It should never generate wrong code.
>>>> > However we know of examples that don't improve as much as we would
>>>> > like, which we will try to address for 4.04.
>>>> >
>>>> > There will be a draft version of the new Flambda manual chapter
>>>> > available shortly (hopefully this week).  Amongst other things this
>>>> > documents what you found about the configure options and the flags'
>>>> > operation.
>>>> >
>>>> > Mark
>>>> >
>>>> > On 9 March 2016 at 03:55, Markus Mottl <markus.mottl@gmail.com> wrote:
>>>> >> Hi Alain,
>>>> >>
>>>> >> I see, thanks.  It was a little confusing, because the command line
>>>> >> options for tuning flambda were still available even without Flambda
>>>> >> being enabled.
>>>> >>
>>>> >> Will Flambda be enabled by default in OCaml 4.03 or is it still
>>>> >> considered to be too experimental?  It could turn out to become one of
>>>> >> the most impactful new features in terms of how I write code.
>>>> >>
>>>> >> Regards,
>>>> >> Markus
>>>> >>
>>>> >> On Tue, Mar 8, 2016 at 5:53 PM, Alain Frisch <alain.frisch@lexifi.com>
>>>> >> wrote:
>>>> >>> Hi Markus,
>>>> >>>
>>>> >>> flambda needs to be enabled explicitly at configure time with the
>>>> >>> "-flambda"
>>>> >>> flag.  The new optimizer will then be used unconditionally, and you
>>>> >>> can
>>>> >>> tweak it using command-line parameters passed to ocamlopt (see
>>>> >>> "ocamlopt
>>>> >>> -h").
>>>> >>>
>>>> >>>
>>>> >>> Alain
>>>> >>>
>>>> >>>
>>>> >>> On 08/03/2016 23:10, Markus Mottl wrote:
>>>> >>>>
>>>> >>>> Hi,
>>>> >>>>
>>>> >>>> I'm trying out OCaml 4.03.0+beta1 right now and wanted to test
>>>> >>>> Flambda
>>>> >>>> optimizations.  But looking at the generated assembly, it doesn't
>>>> >>>> seem
>>>> >>>> to be doing much if anything on the simple test examples that I
>>>> >>>> thought would benefit.
>>>> >>>>
>>>> >>>> To give an example of what I expected to see, lets consider this
>>>> >>>> code:
>>>> >>>>
>>>> >>>> -----
>>>> >>>> let map_pair f (x, y) = f x, f y
>>>> >>>>
>>>> >>>> let succ x = x + 1
>>>> >>>> let map_pair_succ1 pair = map_pair succ pair
>>>> >>>> let map_pair_succ2 (x, y) = succ x, succ y
>>>> >>>> -----
>>>> >>>>
>>>> >>>> I would have thought that the "succ" function would be inlined in
>>>> >>>> "map_pair_succ1" as the compiler would do for "map_pair_succ2".
>>>> >>>> But the generated code looks like this:
>>>> >>>>
>>>> >>>> -----
>>>> >>>> L101:
>>>> >>>>    movq  %rax, %rdi
>>>> >>>>    movq  %rdi, 8(%rsp)
>>>> >>>>    movq  %rbx, (%rsp)
>>>> >>>>    movq  8(%rbx), %rax
>>>> >>>>    movq  (%rdi), %rsi
>>>> >>>>    movq  %rdi, %rbx
>>>> >>>>    call  *%rsi
>>>> >>>> L102:
>>>> >>>>    movq  %rax, 16(%rsp)
>>>> >>>>    movq  (%rsp), %rax
>>>> >>>>    movq  (%rax), %rax
>>>> >>>>    movq  8(%rsp), %rbx
>>>> >>>>    movq  (%rbx), %rdi
>>>> >>>>    call  *%rdi
>>>> >>>> -----
>>>> >>>>
>>>> >>>> Is Flambda supposed to work out of the box with the current beta?
>>>> >>>> What flags or annotations should I use for testing?  Any showcase
>>>> >>>> examples I should try out that are expected to be improved?
>>>> >>>>
>>>> >>>> Regards,
>>>> >>>> Markus
>>>> >>>>
>>>> >>>
>>>> >>
>>>> >>
>>>> >>
>>>> >> --
>>>> >> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>>>> >>
>>>> >> --
>>>> >> Caml-list mailing list.  Subscription management and archives:
>>>> >> https://sympa.inria.fr/sympa/arc/caml-list
>>>> >> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>>>> >> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>>>
>>>>
>>>>
>>>> --
>>>> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>>>>
>>>> --
>>>> Caml-list mailing list.  Subscription management and archives:
>>>> https://sympa.inria.fr/sympa/arc/caml-list
>>>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>>>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>>
>>>
>>
>>
>>
>> --
>> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com



-- 
Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Status of Flambda in OCaml 4.03
  2016-03-10 15:32               ` Markus Mottl
@ 2016-03-10 15:49                 ` Gabriel Scherer
  2016-04-17  8:43                   ` Jesper Louis Andersen
  2016-03-10 20:12                 ` [Caml-list] <DKIM> " Pierre Chambart
  1 sibling, 1 reply; 23+ messages in thread
From: Gabriel Scherer @ 2016-03-10 15:49 UTC (permalink / raw)
  To: Markus Mottl; +Cc: Mark Shinwell, Yotam Barnoy, Alain Frisch, OCaml List

One point that is tangentially related to your message is that the
flambda people observed that it's easy to miss cross-module
optimizations because .cmx files are missing -- the compiler is silent
about this. Leo White added a new warning (58) when a module does not
find the .cmx of one of its dependencies, which interacts with -opaque
(initially introduced in 4.02.0 when compiling implementation files)
in the following way. In 4.03, you can compile an *interface* with
-opaque, announcing the intent not to provide an .cmx file (or to
choose among several implementations at link-time) for its
implementation(s). Warning 58 will not warn about a missing .cmx if
the dependency's interface was compiled opaque.
  https://github.com/ocaml/ocaml/pull/319

I think the long-term plan is to encourage people to enable the
warning, and explicitly use -opaque on .cmi when it is their intent
not to distribute .cmx files. That said, those things may be refined
once we get more experience of flambda in the wild.

On Thu, Mar 10, 2016 at 10:32 AM, Markus Mottl <markus.mottl@gmail.com> wrote:
> Ok, that explains things.  Is it realistic to assume that the size of
> .cmx files can be substantially reduced?  It seems there is a natural
> tradeoff between "optimize well" and "compile fast".  I suspect it may
> be inevitable to add more compilation files.  We actually already have
> that situation with native code libraries: the .cmxa file is enough to
> compile a project, but if the .cmx files of contained modules are
> visible in the path, too, then, and only then, the compiler can and
> will do cross-module inlining - which takes longer, of course.
>
> What about the following approach? - There is one "minimal" set of
> compilation files that always allows you to quickly obtain a running
> (albeit slow / large) executable.   Additional compilation files then
> monotonically augment this information and can be produced and
> consumed optionally depending on compilation flags.  The nice thing
> about this approach is that you don't necessarily have to recompile
> the whole project with different flags whenever you need a different
> compile time / performance tradeoff.  E.g. if Flambda information is
> available for an unchanged file, you don't have to rebuild it when
> needed.  If you just want to compile quickly, you don't have to read
> data you don't need.  Separate compilation files would also integrate
> much better with build tools (timestamping, etc.).
>
> I guess we would already be looking at OCaml version 5 for such a change :)
>
> On Thu, Mar 10, 2016 at 2:20 AM, Mark Shinwell <mshinwell@janestreet.com> wrote:
>> By "enabled at configure time" I mean that you need to pass the
>> "-flambda" option to the configure script when building the compiler.
>>
>> The main reason Flambda isn't enabled by default is because we need to
>> do further work to improve compile-time performance.  There are also
>> concerns about .cmx file size.  Flambda produces larger .cmx files: it
>> stores the entire intermediate representation of the compilation unit
>> so that no subsequent cross-module inlining decision is compromised.
>>
>> There is a mode, -Oclassic, which uses Flambda but mimics the
>> behaviour of the existing compiler; unfortunately this isn't really
>> fast enough yet either and .cmx sizes aren't small enough.
>>
>> When we manage to address some of these issues further, hopefully for
>> 4.04, we will revisit whether Flambda should be enabled by default.
>>
>> One of the main reasons there is a configure option rather than a
>> runtime switch is to avoid having to re-engineer the compiler's build
>> system to permit multiple builds of the various libraries (the stdlib,
>> for example) with differing options that affect what appears in the
>> .cmx files (e.g. with and without Flambda).  Even if code were used to
>> allow Flambda to read non-Flambda .cmx files, performance degradation
>> would result.
>>
>> Mark
>>
>> On 10 March 2016 at 01:43, Markus Mottl <markus.mottl@gmail.com> wrote:
>>> I agree with Yotam.  Assuming that Flambda produces correct code and
>>> doesn't cause any serious performance issues either with the generated
>>> code or with excessive compile times, I'd prefer building it into the
>>> compiler by default.  I'd be fine if I had to pass an extra flag at
>>> compile time to actually run Flambda optimizers, but it should at
>>> least be available.  It doesn't have to be perfect to be useful.
>>>
>>> On Wed, Mar 9, 2016 at 8:32 PM, Yotam Barnoy <yotambarnoy@gmail.com> wrote:
>>>> While we await the manual, can you explain what you mean by 'enabled at
>>>> configure time'? Will a -flambda -O-something argument passed to the normal
>>>> 4.03 compiler enable flambda optimizations? Flambda is clearly the star of
>>>> the 4.03 release, so not enabling it using command line options seems
>>>> counter-intuitive (if this is the case).
>>>>
>>>> -Yotam
>>>>
>>>> On Wed, Mar 9, 2016 at 7:59 PM, Markus Mottl <markus.mottl@gmail.com> wrote:
>>>>>
>>>>> I've just tested Flambda, and it seems to already be doing a pretty
>>>>> decent job on some non-trivial examples (e.g. inlining combinations of
>>>>> functors and first class functions).  I hope there will be a stable
>>>>> 4.03 OPAM switch that enables it.  I'm looking forward to being able
>>>>> to write more elegant, abstract code that's still efficient.
>>>>>
>>>>> Regards,
>>>>> Markus
>>>>>
>>>>> On Wed, Mar 9, 2016 at 2:14 AM, Mark Shinwell <mshinwell@janestreet.com>
>>>>> wrote:
>>>>> > It will not be enabled by default in 4.03.  For the majority of
>>>>> > programs, in the current state, it should improve performance (mainly
>>>>> > by lowering allocation).  It should never generate wrong code.
>>>>> > However we know of examples that don't improve as much as we would
>>>>> > like, which we will try to address for 4.04.
>>>>> >
>>>>> > There will be a draft version of the new Flambda manual chapter
>>>>> > available shortly (hopefully this week).  Amongst other things this
>>>>> > documents what you found about the configure options and the flags'
>>>>> > operation.
>>>>> >
>>>>> > Mark
>>>>> >
>>>>> > On 9 March 2016 at 03:55, Markus Mottl <markus.mottl@gmail.com> wrote:
>>>>> >> Hi Alain,
>>>>> >>
>>>>> >> I see, thanks.  It was a little confusing, because the command line
>>>>> >> options for tuning flambda were still available even without Flambda
>>>>> >> being enabled.
>>>>> >>
>>>>> >> Will Flambda be enabled by default in OCaml 4.03 or is it still
>>>>> >> considered to be too experimental?  It could turn out to become one of
>>>>> >> the most impactful new features in terms of how I write code.
>>>>> >>
>>>>> >> Regards,
>>>>> >> Markus
>>>>> >>
>>>>> >> On Tue, Mar 8, 2016 at 5:53 PM, Alain Frisch <alain.frisch@lexifi.com>
>>>>> >> wrote:
>>>>> >>> Hi Markus,
>>>>> >>>
>>>>> >>> flambda needs to be enabled explicitly at configure time with the
>>>>> >>> "-flambda"
>>>>> >>> flag.  The new optimizer will then be used unconditionally, and you
>>>>> >>> can
>>>>> >>> tweak it using command-line parameters passed to ocamlopt (see
>>>>> >>> "ocamlopt
>>>>> >>> -h").
>>>>> >>>
>>>>> >>>
>>>>> >>> Alain
>>>>> >>>
>>>>> >>>
>>>>> >>> On 08/03/2016 23:10, Markus Mottl wrote:
>>>>> >>>>
>>>>> >>>> Hi,
>>>>> >>>>
>>>>> >>>> I'm trying out OCaml 4.03.0+beta1 right now and wanted to test
>>>>> >>>> Flambda
>>>>> >>>> optimizations.  But looking at the generated assembly, it doesn't
>>>>> >>>> seem
>>>>> >>>> to be doing much if anything on the simple test examples that I
>>>>> >>>> thought would benefit.
>>>>> >>>>
>>>>> >>>> To give an example of what I expected to see, lets consider this
>>>>> >>>> code:
>>>>> >>>>
>>>>> >>>> -----
>>>>> >>>> let map_pair f (x, y) = f x, f y
>>>>> >>>>
>>>>> >>>> let succ x = x + 1
>>>>> >>>> let map_pair_succ1 pair = map_pair succ pair
>>>>> >>>> let map_pair_succ2 (x, y) = succ x, succ y
>>>>> >>>> -----
>>>>> >>>>
>>>>> >>>> I would have thought that the "succ" function would be inlined in
>>>>> >>>> "map_pair_succ1" as the compiler would do for "map_pair_succ2".
>>>>> >>>> But the generated code looks like this:
>>>>> >>>>
>>>>> >>>> -----
>>>>> >>>> L101:
>>>>> >>>>    movq  %rax, %rdi
>>>>> >>>>    movq  %rdi, 8(%rsp)
>>>>> >>>>    movq  %rbx, (%rsp)
>>>>> >>>>    movq  8(%rbx), %rax
>>>>> >>>>    movq  (%rdi), %rsi
>>>>> >>>>    movq  %rdi, %rbx
>>>>> >>>>    call  *%rsi
>>>>> >>>> L102:
>>>>> >>>>    movq  %rax, 16(%rsp)
>>>>> >>>>    movq  (%rsp), %rax
>>>>> >>>>    movq  (%rax), %rax
>>>>> >>>>    movq  8(%rsp), %rbx
>>>>> >>>>    movq  (%rbx), %rdi
>>>>> >>>>    call  *%rdi
>>>>> >>>> -----
>>>>> >>>>
>>>>> >>>> Is Flambda supposed to work out of the box with the current beta?
>>>>> >>>> What flags or annotations should I use for testing?  Any showcase
>>>>> >>>> examples I should try out that are expected to be improved?
>>>>> >>>>
>>>>> >>>> Regards,
>>>>> >>>> Markus
>>>>> >>>>
>>>>> >>>
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> --
>>>>> >> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>>>>> >>
>>>>> >> --
>>>>> >> Caml-list mailing list.  Subscription management and archives:
>>>>> >> https://sympa.inria.fr/sympa/arc/caml-list
>>>>> >> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>>>>> >> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>>>>>
>>>>> --
>>>>> Caml-list mailing list.  Subscription management and archives:
>>>>> https://sympa.inria.fr/sympa/arc/caml-list
>>>>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>>>>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>
>
>
> --
> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] <DKIM> Re: Status of Flambda in OCaml 4.03
  2016-03-10 15:32               ` Markus Mottl
  2016-03-10 15:49                 ` Gabriel Scherer
@ 2016-03-10 20:12                 ` Pierre Chambart
  2016-03-10 21:08                   ` Markus Mottl
  2016-03-10 22:51                   ` Gerd Stolpmann
  1 sibling, 2 replies; 23+ messages in thread
From: Pierre Chambart @ 2016-03-10 20:12 UTC (permalink / raw)
  To: Markus Mottl; +Cc: OCaml List

It is realistic when using the -Oclassic option that Mark mentioned.
By default the flambda inlining heuristic is decided at call site. Hence
all the information about a function needs to be available to correctly
decide. That means that the size of the cmx file is approximatively
linearly related to the .o file size. It is not easy to decide that some
function will never be inlined, so the information is always kept,
even on function annotated with [@inline never]. But I wouldn't
expect that to benefit that much. But for the -Oclassic mode where
the decision is made at the definition, it is possible to decide not
to include some information in the cmx. This is what happens in
non-flambda mode, and in flambda mode it also reduce a bit the
cmx size, but not as much as it could. This will probably improve
in 4.04 if there is sufficient interest in this -Oclassic mode.
-- 
Pierre

On 10/03/2016 16:32, Markus Mottl wrote:
> Ok, that explains things.  Is it realistic to assume that the size of
> .cmx files can be substantially reduced?  It seems there is a natural
> tradeoff between "optimize well" and "compile fast".  I suspect it may
> be inevitable to add more compilation files.  We actually already have
> that situation with native code libraries: the .cmxa file is enough to
> compile a project, but if the .cmx files of contained modules are
> visible in the path, too, then, and only then, the compiler can and
> will do cross-module inlining - which takes longer, of course.
>
> What about the following approach? - There is one "minimal" set of
> compilation files that always allows you to quickly obtain a running
> (albeit slow / large) executable.   Additional compilation files then
> monotonically augment this information and can be produced and
> consumed optionally depending on compilation flags.  The nice thing
> about this approach is that you don't necessarily have to recompile
> the whole project with different flags whenever you need a different
> compile time / performance tradeoff.  E.g. if Flambda information is
> available for an unchanged file, you don't have to rebuild it when
> needed.  If you just want to compile quickly, you don't have to read
> data you don't need.  Separate compilation files would also integrate
> much better with build tools (timestamping, etc.).
>
> I guess we would already be looking at OCaml version 5 for such a change :)
>
> On Thu, Mar 10, 2016 at 2:20 AM, Mark Shinwell <mshinwell@janestreet.com> wrote:
>> By "enabled at configure time" I mean that you need to pass the
>> "-flambda" option to the configure script when building the compiler.
>>
>> The main reason Flambda isn't enabled by default is because we need to
>> do further work to improve compile-time performance.  There are also
>> concerns about .cmx file size.  Flambda produces larger .cmx files: it
>> stores the entire intermediate representation of the compilation unit
>> so that no subsequent cross-module inlining decision is compromised.
>>
>> There is a mode, -Oclassic, which uses Flambda but mimics the
>> behaviour of the existing compiler; unfortunately this isn't really
>> fast enough yet either and .cmx sizes aren't small enough.
>>
>> When we manage to address some of these issues further, hopefully for
>> 4.04, we will revisit whether Flambda should be enabled by default.
>>
>> One of the main reasons there is a configure option rather than a
>> runtime switch is to avoid having to re-engineer the compiler's build
>> system to permit multiple builds of the various libraries (the stdlib,
>> for example) with differing options that affect what appears in the
>> .cmx files (e.g. with and without Flambda).  Even if code were used to
>> allow Flambda to read non-Flambda .cmx files, performance degradation
>> would result.
>>
>> Mark
>>
>> On 10 March 2016 at 01:43, Markus Mottl <markus.mottl@gmail.com> wrote:
>>> I agree with Yotam.  Assuming that Flambda produces correct code and
>>> doesn't cause any serious performance issues either with the generated
>>> code or with excessive compile times, I'd prefer building it into the
>>> compiler by default.  I'd be fine if I had to pass an extra flag at
>>> compile time to actually run Flambda optimizers, but it should at
>>> least be available.  It doesn't have to be perfect to be useful.
>>>
>>> On Wed, Mar 9, 2016 at 8:32 PM, Yotam Barnoy <yotambarnoy@gmail.com> wrote:
>>>> While we await the manual, can you explain what you mean by 'enabled at
>>>> configure time'? Will a -flambda -O-something argument passed to the normal
>>>> 4.03 compiler enable flambda optimizations? Flambda is clearly the star of
>>>> the 4.03 release, so not enabling it using command line options seems
>>>> counter-intuitive (if this is the case).
>>>>
>>>> -Yotam
>>>>
>>>> On Wed, Mar 9, 2016 at 7:59 PM, Markus Mottl <markus.mottl@gmail.com> wrote:
>>>>> I've just tested Flambda, and it seems to already be doing a pretty
>>>>> decent job on some non-trivial examples (e.g. inlining combinations of
>>>>> functors and first class functions).  I hope there will be a stable
>>>>> 4.03 OPAM switch that enables it.  I'm looking forward to being able
>>>>> to write more elegant, abstract code that's still efficient.
>>>>>
>>>>> Regards,
>>>>> Markus
>>>>>
>>>>> On Wed, Mar 9, 2016 at 2:14 AM, Mark Shinwell <mshinwell@janestreet.com>
>>>>> wrote:
>>>>>> It will not be enabled by default in 4.03.  For the majority of
>>>>>> programs, in the current state, it should improve performance (mainly
>>>>>> by lowering allocation).  It should never generate wrong code.
>>>>>> However we know of examples that don't improve as much as we would
>>>>>> like, which we will try to address for 4.04.
>>>>>>
>>>>>> There will be a draft version of the new Flambda manual chapter
>>>>>> available shortly (hopefully this week).  Amongst other things this
>>>>>> documents what you found about the configure options and the flags'
>>>>>> operation.
>>>>>>
>>>>>> Mark
>>>>>>
>>>>>> On 9 March 2016 at 03:55, Markus Mottl <markus.mottl@gmail.com> wrote:
>>>>>>> Hi Alain,
>>>>>>>
>>>>>>> I see, thanks.  It was a little confusing, because the command line
>>>>>>> options for tuning flambda were still available even without Flambda
>>>>>>> being enabled.
>>>>>>>
>>>>>>> Will Flambda be enabled by default in OCaml 4.03 or is it still
>>>>>>> considered to be too experimental?  It could turn out to become one of
>>>>>>> the most impactful new features in terms of how I write code.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Markus
>>>>>>>
>>>>>>> On Tue, Mar 8, 2016 at 5:53 PM, Alain Frisch <alain.frisch@lexifi.com>
>>>>>>> wrote:
>>>>>>>> Hi Markus,
>>>>>>>>
>>>>>>>> flambda needs to be enabled explicitly at configure time with the
>>>>>>>> "-flambda"
>>>>>>>> flag.  The new optimizer will then be used unconditionally, and you
>>>>>>>> can
>>>>>>>> tweak it using command-line parameters passed to ocamlopt (see
>>>>>>>> "ocamlopt
>>>>>>>> -h").
>>>>>>>>
>>>>>>>>
>>>>>>>> Alain
>>>>>>>>
>>>>>>>>
>>>>>>>> On 08/03/2016 23:10, Markus Mottl wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I'm trying out OCaml 4.03.0+beta1 right now and wanted to test
>>>>>>>>> Flambda
>>>>>>>>> optimizations.  But looking at the generated assembly, it doesn't
>>>>>>>>> seem
>>>>>>>>> to be doing much if anything on the simple test examples that I
>>>>>>>>> thought would benefit.
>>>>>>>>>
>>>>>>>>> To give an example of what I expected to see, lets consider this
>>>>>>>>> code:
>>>>>>>>>
>>>>>>>>> -----
>>>>>>>>> let map_pair f (x, y) = f x, f y
>>>>>>>>>
>>>>>>>>> let succ x = x + 1
>>>>>>>>> let map_pair_succ1 pair = map_pair succ pair
>>>>>>>>> let map_pair_succ2 (x, y) = succ x, succ y
>>>>>>>>> -----
>>>>>>>>>
>>>>>>>>> I would have thought that the "succ" function would be inlined in
>>>>>>>>> "map_pair_succ1" as the compiler would do for "map_pair_succ2".
>>>>>>>>> But the generated code looks like this:
>>>>>>>>>
>>>>>>>>> -----
>>>>>>>>> L101:
>>>>>>>>>    movq  %rax, %rdi
>>>>>>>>>    movq  %rdi, 8(%rsp)
>>>>>>>>>    movq  %rbx, (%rsp)
>>>>>>>>>    movq  8(%rbx), %rax
>>>>>>>>>    movq  (%rdi), %rsi
>>>>>>>>>    movq  %rdi, %rbx
>>>>>>>>>    call  *%rsi
>>>>>>>>> L102:
>>>>>>>>>    movq  %rax, 16(%rsp)
>>>>>>>>>    movq  (%rsp), %rax
>>>>>>>>>    movq  (%rax), %rax
>>>>>>>>>    movq  8(%rsp), %rbx
>>>>>>>>>    movq  (%rbx), %rdi
>>>>>>>>>    call  *%rdi
>>>>>>>>> -----
>>>>>>>>>
>>>>>>>>> Is Flambda supposed to work out of the box with the current beta?
>>>>>>>>> What flags or annotations should I use for testing?  Any showcase
>>>>>>>>> examples I should try out that are expected to be improved?
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Markus
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>>>>>>>
>>>>>>> --
>>>>>>> Caml-list mailing list.  Subscription management and archives:
>>>>>>> https://sympa.inria.fr/sympa/arc/caml-list
>>>>>>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>>>>>>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>>>>
>>>>>
>>>>> --
>>>>> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>>>>>
>>>>> --
>>>>> Caml-list mailing list.  Subscription management and archives:
>>>>> https://sympa.inria.fr/sympa/arc/caml-list
>>>>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>>>>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>>>
>>>
>>>
>>> --
>>> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>
>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] <DKIM> Re: Status of Flambda in OCaml 4.03
  2016-03-10 20:12                 ` [Caml-list] <DKIM> " Pierre Chambart
@ 2016-03-10 21:08                   ` Markus Mottl
  2016-03-10 22:51                   ` Gerd Stolpmann
  1 sibling, 0 replies; 23+ messages in thread
From: Markus Mottl @ 2016-03-10 21:08 UTC (permalink / raw)
  To: Pierre Chambart; +Cc: OCaml List

One problem I see with the current approach is with libraries: their
.cmx files either contain sufficient information for Flambda /
cross-module inlining, in which case compiling and linking with many
such libraries may be slow.  Or they don't, in which case the
generated executable will be slow.  I don't think people will want to
maintain multiple library installations with different compilation
flags.

Users who just want to build an application generally don't care much
about (linear) compile times, because they only rarely install or
recompile a package.  But they will typically want good performance.
Developers usually don't care much about compile times either the
first time they compile a project.  But recompilations better be
lightening fast.

That's why I think multiple compilation files may be the better
long-term solution.  It may be slightly slower to build a project the
first time round, because you have to create more files.  But being
able to reuse compiled information in a more fine-grained way will
both satisfy developers who need fast recompilations, and users who
want good performance.  Note that sometimes developers also have to be
able to quickly switch between "slow" and "fast" compilation if they
are in the process of tuning their code.

On Thu, Mar 10, 2016 at 3:12 PM, Pierre Chambart
<pierre.chambart@laposte.net> wrote:
> It is realistic when using the -Oclassic option that Mark mentioned.
> By default the flambda inlining heuristic is decided at call site. Hence
> all the information about a function needs to be available to correctly
> decide. That means that the size of the cmx file is approximatively
> linearly related to the .o file size. It is not easy to decide that some
> function will never be inlined, so the information is always kept,
> even on function annotated with [@inline never]. But I wouldn't
> expect that to benefit that much. But for the -Oclassic mode where
> the decision is made at the definition, it is possible to decide not
> to include some information in the cmx. This is what happens in
> non-flambda mode, and in flambda mode it also reduce a bit the
> cmx size, but not as much as it could. This will probably improve
> in 4.04 if there is sufficient interest in this -Oclassic mode.
> --
> Pierre
>
> On 10/03/2016 16:32, Markus Mottl wrote:
>> Ok, that explains things.  Is it realistic to assume that the size of
>> .cmx files can be substantially reduced?  It seems there is a natural
>> tradeoff between "optimize well" and "compile fast".  I suspect it may
>> be inevitable to add more compilation files.  We actually already have
>> that situation with native code libraries: the .cmxa file is enough to
>> compile a project, but if the .cmx files of contained modules are
>> visible in the path, too, then, and only then, the compiler can and
>> will do cross-module inlining - which takes longer, of course.
>>
>> What about the following approach? - There is one "minimal" set of
>> compilation files that always allows you to quickly obtain a running
>> (albeit slow / large) executable.   Additional compilation files then
>> monotonically augment this information and can be produced and
>> consumed optionally depending on compilation flags.  The nice thing
>> about this approach is that you don't necessarily have to recompile
>> the whole project with different flags whenever you need a different
>> compile time / performance tradeoff.  E.g. if Flambda information is
>> available for an unchanged file, you don't have to rebuild it when
>> needed.  If you just want to compile quickly, you don't have to read
>> data you don't need.  Separate compilation files would also integrate
>> much better with build tools (timestamping, etc.).
>>
>> I guess we would already be looking at OCaml version 5 for such a change :)
>>
>> On Thu, Mar 10, 2016 at 2:20 AM, Mark Shinwell <mshinwell@janestreet.com> wrote:
>>> By "enabled at configure time" I mean that you need to pass the
>>> "-flambda" option to the configure script when building the compiler.
>>>
>>> The main reason Flambda isn't enabled by default is because we need to
>>> do further work to improve compile-time performance.  There are also
>>> concerns about .cmx file size.  Flambda produces larger .cmx files: it
>>> stores the entire intermediate representation of the compilation unit
>>> so that no subsequent cross-module inlining decision is compromised.
>>>
>>> There is a mode, -Oclassic, which uses Flambda but mimics the
>>> behaviour of the existing compiler; unfortunately this isn't really
>>> fast enough yet either and .cmx sizes aren't small enough.
>>>
>>> When we manage to address some of these issues further, hopefully for
>>> 4.04, we will revisit whether Flambda should be enabled by default.
>>>
>>> One of the main reasons there is a configure option rather than a
>>> runtime switch is to avoid having to re-engineer the compiler's build
>>> system to permit multiple builds of the various libraries (the stdlib,
>>> for example) with differing options that affect what appears in the
>>> .cmx files (e.g. with and without Flambda).  Even if code were used to
>>> allow Flambda to read non-Flambda .cmx files, performance degradation
>>> would result.
>>>
>>> Mark
>>>
>>> On 10 March 2016 at 01:43, Markus Mottl <markus.mottl@gmail.com> wrote:
>>>> I agree with Yotam.  Assuming that Flambda produces correct code and
>>>> doesn't cause any serious performance issues either with the generated
>>>> code or with excessive compile times, I'd prefer building it into the
>>>> compiler by default.  I'd be fine if I had to pass an extra flag at
>>>> compile time to actually run Flambda optimizers, but it should at
>>>> least be available.  It doesn't have to be perfect to be useful.
>>>>
>>>> On Wed, Mar 9, 2016 at 8:32 PM, Yotam Barnoy <yotambarnoy@gmail.com> wrote:
>>>>> While we await the manual, can you explain what you mean by 'enabled at
>>>>> configure time'? Will a -flambda -O-something argument passed to the normal
>>>>> 4.03 compiler enable flambda optimizations? Flambda is clearly the star of
>>>>> the 4.03 release, so not enabling it using command line options seems
>>>>> counter-intuitive (if this is the case).
>>>>>
>>>>> -Yotam
>>>>>
>>>>> On Wed, Mar 9, 2016 at 7:59 PM, Markus Mottl <markus.mottl@gmail.com> wrote:
>>>>>> I've just tested Flambda, and it seems to already be doing a pretty
>>>>>> decent job on some non-trivial examples (e.g. inlining combinations of
>>>>>> functors and first class functions).  I hope there will be a stable
>>>>>> 4.03 OPAM switch that enables it.  I'm looking forward to being able
>>>>>> to write more elegant, abstract code that's still efficient.
>>>>>>
>>>>>> Regards,
>>>>>> Markus
>>>>>>
>>>>>> On Wed, Mar 9, 2016 at 2:14 AM, Mark Shinwell <mshinwell@janestreet.com>
>>>>>> wrote:
>>>>>>> It will not be enabled by default in 4.03.  For the majority of
>>>>>>> programs, in the current state, it should improve performance (mainly
>>>>>>> by lowering allocation).  It should never generate wrong code.
>>>>>>> However we know of examples that don't improve as much as we would
>>>>>>> like, which we will try to address for 4.04.
>>>>>>>
>>>>>>> There will be a draft version of the new Flambda manual chapter
>>>>>>> available shortly (hopefully this week).  Amongst other things this
>>>>>>> documents what you found about the configure options and the flags'
>>>>>>> operation.
>>>>>>>
>>>>>>> Mark
>>>>>>>
>>>>>>> On 9 March 2016 at 03:55, Markus Mottl <markus.mottl@gmail.com> wrote:
>>>>>>>> Hi Alain,
>>>>>>>>
>>>>>>>> I see, thanks.  It was a little confusing, because the command line
>>>>>>>> options for tuning flambda were still available even without Flambda
>>>>>>>> being enabled.
>>>>>>>>
>>>>>>>> Will Flambda be enabled by default in OCaml 4.03 or is it still
>>>>>>>> considered to be too experimental?  It could turn out to become one of
>>>>>>>> the most impactful new features in terms of how I write code.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Markus
>>>>>>>>
>>>>>>>> On Tue, Mar 8, 2016 at 5:53 PM, Alain Frisch <alain.frisch@lexifi.com>
>>>>>>>> wrote:
>>>>>>>>> Hi Markus,
>>>>>>>>>
>>>>>>>>> flambda needs to be enabled explicitly at configure time with the
>>>>>>>>> "-flambda"
>>>>>>>>> flag.  The new optimizer will then be used unconditionally, and you
>>>>>>>>> can
>>>>>>>>> tweak it using command-line parameters passed to ocamlopt (see
>>>>>>>>> "ocamlopt
>>>>>>>>> -h").
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Alain
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 08/03/2016 23:10, Markus Mottl wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I'm trying out OCaml 4.03.0+beta1 right now and wanted to test
>>>>>>>>>> Flambda
>>>>>>>>>> optimizations.  But looking at the generated assembly, it doesn't
>>>>>>>>>> seem
>>>>>>>>>> to be doing much if anything on the simple test examples that I
>>>>>>>>>> thought would benefit.
>>>>>>>>>>
>>>>>>>>>> To give an example of what I expected to see, lets consider this
>>>>>>>>>> code:
>>>>>>>>>>
>>>>>>>>>> -----
>>>>>>>>>> let map_pair f (x, y) = f x, f y
>>>>>>>>>>
>>>>>>>>>> let succ x = x + 1
>>>>>>>>>> let map_pair_succ1 pair = map_pair succ pair
>>>>>>>>>> let map_pair_succ2 (x, y) = succ x, succ y
>>>>>>>>>> -----
>>>>>>>>>>
>>>>>>>>>> I would have thought that the "succ" function would be inlined in
>>>>>>>>>> "map_pair_succ1" as the compiler would do for "map_pair_succ2".
>>>>>>>>>> But the generated code looks like this:
>>>>>>>>>>
>>>>>>>>>> -----
>>>>>>>>>> L101:
>>>>>>>>>>    movq  %rax, %rdi
>>>>>>>>>>    movq  %rdi, 8(%rsp)
>>>>>>>>>>    movq  %rbx, (%rsp)
>>>>>>>>>>    movq  8(%rbx), %rax
>>>>>>>>>>    movq  (%rdi), %rsi
>>>>>>>>>>    movq  %rdi, %rbx
>>>>>>>>>>    call  *%rsi
>>>>>>>>>> L102:
>>>>>>>>>>    movq  %rax, 16(%rsp)
>>>>>>>>>>    movq  (%rsp), %rax
>>>>>>>>>>    movq  (%rax), %rax
>>>>>>>>>>    movq  8(%rsp), %rbx
>>>>>>>>>>    movq  (%rbx), %rdi
>>>>>>>>>>    call  *%rdi
>>>>>>>>>> -----
>>>>>>>>>>
>>>>>>>>>> Is Flambda supposed to work out of the box with the current beta?
>>>>>>>>>> What flags or annotations should I use for testing?  Any showcase
>>>>>>>>>> examples I should try out that are expected to be improved?
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Markus
>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>>>>>>>>
>>>>>>>> --
>>>>>>>> Caml-list mailing list.  Subscription management and archives:
>>>>>>>> https://sympa.inria.fr/sympa/arc/caml-list
>>>>>>>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>>>>>>>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>>>>>>
>>>>>> --
>>>>>> Caml-list mailing list.  Subscription management and archives:
>>>>>> https://sympa.inria.fr/sympa/arc/caml-list
>>>>>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>>>>>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>>>>
>>>>
>>>>
>>>> --
>>>> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>>
>>
>



-- 
Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] <DKIM> Re: Status of Flambda in OCaml 4.03
  2016-03-10 20:12                 ` [Caml-list] <DKIM> " Pierre Chambart
  2016-03-10 21:08                   ` Markus Mottl
@ 2016-03-10 22:51                   ` Gerd Stolpmann
  2016-03-11  8:59                     ` Mark Shinwell
  1 sibling, 1 reply; 23+ messages in thread
From: Gerd Stolpmann @ 2016-03-10 22:51 UTC (permalink / raw)
  To: Pierre Chambart; +Cc: Markus Mottl, OCaml List

[-- Attachment #1: Type: text/plain, Size: 11329 bytes --]

Am Donnerstag, den 10.03.2016, 21:12 +0100 schrieb Pierre Chambart:
> It is realistic when using the -Oclassic option that Mark mentioned.
> By default the flambda inlining heuristic is decided at call site. Hence
> all the information about a function needs to be available to correctly
> decide. That means that the size of the cmx file is approximatively
> linearly related to the .o file size. It is not easy to decide that some
> function will never be inlined, so the information is always kept,
> even on function annotated with [@inline never].

This assumes that the user is fine with an unlimited code size blow-up.
So, to make an example, when you have

let f() =
  if <expr1> then <short-expr2> else <very-long-expr3>

there is the chance that the "if-then" part can be inlined and leads to
a speed-up at the price that the unproductive "else" part is also
inlined. In total, there is a good chance that you see some
acceleration. However, the question is whether the code duplication is
acceptable or not. I guess, you need to also draw a line at the callee
site, and disregard functions that are too large in total (though this
limit can be way higher than the limit for "classic" inlining).

Surely this will also limit the cmx size somewhat.

Gerd

>  But I wouldn't
> expect that to benefit that much. But for the -Oclassic mode where
> the decision is made at the definition, it is possible to decide not
> to include some information in the cmx. This is what happens in
> non-flambda mode, and in flambda mode it also reduce a bit the
> cmx size, but not as much as it could. This will probably improve
> in 4.04 if there is sufficient interest in this -Oclassic mode.
> -- 
> Pierre
> 
> On 10/03/2016 16:32, Markus Mottl wrote:
> > Ok, that explains things.  Is it realistic to assume that the size of
> > .cmx files can be substantially reduced?  It seems there is a natural
> > tradeoff between "optimize well" and "compile fast".  I suspect it may
> > be inevitable to add more compilation files.  We actually already have
> > that situation with native code libraries: the .cmxa file is enough to
> > compile a project, but if the .cmx files of contained modules are
> > visible in the path, too, then, and only then, the compiler can and
> > will do cross-module inlining - which takes longer, of course.
> >
> > What about the following approach? - There is one "minimal" set of
> > compilation files that always allows you to quickly obtain a running
> > (albeit slow / large) executable.   Additional compilation files then
> > monotonically augment this information and can be produced and
> > consumed optionally depending on compilation flags.  The nice thing
> > about this approach is that you don't necessarily have to recompile
> > the whole project with different flags whenever you need a different
> > compile time / performance tradeoff.  E.g. if Flambda information is
> > available for an unchanged file, you don't have to rebuild it when
> > needed.  If you just want to compile quickly, you don't have to read
> > data you don't need.  Separate compilation files would also integrate
> > much better with build tools (timestamping, etc.).
> >
> > I guess we would already be looking at OCaml version 5 for such a change :)
> >
> > On Thu, Mar 10, 2016 at 2:20 AM, Mark Shinwell <mshinwell@janestreet.com> wrote:
> >> By "enabled at configure time" I mean that you need to pass the
> >> "-flambda" option to the configure script when building the compiler.
> >>
> >> The main reason Flambda isn't enabled by default is because we need to
> >> do further work to improve compile-time performance.  There are also
> >> concerns about .cmx file size.  Flambda produces larger .cmx files: it
> >> stores the entire intermediate representation of the compilation unit
> >> so that no subsequent cross-module inlining decision is compromised.
> >>
> >> There is a mode, -Oclassic, which uses Flambda but mimics the
> >> behaviour of the existing compiler; unfortunately this isn't really
> >> fast enough yet either and .cmx sizes aren't small enough.
> >>
> >> When we manage to address some of these issues further, hopefully for
> >> 4.04, we will revisit whether Flambda should be enabled by default.
> >>
> >> One of the main reasons there is a configure option rather than a
> >> runtime switch is to avoid having to re-engineer the compiler's build
> >> system to permit multiple builds of the various libraries (the stdlib,
> >> for example) with differing options that affect what appears in the
> >> .cmx files (e.g. with and without Flambda).  Even if code were used to
> >> allow Flambda to read non-Flambda .cmx files, performance degradation
> >> would result.
> >>
> >> Mark
> >>
> >> On 10 March 2016 at 01:43, Markus Mottl <markus.mottl@gmail.com> wrote:
> >>> I agree with Yotam.  Assuming that Flambda produces correct code and
> >>> doesn't cause any serious performance issues either with the generated
> >>> code or with excessive compile times, I'd prefer building it into the
> >>> compiler by default.  I'd be fine if I had to pass an extra flag at
> >>> compile time to actually run Flambda optimizers, but it should at
> >>> least be available.  It doesn't have to be perfect to be useful.
> >>>
> >>> On Wed, Mar 9, 2016 at 8:32 PM, Yotam Barnoy <yotambarnoy@gmail.com> wrote:
> >>>> While we await the manual, can you explain what you mean by 'enabled at
> >>>> configure time'? Will a -flambda -O-something argument passed to the normal
> >>>> 4.03 compiler enable flambda optimizations? Flambda is clearly the star of
> >>>> the 4.03 release, so not enabling it using command line options seems
> >>>> counter-intuitive (if this is the case).
> >>>>
> >>>> -Yotam
> >>>>
> >>>> On Wed, Mar 9, 2016 at 7:59 PM, Markus Mottl <markus.mottl@gmail.com> wrote:
> >>>>> I've just tested Flambda, and it seems to already be doing a pretty
> >>>>> decent job on some non-trivial examples (e.g. inlining combinations of
> >>>>> functors and first class functions).  I hope there will be a stable
> >>>>> 4.03 OPAM switch that enables it.  I'm looking forward to being able
> >>>>> to write more elegant, abstract code that's still efficient.
> >>>>>
> >>>>> Regards,
> >>>>> Markus
> >>>>>
> >>>>> On Wed, Mar 9, 2016 at 2:14 AM, Mark Shinwell <mshinwell@janestreet.com>
> >>>>> wrote:
> >>>>>> It will not be enabled by default in 4.03.  For the majority of
> >>>>>> programs, in the current state, it should improve performance (mainly
> >>>>>> by lowering allocation).  It should never generate wrong code.
> >>>>>> However we know of examples that don't improve as much as we would
> >>>>>> like, which we will try to address for 4.04.
> >>>>>>
> >>>>>> There will be a draft version of the new Flambda manual chapter
> >>>>>> available shortly (hopefully this week).  Amongst other things this
> >>>>>> documents what you found about the configure options and the flags'
> >>>>>> operation.
> >>>>>>
> >>>>>> Mark
> >>>>>>
> >>>>>> On 9 March 2016 at 03:55, Markus Mottl <markus.mottl@gmail.com> wrote:
> >>>>>>> Hi Alain,
> >>>>>>>
> >>>>>>> I see, thanks.  It was a little confusing, because the command line
> >>>>>>> options for tuning flambda were still available even without Flambda
> >>>>>>> being enabled.
> >>>>>>>
> >>>>>>> Will Flambda be enabled by default in OCaml 4.03 or is it still
> >>>>>>> considered to be too experimental?  It could turn out to become one of
> >>>>>>> the most impactful new features in terms of how I write code.
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>> Markus
> >>>>>>>
> >>>>>>> On Tue, Mar 8, 2016 at 5:53 PM, Alain Frisch <alain.frisch@lexifi.com>
> >>>>>>> wrote:
> >>>>>>>> Hi Markus,
> >>>>>>>>
> >>>>>>>> flambda needs to be enabled explicitly at configure time with the
> >>>>>>>> "-flambda"
> >>>>>>>> flag.  The new optimizer will then be used unconditionally, and you
> >>>>>>>> can
> >>>>>>>> tweak it using command-line parameters passed to ocamlopt (see
> >>>>>>>> "ocamlopt
> >>>>>>>> -h").
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Alain
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 08/03/2016 23:10, Markus Mottl wrote:
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>> I'm trying out OCaml 4.03.0+beta1 right now and wanted to test
> >>>>>>>>> Flambda
> >>>>>>>>> optimizations.  But looking at the generated assembly, it doesn't
> >>>>>>>>> seem
> >>>>>>>>> to be doing much if anything on the simple test examples that I
> >>>>>>>>> thought would benefit.
> >>>>>>>>>
> >>>>>>>>> To give an example of what I expected to see, lets consider this
> >>>>>>>>> code:
> >>>>>>>>>
> >>>>>>>>> -----
> >>>>>>>>> let map_pair f (x, y) = f x, f y
> >>>>>>>>>
> >>>>>>>>> let succ x = x + 1
> >>>>>>>>> let map_pair_succ1 pair = map_pair succ pair
> >>>>>>>>> let map_pair_succ2 (x, y) = succ x, succ y
> >>>>>>>>> -----
> >>>>>>>>>
> >>>>>>>>> I would have thought that the "succ" function would be inlined in
> >>>>>>>>> "map_pair_succ1" as the compiler would do for "map_pair_succ2".
> >>>>>>>>> But the generated code looks like this:
> >>>>>>>>>
> >>>>>>>>> -----
> >>>>>>>>> L101:
> >>>>>>>>>    movq  %rax, %rdi
> >>>>>>>>>    movq  %rdi, 8(%rsp)
> >>>>>>>>>    movq  %rbx, (%rsp)
> >>>>>>>>>    movq  8(%rbx), %rax
> >>>>>>>>>    movq  (%rdi), %rsi
> >>>>>>>>>    movq  %rdi, %rbx
> >>>>>>>>>    call  *%rsi
> >>>>>>>>> L102:
> >>>>>>>>>    movq  %rax, 16(%rsp)
> >>>>>>>>>    movq  (%rsp), %rax
> >>>>>>>>>    movq  (%rax), %rax
> >>>>>>>>>    movq  8(%rsp), %rbx
> >>>>>>>>>    movq  (%rbx), %rdi
> >>>>>>>>>    call  *%rdi
> >>>>>>>>> -----
> >>>>>>>>>
> >>>>>>>>> Is Flambda supposed to work out of the box with the current beta?
> >>>>>>>>> What flags or annotations should I use for testing?  Any showcase
> >>>>>>>>> examples I should try out that are expected to be improved?
> >>>>>>>>>
> >>>>>>>>> Regards,
> >>>>>>>>> Markus
> >>>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
> >>>>>>>
> >>>>>>> --
> >>>>>>> Caml-list mailing list.  Subscription management and archives:
> >>>>>>> https://sympa.inria.fr/sympa/arc/caml-list
> >>>>>>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> >>>>>>> Bug reports: http://caml.inria.fr/bin/caml-bugs
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
> >>>>>
> >>>>> --
> >>>>> Caml-list mailing list.  Subscription management and archives:
> >>>>> https://sympa.inria.fr/sympa/arc/caml-list
> >>>>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> >>>>> Bug reports: http://caml.inria.fr/bin/caml-bugs
> >>>>
> >>>
> >>>
> >>> --
> >>> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
> >
> >
> 
> 

-- 
------------------------------------------------------------
Gerd Stolpmann, Darmstadt, Germany    gerd@gerd-stolpmann.de
My OCaml site:          http://www.camlcity.org
Contact details:        http://www.camlcity.org/contact.html
Company homepage:       http://www.gerd-stolpmann.de
------------------------------------------------------------


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] <DKIM> Re: Status of Flambda in OCaml 4.03
  2016-03-10 22:51                   ` Gerd Stolpmann
@ 2016-03-11  8:59                     ` Mark Shinwell
  2016-03-11  9:05                       ` Mark Shinwell
                                         ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Mark Shinwell @ 2016-03-11  8:59 UTC (permalink / raw)
  To: Gerd Stolpmann; +Cc: Pierre Chambart, Markus Mottl, OCaml List

Markus: I think we should at least consider whether package management
can help with multiple installations of libraries at different
optimisation levels.  As regards multiple files, I suppose it could be
configurable, but I worry that for large source trees the overhead of
having that many more compilation artifacts may be non-negligible.
Perhaps another option would be to arrange the .cmx files so that they
could be read without importing the full information for optimisation
unless requested.

Gerd: Unless you have unlimited source files, there shouldn't be
unlimited code size blowup, because there are parameters that restrict
inlining.  In particular (unless the user forces behaviour via an
attribute) there is always a calculation that weighs up the change in
code size resulting from a proposed inlining against the expected
runtime performance benefit based on which operations will be
simplified away as a result of doing such inlining.

Also, for a function like the one you gave containing:

  if <cond> then <small expr> else <big expr>

one of the reasons this should be kept in the .cmx files is because,
when the compiler comes to examine whether to inline it, it may be
able to fully evaluate <cond>.  In particular when it's true then the
large expression can be eliminated completely (so long as <cond> is
not side-effecting).  Another example is functions containing a large
match, where we may end up knowing which case is to be taken.

Mark

On 10 March 2016 at 22:51, Gerd Stolpmann <info@gerd-stolpmann.de> wrote:
> Am Donnerstag, den 10.03.2016, 21:12 +0100 schrieb Pierre Chambart:
>> It is realistic when using the -Oclassic option that Mark mentioned.
>> By default the flambda inlining heuristic is decided at call site. Hence
>> all the information about a function needs to be available to correctly
>> decide. That means that the size of the cmx file is approximatively
>> linearly related to the .o file size. It is not easy to decide that some
>> function will never be inlined, so the information is always kept,
>> even on function annotated with [@inline never].
>
> This assumes that the user is fine with an unlimited code size blow-up.
> So, to make an example, when you have
>
> let f() =
>   if <expr1> then <short-expr2> else <very-long-expr3>
>
> there is the chance that the "if-then" part can be inlined and leads to
> a speed-up at the price that the unproductive "else" part is also
> inlined. In total, there is a good chance that you see some
> acceleration. However, the question is whether the code duplication is
> acceptable or not. I guess, you need to also draw a line at the callee
> site, and disregard functions that are too large in total (though this
> limit can be way higher than the limit for "classic" inlining).
>
> Surely this will also limit the cmx size somewhat.
>
> Gerd
>
>>  But I wouldn't
>> expect that to benefit that much. But for the -Oclassic mode where
>> the decision is made at the definition, it is possible to decide not
>> to include some information in the cmx. This is what happens in
>> non-flambda mode, and in flambda mode it also reduce a bit the
>> cmx size, but not as much as it could. This will probably improve
>> in 4.04 if there is sufficient interest in this -Oclassic mode.
>> --
>> Pierre
>>
>> On 10/03/2016 16:32, Markus Mottl wrote:
>> > Ok, that explains things.  Is it realistic to assume that the size of
>> > .cmx files can be substantially reduced?  It seems there is a natural
>> > tradeoff between "optimize well" and "compile fast".  I suspect it may
>> > be inevitable to add more compilation files.  We actually already have
>> > that situation with native code libraries: the .cmxa file is enough to
>> > compile a project, but if the .cmx files of contained modules are
>> > visible in the path, too, then, and only then, the compiler can and
>> > will do cross-module inlining - which takes longer, of course.
>> >
>> > What about the following approach? - There is one "minimal" set of
>> > compilation files that always allows you to quickly obtain a running
>> > (albeit slow / large) executable.   Additional compilation files then
>> > monotonically augment this information and can be produced and
>> > consumed optionally depending on compilation flags.  The nice thing
>> > about this approach is that you don't necessarily have to recompile
>> > the whole project with different flags whenever you need a different
>> > compile time / performance tradeoff.  E.g. if Flambda information is
>> > available for an unchanged file, you don't have to rebuild it when
>> > needed.  If you just want to compile quickly, you don't have to read
>> > data you don't need.  Separate compilation files would also integrate
>> > much better with build tools (timestamping, etc.).
>> >
>> > I guess we would already be looking at OCaml version 5 for such a change :)
>> >
>> > On Thu, Mar 10, 2016 at 2:20 AM, Mark Shinwell <mshinwell@janestreet.com> wrote:
>> >> By "enabled at configure time" I mean that you need to pass the
>> >> "-flambda" option to the configure script when building the compiler.
>> >>
>> >> The main reason Flambda isn't enabled by default is because we need to
>> >> do further work to improve compile-time performance.  There are also
>> >> concerns about .cmx file size.  Flambda produces larger .cmx files: it
>> >> stores the entire intermediate representation of the compilation unit
>> >> so that no subsequent cross-module inlining decision is compromised.
>> >>
>> >> There is a mode, -Oclassic, which uses Flambda but mimics the
>> >> behaviour of the existing compiler; unfortunately this isn't really
>> >> fast enough yet either and .cmx sizes aren't small enough.
>> >>
>> >> When we manage to address some of these issues further, hopefully for
>> >> 4.04, we will revisit whether Flambda should be enabled by default.
>> >>
>> >> One of the main reasons there is a configure option rather than a
>> >> runtime switch is to avoid having to re-engineer the compiler's build
>> >> system to permit multiple builds of the various libraries (the stdlib,
>> >> for example) with differing options that affect what appears in the
>> >> .cmx files (e.g. with and without Flambda).  Even if code were used to
>> >> allow Flambda to read non-Flambda .cmx files, performance degradation
>> >> would result.
>> >>
>> >> Mark
>> >>
>> >> On 10 March 2016 at 01:43, Markus Mottl <markus.mottl@gmail.com> wrote:
>> >>> I agree with Yotam.  Assuming that Flambda produces correct code and
>> >>> doesn't cause any serious performance issues either with the generated
>> >>> code or with excessive compile times, I'd prefer building it into the
>> >>> compiler by default.  I'd be fine if I had to pass an extra flag at
>> >>> compile time to actually run Flambda optimizers, but it should at
>> >>> least be available.  It doesn't have to be perfect to be useful.
>> >>>
>> >>> On Wed, Mar 9, 2016 at 8:32 PM, Yotam Barnoy <yotambarnoy@gmail.com> wrote:
>> >>>> While we await the manual, can you explain what you mean by 'enabled at
>> >>>> configure time'? Will a -flambda -O-something argument passed to the normal
>> >>>> 4.03 compiler enable flambda optimizations? Flambda is clearly the star of
>> >>>> the 4.03 release, so not enabling it using command line options seems
>> >>>> counter-intuitive (if this is the case).
>> >>>>
>> >>>> -Yotam
>> >>>>
>> >>>> On Wed, Mar 9, 2016 at 7:59 PM, Markus Mottl <markus.mottl@gmail.com> wrote:
>> >>>>> I've just tested Flambda, and it seems to already be doing a pretty
>> >>>>> decent job on some non-trivial examples (e.g. inlining combinations of
>> >>>>> functors and first class functions).  I hope there will be a stable
>> >>>>> 4.03 OPAM switch that enables it.  I'm looking forward to being able
>> >>>>> to write more elegant, abstract code that's still efficient.
>> >>>>>
>> >>>>> Regards,
>> >>>>> Markus
>> >>>>>
>> >>>>> On Wed, Mar 9, 2016 at 2:14 AM, Mark Shinwell <mshinwell@janestreet.com>
>> >>>>> wrote:
>> >>>>>> It will not be enabled by default in 4.03.  For the majority of
>> >>>>>> programs, in the current state, it should improve performance (mainly
>> >>>>>> by lowering allocation).  It should never generate wrong code.
>> >>>>>> However we know of examples that don't improve as much as we would
>> >>>>>> like, which we will try to address for 4.04.
>> >>>>>>
>> >>>>>> There will be a draft version of the new Flambda manual chapter
>> >>>>>> available shortly (hopefully this week).  Amongst other things this
>> >>>>>> documents what you found about the configure options and the flags'
>> >>>>>> operation.
>> >>>>>>
>> >>>>>> Mark
>> >>>>>>
>> >>>>>> On 9 March 2016 at 03:55, Markus Mottl <markus.mottl@gmail.com> wrote:
>> >>>>>>> Hi Alain,
>> >>>>>>>
>> >>>>>>> I see, thanks.  It was a little confusing, because the command line
>> >>>>>>> options for tuning flambda were still available even without Flambda
>> >>>>>>> being enabled.
>> >>>>>>>
>> >>>>>>> Will Flambda be enabled by default in OCaml 4.03 or is it still
>> >>>>>>> considered to be too experimental?  It could turn out to become one of
>> >>>>>>> the most impactful new features in terms of how I write code.
>> >>>>>>>
>> >>>>>>> Regards,
>> >>>>>>> Markus
>> >>>>>>>
>> >>>>>>> On Tue, Mar 8, 2016 at 5:53 PM, Alain Frisch <alain.frisch@lexifi.com>
>> >>>>>>> wrote:
>> >>>>>>>> Hi Markus,
>> >>>>>>>>
>> >>>>>>>> flambda needs to be enabled explicitly at configure time with the
>> >>>>>>>> "-flambda"
>> >>>>>>>> flag.  The new optimizer will then be used unconditionally, and you
>> >>>>>>>> can
>> >>>>>>>> tweak it using command-line parameters passed to ocamlopt (see
>> >>>>>>>> "ocamlopt
>> >>>>>>>> -h").
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> Alain
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> On 08/03/2016 23:10, Markus Mottl wrote:
>> >>>>>>>>> Hi,
>> >>>>>>>>>
>> >>>>>>>>> I'm trying out OCaml 4.03.0+beta1 right now and wanted to test
>> >>>>>>>>> Flambda
>> >>>>>>>>> optimizations.  But looking at the generated assembly, it doesn't
>> >>>>>>>>> seem
>> >>>>>>>>> to be doing much if anything on the simple test examples that I
>> >>>>>>>>> thought would benefit.
>> >>>>>>>>>
>> >>>>>>>>> To give an example of what I expected to see, lets consider this
>> >>>>>>>>> code:
>> >>>>>>>>>
>> >>>>>>>>> -----
>> >>>>>>>>> let map_pair f (x, y) = f x, f y
>> >>>>>>>>>
>> >>>>>>>>> let succ x = x + 1
>> >>>>>>>>> let map_pair_succ1 pair = map_pair succ pair
>> >>>>>>>>> let map_pair_succ2 (x, y) = succ x, succ y
>> >>>>>>>>> -----
>> >>>>>>>>>
>> >>>>>>>>> I would have thought that the "succ" function would be inlined in
>> >>>>>>>>> "map_pair_succ1" as the compiler would do for "map_pair_succ2".
>> >>>>>>>>> But the generated code looks like this:
>> >>>>>>>>>
>> >>>>>>>>> -----
>> >>>>>>>>> L101:
>> >>>>>>>>>    movq  %rax, %rdi
>> >>>>>>>>>    movq  %rdi, 8(%rsp)
>> >>>>>>>>>    movq  %rbx, (%rsp)
>> >>>>>>>>>    movq  8(%rbx), %rax
>> >>>>>>>>>    movq  (%rdi), %rsi
>> >>>>>>>>>    movq  %rdi, %rbx
>> >>>>>>>>>    call  *%rsi
>> >>>>>>>>> L102:
>> >>>>>>>>>    movq  %rax, 16(%rsp)
>> >>>>>>>>>    movq  (%rsp), %rax
>> >>>>>>>>>    movq  (%rax), %rax
>> >>>>>>>>>    movq  8(%rsp), %rbx
>> >>>>>>>>>    movq  (%rbx), %rdi
>> >>>>>>>>>    call  *%rdi
>> >>>>>>>>> -----
>> >>>>>>>>>
>> >>>>>>>>> Is Flambda supposed to work out of the box with the current beta?
>> >>>>>>>>> What flags or annotations should I use for testing?  Any showcase
>> >>>>>>>>> examples I should try out that are expected to be improved?
>> >>>>>>>>>
>> >>>>>>>>> Regards,
>> >>>>>>>>> Markus
>> >>>>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> --
>> >>>>>>> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>> >>>>>>>
>> >>>>>>> --
>> >>>>>>> Caml-list mailing list.  Subscription management and archives:
>> >>>>>>> https://sympa.inria.fr/sympa/arc/caml-list
>> >>>>>>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> >>>>>>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>> >>>>>
>> >>>>>
>> >>>>> --
>> >>>>> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>> >>>>>
>> >>>>> --
>> >>>>> Caml-list mailing list.  Subscription management and archives:
>> >>>>> https://sympa.inria.fr/sympa/arc/caml-list
>> >>>>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> >>>>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>> >>>>
>> >>>
>> >>>
>> >>> --
>> >>> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>> >
>> >
>>
>>
>
> --
> ------------------------------------------------------------
> Gerd Stolpmann, Darmstadt, Germany    gerd@gerd-stolpmann.de
> My OCaml site:          http://www.camlcity.org
> Contact details:        http://www.camlcity.org/contact.html
> Company homepage:       http://www.gerd-stolpmann.de
> ------------------------------------------------------------
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] <DKIM> Re: Status of Flambda in OCaml 4.03
  2016-03-11  8:59                     ` Mark Shinwell
@ 2016-03-11  9:05                       ` Mark Shinwell
  2016-03-11  9:09                       ` Alain Frisch
  2016-03-11 16:58                       ` Markus Mottl
  2 siblings, 0 replies; 23+ messages in thread
From: Mark Shinwell @ 2016-03-11  9:05 UTC (permalink / raw)
  To: Gerd Stolpmann; +Cc: Pierre Chambart, Markus Mottl, OCaml List

(Sorry, I meant: "the large expression and <cond> can be eliminated")

On 11 March 2016 at 08:59, Mark Shinwell <mshinwell@janestreet.com> wrote:
> Markus: I think we should at least consider whether package management
> can help with multiple installations of libraries at different
> optimisation levels.  As regards multiple files, I suppose it could be
> configurable, but I worry that for large source trees the overhead of
> having that many more compilation artifacts may be non-negligible.
> Perhaps another option would be to arrange the .cmx files so that they
> could be read without importing the full information for optimisation
> unless requested.
>
> Gerd: Unless you have unlimited source files, there shouldn't be
> unlimited code size blowup, because there are parameters that restrict
> inlining.  In particular (unless the user forces behaviour via an
> attribute) there is always a calculation that weighs up the change in
> code size resulting from a proposed inlining against the expected
> runtime performance benefit based on which operations will be
> simplified away as a result of doing such inlining.
>
> Also, for a function like the one you gave containing:
>
>   if <cond> then <small expr> else <big expr>
>
> one of the reasons this should be kept in the .cmx files is because,
> when the compiler comes to examine whether to inline it, it may be
> able to fully evaluate <cond>.  In particular when it's true then the
> large expression can be eliminated completely (so long as <cond> is
> not side-effecting).  Another example is functions containing a large
> match, where we may end up knowing which case is to be taken.
>
> Mark
>
> On 10 March 2016 at 22:51, Gerd Stolpmann <info@gerd-stolpmann.de> wrote:
>> Am Donnerstag, den 10.03.2016, 21:12 +0100 schrieb Pierre Chambart:
>>> It is realistic when using the -Oclassic option that Mark mentioned.
>>> By default the flambda inlining heuristic is decided at call site. Hence
>>> all the information about a function needs to be available to correctly
>>> decide. That means that the size of the cmx file is approximatively
>>> linearly related to the .o file size. It is not easy to decide that some
>>> function will never be inlined, so the information is always kept,
>>> even on function annotated with [@inline never].
>>
>> This assumes that the user is fine with an unlimited code size blow-up.
>> So, to make an example, when you have
>>
>> let f() =
>>   if <expr1> then <short-expr2> else <very-long-expr3>
>>
>> there is the chance that the "if-then" part can be inlined and leads to
>> a speed-up at the price that the unproductive "else" part is also
>> inlined. In total, there is a good chance that you see some
>> acceleration. However, the question is whether the code duplication is
>> acceptable or not. I guess, you need to also draw a line at the callee
>> site, and disregard functions that are too large in total (though this
>> limit can be way higher than the limit for "classic" inlining).
>>
>> Surely this will also limit the cmx size somewhat.
>>
>> Gerd
>>
>>>  But I wouldn't
>>> expect that to benefit that much. But for the -Oclassic mode where
>>> the decision is made at the definition, it is possible to decide not
>>> to include some information in the cmx. This is what happens in
>>> non-flambda mode, and in flambda mode it also reduce a bit the
>>> cmx size, but not as much as it could. This will probably improve
>>> in 4.04 if there is sufficient interest in this -Oclassic mode.
>>> --
>>> Pierre
>>>
>>> On 10/03/2016 16:32, Markus Mottl wrote:
>>> > Ok, that explains things.  Is it realistic to assume that the size of
>>> > .cmx files can be substantially reduced?  It seems there is a natural
>>> > tradeoff between "optimize well" and "compile fast".  I suspect it may
>>> > be inevitable to add more compilation files.  We actually already have
>>> > that situation with native code libraries: the .cmxa file is enough to
>>> > compile a project, but if the .cmx files of contained modules are
>>> > visible in the path, too, then, and only then, the compiler can and
>>> > will do cross-module inlining - which takes longer, of course.
>>> >
>>> > What about the following approach? - There is one "minimal" set of
>>> > compilation files that always allows you to quickly obtain a running
>>> > (albeit slow / large) executable.   Additional compilation files then
>>> > monotonically augment this information and can be produced and
>>> > consumed optionally depending on compilation flags.  The nice thing
>>> > about this approach is that you don't necessarily have to recompile
>>> > the whole project with different flags whenever you need a different
>>> > compile time / performance tradeoff.  E.g. if Flambda information is
>>> > available for an unchanged file, you don't have to rebuild it when
>>> > needed.  If you just want to compile quickly, you don't have to read
>>> > data you don't need.  Separate compilation files would also integrate
>>> > much better with build tools (timestamping, etc.).
>>> >
>>> > I guess we would already be looking at OCaml version 5 for such a change :)
>>> >
>>> > On Thu, Mar 10, 2016 at 2:20 AM, Mark Shinwell <mshinwell@janestreet.com> wrote:
>>> >> By "enabled at configure time" I mean that you need to pass the
>>> >> "-flambda" option to the configure script when building the compiler.
>>> >>
>>> >> The main reason Flambda isn't enabled by default is because we need to
>>> >> do further work to improve compile-time performance.  There are also
>>> >> concerns about .cmx file size.  Flambda produces larger .cmx files: it
>>> >> stores the entire intermediate representation of the compilation unit
>>> >> so that no subsequent cross-module inlining decision is compromised.
>>> >>
>>> >> There is a mode, -Oclassic, which uses Flambda but mimics the
>>> >> behaviour of the existing compiler; unfortunately this isn't really
>>> >> fast enough yet either and .cmx sizes aren't small enough.
>>> >>
>>> >> When we manage to address some of these issues further, hopefully for
>>> >> 4.04, we will revisit whether Flambda should be enabled by default.
>>> >>
>>> >> One of the main reasons there is a configure option rather than a
>>> >> runtime switch is to avoid having to re-engineer the compiler's build
>>> >> system to permit multiple builds of the various libraries (the stdlib,
>>> >> for example) with differing options that affect what appears in the
>>> >> .cmx files (e.g. with and without Flambda).  Even if code were used to
>>> >> allow Flambda to read non-Flambda .cmx files, performance degradation
>>> >> would result.
>>> >>
>>> >> Mark
>>> >>
>>> >> On 10 March 2016 at 01:43, Markus Mottl <markus.mottl@gmail.com> wrote:
>>> >>> I agree with Yotam.  Assuming that Flambda produces correct code and
>>> >>> doesn't cause any serious performance issues either with the generated
>>> >>> code or with excessive compile times, I'd prefer building it into the
>>> >>> compiler by default.  I'd be fine if I had to pass an extra flag at
>>> >>> compile time to actually run Flambda optimizers, but it should at
>>> >>> least be available.  It doesn't have to be perfect to be useful.
>>> >>>
>>> >>> On Wed, Mar 9, 2016 at 8:32 PM, Yotam Barnoy <yotambarnoy@gmail.com> wrote:
>>> >>>> While we await the manual, can you explain what you mean by 'enabled at
>>> >>>> configure time'? Will a -flambda -O-something argument passed to the normal
>>> >>>> 4.03 compiler enable flambda optimizations? Flambda is clearly the star of
>>> >>>> the 4.03 release, so not enabling it using command line options seems
>>> >>>> counter-intuitive (if this is the case).
>>> >>>>
>>> >>>> -Yotam
>>> >>>>
>>> >>>> On Wed, Mar 9, 2016 at 7:59 PM, Markus Mottl <markus.mottl@gmail.com> wrote:
>>> >>>>> I've just tested Flambda, and it seems to already be doing a pretty
>>> >>>>> decent job on some non-trivial examples (e.g. inlining combinations of
>>> >>>>> functors and first class functions).  I hope there will be a stable
>>> >>>>> 4.03 OPAM switch that enables it.  I'm looking forward to being able
>>> >>>>> to write more elegant, abstract code that's still efficient.
>>> >>>>>
>>> >>>>> Regards,
>>> >>>>> Markus
>>> >>>>>
>>> >>>>> On Wed, Mar 9, 2016 at 2:14 AM, Mark Shinwell <mshinwell@janestreet.com>
>>> >>>>> wrote:
>>> >>>>>> It will not be enabled by default in 4.03.  For the majority of
>>> >>>>>> programs, in the current state, it should improve performance (mainly
>>> >>>>>> by lowering allocation).  It should never generate wrong code.
>>> >>>>>> However we know of examples that don't improve as much as we would
>>> >>>>>> like, which we will try to address for 4.04.
>>> >>>>>>
>>> >>>>>> There will be a draft version of the new Flambda manual chapter
>>> >>>>>> available shortly (hopefully this week).  Amongst other things this
>>> >>>>>> documents what you found about the configure options and the flags'
>>> >>>>>> operation.
>>> >>>>>>
>>> >>>>>> Mark
>>> >>>>>>
>>> >>>>>> On 9 March 2016 at 03:55, Markus Mottl <markus.mottl@gmail.com> wrote:
>>> >>>>>>> Hi Alain,
>>> >>>>>>>
>>> >>>>>>> I see, thanks.  It was a little confusing, because the command line
>>> >>>>>>> options for tuning flambda were still available even without Flambda
>>> >>>>>>> being enabled.
>>> >>>>>>>
>>> >>>>>>> Will Flambda be enabled by default in OCaml 4.03 or is it still
>>> >>>>>>> considered to be too experimental?  It could turn out to become one of
>>> >>>>>>> the most impactful new features in terms of how I write code.
>>> >>>>>>>
>>> >>>>>>> Regards,
>>> >>>>>>> Markus
>>> >>>>>>>
>>> >>>>>>> On Tue, Mar 8, 2016 at 5:53 PM, Alain Frisch <alain.frisch@lexifi.com>
>>> >>>>>>> wrote:
>>> >>>>>>>> Hi Markus,
>>> >>>>>>>>
>>> >>>>>>>> flambda needs to be enabled explicitly at configure time with the
>>> >>>>>>>> "-flambda"
>>> >>>>>>>> flag.  The new optimizer will then be used unconditionally, and you
>>> >>>>>>>> can
>>> >>>>>>>> tweak it using command-line parameters passed to ocamlopt (see
>>> >>>>>>>> "ocamlopt
>>> >>>>>>>> -h").
>>> >>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>> Alain
>>> >>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>> On 08/03/2016 23:10, Markus Mottl wrote:
>>> >>>>>>>>> Hi,
>>> >>>>>>>>>
>>> >>>>>>>>> I'm trying out OCaml 4.03.0+beta1 right now and wanted to test
>>> >>>>>>>>> Flambda
>>> >>>>>>>>> optimizations.  But looking at the generated assembly, it doesn't
>>> >>>>>>>>> seem
>>> >>>>>>>>> to be doing much if anything on the simple test examples that I
>>> >>>>>>>>> thought would benefit.
>>> >>>>>>>>>
>>> >>>>>>>>> To give an example of what I expected to see, lets consider this
>>> >>>>>>>>> code:
>>> >>>>>>>>>
>>> >>>>>>>>> -----
>>> >>>>>>>>> let map_pair f (x, y) = f x, f y
>>> >>>>>>>>>
>>> >>>>>>>>> let succ x = x + 1
>>> >>>>>>>>> let map_pair_succ1 pair = map_pair succ pair
>>> >>>>>>>>> let map_pair_succ2 (x, y) = succ x, succ y
>>> >>>>>>>>> -----
>>> >>>>>>>>>
>>> >>>>>>>>> I would have thought that the "succ" function would be inlined in
>>> >>>>>>>>> "map_pair_succ1" as the compiler would do for "map_pair_succ2".
>>> >>>>>>>>> But the generated code looks like this:
>>> >>>>>>>>>
>>> >>>>>>>>> -----
>>> >>>>>>>>> L101:
>>> >>>>>>>>>    movq  %rax, %rdi
>>> >>>>>>>>>    movq  %rdi, 8(%rsp)
>>> >>>>>>>>>    movq  %rbx, (%rsp)
>>> >>>>>>>>>    movq  8(%rbx), %rax
>>> >>>>>>>>>    movq  (%rdi), %rsi
>>> >>>>>>>>>    movq  %rdi, %rbx
>>> >>>>>>>>>    call  *%rsi
>>> >>>>>>>>> L102:
>>> >>>>>>>>>    movq  %rax, 16(%rsp)
>>> >>>>>>>>>    movq  (%rsp), %rax
>>> >>>>>>>>>    movq  (%rax), %rax
>>> >>>>>>>>>    movq  8(%rsp), %rbx
>>> >>>>>>>>>    movq  (%rbx), %rdi
>>> >>>>>>>>>    call  *%rdi
>>> >>>>>>>>> -----
>>> >>>>>>>>>
>>> >>>>>>>>> Is Flambda supposed to work out of the box with the current beta?
>>> >>>>>>>>> What flags or annotations should I use for testing?  Any showcase
>>> >>>>>>>>> examples I should try out that are expected to be improved?
>>> >>>>>>>>>
>>> >>>>>>>>> Regards,
>>> >>>>>>>>> Markus
>>> >>>>>>>>>
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>> --
>>> >>>>>>> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>>> >>>>>>>
>>> >>>>>>> --
>>> >>>>>>> Caml-list mailing list.  Subscription management and archives:
>>> >>>>>>> https://sympa.inria.fr/sympa/arc/caml-list
>>> >>>>>>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>>> >>>>>>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>> >>>>>
>>> >>>>>
>>> >>>>> --
>>> >>>>> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>>> >>>>>
>>> >>>>> --
>>> >>>>> Caml-list mailing list.  Subscription management and archives:
>>> >>>>> https://sympa.inria.fr/sympa/arc/caml-list
>>> >>>>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>>> >>>>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>> >>>>
>>> >>>
>>> >>>
>>> >>> --
>>> >>> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>>> >
>>> >
>>>
>>>
>>
>> --
>> ------------------------------------------------------------
>> Gerd Stolpmann, Darmstadt, Germany    gerd@gerd-stolpmann.de
>> My OCaml site:          http://www.camlcity.org
>> Contact details:        http://www.camlcity.org/contact.html
>> Company homepage:       http://www.gerd-stolpmann.de
>> ------------------------------------------------------------
>>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] <DKIM> Re: Status of Flambda in OCaml 4.03
  2016-03-11  8:59                     ` Mark Shinwell
  2016-03-11  9:05                       ` Mark Shinwell
@ 2016-03-11  9:09                       ` Alain Frisch
  2016-03-11  9:26                         ` Mark Shinwell
  2016-03-11 16:58                       ` Markus Mottl
  2 siblings, 1 reply; 23+ messages in thread
From: Alain Frisch @ 2016-03-11  9:09 UTC (permalink / raw)
  To: Mark Shinwell, Gerd Stolpmann; +Cc: Pierre Chambart, Markus Mottl, OCaml List

On 11/03/2016 09:59, Mark Shinwell wrote:
> Also, for a function like the one you gave containing:
>
>    if <cond> then <small expr> else <big expr>
>
> one of the reasons this should be kept in the .cmx files is because,
> when the compiler comes to examine whether to inline it, it may be
> able to fully evaluate <cond>.  In particular when it's true then the
> large expression can be eliminated completely (so long as <cond> is
> not side-effecting).  Another example is functions containing a large
> match, where we may end up knowing which case is to be taken.

For such cases, it is interesting to compile the function as a small 
stub that checks the condition or the match and jumps (tail call) into 
the proper sub-body.  Only the stub (and potentially small enough 
sub-bodies) would be inlined, and the cmx would not need to store the 
large sub-bodies.  Such approach was already taken for optional 
arguments, and I think that flambda already generalizes it.  Would the 
case above be treated like that?


Alain

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] <DKIM> Re: Status of Flambda in OCaml 4.03
  2016-03-11  9:09                       ` Alain Frisch
@ 2016-03-11  9:26                         ` Mark Shinwell
  2016-03-11 14:48                           ` Yotam Barnoy
  0 siblings, 1 reply; 23+ messages in thread
From: Mark Shinwell @ 2016-03-11  9:26 UTC (permalink / raw)
  To: Alain Frisch; +Cc: Gerd Stolpmann, Pierre Chambart, Markus Mottl, OCaml List

Not at present.  Although the technique of using stubs is heavily used
in Flambda for other optimisations (for example where the argument
list of a function is going to be modified).

Mark

On 11 March 2016 at 09:09, Alain Frisch <alain.frisch@lexifi.com> wrote:
> On 11/03/2016 09:59, Mark Shinwell wrote:
>>
>> Also, for a function like the one you gave containing:
>>
>>    if <cond> then <small expr> else <big expr>
>>
>> one of the reasons this should be kept in the .cmx files is because,
>> when the compiler comes to examine whether to inline it, it may be
>> able to fully evaluate <cond>.  In particular when it's true then the
>> large expression can be eliminated completely (so long as <cond> is
>> not side-effecting).  Another example is functions containing a large
>> match, where we may end up knowing which case is to be taken.
>
>
> For such cases, it is interesting to compile the function as a small stub
> that checks the condition or the match and jumps (tail call) into the proper
> sub-body.  Only the stub (and potentially small enough sub-bodies) would be
> inlined, and the cmx would not need to store the large sub-bodies.  Such
> approach was already taken for optional arguments, and I think that flambda
> already generalizes it.  Would the case above be treated like that?
>
>
> Alain

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] <DKIM> Re: Status of Flambda in OCaml 4.03
  2016-03-11  9:26                         ` Mark Shinwell
@ 2016-03-11 14:48                           ` Yotam Barnoy
  2016-03-11 15:09                             ` Jesper Louis Andersen
  0 siblings, 1 reply; 23+ messages in thread
From: Yotam Barnoy @ 2016-03-11 14:48 UTC (permalink / raw)
  To: Mark Shinwell
  Cc: Alain Frisch, Gerd Stolpmann, Pierre Chambart, Markus Mottl, OCaml List

[-- Attachment #1: Type: text/plain, Size: 2115 bytes --]

Another question: how will 4.03 be handled with regard to OPAM? The way I
see it, the majority of users will want Flambda activated by default.
Companies or individuals that depend on OCaml for their business will
probably want to start off with Flambda turned off, and turn it on as
needed. Additionally, to get Flambda tested by as many people as possible,
I believe we want people to use it by default.

-Yotam

On Fri, Mar 11, 2016 at 4:26 AM, Mark Shinwell <mshinwell@janestreet.com>
wrote:

> Not at present.  Although the technique of using stubs is heavily used
> in Flambda for other optimisations (for example where the argument
> list of a function is going to be modified).
>
> Mark
>
> On 11 March 2016 at 09:09, Alain Frisch <alain.frisch@lexifi.com> wrote:
> > On 11/03/2016 09:59, Mark Shinwell wrote:
> >>
> >> Also, for a function like the one you gave containing:
> >>
> >>    if <cond> then <small expr> else <big expr>
> >>
> >> one of the reasons this should be kept in the .cmx files is because,
> >> when the compiler comes to examine whether to inline it, it may be
> >> able to fully evaluate <cond>.  In particular when it's true then the
> >> large expression can be eliminated completely (so long as <cond> is
> >> not side-effecting).  Another example is functions containing a large
> >> match, where we may end up knowing which case is to be taken.
> >
> >
> > For such cases, it is interesting to compile the function as a small stub
> > that checks the condition or the match and jumps (tail call) into the
> proper
> > sub-body.  Only the stub (and potentially small enough sub-bodies) would
> be
> > inlined, and the cmx would not need to store the large sub-bodies.  Such
> > approach was already taken for optional arguments, and I think that
> flambda
> > already generalizes it.  Would the case above be treated like that?
> >
> >
> > Alain
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>

[-- Attachment #2: Type: text/html, Size: 3111 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] <DKIM> Re: Status of Flambda in OCaml 4.03
  2016-03-11 14:48                           ` Yotam Barnoy
@ 2016-03-11 15:09                             ` Jesper Louis Andersen
  0 siblings, 0 replies; 23+ messages in thread
From: Jesper Louis Andersen @ 2016-03-11 15:09 UTC (permalink / raw)
  To: Yotam Barnoy
  Cc: Mark Shinwell, Alain Frisch, Gerd Stolpmann, Pierre Chambart,
	Markus Mottl, OCaml List

[-- Attachment #1: Type: text/plain, Size: 1511 bytes --]

On Fri, Mar 11, 2016 at 3:48 PM, Yotam Barnoy <yotambarnoy@gmail.com> wrote:

> Another question: how will 4.03 be handled with regard to OPAM? The way I
> see it, the majority of users will want Flambda activated by default.
> Companies or individuals that depend on OCaml for their business will
> probably want to start off with Flambda turned off, and turn it on as
> needed. Additionally, to get Flambda tested by as many people as possible,
> I believe we want people to use it by default.


One of the (many) things I think is done right in OPAM is the `opam switch
...` framework, so you can have multiple compilers installed easily and
handle them on a simple switching framework.

Operationally, I'd put it on an Ocaml 4.03+flambda and then in
point-release .1 or .2 I'd make it the default once it has seen some use by
the early adopters, having an ocaml 4.03+no-flambda option for those who
are behind on schedule. This would allow people to gracefully roll forward
one release, and gracefully roll backward once the switch happens which
lessens the strain on a lot of stuff operationally.

The more risky alternative is to just make flambda the default in 4.03 and
have a no-flambda option for those in the know, but this risks introducing
regressions at a greater extent and should be balanced by the knowledge of
how stable flambda has proven to be. If, for instance, it has seen major
use inside Janes St. for some time and has been generally stable, this is a
far more viable option.



-- 
J.

[-- Attachment #2: Type: text/html, Size: 2031 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] <DKIM> Re: Status of Flambda in OCaml 4.03
  2016-03-11  8:59                     ` Mark Shinwell
  2016-03-11  9:05                       ` Mark Shinwell
  2016-03-11  9:09                       ` Alain Frisch
@ 2016-03-11 16:58                       ` Markus Mottl
  2 siblings, 0 replies; 23+ messages in thread
From: Markus Mottl @ 2016-03-11 16:58 UTC (permalink / raw)
  To: Mark Shinwell; +Cc: Gerd Stolpmann, Pierre Chambart, OCaml List

Mark: thanks to OPAM, a package management solution might be
worthwhile trying.  It would certainly be less intrusive to the
compiler and hence at least a temporary improvement.

Given the obviously significant impact of Flambda on compilation, it
might be time for a more general overhaul of the compilation process.
Producing more files has already crept into the compiler anyway, e.g.
typed syntax tree files (.cmt).  I don't really see much of a problem
with generating more files.  Yes, it could slow things down a bit, but
only if you actually had to generate or read all of them all of the
time.  The whole point of having more files is that you only
regenerate and/or read those that you actually care about for your use
case.  For example, I would be perfectly happy if OCaml had an option
to only produce .cmi files, i.e. only syntax- + type-checked your
code.  The likely majority of recompilations happen, because
developers want to have their code checked, not to actually produce
executables.

I'd rather not package things up in one file even if we could
efficiently access parts of the file only.  What if some part suddenly
became too big to fit into its slot after recompilation?  We'd
essentially be implementing our own file system at that point.  One
could argue that with that approach we might as well put everything
(.cmi, .cmx, .cmo, etc.) in one file, which is probably not what
anybody wants.  Splitting up compilation output into several files for
different compiler use cases (just type check code, quickly produce an
executable for testing, heavily optimize an executable for tuning or
applications, etc.) seems like the easiest approach.


On Fri, Mar 11, 2016 at 3:59 AM, Mark Shinwell <mshinwell@janestreet.com> wrote:
> Markus: I think we should at least consider whether package management
> can help with multiple installations of libraries at different
> optimisation levels.  As regards multiple files, I suppose it could be
> configurable, but I worry that for large source trees the overhead of
> having that many more compilation artifacts may be non-negligible.
> Perhaps another option would be to arrange the .cmx files so that they
> could be read without importing the full information for optimisation
> unless requested.
>
> Gerd: Unless you have unlimited source files, there shouldn't be
> unlimited code size blowup, because there are parameters that restrict
> inlining.  In particular (unless the user forces behaviour via an
> attribute) there is always a calculation that weighs up the change in
> code size resulting from a proposed inlining against the expected
> runtime performance benefit based on which operations will be
> simplified away as a result of doing such inlining.
>
> Also, for a function like the one you gave containing:
>
>   if <cond> then <small expr> else <big expr>
>
> one of the reasons this should be kept in the .cmx files is because,
> when the compiler comes to examine whether to inline it, it may be
> able to fully evaluate <cond>.  In particular when it's true then the
> large expression can be eliminated completely (so long as <cond> is
> not side-effecting).  Another example is functions containing a large
> match, where we may end up knowing which case is to be taken.
>
> Mark
>
> On 10 March 2016 at 22:51, Gerd Stolpmann <info@gerd-stolpmann.de> wrote:
>> Am Donnerstag, den 10.03.2016, 21:12 +0100 schrieb Pierre Chambart:
>>> It is realistic when using the -Oclassic option that Mark mentioned.
>>> By default the flambda inlining heuristic is decided at call site. Hence
>>> all the information about a function needs to be available to correctly
>>> decide. That means that the size of the cmx file is approximatively
>>> linearly related to the .o file size. It is not easy to decide that some
>>> function will never be inlined, so the information is always kept,
>>> even on function annotated with [@inline never].
>>
>> This assumes that the user is fine with an unlimited code size blow-up.
>> So, to make an example, when you have
>>
>> let f() =
>>   if <expr1> then <short-expr2> else <very-long-expr3>
>>
>> there is the chance that the "if-then" part can be inlined and leads to
>> a speed-up at the price that the unproductive "else" part is also
>> inlined. In total, there is a good chance that you see some
>> acceleration. However, the question is whether the code duplication is
>> acceptable or not. I guess, you need to also draw a line at the callee
>> site, and disregard functions that are too large in total (though this
>> limit can be way higher than the limit for "classic" inlining).
>>
>> Surely this will also limit the cmx size somewhat.
>>
>> Gerd
>>
>>>  But I wouldn't
>>> expect that to benefit that much. But for the -Oclassic mode where
>>> the decision is made at the definition, it is possible to decide not
>>> to include some information in the cmx. This is what happens in
>>> non-flambda mode, and in flambda mode it also reduce a bit the
>>> cmx size, but not as much as it could. This will probably improve
>>> in 4.04 if there is sufficient interest in this -Oclassic mode.
>>> --
>>> Pierre
>>>
>>> On 10/03/2016 16:32, Markus Mottl wrote:
>>> > Ok, that explains things.  Is it realistic to assume that the size of
>>> > .cmx files can be substantially reduced?  It seems there is a natural
>>> > tradeoff between "optimize well" and "compile fast".  I suspect it may
>>> > be inevitable to add more compilation files.  We actually already have
>>> > that situation with native code libraries: the .cmxa file is enough to
>>> > compile a project, but if the .cmx files of contained modules are
>>> > visible in the path, too, then, and only then, the compiler can and
>>> > will do cross-module inlining - which takes longer, of course.
>>> >
>>> > What about the following approach? - There is one "minimal" set of
>>> > compilation files that always allows you to quickly obtain a running
>>> > (albeit slow / large) executable.   Additional compilation files then
>>> > monotonically augment this information and can be produced and
>>> > consumed optionally depending on compilation flags.  The nice thing
>>> > about this approach is that you don't necessarily have to recompile
>>> > the whole project with different flags whenever you need a different
>>> > compile time / performance tradeoff.  E.g. if Flambda information is
>>> > available for an unchanged file, you don't have to rebuild it when
>>> > needed.  If you just want to compile quickly, you don't have to read
>>> > data you don't need.  Separate compilation files would also integrate
>>> > much better with build tools (timestamping, etc.).
>>> >
>>> > I guess we would already be looking at OCaml version 5 for such a change :)
>>> >
>>> > On Thu, Mar 10, 2016 at 2:20 AM, Mark Shinwell <mshinwell@janestreet.com> wrote:
>>> >> By "enabled at configure time" I mean that you need to pass the
>>> >> "-flambda" option to the configure script when building the compiler.
>>> >>
>>> >> The main reason Flambda isn't enabled by default is because we need to
>>> >> do further work to improve compile-time performance.  There are also
>>> >> concerns about .cmx file size.  Flambda produces larger .cmx files: it
>>> >> stores the entire intermediate representation of the compilation unit
>>> >> so that no subsequent cross-module inlining decision is compromised.
>>> >>
>>> >> There is a mode, -Oclassic, which uses Flambda but mimics the
>>> >> behaviour of the existing compiler; unfortunately this isn't really
>>> >> fast enough yet either and .cmx sizes aren't small enough.
>>> >>
>>> >> When we manage to address some of these issues further, hopefully for
>>> >> 4.04, we will revisit whether Flambda should be enabled by default.
>>> >>
>>> >> One of the main reasons there is a configure option rather than a
>>> >> runtime switch is to avoid having to re-engineer the compiler's build
>>> >> system to permit multiple builds of the various libraries (the stdlib,
>>> >> for example) with differing options that affect what appears in the
>>> >> .cmx files (e.g. with and without Flambda).  Even if code were used to
>>> >> allow Flambda to read non-Flambda .cmx files, performance degradation
>>> >> would result.
>>> >>
>>> >> Mark
>>> >>
>>> >> On 10 March 2016 at 01:43, Markus Mottl <markus.mottl@gmail.com> wrote:
>>> >>> I agree with Yotam.  Assuming that Flambda produces correct code and
>>> >>> doesn't cause any serious performance issues either with the generated
>>> >>> code or with excessive compile times, I'd prefer building it into the
>>> >>> compiler by default.  I'd be fine if I had to pass an extra flag at
>>> >>> compile time to actually run Flambda optimizers, but it should at
>>> >>> least be available.  It doesn't have to be perfect to be useful.
>>> >>>
>>> >>> On Wed, Mar 9, 2016 at 8:32 PM, Yotam Barnoy <yotambarnoy@gmail.com> wrote:
>>> >>>> While we await the manual, can you explain what you mean by 'enabled at
>>> >>>> configure time'? Will a -flambda -O-something argument passed to the normal
>>> >>>> 4.03 compiler enable flambda optimizations? Flambda is clearly the star of
>>> >>>> the 4.03 release, so not enabling it using command line options seems
>>> >>>> counter-intuitive (if this is the case).
>>> >>>>
>>> >>>> -Yotam
>>> >>>>
>>> >>>> On Wed, Mar 9, 2016 at 7:59 PM, Markus Mottl <markus.mottl@gmail.com> wrote:
>>> >>>>> I've just tested Flambda, and it seems to already be doing a pretty
>>> >>>>> decent job on some non-trivial examples (e.g. inlining combinations of
>>> >>>>> functors and first class functions).  I hope there will be a stable
>>> >>>>> 4.03 OPAM switch that enables it.  I'm looking forward to being able
>>> >>>>> to write more elegant, abstract code that's still efficient.
>>> >>>>>
>>> >>>>> Regards,
>>> >>>>> Markus
>>> >>>>>
>>> >>>>> On Wed, Mar 9, 2016 at 2:14 AM, Mark Shinwell <mshinwell@janestreet.com>
>>> >>>>> wrote:
>>> >>>>>> It will not be enabled by default in 4.03.  For the majority of
>>> >>>>>> programs, in the current state, it should improve performance (mainly
>>> >>>>>> by lowering allocation).  It should never generate wrong code.
>>> >>>>>> However we know of examples that don't improve as much as we would
>>> >>>>>> like, which we will try to address for 4.04.
>>> >>>>>>
>>> >>>>>> There will be a draft version of the new Flambda manual chapter
>>> >>>>>> available shortly (hopefully this week).  Amongst other things this
>>> >>>>>> documents what you found about the configure options and the flags'
>>> >>>>>> operation.
>>> >>>>>>
>>> >>>>>> Mark
>>> >>>>>>
>>> >>>>>> On 9 March 2016 at 03:55, Markus Mottl <markus.mottl@gmail.com> wrote:
>>> >>>>>>> Hi Alain,
>>> >>>>>>>
>>> >>>>>>> I see, thanks.  It was a little confusing, because the command line
>>> >>>>>>> options for tuning flambda were still available even without Flambda
>>> >>>>>>> being enabled.
>>> >>>>>>>
>>> >>>>>>> Will Flambda be enabled by default in OCaml 4.03 or is it still
>>> >>>>>>> considered to be too experimental?  It could turn out to become one of
>>> >>>>>>> the most impactful new features in terms of how I write code.
>>> >>>>>>>
>>> >>>>>>> Regards,
>>> >>>>>>> Markus
>>> >>>>>>>
>>> >>>>>>> On Tue, Mar 8, 2016 at 5:53 PM, Alain Frisch <alain.frisch@lexifi.com>
>>> >>>>>>> wrote:
>>> >>>>>>>> Hi Markus,
>>> >>>>>>>>
>>> >>>>>>>> flambda needs to be enabled explicitly at configure time with the
>>> >>>>>>>> "-flambda"
>>> >>>>>>>> flag.  The new optimizer will then be used unconditionally, and you
>>> >>>>>>>> can
>>> >>>>>>>> tweak it using command-line parameters passed to ocamlopt (see
>>> >>>>>>>> "ocamlopt
>>> >>>>>>>> -h").
>>> >>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>> Alain
>>> >>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>> On 08/03/2016 23:10, Markus Mottl wrote:
>>> >>>>>>>>> Hi,
>>> >>>>>>>>>
>>> >>>>>>>>> I'm trying out OCaml 4.03.0+beta1 right now and wanted to test
>>> >>>>>>>>> Flambda
>>> >>>>>>>>> optimizations.  But looking at the generated assembly, it doesn't
>>> >>>>>>>>> seem
>>> >>>>>>>>> to be doing much if anything on the simple test examples that I
>>> >>>>>>>>> thought would benefit.
>>> >>>>>>>>>
>>> >>>>>>>>> To give an example of what I expected to see, lets consider this
>>> >>>>>>>>> code:
>>> >>>>>>>>>
>>> >>>>>>>>> -----
>>> >>>>>>>>> let map_pair f (x, y) = f x, f y
>>> >>>>>>>>>
>>> >>>>>>>>> let succ x = x + 1
>>> >>>>>>>>> let map_pair_succ1 pair = map_pair succ pair
>>> >>>>>>>>> let map_pair_succ2 (x, y) = succ x, succ y
>>> >>>>>>>>> -----
>>> >>>>>>>>>
>>> >>>>>>>>> I would have thought that the "succ" function would be inlined in
>>> >>>>>>>>> "map_pair_succ1" as the compiler would do for "map_pair_succ2".
>>> >>>>>>>>> But the generated code looks like this:
>>> >>>>>>>>>
>>> >>>>>>>>> -----
>>> >>>>>>>>> L101:
>>> >>>>>>>>>    movq  %rax, %rdi
>>> >>>>>>>>>    movq  %rdi, 8(%rsp)
>>> >>>>>>>>>    movq  %rbx, (%rsp)
>>> >>>>>>>>>    movq  8(%rbx), %rax
>>> >>>>>>>>>    movq  (%rdi), %rsi
>>> >>>>>>>>>    movq  %rdi, %rbx
>>> >>>>>>>>>    call  *%rsi
>>> >>>>>>>>> L102:
>>> >>>>>>>>>    movq  %rax, 16(%rsp)
>>> >>>>>>>>>    movq  (%rsp), %rax
>>> >>>>>>>>>    movq  (%rax), %rax
>>> >>>>>>>>>    movq  8(%rsp), %rbx
>>> >>>>>>>>>    movq  (%rbx), %rdi
>>> >>>>>>>>>    call  *%rdi
>>> >>>>>>>>> -----
>>> >>>>>>>>>
>>> >>>>>>>>> Is Flambda supposed to work out of the box with the current beta?
>>> >>>>>>>>> What flags or annotations should I use for testing?  Any showcase
>>> >>>>>>>>> examples I should try out that are expected to be improved?
>>> >>>>>>>>>
>>> >>>>>>>>> Regards,
>>> >>>>>>>>> Markus
>>> >>>>>>>>>
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>> --
>>> >>>>>>> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>>> >>>>>>>
>>> >>>>>>> --
>>> >>>>>>> Caml-list mailing list.  Subscription management and archives:
>>> >>>>>>> https://sympa.inria.fr/sympa/arc/caml-list
>>> >>>>>>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>>> >>>>>>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>> >>>>>
>>> >>>>>
>>> >>>>> --
>>> >>>>> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>>> >>>>>
>>> >>>>> --
>>> >>>>> Caml-list mailing list.  Subscription management and archives:
>>> >>>>> https://sympa.inria.fr/sympa/arc/caml-list
>>> >>>>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>>> >>>>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>> >>>>
>>> >>>
>>> >>>
>>> >>> --
>>> >>> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>>> >
>>> >
>>>
>>>
>>
>> --
>> ------------------------------------------------------------
>> Gerd Stolpmann, Darmstadt, Germany    gerd@gerd-stolpmann.de
>> My OCaml site:          http://www.camlcity.org
>> Contact details:        http://www.camlcity.org/contact.html
>> Company homepage:       http://www.gerd-stolpmann.de
>> ------------------------------------------------------------
>>



-- 
Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Status of Flambda in OCaml 4.03
  2016-03-10 15:49                 ` Gabriel Scherer
@ 2016-04-17  8:43                   ` Jesper Louis Andersen
  2016-04-17  8:59                     ` Mohamed Iguernlala
  0 siblings, 1 reply; 23+ messages in thread
From: Jesper Louis Andersen @ 2016-04-17  8:43 UTC (permalink / raw)
  To: Gabriel Scherer
  Cc: Markus Mottl, Mark Shinwell, Yotam Barnoy, Alain Frisch, OCaml List

[-- Attachment #1: Type: text/plain, Size: 12798 bytes --]

Tried `opam switch ocaml-4.03.0+trunk+flambda` on the Transit format
encoder/decoder i have. I wanted to see how much faster flambda would make
the code, since I've heard 20% and 30% thrown around. It is not very
optimized code, and in particular, the encoder path is rather elegant, but
awfully slow. Well, not anymore:

4.02:
 Name         Time/Run      mWd/Run   mjWd/Run   Prom/Run   Percentage
------------ ---------- ------------ ---------- ---------- ------------
 decode         2.12ms     352.86kw    34.86kw    34.86kw       27.88%
 encode         5.07ms     647.93kw   263.69kw   250.40kw       66.70%
 round_trip     7.61ms   1_000.79kw   298.54kw   285.26kw      100.00%
4.03.0+trunk+flambda:

│ Name       │ Time/Run │  mWd/Run │ mjWd/Run │ Prom/Run │ Percentage │

│ decode     │   2.04ms │ 319.83kw │  35.94kw │  35.94kw │     43.97% │
│ encode     │   2.65ms │ 422.67kw │ 130.88kw │ 117.59kw │     56.95% │
│ round_trip │   4.65ms │ 742.50kw │ 164.85kw │ 151.56kw │    100.00% │

Pretty impressive result. Note the heavyweight lifting is due to the yajl
JSON parser and this poses a lower bound. But I think the speedup in the
encode-path is rather good.

Note that the benchmark is probably flawed and some time passed between
these two runs, so there might be a confounder hidden in other fixes,
either to yajl, or to other parts of the compiler toolchain. However, I
think the result itself stands since in practice, my encoding time was just
cut in half.


On Thu, Mar 10, 2016 at 4:49 PM, Gabriel Scherer <gabriel.scherer@gmail.com>
wrote:

> One point that is tangentially related to your message is that the
> flambda people observed that it's easy to miss cross-module
> optimizations because .cmx files are missing -- the compiler is silent
> about this. Leo White added a new warning (58) when a module does not
> find the .cmx of one of its dependencies, which interacts with -opaque
> (initially introduced in 4.02.0 when compiling implementation files)
> in the following way. In 4.03, you can compile an *interface* with
> -opaque, announcing the intent not to provide an .cmx file (or to
> choose among several implementations at link-time) for its
> implementation(s). Warning 58 will not warn about a missing .cmx if
> the dependency's interface was compiled opaque.
>   https://github.com/ocaml/ocaml/pull/319
>
> I think the long-term plan is to encourage people to enable the
> warning, and explicitly use -opaque on .cmi when it is their intent
> not to distribute .cmx files. That said, those things may be refined
> once we get more experience of flambda in the wild.
>
> On Thu, Mar 10, 2016 at 10:32 AM, Markus Mottl <markus.mottl@gmail.com>
> wrote:
> > Ok, that explains things.  Is it realistic to assume that the size of
> > .cmx files can be substantially reduced?  It seems there is a natural
> > tradeoff between "optimize well" and "compile fast".  I suspect it may
> > be inevitable to add more compilation files.  We actually already have
> > that situation with native code libraries: the .cmxa file is enough to
> > compile a project, but if the .cmx files of contained modules are
> > visible in the path, too, then, and only then, the compiler can and
> > will do cross-module inlining - which takes longer, of course.
> >
> > What about the following approach? - There is one "minimal" set of
> > compilation files that always allows you to quickly obtain a running
> > (albeit slow / large) executable.   Additional compilation files then
> > monotonically augment this information and can be produced and
> > consumed optionally depending on compilation flags.  The nice thing
> > about this approach is that you don't necessarily have to recompile
> > the whole project with different flags whenever you need a different
> > compile time / performance tradeoff.  E.g. if Flambda information is
> > available for an unchanged file, you don't have to rebuild it when
> > needed.  If you just want to compile quickly, you don't have to read
> > data you don't need.  Separate compilation files would also integrate
> > much better with build tools (timestamping, etc.).
> >
> > I guess we would already be looking at OCaml version 5 for such a change
> :)
> >
> > On Thu, Mar 10, 2016 at 2:20 AM, Mark Shinwell <mshinwell@janestreet.com>
> wrote:
> >> By "enabled at configure time" I mean that you need to pass the
> >> "-flambda" option to the configure script when building the compiler.
> >>
> >> The main reason Flambda isn't enabled by default is because we need to
> >> do further work to improve compile-time performance.  There are also
> >> concerns about .cmx file size.  Flambda produces larger .cmx files: it
> >> stores the entire intermediate representation of the compilation unit
> >> so that no subsequent cross-module inlining decision is compromised.
> >>
> >> There is a mode, -Oclassic, which uses Flambda but mimics the
> >> behaviour of the existing compiler; unfortunately this isn't really
> >> fast enough yet either and .cmx sizes aren't small enough.
> >>
> >> When we manage to address some of these issues further, hopefully for
> >> 4.04, we will revisit whether Flambda should be enabled by default.
> >>
> >> One of the main reasons there is a configure option rather than a
> >> runtime switch is to avoid having to re-engineer the compiler's build
> >> system to permit multiple builds of the various libraries (the stdlib,
> >> for example) with differing options that affect what appears in the
> >> .cmx files (e.g. with and without Flambda).  Even if code were used to
> >> allow Flambda to read non-Flambda .cmx files, performance degradation
> >> would result.
> >>
> >> Mark
> >>
> >> On 10 March 2016 at 01:43, Markus Mottl <markus.mottl@gmail.com> wrote:
> >>> I agree with Yotam.  Assuming that Flambda produces correct code and
> >>> doesn't cause any serious performance issues either with the generated
> >>> code or with excessive compile times, I'd prefer building it into the
> >>> compiler by default.  I'd be fine if I had to pass an extra flag at
> >>> compile time to actually run Flambda optimizers, but it should at
> >>> least be available.  It doesn't have to be perfect to be useful.
> >>>
> >>> On Wed, Mar 9, 2016 at 8:32 PM, Yotam Barnoy <yotambarnoy@gmail.com>
> wrote:
> >>>> While we await the manual, can you explain what you mean by 'enabled
> at
> >>>> configure time'? Will a -flambda -O-something argument passed to the
> normal
> >>>> 4.03 compiler enable flambda optimizations? Flambda is clearly the
> star of
> >>>> the 4.03 release, so not enabling it using command line options seems
> >>>> counter-intuitive (if this is the case).
> >>>>
> >>>> -Yotam
> >>>>
> >>>> On Wed, Mar 9, 2016 at 7:59 PM, Markus Mottl <markus.mottl@gmail.com>
> wrote:
> >>>>>
> >>>>> I've just tested Flambda, and it seems to already be doing a pretty
> >>>>> decent job on some non-trivial examples (e.g. inlining combinations
> of
> >>>>> functors and first class functions).  I hope there will be a stable
> >>>>> 4.03 OPAM switch that enables it.  I'm looking forward to being able
> >>>>> to write more elegant, abstract code that's still efficient.
> >>>>>
> >>>>> Regards,
> >>>>> Markus
> >>>>>
> >>>>> On Wed, Mar 9, 2016 at 2:14 AM, Mark Shinwell <
> mshinwell@janestreet.com>
> >>>>> wrote:
> >>>>> > It will not be enabled by default in 4.03.  For the majority of
> >>>>> > programs, in the current state, it should improve performance
> (mainly
> >>>>> > by lowering allocation).  It should never generate wrong code.
> >>>>> > However we know of examples that don't improve as much as we would
> >>>>> > like, which we will try to address for 4.04.
> >>>>> >
> >>>>> > There will be a draft version of the new Flambda manual chapter
> >>>>> > available shortly (hopefully this week).  Amongst other things this
> >>>>> > documents what you found about the configure options and the flags'
> >>>>> > operation.
> >>>>> >
> >>>>> > Mark
> >>>>> >
> >>>>> > On 9 March 2016 at 03:55, Markus Mottl <markus.mottl@gmail.com>
> wrote:
> >>>>> >> Hi Alain,
> >>>>> >>
> >>>>> >> I see, thanks.  It was a little confusing, because the command
> line
> >>>>> >> options for tuning flambda were still available even without
> Flambda
> >>>>> >> being enabled.
> >>>>> >>
> >>>>> >> Will Flambda be enabled by default in OCaml 4.03 or is it still
> >>>>> >> considered to be too experimental?  It could turn out to become
> one of
> >>>>> >> the most impactful new features in terms of how I write code.
> >>>>> >>
> >>>>> >> Regards,
> >>>>> >> Markus
> >>>>> >>
> >>>>> >> On Tue, Mar 8, 2016 at 5:53 PM, Alain Frisch <
> alain.frisch@lexifi.com>
> >>>>> >> wrote:
> >>>>> >>> Hi Markus,
> >>>>> >>>
> >>>>> >>> flambda needs to be enabled explicitly at configure time with the
> >>>>> >>> "-flambda"
> >>>>> >>> flag.  The new optimizer will then be used unconditionally, and
> you
> >>>>> >>> can
> >>>>> >>> tweak it using command-line parameters passed to ocamlopt (see
> >>>>> >>> "ocamlopt
> >>>>> >>> -h").
> >>>>> >>>
> >>>>> >>>
> >>>>> >>> Alain
> >>>>> >>>
> >>>>> >>>
> >>>>> >>> On 08/03/2016 23:10, Markus Mottl wrote:
> >>>>> >>>>
> >>>>> >>>> Hi,
> >>>>> >>>>
> >>>>> >>>> I'm trying out OCaml 4.03.0+beta1 right now and wanted to test
> >>>>> >>>> Flambda
> >>>>> >>>> optimizations.  But looking at the generated assembly, it
> doesn't
> >>>>> >>>> seem
> >>>>> >>>> to be doing much if anything on the simple test examples that I
> >>>>> >>>> thought would benefit.
> >>>>> >>>>
> >>>>> >>>> To give an example of what I expected to see, lets consider this
> >>>>> >>>> code:
> >>>>> >>>>
> >>>>> >>>> -----
> >>>>> >>>> let map_pair f (x, y) = f x, f y
> >>>>> >>>>
> >>>>> >>>> let succ x = x + 1
> >>>>> >>>> let map_pair_succ1 pair = map_pair succ pair
> >>>>> >>>> let map_pair_succ2 (x, y) = succ x, succ y
> >>>>> >>>> -----
> >>>>> >>>>
> >>>>> >>>> I would have thought that the "succ" function would be inlined
> in
> >>>>> >>>> "map_pair_succ1" as the compiler would do for "map_pair_succ2".
> >>>>> >>>> But the generated code looks like this:
> >>>>> >>>>
> >>>>> >>>> -----
> >>>>> >>>> L101:
> >>>>> >>>>    movq  %rax, %rdi
> >>>>> >>>>    movq  %rdi, 8(%rsp)
> >>>>> >>>>    movq  %rbx, (%rsp)
> >>>>> >>>>    movq  8(%rbx), %rax
> >>>>> >>>>    movq  (%rdi), %rsi
> >>>>> >>>>    movq  %rdi, %rbx
> >>>>> >>>>    call  *%rsi
> >>>>> >>>> L102:
> >>>>> >>>>    movq  %rax, 16(%rsp)
> >>>>> >>>>    movq  (%rsp), %rax
> >>>>> >>>>    movq  (%rax), %rax
> >>>>> >>>>    movq  8(%rsp), %rbx
> >>>>> >>>>    movq  (%rbx), %rdi
> >>>>> >>>>    call  *%rdi
> >>>>> >>>> -----
> >>>>> >>>>
> >>>>> >>>> Is Flambda supposed to work out of the box with the current
> beta?
> >>>>> >>>> What flags or annotations should I use for testing?  Any
> showcase
> >>>>> >>>> examples I should try out that are expected to be improved?
> >>>>> >>>>
> >>>>> >>>> Regards,
> >>>>> >>>> Markus
> >>>>> >>>>
> >>>>> >>>
> >>>>> >>
> >>>>> >>
> >>>>> >>
> >>>>> >> --
> >>>>> >> Markus Mottl        http://www.ocaml.info
> markus.mottl@gmail.com
> >>>>> >>
> >>>>> >> --
> >>>>> >> Caml-list mailing list.  Subscription management and archives:
> >>>>> >> https://sympa.inria.fr/sympa/arc/caml-list
> >>>>> >> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> >>>>> >> Bug reports: http://caml.inria.fr/bin/caml-bugs
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Markus Mottl        http://www.ocaml.info
> markus.mottl@gmail.com
> >>>>>
> >>>>> --
> >>>>> Caml-list mailing list.  Subscription management and archives:
> >>>>> https://sympa.inria.fr/sympa/arc/caml-list
> >>>>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> >>>>> Bug reports: http://caml.inria.fr/bin/caml-bugs
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Markus Mottl        http://www.ocaml.info
> markus.mottl@gmail.com
> >
> >
> >
> > --
> > Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
> >
> > --
> > Caml-list mailing list.  Subscription management and archives:
> > https://sympa.inria.fr/sympa/arc/caml-list
> > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> > Bug reports: http://caml.inria.fr/bin/caml-bugs
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>



-- 
J.

[-- Attachment #2: Type: text/html, Size: 20107 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Status of Flambda in OCaml 4.03
  2016-04-17  8:43                   ` Jesper Louis Andersen
@ 2016-04-17  8:59                     ` Mohamed Iguernlala
  2016-04-17 15:43                       ` Markus Mottl
  0 siblings, 1 reply; 23+ messages in thread
From: Mohamed Iguernlala @ 2016-04-17  8:59 UTC (permalink / raw)
  To: Jesper Louis Andersen; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 2002 bytes --]

Good results, but it would be better to compare with 4.03.0+trunk.

Have you used particular options for flambda (eg. -O3 -unbox-closures) ?
have you modified the default value of -inline option ?

I noticed better performances with "-O3 -unbox-closures" on Alt-Ergo, but
I have probably to ajust the value of "-inline" (which is currently set 
to 100).

- Mohamed.|

|
Le 17/04/2016 10:43, Jesper Louis Andersen a écrit :
> Tried `opam switch ocaml-4.03.0+trunk+flambda` on the Transit format 
> encoder/decoder i have. I wanted to see how much faster flambda would 
> make the code, since I've heard 20% and 30% thrown around. It is not 
> very optimized code, and in particular, the encoder path is rather 
> elegant, but awfully slow. Well, not anymore:
>
> 4.02:
>  Name     Time/Run      mWd/Run   mjWd/Run   Prom/Run   Percentage
> ------------ ---------- ------------ ---------- ---------- ------------
>  decode       2.12ms     352.86kw    34.86kw    34.86kw       27.88%
>  encode       5.07ms     647.93kw   263.69kw   250.40kw       66.70%
>  round_trip     7.61ms   1_000.79kw   298.54kw   285.26kw    100.00%
> 4.03.0+trunk+flambda:
>
> │ Name   │ Time/Run │  mWd/Run │ mjWd/Run │ Prom/Run │ Percentage │
>
> │ decode   │   2.04ms │ 319.83kw │  35.94kw │  35.94kw │     43.97% │
> │ encode   │   2.65ms │ 422.67kw │ 130.88kw │ 117.59kw │     56.95% │
> │ round_trip │   4.65ms │ 742.50kw │ 164.85kw │ 151.56kw │  100.00% │
>
> Pretty impressive result. Note the heavyweight lifting is due to the 
> yajl JSON parser and this poses a lower bound. But I think the speedup 
> in the encode-path is rather good.
>
> Note that the benchmark is probably flawed and some time passed 
> between these two runs, so there might be a confounder hidden in other 
> fixes, either to yajl, or to other parts of the compiler toolchain. 
> However, I think the result itself stands since in practice, my 
> encoding time was just cut in half.
>


[-- Attachment #2: Type: text/html, Size: 3799 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Status of Flambda in OCaml 4.03
  2016-04-17  8:59                     ` Mohamed Iguernlala
@ 2016-04-17 15:43                       ` Markus Mottl
  0 siblings, 0 replies; 23+ messages in thread
From: Markus Mottl @ 2016-04-17 15:43 UTC (permalink / raw)
  To: Mohamed Iguernlala; +Cc: Jesper Louis Andersen, OCaml List

Ultimately, comprehensive benchmarks will be necessary to determine
how much of a benefit Flambda really is.  But just looking at the
produced assembly and some anecdotal performance evidence, it seems
clear to me that Flambda is a substantial improvement.

If we judge a feature by its impact on how people write code, I can
already say it made a huge difference to me.  For example, before
Flambda was available, I had been using Camlp4 macros in a project,
because it was otherwise impossible to avoid significant performance
penalties, also in terms of memory consumption (closure sizes),
without a lot of code duplication.  Now I can completely avoid Camlp4
while at the same time writing much more elegant, pure OCaml that
compiles to equivalently efficient machine code - if not better.

FWIW, I usually also use "-O3 -unbox-closures", which seemed to work
best in some ad-hoc tests.

On Sun, Apr 17, 2016 at 4:59 AM, Mohamed Iguernlala
<iguer.auto@gmail.com> wrote:
> Good results, but it would be better to compare with 4.03.0+trunk.
>
> Have you used particular options for flambda (eg. -O3 -unbox-closures) ?
> have you modified the default value of -inline option ?
>
> I noticed better performances with "-O3 -unbox-closures" on Alt-Ergo, but
> I have probably to ajust the value of "-inline" (which is currently set to
> 100).
>
> - Mohamed.
>
>
> Le 17/04/2016 10:43, Jesper Louis Andersen a écrit :
>
> Tried `opam switch ocaml-4.03.0+trunk+flambda` on the Transit format
> encoder/decoder i have. I wanted to see how much faster flambda would make
> the code, since I've heard 20% and 30% thrown around. It is not very
> optimized code, and in particular, the encoder path is rather elegant, but
> awfully slow. Well, not anymore:
>
> 4.02:
>  Name         Time/Run      mWd/Run   mjWd/Run   Prom/Run   Percentage
> ------------ ---------- ------------ ---------- ---------- ------------
>  decode         2.12ms     352.86kw    34.86kw    34.86kw       27.88%
>  encode         5.07ms     647.93kw   263.69kw   250.40kw       66.70%
>  round_trip     7.61ms   1_000.79kw   298.54kw   285.26kw      100.00%
> 4.03.0+trunk+flambda:
>
> │ Name       │ Time/Run │  mWd/Run │ mjWd/Run │ Prom/Run │ Percentage │
>
> │ decode     │   2.04ms │ 319.83kw │  35.94kw │  35.94kw │     43.97% │
> │ encode     │   2.65ms │ 422.67kw │ 130.88kw │ 117.59kw │     56.95% │
> │ round_trip │   4.65ms │ 742.50kw │ 164.85kw │ 151.56kw │    100.00% │
>
> Pretty impressive result. Note the heavyweight lifting is due to the yajl
> JSON parser and this poses a lower bound. But I think the speedup in the
> encode-path is rather good.
>
> Note that the benchmark is probably flawed and some time passed between
> these two runs, so there might be a confounder hidden in other fixes, either
> to yajl, or to other parts of the compiler toolchain. However, I think the
> result itself stands since in practice, my encoding time was just cut in
> half.
>
>



-- 
Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2016-04-17 15:43 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-08 22:10 [Caml-list] Status of Flambda in OCaml 4.03 Markus Mottl
2016-03-08 22:53 ` Alain Frisch
2016-03-09  3:55   ` Markus Mottl
2016-03-09  7:14     ` Mark Shinwell
2016-03-10  0:59       ` Markus Mottl
2016-03-10  1:32         ` Yotam Barnoy
2016-03-10  1:43           ` Markus Mottl
2016-03-10  7:20             ` Mark Shinwell
2016-03-10 15:32               ` Markus Mottl
2016-03-10 15:49                 ` Gabriel Scherer
2016-04-17  8:43                   ` Jesper Louis Andersen
2016-04-17  8:59                     ` Mohamed Iguernlala
2016-04-17 15:43                       ` Markus Mottl
2016-03-10 20:12                 ` [Caml-list] <DKIM> " Pierre Chambart
2016-03-10 21:08                   ` Markus Mottl
2016-03-10 22:51                   ` Gerd Stolpmann
2016-03-11  8:59                     ` Mark Shinwell
2016-03-11  9:05                       ` Mark Shinwell
2016-03-11  9:09                       ` Alain Frisch
2016-03-11  9:26                         ` Mark Shinwell
2016-03-11 14:48                           ` Yotam Barnoy
2016-03-11 15:09                             ` Jesper Louis Andersen
2016-03-11 16:58                       ` Markus Mottl

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).