caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] Compile-time performance problem
@ 2016-11-07 17:01 Christoph Höger
  2016-11-07 17:07 ` Christoph Höger
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Höger @ 2016-11-07 17:01 UTC (permalink / raw)
  To: caml users


[-- Attachment #1.1.1: Type: text/plain, Size: 2503 bytes --]

Dear all,

I have some (autogenerated) code that (naively) implements a function in
CPS.

To give you an idea: continuations are wrapped in objects with a
_continue_ field. Functions take an argument k, which is the next
continuation and s, which depicts a monadic state.

The whole file is generated from the following input:

let foo = do
    put 1 ;
    put 2 ;
    put 3 ;
    put 4 ;
    put 5 ;
    put 6 ;
    put 7 ;
    put 8

Here, put n means: Set the monadic state to n. The semicolon is
interpreted as the monadic bind operator (without actually binding a
variable in that simple case). Due to the cps nature, I compile as
follows (compilation in brackets [| .. |]):

[| m1 ; m2 |] =^=

(fun k s -> [|m1|] (object val _continue_ fun x -> [|m2|]; ... end)

(The with_continue and continue methods are intended to mimic structural
records)

When I compile to bytecode, it just works (TM). When I compile to
machine-code, the compilation time grows very fast:

choeger@oxide /tmp % ocamlopt -dtimings perf.ml
all: 1.284s
parsing(perf.ml): 0.001s
typing(perf.ml): 0.012s
transl(perf.ml): 0.011s
generate(perf.ml): 1.238s
cmm(sourcefile(perf.ml)): 0.000s
compile_phrases(sourcefile(perf.ml)): 0.042s
selection(sourcefile(perf.ml)): 0.005s
comballoc(sourcefile(perf.ml)): 0.000s
cse(sourcefile(perf.ml)): 0.003s
deadcode(sourcefile(perf.ml)): 0.001s
spill(sourcefile(perf.ml)): 0.006s
split(sourcefile(perf.ml)): 0.002s
liveness(sourcefile(perf.ml)): 0.005s
regalloc(sourcefile(perf.ml)): 0.016s
linearize(sourcefile(perf.ml)): 0.000s
scheduling(sourcefile(perf.ml)): 0.000s
emit(sourcefile(perf.ml)): 0.003s
assemble(sourcefile(perf.ml)): 0.000s
selection(startup): 0.002s
comballoc(startup): 0.000s
cse(startup): 0.001s
deadcode(startup): 0.000s
spill(startup): 0.002s
split(startup): 0.001s
liveness(startup): 0.002s
regalloc(startup): 0.006s
linearize(startup): 0.000s
scheduling(startup): 0.000s
emit(startup): 0.001s
assemble(startup): 0.000s


The generate part is the culprit. Is there any known performance
regression in 4.03.0 that could explain this? Is it the (admittedly
naive) style of the continuations that causes this?

thanks,

Christoph
-- 
Christoph Höger

Technische Universität Berlin
Fakultät IV - Elektrotechnik und Informatik
Übersetzerbau und Programmiersprachen

Sekr. TEL12-2, Ernst-Reuter-Platz 7, 10587 Berlin

Tel.: +49 (30) 314-24890
E-Mail: christoph.hoeger@tu-berlin.de

[-- Attachment #1.1.2: perf.ml --]
[-- Type: text/plain, Size: 4302 bytes --]

let (<+) a v = Array.append a [|v|] 
let foo k s =
  ((fun k  -> fun s  -> (k#_continue_ (object (self)  end)) 1)
     (object (self)
        val _continue_ =
          fun x  ->
            (fun k  ->
               fun s  ->
                 ((fun k  -> fun s  -> (k#_continue_ (object (self)  end)) 2)
                    (object (self)
                       val _continue_ =
                         fun x  ->
                           (fun k  ->
                              fun s  ->
                                ((fun k  -> fun s  -> (k#_continue_ (object (self)  end)) 3)
                                   (object (self)
                                      val _continue_ =
                                        fun x  ->
                                          (fun k  ->
                                             fun s  ->
                                               ((fun k  -> fun s  -> (k#_continue_ (object (self)  end)) 4)
                                                  (object (self)
                                                     val _continue_ =
                                                       fun x  ->
                                                         (fun k  ->
                                                            fun s  ->
                                                              ((fun k  -> fun s  -> (k#_continue_ (object (self)  end)) 5)
                                                                 (object (self)
                                                                    val _continue_ =
                                                                    fun x  ->
                                                                    (fun k  ->
                                                                    fun s  ->
                                                                    ((fun k  -> fun s  -> (k#_continue_ (object (self)  end)) 6)
                                                                    (object (self)
                                                                    val _continue_ =
                                                                    fun x  ->
                                                                    (fun k  ->
                                                                    fun s  ->
                                                                    ((fun k  -> fun s  -> (k#_continue_ (object (self)  end)) 7)
                                                                    (object (self)
                                                                    val _continue_ = fun x  -> (fun k  -> fun s  -> (k#_continue_ (object (self)  end)) 8) k
                                                                    method with__continue_ x = {<_continue_ = x>}
                                                                    method _continue_ = _continue_
                                                                    end)) s) k
                                                                    method with__continue_ x = {<_continue_ = x>}
                                                                    method _continue_ = _continue_
                                                                    end)) s) k
                                                                    method with__continue_ x = {<_continue_ = x>}
                                                                    method _continue_ = _continue_
                                                                  end)) s) k
                                                     method with__continue_ x = {<_continue_ = x>}
                                                     method _continue_ = _continue_
                                                   end)) s) k
                                      method with__continue_ x = {<_continue_ = x>}
                                      method _continue_ = _continue_
                                    end)) s) k
                       method with__continue_ x = {<_continue_ = x>}
                       method _continue_ = _continue_
                     end)) s) k
        method with__continue_ x = {<_continue_ = x>}
        method _continue_ = _continue_
      end)) s
  

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] Compile-time performance problem
  2016-11-07 17:01 [Caml-list] Compile-time performance problem Christoph Höger
@ 2016-11-07 17:07 ` Christoph Höger
  2016-11-07 17:18   ` Nicolas Ojeda Bar
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Höger @ 2016-11-07 17:07 UTC (permalink / raw)
  To: caml-list


[-- Attachment #1.1: Type: text/plain, Size: 440 bytes --]

I just checked with 4.04beta2 and could not reproduce the problem. So it
seems to be a 4.03.0 bug, but I could not find anything obviously
related in the 4.04 changelog.

-- 
Christoph Höger

Technische Universität Berlin
Fakultät IV - Elektrotechnik und Informatik
Übersetzerbau und Programmiersprachen

Sekr. TEL12-2, Ernst-Reuter-Platz 7, 10587 Berlin

Tel.: +49 (30) 314-24890
E-Mail: christoph.hoeger@tu-berlin.de


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] Compile-time performance problem
  2016-11-07 17:07 ` Christoph Höger
@ 2016-11-07 17:18   ` Nicolas Ojeda Bar
  2016-11-07 17:32     ` Christoph Höger
  0 siblings, 1 reply; 7+ messages in thread
From: Nicolas Ojeda Bar @ 2016-11-07 17:18 UTC (permalink / raw)
  To: Christoph Höger; +Cc: OCaml Mailing List

[-- Attachment #1: Type: text/plain, Size: 606 bytes --]

Hi Christoph,

Are you using flambda ?

- Nicolas

2016-11-07 18:07 GMT+01:00 Christoph Höger <christoph.hoeger@tu-berlin.de>:

> I just checked with 4.04beta2 and could not reproduce the problem. So it
> seems to be a 4.03.0 bug, but I could not find anything obviously
> related in the 4.04 changelog.
>
> --
> Christoph Höger
>
> Technische Universität Berlin
> Fakultät IV - Elektrotechnik und Informatik
> Übersetzerbau und Programmiersprachen
>
> Sekr. TEL12-2, Ernst-Reuter-Platz 7, 10587 Berlin
>
> Tel.: +49 (30) 314-24890
> E-Mail: christoph.hoeger@tu-berlin.de
>
>

[-- Attachment #2: Type: text/html, Size: 1154 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] Compile-time performance problem
  2016-11-07 17:18   ` Nicolas Ojeda Bar
@ 2016-11-07 17:32     ` Christoph Höger
  2016-11-08  8:56       ` Christoph Höger
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Höger @ 2016-11-07 17:32 UTC (permalink / raw)
  To: caml users


[-- Attachment #1.1: Type: text/plain, Size: 1329 bytes --]

Hi Nicolas,

no, AFAIK 4.03.0 does not enable flambda by default. The 4.04beta2 I
tried had a 4.04beta2+flambda right next to it, so I assume it also does
not have flambda enabled.

regards,

Christoph

Am 07.11.2016 um 18:18 schrieb Nicolas Ojeda Bar:
> Hi Christoph,
> 
> Are you using flambda ? 
> 
> - Nicolas
> 
> 2016-11-07 18:07 GMT+01:00 Christoph Höger
> <christoph.hoeger@tu-berlin.de <mailto:christoph.hoeger@tu-berlin.de>>:
> 
>     I just checked with 4.04beta2 and could not reproduce the problem. So it
>     seems to be a 4.03.0 bug, but I could not find anything obviously
>     related in the 4.04 changelog.
> 
>     --
>     Christoph Höger
> 
>     Technische Universität Berlin
>     Fakultät IV - Elektrotechnik und Informatik
>     Übersetzerbau und Programmiersprachen
> 
>     Sekr. TEL12-2, Ernst-Reuter-Platz 7, 10587 Berlin
> 
>     Tel.: +49 (30) 314-24890 <tel:%2B49%20%2830%29%20314-24890>
>     E-Mail: christoph.hoeger@tu-berlin.de
>     <mailto:christoph.hoeger@tu-berlin.de>
> 
> 


-- 
Christoph Höger

Technische Universität Berlin
Fakultät IV - Elektrotechnik und Informatik
Übersetzerbau und Programmiersprachen

Sekr. TEL12-2, Ernst-Reuter-Platz 7, 10587 Berlin

Tel.: +49 (30) 314-24890
E-Mail: christoph.hoeger@tu-berlin.de


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] Compile-time performance problem
  2016-11-07 17:32     ` Christoph Höger
@ 2016-11-08  8:56       ` Christoph Höger
  2016-11-08 10:10         ` Goswin von Brederlow
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Höger @ 2016-11-08  8:56 UTC (permalink / raw)
  To: caml-list


[-- Attachment #1.1.1: Type: text/plain, Size: 1849 bytes --]

I must correct myself. Version 4.04.0 does not fix the issue, just move
it slightly. It seems 4.04.0 is faster by at least a constant factor,
but asymptotically still problematic (see attachment).

The file compiles just fine to bytecode, but takes 20s to compile to
machine code.

Does anyone have a clue whether I run into a bug or if there is a
compiler pass that is non-linear in some metric of the code-size?

Am 07.11.2016 um 18:32 schrieb Christoph Höger:
> Hi Nicolas,
> 
> no, AFAIK 4.03.0 does not enable flambda by default. The 4.04beta2 I
> tried had a 4.04beta2+flambda right next to it, so I assume it also does
> not have flambda enabled.
> 
> regards,
> 
> Christoph
> 
> Am 07.11.2016 um 18:18 schrieb Nicolas Ojeda Bar:
>> Hi Christoph,
>>
>> Are you using flambda ? 
>>
>> - Nicolas
>>
>> 2016-11-07 18:07 GMT+01:00 Christoph Höger
>> <christoph.hoeger@tu-berlin.de <mailto:christoph.hoeger@tu-berlin.de>>:
>>
>>     I just checked with 4.04beta2 and could not reproduce the problem. So it
>>     seems to be a 4.03.0 bug, but I could not find anything obviously
>>     related in the 4.04 changelog.
>>
>>     --
>>     Christoph Höger
>>
>>     Technische Universität Berlin
>>     Fakultät IV - Elektrotechnik und Informatik
>>     Übersetzerbau und Programmiersprachen
>>
>>     Sekr. TEL12-2, Ernst-Reuter-Platz 7, 10587 Berlin
>>
>>     Tel.: +49 (30) 314-24890 <tel:%2B49%20%2830%29%20314-24890>
>>     E-Mail: christoph.hoeger@tu-berlin.de
>>     <mailto:christoph.hoeger@tu-berlin.de>
>>
>>
> 
> 


-- 
Christoph Höger

Technische Universität Berlin
Fakultät IV - Elektrotechnik und Informatik
Übersetzerbau und Programmiersprachen

Sekr. TEL12-2, Ernst-Reuter-Platz 7, 10587 Berlin

Tel.: +49 (30) 314-24890
E-Mail: christoph.hoeger@tu-berlin.de

[-- Attachment #1.1.2: perf.ml --]
[-- Type: text/plain, Size: 7602 bytes --]

let (<+) a v = Array.append a [|v|] 
let foo k s =
  ((fun k  -> fun s  -> (k#_continue_ (object (self)  end)) 1)
     (object (self)
        val _continue_ =
          fun x  ->
            (fun k  ->
               fun s  ->
                 ((fun k  -> fun s  -> (k#_continue_ (object (self)  end)) 2)
                    (object (self)
                       val _continue_ =
                         fun x  ->
                           (fun k  ->
                              fun s  ->
                                ((fun k  -> fun s  -> (k#_continue_ (object (self)  end)) 3)
                                   (object (self)
                                      val _continue_ =
                                        fun x  ->
                                          (fun k  ->
                                             fun s  ->
                                               ((fun k  -> fun s  -> (k#_continue_ (object (self)  end)) 4)
                                                  (object (self)
                                                     val _continue_ =
                                                       fun x  ->
                                                         (fun k  ->
                                                            fun s  ->
                                                              ((fun k  -> fun s  -> (k#_continue_ (object (self)  end)) 5)
                                                                 (object (self)
                                                                    val _continue_ =
                                                                    fun x  ->
                                                                    (fun k  ->
                                                                    fun s  ->
                                                                    ((fun k  -> fun s  -> (k#_continue_ (object (self)  end)) 6)
                                                                    (object (self)
                                                                    val _continue_ =
                                                                    fun x  ->
                                                                    (fun k  ->
                                                                    fun s  ->
                                                                    ((fun k  -> fun s  -> (k#_continue_ (object (self)  end)) 7)
                                                                    (object (self)
                                                                    val _continue_ =
                                                                    fun x  ->
                                                                    (fun k  ->
                                                                    fun s  ->
                                                                    ((fun k  -> fun s  -> (k#_continue_ (object (self)  end)) 9)
                                                                    (object (self)
                                                                    val _continue_ =
                                                                    fun x  ->
                                                                    (fun k  ->
                                                                    fun s  ->
                                                                    ((fun k  -> fun s  -> (k#_continue_ (object (self)  end)) 10)
                                                                    (object (self)
                                                                    val _continue_ =
                                                                    fun x  ->
                                                                    (fun k  ->
                                                                    fun s  ->
                                                                    ((fun k  -> fun s  -> (k#_continue_ (object (self)  end)) 11)
                                                                    (object (self)
                                                                    val _continue_ =
                                                                    fun x  ->
                                                                    (fun k  ->
                                                                    fun s  ->
                                                                    ((fun k  -> fun s  -> (k#_continue_ (object (self)  end)) 12)
                                                                    (object (self)
                                                                    val _continue_ = fun x  -> (fun k  -> fun s  -> (k#_continue_ (object (self)  end)) 13) k
                                                                    method with__continue_ x = {<_continue_ = x>}
                                                                    method _continue_ = _continue_
                                                                    end)) s) k
                                                                    method with__continue_ x = {<_continue_ = x>}
                                                                    method _continue_ = _continue_
                                                                    end)) s) k
                                                                    method with__continue_ x = {<_continue_ = x>}
                                                                    method _continue_ = _continue_
                                                                    end)) s) k
                                                                    method with__continue_ x = {<_continue_ = x>}
                                                                    method _continue_ = _continue_
                                                                    end)) s) k
                                                                    method with__continue_ x = {<_continue_ = x>}
                                                                    method _continue_ = _continue_
                                                                    end)) s) k
                                                                    method with__continue_ x = {<_continue_ = x>}
                                                                    method _continue_ = _continue_
                                                                    end)) s) k
                                                                    method with__continue_ x = {<_continue_ = x>}
                                                                    method _continue_ = _continue_
                                                                  end)) s) k
                                                     method with__continue_ x = {<_continue_ = x>}
                                                     method _continue_ = _continue_
                                                   end)) s) k
                                      method with__continue_ x = {<_continue_ = x>}
                                      method _continue_ = _continue_
                                    end)) s) k
                       method with__continue_ x = {<_continue_ = x>}
                       method _continue_ = _continue_
                     end)) s) k
        method with__continue_ x = {<_continue_ = x>}
        method _continue_ = _continue_
      end)) s
  

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] Compile-time performance problem
  2016-11-08  8:56       ` Christoph Höger
@ 2016-11-08 10:10         ` Goswin von Brederlow
  2016-11-08 10:49           ` [Caml-list] <SPAM> " Pierre Chambart
  0 siblings, 1 reply; 7+ messages in thread
From: Goswin von Brederlow @ 2016-11-08 10:10 UTC (permalink / raw)
  To: Christoph H??ger; +Cc: caml-list

On Tue, Nov 08, 2016 at 09:56:33AM +0100, Christoph H??ger wrote:
> let (<+) a v = Array.append a [|v|] 
> let foo k s =
>   ((fun k  -> fun s  -> (k#_continue_ (object (self)  end)) 1)
>      (object (self)
>         val _continue_ =
>           fun x  ->
>             (fun k  ->
>                fun s  ->
>                  ((fun k  -> fun s  -> (k#_continue_ (object (self)  end)) 2)
...

Do you have to use objects there? Does the problem go away if you use
a simple type for the continuation?

The thing with CPS is that your code can be converted into simple
jumps instead of function calls. Everything is a tail call. And all
your objects are dummies that can be optimized away if you try hard
enough. Really complex dummies. I can imagine the bytecode simply
creates the objects and calls the methods while the native code
optimizes it all away. Hard to say if that is linear (and really
slow), quadratic or exponential with just one example though.

MfG
	Goswin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] <SPAM> Re: Compile-time performance problem
  2016-11-08 10:10         ` Goswin von Brederlow
@ 2016-11-08 10:49           ` Pierre Chambart
  0 siblings, 0 replies; 7+ messages in thread
From: Pierre Chambart @ 2016-11-08 10:49 UTC (permalink / raw)
  To: Goswin von Brederlow, Christoph H??ger; +Cc: caml-list

Le 08/11/2016 à 11:10, Goswin von Brederlow a écrit :
> On Tue, Nov 08, 2016 at 09:56:33AM +0100, Christoph H??ger wrote:
>> let (<+) a v = Array.append a [|v|] 
>> let foo k s =
>>   ((fun k  -> fun s  -> (k#_continue_ (object (self)  end)) 1)
>>      (object (self)
>>         val _continue_ =
>>           fun x  ->
>>             (fun k  ->
>>                fun s  ->
>>                  ((fun k  -> fun s  -> (k#_continue_ (object (self)  end)) 2)
> ...
>
> Do you have to use objects there? Does the problem go away if you use
> a simple type for the continuation?
>
> The thing with CPS is that your code can be converted into simple
> jumps instead of function calls. Everything is a tail call. And all
> your objects are dummies that can be optimized away if you try hard
> enough. Really complex dummies. I can imagine the bytecode simply
> creates the objects and calls the methods while the native code
> optimizes it all away. Hard to say if that is linear (and really
> slow), quadratic or exponential with just one example though.
>
> MfG
> 	Goswin
>
It is linear with a very large constant factor.

It is linked to a known exponential case in closure conversion that was
limited: In case of deeply nested functions when the initial closeness
estimation is systematically wrong, this can fall into that case. To prevent
problems the search depth is limited to 5. See
https://github.com/ocaml/ocaml/blob/trunk/asmcomp/closure.ml#L765
and
https://github.com/ocaml/ocaml/blob/trunk/asmcomp/closure.ml#L1195

Please fill a bug on mantis for this.

Side not: Flambda does not have this kind of problem, this estimation
is done by a linear time fixpoint. (this might be the only known example
where flambda compilation is faster...)
-- 
Pierre



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-11-08 10:49 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-07 17:01 [Caml-list] Compile-time performance problem Christoph Höger
2016-11-07 17:07 ` Christoph Höger
2016-11-07 17:18   ` Nicolas Ojeda Bar
2016-11-07 17:32     ` Christoph Höger
2016-11-08  8:56       ` Christoph Höger
2016-11-08 10:10         ` Goswin von Brederlow
2016-11-08 10:49           ` [Caml-list] <SPAM> " Pierre Chambart

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).