Good results, but it would be better to compare with 4.03.0+trunk.

Have you used particular options for flambda (eg. -O3 -unbox-closures) ?
have you modified the default value of -inline option ?

I noticed better performances with "-O3 -unbox-closures" on Alt-Ergo, but
I have probably to ajust the value of "-inline" (which is currently set to 100).

- Mohamed.


Le 17/04/2016 10:43, Jesper Louis Andersen a écrit :
Tried `opam switch ocaml-4.03.0+trunk+flambda` on the Transit format encoder/decoder i have. I wanted to see how much faster flambda would make the code, since I've heard 20% and 30% thrown around. It is not very optimized code, and in particular, the encoder path is rather elegant, but awfully slow. Well, not anymore:

4.02:
 Name         Time/Run      mWd/Run   mjWd/Run   Prom/Run   Percentage  
------------ ---------- ------------ ---------- ---------- ------------ 
 decode         2.12ms     352.86kw    34.86kw    34.86kw       27.88%  
 encode         5.07ms     647.93kw   263.69kw   250.40kw       66.70%  
 round_trip     7.61ms   1_000.79kw   298.54kw   285.26kw      100.00%  
4.03.0+trunk+flambda:

│ Name       │ Time/Run │  mWd/Run │ mjWd/Run │ Prom/Run │ Percentage │

│ decode     │   2.04ms │ 319.83kw │  35.94kw │  35.94kw │     43.97% │
│ encode     │   2.65ms │ 422.67kw │ 130.88kw │ 117.59kw │     56.95% │
│ round_trip │   4.65ms │ 742.50kw │ 164.85kw │ 151.56kw │    100.00% │

Pretty impressive result. Note the heavyweight lifting is due to the yajl JSON parser and this poses a lower bound. But I think the speedup in the encode-path is rather good.

Note that the benchmark is probably flawed and some time passed between these two runs, so there might be a confounder hidden in other fixes, either to yajl, or to other parts of the compiler toolchain. However, I think the result itself stands since in practice, my encoding time was just cut in half.