TWIMC, I've played a little bit with different optimization options in flambda 4.04, and finally, all three versions of the loop: curried, uncurried, and the for-loop, have the same performance, though they still loose about 30% to the C version, due to tagging. Basically, this means, that flambda was able to get rid of the allocation. I don't actually know which of the options finally made the difference, but this is how I compiled it. ocamlopt.opt -c -S -inlining-report -unbox-closures -O3 -rounds 8 -inline-max-depth 256 -inline-max-unroll 1024 -o loop.cmx loop.ml ocamlopt.opt loop.cmx -o loop.native Regards, Ivan On Tue, Jul 11, 2017 at 8:54 AM, Simon Cruanes wrote: > Hello, > > Iterators in OCaml have been the topic of many discussions. Another > option for fast iterators is https://github.com/c-cube/sequence , > which (with flambda) should compile down to loops and tests on this kind > of benchmark. With the attached additional file on 4.04.0+flambda, > I obtain the following (where sequence is test-seq): > > $ for i in test-* ; do echo $i ; time ./$i ; done > test-c_loop > 5000000100000000 > ./$i 0.08s user 0.00s system 97% cpu 0.085 total > test-f_loop > 5000000100000000 > ./$i 0.10s user 0.00s system 96% cpu 0.100 total > test-loop > 5000000100000000 > ./$i 0.18s user 0.00s system 97% cpu 0.184 total > test-seq > 5000000100000000 > ./$i 0.11s user 0.00s system 97% cpu 0.113 total > test-stream > 5000000100000000 > ./$i 0.44s user 0.00s system 98% cpu 0.449 total > > > Note that sequence is imperative underneath, but can be safely used as a > functional structure. > > -- > Simon Cruanes > > http://weusepgp.info/ > key 49AA62B6, fingerprint 949F EB87 8F06 59C6 D7D3 7D8D 4AC0 1D08 49AA > 62B6 >