On Sun, 12 Dec 2010 17:14:45 -0000 "Jon Harrop" wrote: > Török Edwin wrote: > > Problem #1: Int64.rem n 2 -> another idiv instruction > > > > A C compiler would optimize this to an 'and' instruction. > > Change that to 'Int64.logand n 1L = 0L'/ > > Yes. LLVM did that for me. > > > Problem #2: Int64.div n 2 -> idiv instruction. > > > > A C compiler would optimize this to a right shift. Changing that to > > 'Int64.shift_right n 1' speeds > > up the code. > > Yes. LLVM also did that for me. In fact, I have been bitten by > ocamlopt not optimizing div and mod by a constant in real OCaml code > before. This problem also turns up in the context of hash table > implementations where you want to % by the length of the spine. Do you really need to use Int64 for that though? Won't the 63-bit version do? > > > With these changes I get almost the same speed as the C code: > > $ ocamlopt x.ml -o x && time ./x > > 837799 > > real 0m0.664s > > user 0m0.667s > > sys 0m0.000s > > > > $ gcc -O3 x.c && time ./a.out > > 837799 > > real 0m0.635s > > user 0m0.633s > > sys 0m0.000s > > > > Here's the OCaml code: > > let rec collatzLen(c, n) : int = > > if n = 1L then c else > > collatzLen (c+1, if Int64.logand n 1L = 0L then > > Int64.shift_right n 1 else Int64.add (Int64.mul 3L n) 1L);; > > > > let rec loop(i, (nlen, n)) = > > if i = 1L then n else > > let ilen = collatzLen(1, i) in > > let nlen, n = if ilen > nlen then ilen, i else nlen, n in > > loop (Int64.sub i 1L, (nlen, n));; > > > > let _ = > > let s = loop (1000000L, (1,1000000L)) in > > print_int (Int64.to_int s);; > > I am unable to reproduce your results. Here, the time falls from 24s > to 19.5s (using ocamlopt 3.12.0 on Intel x86) which is still 26× > slower than HLVM. Do you still have 'idiv' in the compiled code? See my attached assembly, and compare it with yours please. I was doing the test on 64-bit, with ocamlopt 3.11.2 and 3.12.0. FWIW the original code took 2.8 seconds here, so only 4x slower (this is an AMD Phenom II x6 1090T CPU). It probably depends how fast/slow the 'idiv' is on your CPU. --Edwin