From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gclci-caml-list@m.gmane.org>
X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr
X-Spam-Level: 
X-Spam-Status: No, score=0.3 required=5.0 tests=AWL autolearn=disabled 
	version=3.1.3
X-Original-To: caml-list@yquem.inria.fr
Delivered-To: caml-list@yquem.inria.fr
Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104])
	by yquem.inria.fr (Postfix) with ESMTP id D1F82BBCA
	for <caml-list@yquem.inria.fr>; Wed,  2 Apr 2008 11:44:07 +0200 (CEST)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AlQBALvy8kdQW+UCj2dsb2JhbACRTwEBAQEJBRAWmj4
X-IronPort-AV: E=Sophos;i="4.25,592,1199660400"; 
   d="scan'208";a="10969463"
Received: from discorde.inria.fr ([192.93.2.38])
  by mail3-smtp-sop.national.inria.fr with ESMTP; 02 Apr 2008 11:44:07 +0200
Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104])
	by discorde.inria.fr (8.13.6/8.13.6) with ESMTP id m329i6DO021731
	(version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=OK)
	for <caml-list@inria.fr>; Wed, 2 Apr 2008 11:44:07 +0200
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AlQBALvy8kdQW+UCj2dsb2JhbACRTwEBAQEJBRAWmj4
X-IronPort-AV: E=Sophos;i="4.25,592,1199660400"; 
   d="scan'208";a="10969443"
Received: from main.gmane.org (HELO ciao.gmane.org) ([80.91.229.2])
  by mail3-smtp-sop.national.inria.fr with ESMTP; 02 Apr 2008 11:43:43 +0200
Received: from list by ciao.gmane.org with local (Exim 4.43)
	id 1JgzVN-0000ao-3a
	for caml-list@inria.fr; Wed, 02 Apr 2008 09:43:41 +0000
Received: from ks300734.kimsufi.com ([91.121.65.225])
        by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
        id 1AlnuQ-0007hv-00
        for <caml-list@inria.fr>; Wed, 02 Apr 2008 09:43:41 +0000
Received: from sylvain by ks300734.kimsufi.com with local (Gmexim 0.1 (Debian))
        id 1AlnuQ-0007hv-00
        for <caml-list@inria.fr>; Wed, 02 Apr 2008 09:43:41 +0000
X-Injected-Via-Gmane: http://gmane.org/
To: caml-list@inria.fr
From: Sylvain Le Gall <sylvain@le-gall.net>
Subject:  Re: ocaml and int64
Date: Wed, 2 Apr 2008 09:43:31 +0000 (UTC)
Message-ID:  <slrnfv6la3.tll.sylvain@gallu.homelinux.org>
References:  <d6c7b34d0804011754k34acaecfp7de0f27769b829b@mail.gmail.com>
 <200804020250.25091.jon@ffconsultancy.com>
Mime-Version:  1.0
Content-Type:  text/plain; charset=us-ascii
Content-Transfer-Encoding:  7bit
X-Complaints-To: usenet@ger.gmane.org
X-Gmane-NNTP-Posting-Host: ks300734.kimsufi.com
User-Agent: slrn/0.9.8.1pl2 (Debian)
Sender: news <news@ger.gmane.org>
X-Miltered: at discorde with ID 47F35566.001 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)!
X-Spam: no; 0.00; le-gall:01 ocaml:01 ocaml:01 pitfalls:01 recursion:01 seq:01 seq:01 statistic:98 wrote:01 wrote:01 rec:01 compile:01 integer:01 integer:01 pps:01 

On 02-04-2008, Jon Harrop <jon@ffconsultancy.com> wrote:
> On Wednesday 02 April 2008 01:54:54 Ludovic Coquelle wrote:
>> Hi,
>>
>> I wrote a direct translation of a simple algo from F# to ocaml.
>> (details can be found here:
>> http://khigia.wordpress.com/2008/03/30/ocaml-vs-f-for-big-integer-surprisin
>>g-performance-test/ )
>>
>> The compile F# program (12s) is much faster than Ocaml (30s), probably
>> because the algo do integer arithmetic with Int64 module (thanks to David
>> for this info).
>>
>> Have someone here face this kind of problem (optimizing a code doing
>> arithmetic on big integer)?
>> Any advice to improve the Ocaml code (without changing the algo)?
>
> Get a 64-bit machine. ;-)
>
> There are some performance pitfalls in your code (which takes 21s on my 
> machine):
>
> . OCaml makes no attempt to optimize integer div and mod so avoid these at all 
> costs. In this case, use bitwise ANDs and shifts.
>
> . Int64 is boxed by default and your use of recursion is likely to worsen this 
> effect.
>
> Optimizing for these, the following code takes only 4.6s, which is 4.6x faster 
> than your original:

Nice speed up... But without changing the algo, replace seq_length by:

let limit_int =
  (max_int / 3) - 1
;;

let limit_int64 =
  Int64.of_int max_int
;;

let rec seq_length_int x n =
    match x with
    | 0 ->
        (n + 1)
    | 1 ->
        seq_length_int 0 (n + 1)
    | x when x mod 2 = 0 ->
        seq_length_int (x / 2) (n + 1)
    | _ ->
        if x < limit_int then
          seq_length_int ((3 * x) + 1) (n + 1)
        else
          seq_length_int64 (Int64.of_int x) n
and seq_length_int64 x n =
    match x with
    | 0L ->
        (n + 1)
    | 1L ->
        seq_length_int64 0L (n + 1)
    | x when x %% 2L = 0L ->
        let nx =
          x // 2L
        in
          if Int64.compare nx limit_int64 < 0 then
            seq_length_int (Int64.to_int nx) (n + 1)
          else
            seq_length_int64 nx (n + 1)
    | _ ->
        seq_length_int64 (Int64.succ (3L ** x)) (n + 1)
;;

let seq_length =
  seq_length_int64
;;

Give you a 19x speed up on 32 bit machine. On 64 bit machine this should
lead to nearly native performance (i.e. beat F# more than you think).

The trick is very simple: switch to "int" as soon as you can, only use
boxed integer when you cannot do anything else (in my case the limit is
when we go from 64bit to 31bit integer, but on 64bit machine this will
be when you go from 64bit to 63bit integer i.e. you will always compute
using "int" which will be native perf).   

You can explain the perf gain by doing some statistic on 3n + 1, n / 2,
to see how much time is spend under max_int... and explain the speed-up
;-) 

Applying the ">>>" trick of Dr. Harrop doesn't really improve perf here
(even if it is a good idea).

Regards,
Sylvain Le Gall

ps: for the sake of writing better code, there is a way to write the
code in a more generic way, but i am too lazy
pps:
gildor@peta:~/tmp/ocaml-test/eul14$ time ./eul14
837799
real    1m11.870s
user    1m11.324s
sys     0m0.428s

(with seq_length_int/seq_length_int64)
gildor@peta:~/tmp/ocaml-test/eul14$ time ./eul14
837799
real    0m3.736s
user    0m3.676s
sys     0m0.052s