From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=AWL,HTML_MESSAGE autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id ACD99BBCA for ; Wed, 2 Apr 2008 02:54:57 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqkGAEN28kfAXQIm/2dsb2JhbACCPDajSoYc X-IronPort-AV: E=Sophos;i="4.25,589,1199660400"; d="scan'208";a="24479866" Received: from discorde.inria.fr ([192.93.2.38]) by mail4-smtp-sop.national.inria.fr with ESMTP; 02 Apr 2008 02:54:57 +0200 Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by discorde.inria.fr (8.13.6/8.13.6) with ESMTP id m320suEN006185 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=OK) for ; Wed, 2 Apr 2008 02:54:57 +0200 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AhsCAEN28kfRVYT5c2dsb2JhbACCPDaOWAEMAwQFCRSUPIYc X-IronPort-AV: E=Sophos;i="4.25,589,1199660400"; d="scan'208";a="9081566" Received: from an-out-0708.google.com ([209.85.132.249]) by mail2-smtp-roc.national.inria.fr with ESMTP; 02 Apr 2008 02:54:56 +0200 Received: by an-out-0708.google.com with SMTP id c24so587041ana.97 for ; Tue, 01 Apr 2008 17:54:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:mime-version:content-type; bh=N+XbaL+5NwpYDrhbmIwkr3uTe9cZuWj6WHShsPUGLaY=; b=FmEVrsT69fCHUSmgCcbbwxCL6d8UrLY4oqvXSD959WH/IttM2JR49uZ8CedUbWjj2Y7gy5C4dr63wnJPK2DkZss5GpwO3VlpyLmxnE94Z44XVWiOY4VzVdp4jwyR/InraV9kTikZTNtyuQLvfY9Hz+Utwc0/bSrxb4pepyDTGqo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=message-id:date:from:to:subject:mime-version:content-type; b=A5criCsretmgLDws2s5t6tSGqMUiC65Y6gW4wi5jokG6zHdz8XD4FbppjZJytRVMCtjDq1akd85GLGHCjAG1bfXiyXBbGEN4glVD3Q0wCSTFZYNTz25gj4Y+bIW7qS1ZR/HBGfWYIBFVLEF5ugp4BBTrTprH8f9dPfWb9ee9Vgs= Received: by 10.100.214.16 with SMTP id m16mr20905635ang.101.1207097694528; Tue, 01 Apr 2008 17:54:54 -0700 (PDT) Received: by 10.100.136.16 with HTTP; Tue, 1 Apr 2008 17:54:54 -0700 (PDT) Message-ID: Date: Wed, 2 Apr 2008 08:54:54 +0800 From: "Ludovic Coquelle" To: Caml Subject: ocaml and int64 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_1618_28729966.1207097694516" X-Miltered: at discorde with ID 47F2D960.000 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; ocaml:01 ocaml:01 wrote:01 wrote:01 compile:01 compile:01 integer:01 integer:01 arithmetic:01 arithmetic:01 int:01 int:01 face:97 face:97 module:03 ------=_Part_1618_28729966.1207097694516 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline Hi, I wrote a direct translation of a simple algo from F# to ocaml. (details can be found here: http://khigia.wordpress.com/2008/03/30/ocaml-vs-f-for-big-integer-surprising-performance-test/ ) The compile F# program (12s) is much faster than Ocaml (30s), probably because the algo do integer arithmetic with Int64 module (thanks to David for this info). Have someone here face this kind of problem (optimizing a code doing arithmetic on big integer)? Any advice to improve the Ocaml code (without changing the algo)? Thanks ------=_Part_1618_28729966.1207097694516 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline Hi,

I wrote a direct translation of a simple algo from F# to ocaml.
(details can be found here: http://khigia.wordpress.com/2008/03/30/ocaml-vs-f-for-big-integer-surprising-performance-test/)

The compile F# program (12s) is much faster than Ocaml (30s), probably because the algo do integer arithmetic with Int64 module (thanks to David for this info).

Have someone here face this kind of problem (optimizing a code doing arithmetic on big integer)?
Any advice to improve the Ocaml code (without changing the algo)?

Thanks
------=_Part_1618_28729966.1207097694516-- From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=AWL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by yquem.inria.fr (Postfix) with ESMTP id A08F1BBCA for ; Wed, 2 Apr 2008 03:59:53 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AoMBAH6F8kfUnw6Db2dsb2JhbACCOI8SAQwFAgUHGJpJ X-IronPort-AV: E=Sophos;i="4.25,590,1199660400"; d="scan'208";a="10297840" Received: from pih-relay04.plus.net ([212.159.14.131]) by mail1-smtp-roc.national.inria.fr with ESMTP; 02 Apr 2008 03:59:53 +0200 Received: from [80.229.56.224] (helo=beast.local) by pih-relay04.plus.net with esmtp (Exim) id 1JgsGZ-00080H-3Z for caml-list@yquem.inria.fr; Wed, 02 Apr 2008 02:59:55 +0100 From: Jon Harrop Organization: Flying Frog Consultancy Ltd. To: caml-list@yquem.inria.fr Subject: Re: [Caml-list] ocaml and int64 Date: Wed, 2 Apr 2008 02:50:24 +0100 User-Agent: KMail/1.9.7 References: In-Reply-To: MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Message-Id: <200804020250.25091.jon@ffconsultancy.com> X-Plusnet-Relay: 6cde174ef2478657e83857a26be2c5be X-Spam: no; 0.00; ocaml:01 ocaml:01 pitfalls:01 recursion:01 seq:01 seq:01 printf:01 printf:01 trivial:01 0.6:98 frog:98 wrote:01 wrote:01 rec:01 rec:01 On Wednesday 02 April 2008 01:54:54 Ludovic Coquelle wrote: > Hi, > > I wrote a direct translation of a simple algo from F# to ocaml. > (details can be found here: > http://khigia.wordpress.com/2008/03/30/ocaml-vs-f-for-big-integer-surprisin >g-performance-test/ ) > > The compile F# program (12s) is much faster than Ocaml (30s), probably > because the algo do integer arithmetic with Int64 module (thanks to David > for this info). > > Have someone here face this kind of problem (optimizing a code doing > arithmetic on big integer)? > Any advice to improve the Ocaml code (without changing the algo)? Get a 64-bit machine. ;-) There are some performance pitfalls in your code (which takes 21s on my machine): . OCaml makes no attempt to optimize integer div and mod so avoid these at all costs. In this case, use bitwise ANDs and shifts. . Int64 is boxed by default and your use of recursion is likely to worsen this effect. Optimizing for these, the following code takes only 4.6s, which is 4.6x faster than your original: let int = Int64.to_int let int64 = Int64.of_int let ( +: ) = Int64.add let ( -: ) = Int64.sub let ( *: ) = Int64.mul let ( >>> ) = Int64.shift_right let rec seq_length x n = match x with | 0L -> n +: 1L | 1L -> seq_length 0L (n +: 1L) | x when int x land 1 = 0 -> seq_length (x >>> 1) (n +: 1L) | _ -> seq_length (3L *: x +: 1L) (n +: 1L) let rec loop i imax n = let n' = seq_length i 0L in let imax, n = if n' > n then (i, n') else (imax, n) in if i < 1000000L then loop (i +: 1L) imax n else imax let _ = print_string (Int64.to_string (loop 1L 0L 0L)) Using 63-bit ints on a 64-bit machine, the time drops to only 2s: let rec seq_length x n = match x with | 0 -> n + 1 | 1 -> seq_length 0 (n + 1) | x when x land 1 = 0 -> seq_length (x lsr 1) (n + 1) | _ -> seq_length (3*x + 1) (n + 1) let rec loop i imax n = let n' = seq_length i 0 in let imax, n = if n' > n then (i, n') else (imax, n) in if i < 1000000 then loop (i+1) imax n else imax let _ = print_string (string_of_int (loop 1 0 0)) As you have allured to, algorithmic optimizations buy you even more. The following implementation is several times faster again, taking only 0.6s to complete: let int = Int64.to_int let int64 = Int64.of_int let ( +: ) = Int64.add let ( -: ) = Int64.sub let ( *: ) = Int64.mul let rec inside a n = if n<=1L then 0 else if a.(int n)>0 then a.(int n) else let p = if int n land 1 = 0 then inside a (Int64.shift_right n 1) else let n = 3L *: n +: 1L in if n < int64(Array.length a) then inside a n else outside a n in a.(int n) <- 1 + p; 1 + p and outside a n = let n = if int n land 1 = 0 then Int64.shift_right n 1 else 3L *: n +: 1L in 1 + if n < int64(Array.length a) then inside a n else outside a n let euler14 n = let a = Array.create (n+1) 0 in let longest_n = ref 0 and longest_len = ref 0 in for n=1 to n do let len = inside a (int64 n) in if len > !longest_len then begin longest_n := n; longest_len := len end done; !longest_n, !longest_len let () = let n, len = euler14 1000000 in Printf.printf "%d: %d\n%!" n len Converting this to use native ints on my 64-bit machine the time drops to only 0.2s, which is over 100x faster than your original! However, this benchmark really plays to F#'s strengths and you will probably never beat F# here. My best F# is 3x faster than anything I can write in OCaml, not least because it is trivial to parallelize efficiently but also because F# and .NET automate many relevant optimizations. HTH. -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/products/?e From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.3 required=5.0 tests=AWL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by yquem.inria.fr (Postfix) with ESMTP id D1F82BBCA for ; Wed, 2 Apr 2008 11:44:07 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AlQBALvy8kdQW+UCj2dsb2JhbACRTwEBAQEJBRAWmj4 X-IronPort-AV: E=Sophos;i="4.25,592,1199660400"; d="scan'208";a="10969463" Received: from discorde.inria.fr ([192.93.2.38]) by mail3-smtp-sop.national.inria.fr with ESMTP; 02 Apr 2008 11:44:07 +0200 Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by discorde.inria.fr (8.13.6/8.13.6) with ESMTP id m329i6DO021731 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=OK) for ; Wed, 2 Apr 2008 11:44:07 +0200 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AlQBALvy8kdQW+UCj2dsb2JhbACRTwEBAQEJBRAWmj4 X-IronPort-AV: E=Sophos;i="4.25,592,1199660400"; d="scan'208";a="10969443" Received: from main.gmane.org (HELO ciao.gmane.org) ([80.91.229.2]) by mail3-smtp-sop.national.inria.fr with ESMTP; 02 Apr 2008 11:43:43 +0200 Received: from list by ciao.gmane.org with local (Exim 4.43) id 1JgzVN-0000ao-3a for caml-list@inria.fr; Wed, 02 Apr 2008 09:43:41 +0000 Received: from ks300734.kimsufi.com ([91.121.65.225]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 02 Apr 2008 09:43:41 +0000 Received: from sylvain by ks300734.kimsufi.com with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 02 Apr 2008 09:43:41 +0000 X-Injected-Via-Gmane: http://gmane.org/ To: caml-list@inria.fr From: Sylvain Le Gall Subject: Re: ocaml and int64 Date: Wed, 2 Apr 2008 09:43:31 +0000 (UTC) Message-ID: References: <200804020250.25091.jon@ffconsultancy.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: ks300734.kimsufi.com User-Agent: slrn/0.9.8.1pl2 (Debian) Sender: news X-Miltered: at discorde with ID 47F35566.001 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; le-gall:01 ocaml:01 ocaml:01 pitfalls:01 recursion:01 seq:01 seq:01 statistic:98 wrote:01 wrote:01 rec:01 compile:01 integer:01 integer:01 pps:01 On 02-04-2008, Jon Harrop wrote: > On Wednesday 02 April 2008 01:54:54 Ludovic Coquelle wrote: >> Hi, >> >> I wrote a direct translation of a simple algo from F# to ocaml. >> (details can be found here: >> http://khigia.wordpress.com/2008/03/30/ocaml-vs-f-for-big-integer-surprisin >>g-performance-test/ ) >> >> The compile F# program (12s) is much faster than Ocaml (30s), probably >> because the algo do integer arithmetic with Int64 module (thanks to David >> for this info). >> >> Have someone here face this kind of problem (optimizing a code doing >> arithmetic on big integer)? >> Any advice to improve the Ocaml code (without changing the algo)? > > Get a 64-bit machine. ;-) > > There are some performance pitfalls in your code (which takes 21s on my > machine): > > . OCaml makes no attempt to optimize integer div and mod so avoid these at all > costs. In this case, use bitwise ANDs and shifts. > > . Int64 is boxed by default and your use of recursion is likely to worsen this > effect. > > Optimizing for these, the following code takes only 4.6s, which is 4.6x faster > than your original: Nice speed up... But without changing the algo, replace seq_length by: let limit_int = (max_int / 3) - 1 ;; let limit_int64 = Int64.of_int max_int ;; let rec seq_length_int x n = match x with | 0 -> (n + 1) | 1 -> seq_length_int 0 (n + 1) | x when x mod 2 = 0 -> seq_length_int (x / 2) (n + 1) | _ -> if x < limit_int then seq_length_int ((3 * x) + 1) (n + 1) else seq_length_int64 (Int64.of_int x) n and seq_length_int64 x n = match x with | 0L -> (n + 1) | 1L -> seq_length_int64 0L (n + 1) | x when x %% 2L = 0L -> let nx = x // 2L in if Int64.compare nx limit_int64 < 0 then seq_length_int (Int64.to_int nx) (n + 1) else seq_length_int64 nx (n + 1) | _ -> seq_length_int64 (Int64.succ (3L ** x)) (n + 1) ;; let seq_length = seq_length_int64 ;; Give you a 19x speed up on 32 bit machine. On 64 bit machine this should lead to nearly native performance (i.e. beat F# more than you think). The trick is very simple: switch to "int" as soon as you can, only use boxed integer when you cannot do anything else (in my case the limit is when we go from 64bit to 31bit integer, but on 64bit machine this will be when you go from 64bit to 63bit integer i.e. you will always compute using "int" which will be native perf). You can explain the perf gain by doing some statistic on 3n + 1, n / 2, to see how much time is spend under max_int... and explain the speed-up ;-) Applying the ">>>" trick of Dr. Harrop doesn't really improve perf here (even if it is a good idea). Regards, Sylvain Le Gall ps: for the sake of writing better code, there is a way to write the code in a more generic way, but i am too lazy pps: gildor@peta:~/tmp/ocaml-test/eul14$ time ./eul14 837799 real 1m11.870s user 1m11.324s sys 0m0.428s (with seq_length_int/seq_length_int64) gildor@peta:~/tmp/ocaml-test/eul14$ time ./eul14 837799 real 0m3.736s user 0m3.676s sys 0m0.052s