From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by yquem.inria.fr (Postfix) with ESMTP id DD75FBC57 for ; Tue, 14 Dec 2010 10:54:50 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AswBAIfNBk3ZSMDdkWdsb2JhbACkCxYBAQEJCwoHEQMmwUOFSgSOD4Jy X-IronPort-AV: E=Sophos;i="4.59,341,1288566000"; d="scan'208";a="70458174" Received: from fmmailgate01.web.de ([217.72.192.221]) by mail3-smtp-sop.national.inria.fr with ESMTP; 14 Dec 2010 10:54:50 +0100 Received: from smtp07.web.de ( [172.20.5.215]) by fmmailgate01.web.de (Postfix) with ESMTP id D0751181D597C; Tue, 14 Dec 2010 10:54:00 +0100 (CET) Received: from [78.43.204.177] (helo=frosties.localdomain) by smtp07.web.de with asmtp (TLSv1:AES256-SHA:256) (WEB.DE 4.110 #24) id 1PSRa4-0003oo-00; Tue, 14 Dec 2010 10:54:00 +0100 Received: from mrvn by frosties.localdomain with local (Exim 4.72) (envelope-from ) id 1PSRa4-0001SI-E9; Tue, 14 Dec 2010 10:54:00 +0100 From: Goswin von Brederlow To: Edwin Cc: "Jon Harrop" , caml-list@inria.fr Subject: Re: Value types In-Reply-To: <20101212175524.73a8e285@deb0> ("Edwin"'s message of "Sun, 12 Dec 2010 17:55:24 +0200") References: <036001cb9a0c$725acef0$57106cd0$@com> <20101212175524.73a8e285@deb0> User-Agent: Gnus/5.110009 (No Gnus v0.9) XEmacs/21.4.22 (linux, no MULE) Date: Tue, 14 Dec 2010 10:54:00 +0100 Message-ID: <87fwu0bryf.fsf@frosties.localnet> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Sender: goswin-v-b@web.de X-Sender: goswin-v-b@web.de X-Provags-ID: V01U2FsdGVkX1+H/Q+0xw80S6EPwUjsVTkv0avu2I2O1spO4cTs qPr67kLPRHxt7tZpRepV3aD7UltUn3BT0eXdeBV3KHDnRVPM71 eorkquyWg= X-Spam: no; 0.00; haskell:01 ocaml:01 compiler:01 compiler:01 'int:01 foo:01 foo:01 rax:01 rax:01 'int:01 gcc:01 edwin:98 offenders:98 'and':98 sar:98 Török Edwin writes: > On Sun, 12 Dec 2010 14:54:14 -0000 > "Jon Harrop" wrote: > >> The Haskell guys got their best performance improvement moving to >> LLVM from the hailstone benchmark so it is interesting to examine >> this case as well. I just added support for 64-bit ints to HLVM to >> implement that benchmark and my code is: >> >> Here’s the equivalent OCaml code: >> >> let rec collatzLen(c, n) : int = >> if n = 1L then c else >> collatzLen (c+1, if Int64.rem n 2L = 0L then Int64.div n 2L else > > OK, but boxing itself has nothing to do with the performance degration > here. It is the lack of compiler optimizations on the Int64 type. This > could be solved by implementing compiler optimizations for it (or > manually optimizing some integer arithmetic that is slow). > > Lets run the code under a profiler, or look at the assembly (I used > 'perf record' and 'perf report'). 2 'idiv' instructions turn up as top > offenders in the profile. > > Problem #1: Int64.rem n 2 -> another idiv instruction > > A C compiler would optimize this to an 'and' instruction. > Change that to 'Int64.logand n 1L = 0L'/ But C would have an uint64_t there (if you are smart). Otherwise that isn't equivalent: void foo(int64_t x) { a = x % 2; } 0000000000000000 : 0: 48 89 f8 mov %rdi,%rax 3: 48 c1 e8 3f shr $0x3f,%rax 7: 48 01 c7 add %rax,%rdi a: 83 e7 01 and $0x1,%edi d: 48 29 c7 sub %rax,%rdi 10: 48 89 3d 00 00 00 00 mov %rdi,0x0(%rip) # 17 17: c3 retq > Problem #2: Int64.div n 2 -> idiv instruction. > > A C compiler would optimize this to a right shift. Changing that to 'Int64.shift_right n 1' speeds > up the code. Same thing again: void foo(int64_t x) { a = x / 2; } 0000000000000000 : 0: 48 89 f8 mov %rdi,%rax 3: 48 c1 e8 3f shr $0x3f,%rax 7: 48 8d 3c 38 lea (%rax,%rdi,1),%rdi b: 48 d1 ff sar %rdi e: 48 89 3d 00 00 00 00 mov %rdi,0x0(%rip) # 15 15: c3 retq In both cases gcc avoids the expensive idiv instruction but it needs to handle the sign of the number with some tricks. Still faster than idiv though. An UInt64 module would be nice here. MfG Goswin