From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <Xavier.Leroy@inria.fr>
X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr
X-Spam-Level: 
X-Spam-Status: No, score=0.1 required=5.0 tests=AWL autolearn=disabled 
	version=3.1.3
X-Original-To: caml-list@yquem.inria.fr
Delivered-To: caml-list@yquem.inria.fr
Received: from discorde.inria.fr (discorde.inria.fr [192.93.2.38])
	by yquem.inria.fr (Postfix) with ESMTP id CAF87BC69
	for <caml-list@yquem.inria.fr>; Sat, 10 Feb 2007 17:10:40 +0100 (CET)
Received: from smtp2-g19.free.fr (smtp2-g19.free.fr [212.27.42.28])
	by discorde.inria.fr (8.13.6/8.13.6) with ESMTP id l1AGAeKK027618
	for <caml-list@yquem.inria.fr>; Sat, 10 Feb 2007 17:10:40 +0100
Received: from [192.168.1.2] (che78-2-82-237-71-191.fbx.proxad.net [82.237.71.191])
	by smtp2-g19.free.fr (Postfix) with ESMTP id 2971B7D4C;
	Sat, 10 Feb 2007 17:10:40 +0100 (CET)
Message-ID: <45CDEE7F.8040601@inria.fr>
Date: Sat, 10 Feb 2007 17:10:39 +0100
From: Xavier Leroy <Xavier.Leroy@inria.fr>
User-Agent: Mozilla Thunderbird 1.0.2 (X11/20050317)
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: ls-ocaml-developer-2006@m-e-leypold.de
Cc: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] Multiplication of matrix in C and OCaml
References: <45CAF3E2.7020807@univ-paris12.fr>	<200702100224.32450.jon@ffconsultancy.com>	<hbtzxu13d9.fsf@hod.lan.m-e-leypold.de>	<200702101452.02955.jon@ffconsultancy.com> <n98xf6j9ic.fsf@hod.lan.m-e-leypold.de>
In-Reply-To: <n98xf6j9ic.fsf@hod.lan.m-e-leypold.de>
X-Enigmail-Version: 0.91.0.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Miltered: at discorde with ID 45CDEE80.000 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)!
X-Spam: no; 0.00; ocaml:01 citeseer:01 compiler:01 reals:01 gcc:01 flags:01 gcc:01 48,:98 psu:98 interchange:98 caml-list:01 pairs:01 digest:01 arithmetic:01 arithmetic:01 

> [Many questions about float arithmetic and optimizations]

Do yourself a favor and read Goldberg's excellent reference:

"What Every Computer Scientist Should Know About Floating Point Arithmetic"
ACM Computing Surveys 23(1), 5-48, 1991
http://citeseer.ist.psu.edu/goldberg91what.html

It doesn't read like _TV_digest_, but it's well worth the effort.

> Is the compiler allowed to make
> optimizations according to known mathematical laws?

Yes, provided they actually hold.  Goldberg lists a few that hold in
IEEE float arithmetic (section "Optimizers").  But pretty much every
algebraic law that holds over the reals doesn't hold in floating-point
arithmetic.

> Still: With a certain Gcc version and flags combination the OP saw a
> threefold improvement in performance. That in intself is suspicious (I
> don't think that this much optimization potential was left in Gcc ...)
> and I still would check for optimization errors in this case.

That's a good idea.  Note however that there are known optimizations
(loop blocking, loop interchange) that can dramatically improve the
performance of dense matrix multiply by making better use of the caches.
Automatic vectorization (generation of SSE2 instructions that operate
over pairs of double-precision floats) could also have a significant
impact, although not by a factor of 3.  There's only one way to know:
read the assembly code generated by gcc.

- Xavier Leroy