From mboxrd@z Thu Jan  1 00:00:00 1970
Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id EAA02088; Thu, 22 Aug 2002 04:18:22 +0200 (MET DST)
X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f
Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id EAA01824 for <caml-list@pauillac.inria.fr>; Thu, 22 Aug 2002 04:18:20 +0200 (MET DST)
Received: from slave-dog.naughtydog.com (naughtydog200.brandx.net [209.55.75.200])
	by nez-perce.inria.fr (8.11.1/8.11.1) with SMTP id g7M2IGn08853
	for <caml-list@inria.fr>; Thu, 22 Aug 2002 04:18:17 +0200 (MET DST)
Received: from SMTP agent by mail gateway 
 Wed, 21 Aug 2002 19:33:22 -0800
Received: from katn-dog (engstad@katn-dog.naughtydog.com [10.0.0.90])
	by slave-dog.naughtydog.com (SGI-8.9.3/8.9.3) with ESMTP id TAA65370
	for <caml-list@inria.fr>; Wed, 21 Aug 2002 19:33:18 -0700 (PDT)
Content-Type: text/plain;
  charset="us-ascii"
From: Pal-Kristian Engstad <engstad@naughtydog.com>
Organization: Naughty Dog, Inc.
To: caml-list@inria.fr
Subject: [Caml-list] Pointers needed for optimization problem.
Date: Wed, 21 Aug 2002 19:33:21 -0700
User-Agent: KMail/1.4.2
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Message-Id: <200208211933.21915.engstad@naughtydog.com>
Sender: owner-caml-list@pauillac.inria.fr
Precedence: bulk

Hi,

I'm interested in solving a problem I have with automatically generating 
optimal (vector) floating point code. In other words, I want to code a caml 
program that transforms the input description into assembly instructions. For 
regular floating point operations, a good compiler should be able to optimize 
the code:
	
	r := A * x * x + B * x + C

into (if multiplication is more expensive than adds):

	t := A * x
	t := t + B
	t := t * x
	t := t + B
since
	r := (A * x + B) * x + C

Now, this is fairly trivial to do, but if you have an architecture with SIMD 
vector floating point registers and opcodes, a whole range of new interesting 
optimizations can be done. With vector registers, I mean registers that have 
up to 4 floating point values. For instance, a 4-vector times a 4-by-4 matrix 
operation can be coded as:

	acc.xyzw := mtxrow[0].xyzw * vec.x
	acc.xyzw += mtxrow[1].xyzw * vec.y
	acc.xyzw += mtxrow[2].xyzw * vec.z
	res.xyzw = acc.xyzw + mtxrow[3].xyzw * vec.w

Here, each line represent one assembly instruction.

So, what I am looking for is basically pointers to literature that deal with 
issues like these, and since OCaml is a great language for language 
transformations I thought someone on this list would be able to point me in 
the right direction. I _have_ been searching on the net, but I guess I don't 
know the right keywords to search on.

PKE.
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners