From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id EAA09846; Wed, 9 Jun 2004 04:41:30 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id EAA09936 for ; Wed, 9 Jun 2004 04:41:28 +0200 (MET DST) Received: from ptb-relay01.plus.net (ptb-relay01.plus.net [212.159.14.212]) by concorde.inria.fr (8.12.10/8.12.10) with ESMTP id i592fRSH020874 for ; Wed, 9 Jun 2004 04:41:27 +0200 Received: from [80.229.56.224] (helo=chetara) by ptb-relay01.plus.net with esmtp (Exim) id 1BXt1q-000BJM-HH; Wed, 09 Jun 2004 02:41:26 +0000 From: Jon Harrop Organization: University of Cambridge To: "Brandon J. Van Every" Subject: Re: [Caml-list] 3D graphics debate Date: Wed, 9 Jun 2004 03:40:01 +0100 User-Agent: KMail/1.6.2 Cc: "caml" References: In-Reply-To: MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Message-Id: <200406090340.01133.jdh30@cam.ac.uk> X-Miltered: at concorde with ID 40C678D7.001 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Loop: caml-list@inria.fr X-Spam: no; 0.00; caml-list:01 indicative:99 complexities:01 traversing:01 ironically:01 abstraction:01 reordering:01 usefulness:01 dynamically:01 lod:99 floats:01 renderer:01 optimised:01 renderer:01 divorced:99 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk > > Are you trying to write a device driver in OCaml? > > 3D device driver problems are indicative of graphics problems in > general. There's not much 'new' at any level of the game. I would contest that, but then I am trying to make some "new" stuff... > Just things > that have been committed to HW vs. things that are still waiting to be > committed to HW, with some interrupting complexities in between. > Traversing a cell grid database is no different from rasterizing a > triangle. 3D graphics is one giant self-similar fractal. Ironically, I would say that more things should be done hierarchically, e.g. octrees rather than cell grids. :-) > > I think you are exaggerating cost of the "abstraction" of > > reordering the arguments of a function. > > And I'm thinking you've never done high performance anything in a big > architecture for a living. Prove me wrong. Oh ok then - I've just passed my PhD in computational and theoretical physics from the University of Cambridge. While I was there I developed numerous high performance programs, both for simulation (condensed matter physics) and visualisation (using OpenGL). I also invented and published research on wavelets so I've studied a lot of multiresolution analyses and hierarchical data structures... I also did some fun projects which, I believe, are valid research into the usefulness of hierarchical data structures over flat ones in the context of 2D and 3D graphics. One is a multiresolution representation of a planet which is dynamically tesselated to give just enough detail for rendering: http://www.chem.pwf.cam.ac.uk/~jdh30/programming/opengl/lod/index.html if the planet were represented to enough detail using flat data structures (equivalent to the cell grid) for that demo it would require 2^80 floats. Had I known OCaml at the time, this would have been a much smaller and faster program. :-) Another ongoing project of mine is a 2D vector graphics renderer which takes a similar, hierarchical approach to the rendering of 2D graphics (which are quite different to 3D because you're now interested in rendering the interior defined by a manifold, rather than the manifold itself): http://www.chem.pwf.cam.ac.uk/~jdh30/programming/opengl/smoke/index.html I have converted that C++ version into OCaml and optimised it so that it is now much faster, more robust and better behaved wrt real-time rendering. It is about 100 times faster than every other general purpose vector graphics renderer that I have ever come across. Of course, because it is multiresolution, it can render 1Gb vector graphics files in real-time whilst all the other programs fall over. This makes it uniquely suitable for real-time, interactive graphics which is why I intend to commercialise it... So yes, I've dabbled. :-) > > Apart from that "CPU" thing. ;-) > > I suppose you think CPU approaches are divorced from the realities of > what underlying HW expects? You want to retire things quickly, you > don't spread them out all over memory. Not unless you've got some kind > of super duper programmer visible scatter-gather DMA, and I'm not aware > of anyone building PCs in such configurations. If you want to retire things quickly, you use a hierarchical representation which allows you to retire things in O(ln n) rather than O(n) time. Who knows, it may even let you reduce your storage from O(n^2) to O(n). > > My point is that there are likely to be much more productive, > > higher-level optimisations that you could be doing. > > It is a wasted point. Performance jock do that *and* the low-level > optimizations. The key is sane architecture so you don't have to do > more than a handful of low-level optimizations. Ideally. But the vast majority of "performance jocks" ignore the mathematical proofs and stick to twiddling 32-bit floats. > No, I said that *in addition to* that rather common class of algorithms, > even sequential processing is usually accept / reject. Let's say you've > got SoA for a 4-field vector. When you accept, you're going to be > touching all 4 arrays. If you often accept, then your cache is going to > be polluted by all the rejects. You have made your problem 4 times less > cache coherent than it otherwise would be. The reality of 3D graphics > processing is you usually need all that info in the structure. That's > why it's a structure. Aren't we talking about a 4xn matrix rather than an nx4 matrix? > > Can these algorithms not be transformed so that they > > access more coherently? > > No, dammit! You Just Don't Get It [TM]. When someone tells you > something is "an invasive programming model," they're telling you > they're not looking for the opportunity to rewrite their whole software > architecture from scratch just because you personally think SoA might be > more kewl than AoS. Well, if you want to optimise your program and you wrote it badly the first time... ;-) > > I'd like more, specific examples here. What determines the > > accept or reject? > > What is the consequence of an accept or reject? > > I don't have time to educate you on the fundamentals of 3D graphics > processing. You can find all the classical, basic information via > newsgroups such as comp.graphics.algorithms. If you want to save a lot > of time going up the learning curve, you could post something like "AoS > or SoA?" and see what people say. Or Google the archives for them. I would say that I am somewhat familiar with the concepts involved. My concern is that I have only ever come across your optimisation problems in the context of shoehorning algorithms into using flat containers. > So was it 'dabbler' or not then? I suppose you think that the > implementation complexity of multiresolution representations is > desireable in an industry where a significantly faster card ships every > 6 months? Absolutely. It separates the men from the boys... Cheers, Jon. ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners