From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id QAA05464; Mon, 2 Aug 2004 16:32:59 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id QAA05826 for ; Mon, 2 Aug 2004 16:32:57 +0200 (MET DST) Received: from asmtp-a063f33.pas.sa.earthlink.net (asmtp-a063f33.pas.sa.earthlink.net [207.217.120.149]) by concorde.inria.fr (8.12.10/8.12.10) with ESMTP id i72EWuSH002646 for ; Mon, 2 Aug 2004 16:32:56 +0200 Received: from 168-103-58-16.tcsn.qwest.net ([168.103.58.16] helo=dylan) by asmtp-a063f33.pas.sa.earthlink.net with asmtp (Exim 4.34) id 1Brdrz-0005fM-MP for caml-list@inria.fr; Mon, 02 Aug 2004 07:32:55 -0700 Message-ID: <003f01c4789d$c0ca79b0$0201a8c0@dylan> From: "David McClain" To: "caml" Subject: [Caml-list] Wish lists... Date: Mon, 2 Aug 2004 07:33:52 -0700 MIME-Version: 1.0 Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1437 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1441 X-ELNK-Trace: 7a0ab3eafc8cf994b22988ad1c62733440683398e744b8a4b83c4cc1d498d769a285f978b5d97106a2d4e88014a4647c350badd9bab72f9c350badd9bab72f9c X-Originating-IP: 168.103.58.16 X-Miltered: at concorde with ID 410E5098.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Loop: caml-list@inria.fr X-Spam: no; 0.00; mcclain:01 dmcclain:01 modulus:01 pervasive:01 slower:01 buffered:01 alignment:01 alignment:01 stringent:99 buffered:01 fread:01 threads:01 mappings:01 descent:01 closures:01 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk Some good points here by Ville-Pertti, Indeed the Scientific mode requires a balanced modulus operation on each array index, not the one presently offered by OCaml Pervasive. But this is used in lieu of bounds checking anyway, and the world has come to accept the slight cost of array bounds checking. There are really two issues that sort of got mixed together here, only because BigArray mixed them up... One is the use of Scientific mode for some arrays. The other is memory mapped arrays. These are really two separate issues, and the extra cost on accessing mmapped arrays is worth the price over the cost of slower buffered file I/O. It wouldn't be acceptable cost for normal memory bound arrays. Some processors do have alignment requirements, but every file system I was referring to always guarantees a minimal alignment based on the underlying array element type. These alignments generally coincide with the most stringent alignment requirements in use today. Some processors like the G4 appear to be more lax on alignment requirements, but my bet is that misaligned data cause some slowdown. I think the X86 architectures operate this way too. However, you do raise an interesting point about endianess. The more portable file formats have generally accepted network byte ordering, generally by incorporating old Sun-XDR data representations. And indeed for memory mapped arrays, this would be an extra cost. But still, in this case, it would far faster than buffered file I/O. My own tests show that a more or less random access pattern in the mmapped array is 200 times faster than fread/fwrite style of data accessing. So any addition machine cycles can easily be hidden in that performance difference. But again, let's separate these two issues. I generally know when I'm accessing a mmapped array and when I'm not. I had to offer up a filename in order to do mmapping... The only reason these two conversation threads merged is because when I read the BigArray documentation, I found out that these offer a primitive form of mmapped access in addition to normal memory bound array accessing. Not sure what multiple mappings you were referring to... I meant to allow a kind of scatter-gather COW on normal memory bound arrays. Memmaped arrays are a problem apart from this. Despite what might appear as a cost overhead, the savings can be quite significant when combined with smart array slicing and sectioning. For example, in my NML, whenever I do an array slice (more complex operations than supported by BigArray), what I actually do is pay the price of all the if-then-else branching on only the first descent, generating a tree of lambda closures on the way back out, so that all the actual copying operation occurs without any more testing along the way. Sort of like reaching down your throat and pulling yourself inside-out... heh! The speed of these compound slicings is enormously faster than conventional imperative logic. So while some operations are more costly, others benefit greatly from higher order logic. In fact, a simple minded analysis shows that if you ever intend to read or write a mutating representation array then it pays to simply create a native double array once, and pay the cost of representation mutation just once, and then allow repeat non-mutating, faster, accesses to the underlying data. Keeping the array around in a foreign format just adds incremental costs that will exceed this copying cost, if you hit every element several times. But as often as not, we do slice interesting sections from the data. Not sure if this ever happens without first hitting every element on a vectorized math op... My guess is no... and so the cost of copying must occur no matter what. David McClain Senior Corporate Scientist Avisere, Inc. +1.520.390.7738 (USA) david.mcclain@avisere.com ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners