From mboxrd@z Thu Jan  1 00:00:00 1970
Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id QAA05464; Mon, 2 Aug 2004 16:32:59 +0200 (MET DST)
X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f
Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id QAA05826 for <caml-list@pauillac.inria.fr>; Mon, 2 Aug 2004 16:32:57 +0200 (MET DST)
Received: from asmtp-a063f33.pas.sa.earthlink.net (asmtp-a063f33.pas.sa.earthlink.net [207.217.120.149])
	by concorde.inria.fr (8.12.10/8.12.10) with ESMTP id i72EWuSH002646
	for <caml-list@inria.fr>; Mon, 2 Aug 2004 16:32:56 +0200
Received: from 168-103-58-16.tcsn.qwest.net ([168.103.58.16] helo=dylan)
	by asmtp-a063f33.pas.sa.earthlink.net with asmtp (Exim 4.34)
	id 1Brdrz-0005fM-MP
	for caml-list@inria.fr; Mon, 02 Aug 2004 07:32:55 -0700
Message-ID: <003f01c4789d$c0ca79b0$0201a8c0@dylan>
From: "David McClain" <dmcclain1@mindspring.com>
To: "caml" <caml-list@inria.fr>
Subject: [Caml-list] Wish lists...
Date: Mon, 2 Aug 2004 07:33:52 -0700
MIME-Version: 1.0
Content-Type: text/plain;
	charset="Windows-1252"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2800.1437
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1441
X-ELNK-Trace: 7a0ab3eafc8cf994b22988ad1c62733440683398e744b8a4b83c4cc1d498d769a285f978b5d97106a2d4e88014a4647c350badd9bab72f9c350badd9bab72f9c
X-Originating-IP: 168.103.58.16
X-Miltered: at concorde with ID 410E5098.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)!
X-Loop: caml-list@inria.fr
X-Spam: no; 0.00; mcclain:01 dmcclain:01 modulus:01 pervasive:01 slower:01 buffered:01 alignment:01 alignment:01 stringent:99 buffered:01 fread:01 threads:01 mappings:01 descent:01 closures:01 
Sender: owner-caml-list@pauillac.inria.fr
Precedence: bulk

Some good points here by Ville-Pertti,

Indeed the Scientific mode requires a balanced modulus operation on each
array index, not the one presently offered by OCaml Pervasive. But this is
used in lieu of bounds checking anyway, and the world has come to accept the
slight cost of array bounds checking.

There are really two issues that sort of got mixed together here, only
because BigArray mixed them up... One is the use of Scientific mode for some
arrays. The other is memory mapped arrays. These are really two separate
issues, and the extra cost on accessing mmapped arrays is worth the price
over the cost of slower buffered file I/O. It wouldn't be acceptable cost
for normal memory bound arrays.

Some processors do have alignment requirements, but every file system I was
referring to always guarantees a minimal alignment based on the underlying
array element type. These alignments generally coincide with the most
stringent alignment requirements in use today. Some processors like the G4
appear to be more lax on alignment requirements, but my bet is that
misaligned data cause some slowdown. I think the X86 architectures operate
this way too.

However, you do raise an interesting point about endianess. The more
portable file formats have generally accepted network byte ordering,
generally by incorporating old Sun-XDR data representations. And indeed for
memory mapped arrays, this would be an extra cost. But still, in this case,
it would far faster than buffered file I/O. My own tests show that a more or
less random access pattern in the mmapped array is 200 times faster than
fread/fwrite style of data accessing. So any addition machine cycles can
easily be hidden in that performance difference.

But again, let's separate these two issues. I generally know when I'm
accessing a mmapped array and when I'm not. I had to offer up a filename in
order to do mmapping... The only reason these two conversation threads
merged is because when I read the BigArray documentation, I found out that
these offer a primitive form of mmapped access in addition to normal memory
bound array accessing.

Not sure what multiple mappings you were referring to... I meant to allow a
kind of scatter-gather COW on normal memory bound arrays. Memmaped arrays
are a problem apart from this. Despite what might appear as a cost overhead,
the savings can be quite significant when combined with smart array slicing
and sectioning.

For example, in my NML, whenever I do an array slice (more complex
operations than supported by BigArray), what I actually do is pay the price
of all the if-then-else branching on only the first descent, generating a
tree of lambda closures on the way back out, so that all the actual copying
operation occurs without any more testing along the way. Sort of like
reaching down your throat and pulling yourself inside-out... heh!

The speed of these compound slicings is enormously faster than conventional
imperative logic. So while some operations are more costly, others benefit
greatly from higher order logic. In fact, a simple minded analysis shows
that if you ever intend to read or write a mutating representation array
then it pays to simply create a native double array once, and pay the cost
of representation mutation just once, and then allow repeat non-mutating,
faster, accesses to the underlying data. Keeping the array around in a
foreign format just adds incremental costs that will exceed this copying
cost, if you hit every element several times.

But as often as not, we do slice interesting sections from the data. Not
sure if this ever happens without first hitting every element on a
vectorized math op... My guess is no... and so the cost of copying must
occur no matter what.

David McClain
Senior Corporate Scientist
Avisere, Inc.

+1.520.390.7738 (USA)
david.mcclain@avisere.com


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners