From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id LAA12401; Sun, 25 Jul 2004 11:08:54 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id LAA12267 for ; Sun, 25 Jul 2004 11:08:53 +0200 (MET DST) Received: from invasion.mail.pas.earthlink.net (invasion.mail.pas.earthlink.net [207.217.120.254]) by nez-perce.inria.fr (8.12.10/8.12.10) with ESMTP id i6P98pEV025989 for ; Sun, 25 Jul 2004 11:08:51 +0200 Received: from 65-101-178-163.tcsn.qwest.net ([65.101.178.163] helo=dylan) by invasion.mail.pas.earthlink.net with asmtp (Exim 4.34) id 1Boezy-0003Bc-7K for caml-list@inria.fr; Sun, 25 Jul 2004 02:08:50 -0700 Message-ID: <000c01c47227$226c7240$0201a8c0@dylan> From: "David McClain" To: "caml" References: <1090663185.15206.52.camel@qrnik> Subject: Re: [Caml-list] Bigarray is a pig Date: Sun, 25 Jul 2004 02:09:42 -0700 MIME-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-2" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1437 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1441 X-ELNK-Trace: 7a0ab3eafc8cf994b22988ad1c62733440683398e744b8a4144c8b7fefb75e9401dcd93493553fb0667c3043c0873f7e350badd9bab72f9c350badd9bab72f9c X-Originating-IP: 65.101.178.163 X-Miltered: at nez-perce with ID 410378A3.001 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Loop: caml-list@inria.fr X-Spam: no; 0.00; mcclain:01 dmcclain:01 caml-list:01 datasets:01 buffered:01 buffered:01 stumbling:01 buffer:01 randomized:01 randomized:01 slower:01 slower:01 subroutine:01 mcclain:01 kernel:01 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk Here is a case where even paying the price of a function call, no matter how indirect, is well worth it... I just finished an implementation of memory mapped files for binary array access from scientific datasets stored on disk. The implementation was in C++ along with natural usage pseudo-pointers. Every array access has to check whether or not the position is currently mapped into memory. If so, it simply returns that address for get or set. If not, then it has to sync the dirty pages to disk, then remap another segment of the file into memory before continuing as before. I am in the process of writing an OCaml interface to this as we speak. But is it worth doing? My tests show that for pure sequential writing, the memory mapped I/O is about 350% faster than using buffered I/O with fwrite(). That's nice... but for pseudo-random access, where I sequentially march upward in memory and write that address and the two surrounding addresses at +/-4KB, which is similar to some scientific array access patterns, the speed of the memory mapped I/O is 200 times faster (20,000%) than using buffered I/O. Of course, I chose that 4 KB as a ticklish offset because it both matches the page frame size and will cause some stumbling in the memory mapped I/O. And it also happens to be the more or less standard size of a buffer for fwrite. I'm writing 24 MBytes of data overall, one longword at a time. The effective throughput is about 72 MB/sec for sequential access and 100 MB/sec for randomized access, compared with 20 MB/sec sequential and 0.5 MB/sec randomized for fwrite buffered I/O. Disks are slow. File systems are slower yet. By letting the Mach kernel handle the I/O directly on page faults, I end up squeezing a lot more performance from the data handling system. This is still orders of magnitude slower than my effective computation rate, and so the cost of all the bounds checking and subroutine calling is lost in the noise. [Tests were performed on a stock 1.25 GHz G4 eMac]. David McClain Senior Corporate Scientist Avisere, Inc. +1.520.390.7738 (USA) david.mcclain@avisere.com ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners