From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id CAA24796; Wed, 28 Apr 2004 02:20:48 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id CAA24785 for ; Wed, 28 Apr 2004 02:20:46 +0200 (MET DST) Received: from smtp1.adl2.internode.on.net (smtp1.adl2.internode.on.net [203.16.214.181]) by nez-perce.inria.fr (8.12.10/8.12.10) with ESMTP id i3S0Kfjq020784 for ; Wed, 28 Apr 2004 02:20:45 +0200 Received: from [192.168.1.200] (ppp119-113.lns1.syd2.internode.on.net [150.101.119.113]) by smtp1.adl2.internode.on.net (8.12.9/8.12.9) with ESMTP id i3S0KSZq007888; Wed, 28 Apr 2004 09:50:33 +0930 (CST) Subject: Re: [Caml-list] Re: Common IO structure From: skaller Reply-To: skaller@users.sourceforge.net To: Yamagata Yoriyuki Cc: warplayer@free.fr, caml-list In-Reply-To: <20040428.015800.126758722.yoriyuki@mbg.ocn.ne.jp> References: <016401c42bc4$b6438840$19b0e152@warp> <20040428.004358.45522587.yoriyuki@mbg.ocn.ne.jp> <016501c42c73$24e64b30$ef01a8c0@warp> <20040428.015800.126758722.yoriyuki@mbg.ocn.ne.jp> Content-Type: text/plain Message-Id: <1083111626.9537.757.camel@pelican.wigram> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 (1.2.2-4) Date: 28 Apr 2004 10:20:27 +1000 Content-Transfer-Encoding: 7bit X-Miltered: at nez-perce by Joe's j-chkmail ("http://j-chkmail.ensmp.fr")! X-Loop: caml-list@inria.fr X-Spam: no; 0.00; caml-list:01 sourceforge:01 2004:99 yamagata:01 yoriyuki:01 10646:01 latin-:01 implementor:01 9660:01 glebe:01 nsw:01 snail:02 conversions:02 conversions:02 checkout:02 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk On Wed, 2004-04-28 at 02:58, Yamagata Yoriyuki wrote: > I'm interested in an emprical evidence, though. You don't need it. It is clear that there are common (99%) of all cases where UTF-8 representation of ISO10646 is the same as ASCII, and 90% of the rest using Latin-1 which converts very very fast. In these common cases the overhead of non-inlined function calls to convert characters could be very serious. Perhaps it isn't and perhaps it is. Who knows? Providing bulk conversions seems a prudent way to hedge your bets. It makes the interface richer, but there is a universal default for the bulk operations, so no burden is imposed on the implementor. To add to the argument in favour of bulk conversions: in principle, doing *any* conversions on I/O is a bad idea. The order of priority is: 1. single point codecs 2. string codecs 3. IO codecs Doesn't really make sense to have (1) and (3) and not (2). -- John Skaller, mailto:skaller@users.sf.net voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners