From mboxrd@z Thu Jan  1 00:00:00 1970
Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id CAA24796; Wed, 28 Apr 2004 02:20:48 +0200 (MET DST)
X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f
Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id CAA24785 for <caml-list@pauillac.inria.fr>; Wed, 28 Apr 2004 02:20:46 +0200 (MET DST)
Received: from smtp1.adl2.internode.on.net (smtp1.adl2.internode.on.net [203.16.214.181])
	by nez-perce.inria.fr (8.12.10/8.12.10) with ESMTP id i3S0Kfjq020784
	for <caml-list@inria.fr>; Wed, 28 Apr 2004 02:20:45 +0200
Received: from [192.168.1.200] (ppp119-113.lns1.syd2.internode.on.net [150.101.119.113])
	by smtp1.adl2.internode.on.net (8.12.9/8.12.9) with ESMTP id i3S0KSZq007888;
	Wed, 28 Apr 2004 09:50:33 +0930 (CST)
Subject: Re: [Caml-list] Re: Common IO structure
From: skaller <skaller@users.sourceforge.net>
Reply-To: skaller@users.sourceforge.net
To: Yamagata Yoriyuki <yoriyuki@mbg.ocn.ne.jp>
Cc: warplayer@free.fr, caml-list <caml-list@inria.fr>
In-Reply-To: <20040428.015800.126758722.yoriyuki@mbg.ocn.ne.jp>
References: <016401c42bc4$b6438840$19b0e152@warp>
	 <20040428.004358.45522587.yoriyuki@mbg.ocn.ne.jp>
	 <016501c42c73$24e64b30$ef01a8c0@warp>
	 <20040428.015800.126758722.yoriyuki@mbg.ocn.ne.jp>
Content-Type: text/plain
Message-Id: <1083111626.9537.757.camel@pelican.wigram>
Mime-Version: 1.0
X-Mailer: Ximian Evolution 1.2.2 (1.2.2-4) 
Date: 28 Apr 2004 10:20:27 +1000
Content-Transfer-Encoding: 7bit
X-Miltered: at nez-perce by Joe's j-chkmail ("http://j-chkmail.ensmp.fr")!
X-Loop: caml-list@inria.fr
X-Spam: no; 0.00; caml-list:01 sourceforge:01 2004:99 yamagata:01 yoriyuki:01 10646:01 latin-:01 implementor:01 9660:01 glebe:01 nsw:01 snail:02 conversions:02 conversions:02 checkout:02 
Sender: owner-caml-list@pauillac.inria.fr
Precedence: bulk

On Wed, 2004-04-28 at 02:58, Yamagata Yoriyuki wrote:

> I'm interested in an emprical evidence, though.

You don't need it. It is clear that there are
common (99%) of all cases where UTF-8 representation
of ISO10646 is the same as ASCII, and 90% of the
rest using Latin-1 which converts very very fast.

In these common cases the overhead of non-inlined
function calls to convert characters could be very serious.

Perhaps it isn't and perhaps it is. Who knows?
Providing bulk conversions seems a prudent way to
hedge your bets. It makes the interface richer,
but there is a universal default for the bulk
operations, so no burden is imposed on the implementor.

To add to the argument in favour of bulk conversions:
in principle, doing *any* conversions on I/O is a bad
idea. The order of priority is:

	1. single point codecs
	2. string codecs
	3. IO codecs

Doesn't really make sense to have (1) and (3) and not (2).
	

-- 
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners