From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id QAA07558; Thu, 29 Apr 2004 16:02:44 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id QAA07341 for ; Thu, 29 Apr 2004 16:02:43 +0200 (MET DST) Received: from gatekeeper.elmer.external.excelhustler.com (gatekeeper.excelhustler.com [68.99.114.105]) by concorde.inria.fr (8.12.10/8.12.10) with ESMTP id i3TE2fYM015573 for ; Thu, 29 Apr 2004 16:02:42 +0200 Received: from chatterbox.elmer.internal.excelhustler.com (unknown [192.168.0.12]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (Client CN "chatterbox.elmer.internal.excelhustler.com", Issuer "excelhustler.com" (not verified)) by gatekeeper.elmer.external.excelhustler.com (Postfix) with ESMTP id 002B8E013F; Thu, 29 Apr 2004 09:02:40 -0500 (CDT) Received: from localhost (localhost [127.0.0.1]) by chatterbox.elmer.internal.excelhustler.com (Postfix) with ESMTP id C34425C006; Thu, 29 Apr 2004 09:02:40 -0500 (CDT) Received: from wile.internal.excelhustler.com (wile.internal.excelhustler.com [192.168.1.34]) by chatterbox.elmer.internal.excelhustler.com (Postfix) with ESMTP id AE90F5C005; Thu, 29 Apr 2004 09:02:40 -0500 (CDT) Received: by wile.internal.excelhustler.com (Postfix, from userid 1000) id A97681B058; Thu, 29 Apr 2004 09:02:40 -0500 (CDT) Date: Thu, 29 Apr 2004 09:02:40 -0500 From: John Goerzen To: Yamagata Yoriyuki Cc: ben@socialtools.net, caml-list@inria.fr Subject: Re: [Caml-list] Re: Common IO structure Message-ID: <20040429140240.GA12640@excelhustler.com> References: <20040428214442.GE10198@excelhustler.com> <20040429.192746.48535796.yoriyuki@mbg.ocn.ne.jp> <20040429130335.GA11323@excelhustler.com> <20040429.224036.75426712.yoriyuki@mbg.ocn.ne.jp> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040429.224036.75426712.yoriyuki@mbg.ocn.ne.jp> User-Agent: Mutt/1.5.5.1+cvs20040105i X-Scanned-By: clamscan at chatterbox.elmer.internal.excelhustler.com X-Miltered: at concorde by Joe's j-chkmail ("http://j-chkmail.ensmp.fr")! X-Loop: caml-list@inria.fr X-Spam: no; 0.00; caml-list:01 2004:99 0900,:01 yamagata:01 yoriyuki:01 readline:01 encodings:01 implemented:01 eol:01 unicode:01 unicode:01 string:03 separator:03 separator:03 wrote:03 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk On Thu, Apr 29, 2004 at 10:40:36PM +0900, Yamagata Yoriyuki wrote: > > > > > OK, but then you can leave out readline(), readlines() and xreadlines(), > > > > > because they don't make any sense unless you've already dealt with > > > > > character encodings. > > > > > > > > No, they can simply be implemented in terms of read(). > > > > > > It will break when UTF-16/UTF-32 are used. The line separator should > > > be handled after code conversion. At least that is the idea of > > > Unicode standard. (But Since Unicode standard is challenged by > > > reality in every aspect, maybe nobody cares.) > > > > You are missing the point. read() could handle the code conversion. > > No, what I wanted to say is that the line separator should be handled > in the Unicode level, not the byte-character level. Your design > assumes read() always returns new line characters as in ASCII. This > would not hold when read() returns UTF-16/UTF-32. I don't see why that is the case. If read() returns UTF-16 data, readlines() works with it, and would of course be scanning it for a UTF-16 EOL character or string. I don't see where that's the problem. -- John ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners