From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id NAA11176; Thu, 29 Apr 2004 13:23:33 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id NAA11173 for ; Thu, 29 Apr 2004 13:23:32 +0200 (MET DST) Received: from rabelais.socialtools.net (rabelais.socialtools.net [81.2.94.243]) by nez-perce.inria.fr (8.12.10/8.12.10) with ESMTP id i3TBNVjq001246 for ; Thu, 29 Apr 2004 13:23:31 +0200 Received: by rabelais.socialtools.net (Postfix, from userid 108) id 4D6E2232DF; Thu, 29 Apr 2004 12:23:31 +0100 (BST) Received: from socialtools.net (chaucer.socialtools.net [81.2.94.242]) by rabelais.socialtools.net (Postfix) with ESMTP id E37B1232DB; Thu, 29 Apr 2004 12:23:29 +0100 (BST) Message-ID: <4090E597.1080603@socialtools.net> Date: Thu, 29 Apr 2004 12:23:03 +0100 From: Benjamin Geer User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040113 X-Accept-Language: en-gb, en, fr, it MIME-Version: 1.0 To: John Goerzen Cc: caml-list@inria.fr Subject: Re: [Caml-list] Re: Common IO structure References: <016401c42bc4$b6438840$19b0e152@warp> <20040428.004358.45522587.yoriyuki@mbg.ocn.ne.jp> <016501c42c73$24e64b30$ef01a8c0@warp> <20040428.015800.126758722.yoriyuki@mbg.ocn.ne.jp> <408EEE3E.7050008@socialtools.net> <20040428034415.GB19564@complete.org> <40902265.9040702@socialtools.net> <20040428214442.GE10198@excelhustler.com> In-Reply-To: <20040428214442.GE10198@excelhustler.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on rabelais.socialtools.net X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham version=2.63 X-Miltered: at nez-perce by Joe's j-chkmail ("http://j-chkmail.ensmp.fr")! X-Loop: caml-list@inria.fr X-Spam: no; 0.00; caml-list:01 glance:01 api:01 governing:99 api:01 readline:01 wrappers:01 buffering:01 readline:01 encodings:01 implemented:01 python:01 null:01 interfaces:01 bytes:02 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk John Goerzen wrote: > I'm looking at java.io right now. I count no less than 10 interfaces > and 50 classes. Let's say that I want to open up a file for read/write > access and be able to seek around in it. Looking at the class list, I > don't know if I want BufferedInputStream, BufferedOutputStream, > BufferedReader, BufferedWriter, CharArrayReader, CharArrayWriter, > DataInputStream, DataOutputStream, File, FileDescriptor, > FileInputStream, FileOutputStream, FileReader, FileWriter, InputStream, > InputStreamReader, OutputStream, OutputStreamWriter, RandomAccessFile, > Reader, or Writer. Really, I literally *do not know how to open a > simple file*. I would not call that intuitive. You actually have to *read* the documentation, not just glance at the class names. :) That's to be expected with a powerful API. Once you understand the key concepts governing the design of the API, it makes sense, it and becomes intuitive to select the classes you need. I tried to point out these concepts in the message you replied to. To read a file containing UTF-8 text, one line at a time: BufferedReader in = new BufferedReader (new InputStreamReader (new FileInputStream(filename), "UTF8")); while (true) { String line = in.readLine(); if (line == null) { break; } System.out.println(line); } This illustrates the main design concept I was talking about. InputStream is an abstract class; different implementations either know how to get input from a particular source (like FileInputStream), or are meant to be used as wrappers around another InputStream to add functionality (like buffering). All the classes whose names end in 'Stream' deal with bytes only; the ones whose names end in 'Reader' or 'Writer' deal with characters. See? It's easy once you know the pattern. To open a file for read/write access and be able to seek around in it: RandomAccessFile file = new RandomAccessFile(filename, "rw"); The methods in RandomAccessFile are pretty self-explanatory. Ben >>OK, but then you can leave out readline(), readlines() and xreadlines(), >>because they don't make any sense unless you've already dealt with >>character encodings. > > No, they can simply be implemented in terms of read(). A line is a chunk of text, not of bytes. I don't think it makes sense to deal with text unless you know what encoding it's in. >>Then, before you can divide text into lines, you also need to know which >>newline character(s) to use. This needs to be configurable >>programmatically > > That's pretty easy as a class variable I was only pointing out that neither Python (as far as I can tell) nor Java do this. Ben ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners