From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id RAA10041; Thu, 29 Apr 2004 17:16:32 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id RAA10047 for ; Thu, 29 Apr 2004 17:16:31 +0200 (MET DST) Received: from rabelais.socialtools.net (rabelais.socialtools.net [81.2.94.243]) by nez-perce.inria.fr (8.12.10/8.12.10) with ESMTP id i3TFGTjq002576 for ; Thu, 29 Apr 2004 17:16:30 +0200 Received: by rabelais.socialtools.net (Postfix, from userid 108) id 95B6C232DF; Thu, 29 Apr 2004 16:16:29 +0100 (BST) Received: from socialtools.net (chaucer.socialtools.net [81.2.94.242]) by rabelais.socialtools.net (Postfix) with ESMTP id A9AF5232DB; Thu, 29 Apr 2004 16:16:28 +0100 (BST) Message-ID: <40911C31.3080601@socialtools.net> Date: Thu, 29 Apr 2004 16:16:01 +0100 From: Benjamin Geer User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040113 X-Accept-Language: en-gb, en, fr, it MIME-Version: 1.0 To: Richard Jones Cc: caml-list@inria.fr Subject: Re: [Caml-list] Re: Common IO structure References: <016401c42bc4$b6438840$19b0e152@warp> <20040428.004358.45522587.yoriyuki@mbg.ocn.ne.jp> <016501c42c73$24e64b30$ef01a8c0@warp> <20040428.015800.126758722.yoriyuki@mbg.ocn.ne.jp> <408EEE3E.7050008@socialtools.net> <20040428034415.GB19564@complete.org> <40902265.9040702@socialtools.net> <20040428214442.GE10198@excelhustler.com> <20040428224130.GA30256@redhat.com> <4090EC37.2070200@socialtools.net> <20040429120342.GB28018@redhat.com> In-Reply-To: <20040429120342.GB28018@redhat.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on rabelais.socialtools.net X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham version=2.63 X-Miltered: at nez-perce by Joe's j-chkmail ("http://j-chkmail.ensmp.fr")! X-Loop: caml-list@inria.fr X-Spam: no; 0.00; caml-list:01 buffer:01 unicode:01 unicode:01 bytes:02 bytes:02 string:03 string:03 wrote:03 encoding:04 converting:05 converting:05 structure:06 uses:06 benjamin:07 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk Richard Jones wrote: > Actually me too. 90% of my file IO requirement is to slurp a whole > file into a string Sometimes I also do get very large messages, so it's nice if, when converting from, say, Arabic EBCDIC encoding to Unicode, I don't have to read the entire message as EBCDIC bytes, then convert the whole thing to Unicode at once. It uses much less memory if I can read, say, 1K of EBCDIC bytes, convert them to Unicode characters, append them to a string buffer, and repeat until done. Another thing that I get fairly often (though less often) is a file containing many messages separated by some delimiter. In this case I want to read through the file, converting bytes to Unicode characters as I go along, and looking for the delimiter until I have a whole message in a string. I don't want to read the whole file at once, because it might be far too big. Ben ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners