From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id QAA06629; Wed, 10 Mar 2004 16:25:46 +0100 (MET) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id QAA06044 for ; Wed, 10 Mar 2004 16:25:45 +0100 (MET) Received: from gw01.mail.saunalahti.fi (gw01.mail.saunalahti.fi [195.197.172.115]) by nez-perce.inria.fr (8.12.10/8.12.10) with ESMTP id i2AFQ3KW022845 for ; Wed, 10 Mar 2004 16:26:03 +0100 Received: from aka.i.naked.iki.fi (naked.iki.fi [62.142.249.112]) by gw01.mail.saunalahti.fi (Postfix) with ESMTP id 57063111CFBD; Wed, 10 Mar 2004 17:25:43 +0200 (EET) Received: from naked by aka.i.naked.iki.fi with local (Exim 3.36 #1 (Debian)) id 1B15aY-0002tZ-00; Wed, 10 Mar 2004 17:25:42 +0200 To: Eric Dahlman Cc: skaller@users.sourceforge.net, caml-list@pauillac.inria.fr Subject: Re: [Caml-list] Bug with really_input under cygwin References: <51FE3429-7219-11D8-8BF5-000393914EAA@atcorp.com> <1078888018.2452.52.camel@pelican.wigram> From: Nuutti Kotivuori Date: Wed, 10 Mar 2004 17:25:42 +0200 In-Reply-To: <1078888018.2452.52.camel@pelican.wigram> (skaller@users.sourceforge.net's message of "10 Mar 2004 14:06:59 +1100") Message-ID: <87hdww4tc9.fsf@aka.i.naked.iki.fi> User-Agent: Gnus/5.1006 (Gnus v5.10.6) XEmacs/21.4 (Security Through Obscurity, linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Miltered: at nez-perce by Joe's j-chkmail ("http://j-chkmail.ensmp.fr")! X-Loop: caml-list@inria.fr X-Spam: no; 0.00; caml-list:01 bug:01 cygwin:01 sourceforge:01 2004:99 howdy:01 rant:01 'really:01 buffer:01 abstraction:01 caml:01 byte:01 byte:01 unlikely:02 binary:02 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk X-Status: X-Keywords: X-UID: 105 skaller@users.sourceforge.net wrote: > On Wed, 2004-03-10 at 09:30, Eric Dahlman wrote: >> Howdy all, >> >> I have some code which is reads in a whole file in and returns it >> as a string. If you have a master's degree in reading in between the rant, you probably picked out the right answer from the text below. But here it is as a simple answer: Loop doing 'input' on the file, until 'input' returns zero. 'really_input' is ofcourse nice and easy, but since you have no really proper way of knowing how large the entire file is going to be in the end, you need to make a decision with the buffer size anyway. Binary or non-binary mode only affects the \r\n -> \n translation while reading the file - and vice versa while writing. > The only correct way to do this is to read a block at a time > until you get a partial block. > > This is so EVEN in 'binary' mode, which is just another > ill conceived Unix hack :-) [...] > It is unfortunate that C and Unix do not provide a coherent > abstraction in this area. Even binary I/O is ill-conceived: [...] > C has been plagued by extremely ill considered functions. > Even the basic IO operation is not correctly defined. [...] > There is no such thing as 'the number of characters > in a file'. Perhaps there is a number of bytes in a file. [...] > In MS-DOS, files *always* consist of a number of 256 > byte blocks. It is impossible to have a file with > a non-256 byte multiple size. Of course, text files > uses an encoding with a Ctrl-Z at the end. [...] > Under Linux, the Standard for text encoding is UTF-8. [...] > I personally believe the easiest way to work around this > quagmire of malspecification is to > > (a) ONLY use 8 bit binary I/O > (b) ALWAYS read and write bytes > > even if you're processing text. Never depend on the > language or OS conversion functions, its very unlikely > they'll be right. Do all the conversions needed yourself. > At least when you find a problem you're not handling > correctly you can fix it. Luckily not everybody sees the world as glum :-) -- Naked ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners