From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id UAA31575; Wed, 19 Mar 2003 20:08:16 +0100 (MET) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id UAA31428 for ; Wed, 19 Mar 2003 20:08:09 +0100 (MET) Received: from mallaury.noc.nerim.net (smtp-102.noc.nerim.net [62.4.17.102]) by concorde.inria.fr (8.11.1/8.11.1) with ESMTP id h2JJ89f02380 for ; Wed, 19 Mar 2003 20:08:09 +0100 (MET) Received: from hector.lesours (lesours.starynkevitch.net [80.65.224.217]) by mallaury.noc.nerim.net (Postfix) with ESMTP id 5B4EC62F96; Wed, 19 Mar 2003 20:08:06 +0100 (CET) Received: from basile by hector.lesours with local (Exim 3.36 #1 (Debian)) id 18viv2-0002yq-00; Wed, 19 Mar 2003 20:08:08 +0100 MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Message-ID: <15992.49176.43893.768644@hector.lesours> Date: Wed, 19 Mar 2003 20:08:08 +0100 To: cashin@cs.uga.edu Cc: caml-list@inria.fr Subject: Re: [Caml-list] Unix.lseek versus Pervasives.pos In-Reply-To: <877kavryp3.fsf@cs.uga.edu> References: <877kavryp3.fsf@cs.uga.edu> X-Mailer: VM 7.11 under Emacs 21.2.2 From: Basile STARYNKEVITCH X-Spam: no; 0.00; basile:01 caml-list:01 lseek:01 pervasives:01 stupid:01 flushing:01 buffering:01 flushes:01 byterun:01 implemented:01 struct:01 descriptor:01 char:01 buffer:01 placeholder:01 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk >>>>> "cashin" == cashin writes: cashin> Sorry if this shows up as a duplicate. Basile cashin> STARYNKEVITCH writes: Basile>> You apparently forgot to flush the channel. Ok, I made a stupid mistake (flushing is only for writes!) but my intuition was right, in the sense of taking buffering into account. cashin> Flushes are for writes, but even when using a test program cashin> that just reads, zero is returned when it appears that it cashin> shouldn't return zero. Compare the short ocaml program cashin> below to the comparable C version. Ok; but the problem is the same: Ocaml I/O subsystem manage internal buffering. Channels are not Unix filedescriptors, but a buffering of these. See the source code (in particular ocaml/byterun/io.c and io.h) for details. In particular, a channel is (from io.h) implemented as struct channel { int fd; /* Unix file descriptor */ file_offset offset; /* Absolute position of fd in the file */ char * end; /* Physical end of the buffer */ char * curr; /* Current position in the buffer */ char * max; /* Logical end of the buffer (for input) */ void * mutex; /* Placeholder for mutex (for systhreads) */ struct channel * next; /* Linear chaining of channels (flush_all) */ int revealed; /* For Cash only */ int old_revealed; /* For Cash only */ int refcount; /* For flush_all and for Cash */ char buff[IO_BUFFER_SIZE]; /* The buffer itself */ }; where IO_BUFFER_SIZE is usually 4096 bytes. The equivalent C library would mix lseek with FILE, and also get a mess: /* file main.c */ #include #include #include #include #include #include #include int main(void) { FILE *f = fopen("main.c", "r"); char buf[1024]; int fd = fileno(f); memset(buf, '\0', sizeof(buf)); fread(buf, 1, 10, f); printf("after reading \"%s\" lseek returns %d\n", buf, (int) lseek(fd, 0, SEEK_CUR)); return 0; } When I run above file with tcc (www.tinycc.org) I get after reading " /* file " lseek returns 483 which is messy as I was expecting. In a short sentence, never mix Unix.read (or other Unix IO) & Pervasive.* channel operations. As usual with advices, it is a "don't do what I did" advice; shame on me :-( I must admit that I once did open a channel and then only do Unix.read operations on it, but I commented this code (opensource code in Poesia monitor) with (** IMPORTANT NOTICE: here outputxchannel_t-s are only used for their Unix file descriptor; no output takes actually place on the output channel; all output is thru Unix.write *) and later (** the reply channel from filter to monitor [don't use the Pervasives.channel; using Unix] *) The bad reasons for mixing channels & unix file descriptors (beside perhaps a design bug) is that I use nonblocking unix IO and that I want precise control over the actual read & write system calls -so I don't want extra buffering- -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basilestarynkevitchnet aliases: basiletunesorg = bstarynknerimnet 8, rue de la Faïencerie, 92340 Bourg La Reine, France ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners