caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Xavier Leroy <xavier.leroy@inria.fr>
To: caml-list@inria.fr
Subject: Re: [Caml-list] Unix file descriptors vs. in/out channels
Date: Mon, 18 Aug 2014 18:33:47 +0200	[thread overview]
Message-ID: <53F22AEB.5070506@inria.fr> (raw)
In-Reply-To: <CAHR=VkwXynFAJm13MwKrsJNgx4HeW_LU03BQxqD4S7prG1A8Rw@mail.gmail.com>

Hi Thomas,

> [problem] I am a bit puzzled w.r.t. the interplay between Pervasives
> functions that operate on in/out channels and the Unix function that
> operate on file descriptors. From the documentation, I assume that it
> is not possible to close an (input) channel that was created using
> Unix.in_channel_of_descr without closing the associated file
> descriptor. 

Correct.

> Therefore, I assume that I cannot use
> Unix.in_channel_of_descr and Unix.out_channel_of_descr more that once
> for my file-descriptor (because otherwise, these channels would not be
> reclaimed). 

I don't quite understand your remark.  You need to close (explicitly
and at once) all in_channels and out_channels associated with your
file_descr, once you're done with it.  The first close_in/close_out
will close the underlying FD, and the others will ignore the fact that
the FD is already closed.

> But, is is safe to use both kind of channels?

Sometimes :-)  An example is Unix.open_connection, which gives you a
pair of in/out channels on the same socket.  The only caveat is that
writes on out_channels are buffered, so you need to flush explicitly
to make sure the data is actually sent over the socket.

> [summary] I would like to open a file in read-write mode, and use it
> (mainly) to stream a big data-structure in it and (sometime) reading
> the content of this data-structure.

For a file opened in RW mode, the problem is that reads through
the in_channel may not see the data written through the out_channel,
even if you religiously flush the out_channel before reading anything.
The reason is that in_channels are also buffered, and may hold stale
data corresponding to the state of the file before recent writes.  And
there is no flush operation for in_channels...

> Btw, while playing with this problem, I found the following strange
> behavior: if I uncomment the second line in debug (see below), I can
> read data from the input channel, while if the debug line is comment,
> reading from the channel yields an End_of_file exception. Is this
> expected?

This is another gotcha :-)  in_channels and out_channels maintain
(their idea of) the current position in the file.  This helps avoiding
unnecessary "lseek" operations to determine current position and
length.  However, if you share a FD between two channels, the
channels's idea of the current position is inconsistent with the
actual position of the FD.

Bottom line: for your intended application, it's better to use Unix
functions exclusively.  The trick with an (in_channel, out_channel) pair
does work pretty well for sockets, named pipes and terminals, though.

- Xavier Leroy


  parent reply	other threads:[~2014-08-18 16:33 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-18 14:42 Thomas Braibant
2014-08-18 16:10 ` Adrien Nader
2014-08-18 16:15   ` Edouard Evangelisti
2014-08-18 16:29   ` Thomas Braibant
2014-08-18 16:33 ` Xavier Leroy [this message]
2014-08-18 16:52   ` Thomas Braibant
2014-08-18 16:57     ` Xavier Leroy
2014-08-18 17:18       ` Thomas Braibant
2014-08-18 17:55         ` David Sheets

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53F22AEB.5070506@inria.fr \
    --to=xavier.leroy@inria.fr \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).