caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Goswin von Brederlow <goswin-v-b@web.de>
To: rixed@happyleptic.org
Cc: caml-list@inria.fr
Subject: Re: [Caml-list] Unix.single_write, doc and atomicity
Date: Sun, 17 Jun 2012 01:11:42 +0200	[thread overview]
Message-ID: <871ulekg29.fsf@frosties.localnet> (raw)
In-Reply-To: <20120606130646.GA23115@securactive.lan> (rixed@happyleptic.org's message of "Wed, 6 Jun 2012 15:06:46 +0200")

rixed@happyleptic.org writes:

> I was reading the code for unix_single_write recently, and noticed two strange
> things that I prefer to discuss here before filling the bug tracker with
> dubious tickets.
>
> First, the doc of Unix.single_write claims that this function "attemps to write
> only once", although the underlying unix_single_write C function clearly loops
> around a write if the buffer being written is larger than the internal buffer
> (16K).  So if one large buffer is written and an error happens after the first
> 16K then the written file is now corrupt. Looks like a bug. At the very least,
> the documentation should state that single_write can write atomicaly only
> buffer smaller than 16K. But I'd prefer a solution based on dynamic memory
> allocation for the required iobuf.
>
> Then, while we are on atomicity, another annoying thing: as the giant lock is
> released during the write some other thread may perform another single_write on
> the same file handler and again if the written buffers are larger than 16K the
> simultaneous write loop may interleave the written chunks. In other words, the
> nice atomicity properties of unix files opened with O_APPEND flag no longer
> holds. Although the doc does not pretend that the writes will be atomic in this
> situation, this is quite sad.
>
> So this boils down to : unix_single_write really should perform a single write!
> What do you think?

As others have said you were reading the wrong function.


But you do have a point with the copying of data in 16K chunkd and
writing them out in bits and pices. This destroys the atomicity of write
(as far as it is given at all, pipes garanty only 4k atomic writes under
linux). Also the copying is a waste of cpu power.

There is a nice solution to this: Bigarray. The data part of a Bigarray
is allocated from outside the ocaml heap and is unmovable. That means
that C code can look up the address and size of the data, release the
runtime system and write the data in the background without having to
copy it first. The extunix module (http://extunix.forge.ocamlcore.org/)
has bindings for read, write, pread and pwrite functions with Bigarrays
that retain all the atomicity the systemcalls provide.

Extunix also provides read, write, pread and pwrite functions for
strings with better defined behaviour on error:

----------------------------------------------------------------------
(** [all_write fd buf ofs len] writes up to [len] bytes from file
    descriptor [fd] into the string [buf] at offset [ofs].

    [all_write] repeats the write operation until all characters have
    been written or an error occurs. Returns less than the number of
    characters requested on EAGAIN, EWOULDBLOCK but never 0. Continues
    the write operation on EINTR. Raises an Unix.Unix_error exception
    in all other cases. *)
external unsafe_all_write: Unix.file_descr -> string -> int -> int -> int = "caml_extunix_all_write"

let all_write fd buf ofs len =
  if ofs < 0 || len < 0 || ofs > String.length buf - len
  then invalid_arg "ExtUnix.all_write"
  else unsafe_all_write fd buf ofs len

(** [single_write fd buf ofs len] writes up to [len] bytes from file
    descriptor [fd] into the string [buf] at offset [ofs].

    [single_write] attempts to write only once. Returns the number of
    characters written or raises an Unix.Unix_error exception. *)
external unsafe_single_write: Unix.file_descr -> string -> int -> int -> int = "caml_extunix_single_write"

let single_write fd buf ofs len =
  if ofs < 0 || len < 0 || ofs > String.length buf - len
  then invalid_arg "ExtUnix.single_write"
  else unsafe_single_write fd buf ofs len

(** [write fd buf ofs len] writes up to [len] bytes from file
    descriptor [fd] into the string [buf] at offset [ofs].

    [write] repeats the write operation until all characters have
    been written or an error occurs. Raises an Unix.Unix_error exception
    if 0 characters could be written before an error occurs. Continues
    the write operation on EINTR. Returns the number of characters
    written in all other cases. *)
external unsafe_write: Unix.file_descr -> string -> int -> int -> int = "caml_extunix_write"

let write fd buf ofs len =
  if ofs < 0 || len < 0 || ofs > String.length buf - len
  then invalid_arg "ExtUnix.pwrite"
  else unsafe_write fd buf ofs len

(** [intr_write fd buf ofs len] writes up to [len] bytes from file
    descriptor [fd] into the string [buf] at offset [ofs].

    [intr_write] repeats the write operation until all characters have
    been written or an error occurs. Raises an Unix.Unix_error exception
    if 0 characters could be written before an error occurs. Does NOT
    continue on EINTR. Returns the number of characters written in all
    other cases. *)
external unsafe_intr_write: Unix.file_descr -> string -> int -> int -> int = "caml_extunix_intr_write"

let intr_write fd buf ofs len =
  if ofs < 0 || len < 0 || ofs > String.length buf - len
  then invalid_arg "ExtUnix.intr_write"
  else unsafe_intr_write fd buf ofs len

----------------------------------------------------------------------

MfG
        Goswin


      parent reply	other threads:[~2012-06-16 23:11 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-06 13:06 rixed
2012-06-06 13:17 ` Török Edwin
2012-06-06 13:29   ` oliver
2012-06-06 13:34   ` rixed
2012-06-16 23:11 ` Goswin von Brederlow [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871ulekg29.fsf@frosties.localnet \
    --to=goswin-v-b@web.de \
    --cc=caml-list@inria.fr \
    --cc=rixed@happyleptic.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).