caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Enrico Tassi <enrico.tassi@inria.fr>
To: caml-list@inria.fr
Subject: Re: [Caml-list] unmarshaling large data from string on 32 bits
Date: Wed, 4 Feb 2015 17:47:03 +0100	[thread overview]
Message-ID: <20150204164702.GA14942@birba.invalid> (raw)
In-Reply-To: <CAPFanBFt_=URJLAj3vSo9bpKrgJt6=+fLdXA8HSik3ga8SpKdA@mail.gmail.com>

On Mon, Feb 02, 2015 at 01:00:53PM +0100, Gabriel Scherer wrote:
> If you don't mind going through a temporary file,
> Marshal.{to,from}_channel should work fine.

Thanks for the suggestion, if Windows is as smart as Linux than a
tmpfile should work fine.  If not, well, better than nothing.

> You should consider opening a problem report to OCaml upstream (
> http://caml.inria.fr/mantis/ ) explaining the use-case and asking for
> a large-string-safe API (eg. taking and returning lists of strings).

The chain of workarounds that leads here is long an ugly :-/

1. I have a problems with threads on Windows and (rarely) on Linux.
   The model is simple, Coq sits between 1 user interface and many
   (usually only 1) worker process.  Coq's main thread talks to the
   UI via a socket and does blocking calls; worker manager
   threads (1 per worker) do the same with their respective workers.
   At some point all threads are blocked reading. Then
   a worker process writes data but no thread is woken up.
   On Linux I need at least 2 worker manager threads to see the problem,
   on Windows 1 is enough.  All that using the channels API and Marshal.

   OK, I say, let's go back to the old good Unix.select to read only when some
   data is there.
 
2. The Unix module lets you get the fd number associated to the channel
   and you can use Unix.select with it.  And you can still use the channels
   API to Marshal.from_channel.  Looks good but I still a problem.  I have
   LARGE and small messages.  The small ones fit, largely, in the
   channels buffer.  Result: you have 2 "values" in the buffers of the
   OS.  Select tells you that you can read.  You Marshal.from_channel.
   Both values are moved in the channel buffer, but clearly
   "input_value" reads only the first one.  You select again, but this
   time the OS buffers are empty.  So you wait until next message
   arrives to discover the one forgotten in the channel buffer.

   I can't bet all my money on the correctness of this diagnoses,
   but that seemed the cause at the time.  Artificially inflating
   messages was working, but this is not what you want.  There is no
   API, at least in 3.12, to peek a channel and see if there is
   data (and if so, don't call select).  I tried with non blocking
   channels, but I could not succeed using input_value there (I don't
   recall if input_value is always blocking or something else went
   wrong).

   OK, I say, let's not use the channels and do old good Unix select and
   read.  Unfortunately the size of buffers, strings, is limited and the
   LARGE messages I have do not fit.

So yes, Marshal.from_string_list would be an option here.

I still have around a simple example that locks up on Windows,
I'll open a bug for that.

Best,
-- 
Enrico Tassi

  parent reply	other threads:[~2015-02-04 16:47 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-02 10:32 Enrico Tassi
2015-02-02 12:00 ` Gabriel Scherer
2015-02-02 13:08   ` Pierre-Marie Pédrot
2015-02-04 16:47   ` Enrico Tassi [this message]
2015-02-04 23:51     ` Gerd Stolpmann
2015-02-05  8:56 ` Alain Frisch
2015-02-05  9:01   ` Gabriel Scherer
2015-02-05  9:34     ` Alain Frisch
2015-02-05  9:58   ` Pierre-Marie Pédrot
2015-02-05 10:33     ` Enrico Tassi
2015-02-05 10:50     ` Alain Frisch
2015-02-05 12:22       ` Fabrice Le Fessant
2015-02-05 12:24         ` Alain Frisch
2015-02-05 12:27       ` Enrico Tassi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150204164702.GA14942@birba.invalid \
    --to=enrico.tassi@inria.fr \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).