caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: "Gerd Stolpmann" <info@gerd-stolpmann.de>
To: "Enrico Tassi" <enrico.tassi@inria.fr>
Cc: caml-list@inria.fr
Subject: Re: [Caml-list] unmarshaling large data from string on 32 bits
Date: Thu, 5 Feb 2015 00:51:24 +0100	[thread overview]
Message-ID: <6395dd48f0abe859ff44f5095631d32c.squirrel@gps.dynxs.de> (raw)
In-Reply-To: <20150204164702.GA14942@birba.invalid>

What about this: you change the protocol so that there is a single
character, say an 'X', before any marshalled value. The 'X' is something
you can use for non-blocking reads. So (if ch is the input channel):

Unix.set_nonblock (Unix.descr_of_in_channel ch);
let x = input_char ch in   (* or Sys_blocked_io *)
assert(x = 'X');
Unix.clear_nonblock (Unix.descr_of_in_channel ch);
let v = Marshal.from_channel ch

This will also work when there are several messages in the input buffer,
as input_char then simply succeeds. If you get a Sys_blocked_io, you can
even revert to using select() because you know that the buffer is empty
then.

Gerd


> On Mon, Feb 02, 2015 at 01:00:53PM +0100, Gabriel Scherer wrote:
>> If you don't mind going through a temporary file,
>> Marshal.{to,from}_channel should work fine.
>
> Thanks for the suggestion, if Windows is as smart as Linux than a
> tmpfile should work fine.  If not, well, better than nothing.
>
>> You should consider opening a problem report to OCaml upstream (
>> http://caml.inria.fr/mantis/ ) explaining the use-case and asking for
>> a large-string-safe API (eg. taking and returning lists of strings).
>
> The chain of workarounds that leads here is long an ugly :-/
>
> 1. I have a problems with threads on Windows and (rarely) on Linux.
>    The model is simple, Coq sits between 1 user interface and many
>    (usually only 1) worker process.  Coq's main thread talks to the
>    UI via a socket and does blocking calls; worker manager
>    threads (1 per worker) do the same with their respective workers.
>    At some point all threads are blocked reading. Then
>    a worker process writes data but no thread is woken up.
>    On Linux I need at least 2 worker manager threads to see the problem,
>    on Windows 1 is enough.  All that using the channels API and Marshal.
>
>    OK, I say, let's go back to the old good Unix.select to read only when
> some
>    data is there.
>
> 2. The Unix module lets you get the fd number associated to the channel
>    and you can use Unix.select with it.  And you can still use the
> channels
>    API to Marshal.from_channel.  Looks good but I still a problem.  I have
>    LARGE and small messages.  The small ones fit, largely, in the
>    channels buffer.  Result: you have 2 "values" in the buffers of the
>    OS.  Select tells you that you can read.  You Marshal.from_channel.
>    Both values are moved in the channel buffer, but clearly
>    "input_value" reads only the first one.  You select again, but this
>    time the OS buffers are empty.  So you wait until next message
>    arrives to discover the one forgotten in the channel buffer.
>
>    I can't bet all my money on the correctness of this diagnoses,
>    but that seemed the cause at the time.  Artificially inflating
>    messages was working, but this is not what you want.  There is no
>    API, at least in 3.12, to peek a channel and see if there is
>    data (and if so, don't call select).  I tried with non blocking
>    channels, but I could not succeed using input_value there (I don't
>    recall if input_value is always blocking or something else went
>    wrong).
>
>    OK, I say, let's not use the channels and do old good Unix select and
>    read.  Unfortunately the size of buffers, strings, is limited and the
>    LARGE messages I have do not fit.
>
> So yes, Marshal.from_string_list would be an option here.
>
> I still have around a simple example that locks up on Windows,
> I'll open a bug for that.
>
> Best,
> --
> Enrico Tassi
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>


-- 
------------------------------------------------------------
Gerd Stolpmann, Darmstadt, Germany    gerd@gerd-stolpmann.de
My OCaml site:          http://www.camlcity.org
Contact details:        http://www.camlcity.org/contact.html
Company homepage:       http://www.gerd-stolpmann.de
------------------------------------------------------------






  reply	other threads:[~2015-02-04 23:51 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-02 10:32 Enrico Tassi
2015-02-02 12:00 ` Gabriel Scherer
2015-02-02 13:08   ` Pierre-Marie Pédrot
2015-02-04 16:47   ` Enrico Tassi
2015-02-04 23:51     ` Gerd Stolpmann [this message]
2015-02-05  8:56 ` Alain Frisch
2015-02-05  9:01   ` Gabriel Scherer
2015-02-05  9:34     ` Alain Frisch
2015-02-05  9:58   ` Pierre-Marie Pédrot
2015-02-05 10:33     ` Enrico Tassi
2015-02-05 10:50     ` Alain Frisch
2015-02-05 12:22       ` Fabrice Le Fessant
2015-02-05 12:24         ` Alain Frisch
2015-02-05 12:27       ` Enrico Tassi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6395dd48f0abe859ff44f5095631d32c.squirrel@gps.dynxs.de \
    --to=info@gerd-stolpmann.de \
    --cc=caml-list@inria.fr \
    --cc=enrico.tassi@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).