caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: "Nathaniel Gray" <n8gray@gmail.com>
To: "Bardur Arantsson" <spam@scientician.net>
Cc: caml-list@inria.fr
Subject: Re: [Caml-list] Re: Help interfacing with C
Date: Fri, 18 Aug 2006 17:33:26 -0700	[thread overview]
Message-ID: <aee06c9e0608181733t6b6f261eobdd9d1c78c91c6b6@mail.gmail.com> (raw)
In-Reply-To: <ec5en8$4eg$1@sea.gmane.org>

On 8/18/06, Bardur Arantsson <spam@scientician.net> wrote:
> Nathaniel Gray wrote:
> > On 8/16/06, Bardur Arantsson <spam@scientician.net> wrote:
> >> Nathaniel Gray wrote:
> >> > Hi folks,
> >> [--snip--]
> >
> > Thanks, but this doesn't really help.  Perhaps I should say what I'm
> > trying to accomplish instead of trying to let the types speak for me.
> > The point is that when you select on a list of file descriptors
> > there's a very good chance that you don't *only* care that file
> > descriptor f is ready, you also have some ocaml data structure
> > associated with f that you want to use with it in some way.  For
> > example, you probably have a string to write or a buffer to read into.
> >
> > The current API is sort of a minimal translation of the select system
> > call to OCaml, which is a good starting point, but not great IMHO.  In
> > order to match up each file descriptor to its data you have to scan
> > the output fd list yet again, after it's already been done once at the
> > C level, and since you don't know how/if it's sorted you end up with
> > unsatisfying time complexity for the whole thing.  It's not critical,
> > since the lists are small, but it hardly brings a smile to your face.
> >
> > On the other hand, imagine a select2 that takes lists of (fd, 'a)
> > pairs and returns the same.  The results become incredibly easy to
> > handle, with no scanning necessary.  You probably just List.iter a
> > simple read/write function over the result and you're done.
> >
>
> Have you tried to benchmark the naive implementation(*) versus a "bare"
> Unix.select? I shouldn't think you a little bit extra of scanning the
> list which is O(n) in the number of _active_ file descriptors matters
> with all the O(MAX_FD) stuff that's going on in select(). I would expect
> most of the time to be spent doing the actual system call.
>
> (*) That is, an implementation which puts (fd, associated value) in a
> hashtable indexed by the file_descr and does the scanning after doing
> the Unix.select.

Let me be clear about this -- it's not really about performance.  I'm
looking at the entire sequence of operations and data structures
needed to make a select call.  Here's a typical example with the
current API:

(* You have a data structure with some file descriptors and auxilliary data *)
let foo = [(fd1, "hello"); (fd2, "world"); (fd3, "baz")] in
(* Now you need a list of file descriptors *)
let fds = List.map fst foo in
(* Feed them to select *)
let _, outs, _ = Unix.select [] fds [] (-1.0) in
(* Now do something with each active descriptor *)
List.iter (fun fd ->
   let data = List.assoc fd foo in
   do_something data) outs

Here's how it would work with my proposed select2:

(* You have a data structure with some file descriptors and auxilliary data *)
let foo = [(fd1, "hello"); (fd2, "world"); (fd3, "baz")] in
(* Feed it to select2 *)
let _, outs, _ = Unix.select2 [] foo [] (-1.0) in
(* Now do something with the data *)
List.iter (fun (fd, data) -> do_something data) outs

The whole thing is much nicer.  You don't need a separate list of file
descriptors that exists only to be fed to select.  You don't need to
do any work afterwards to find the data associated with a file
descriptor.  The fact that it's also faster (assuming it truly *is*
faster) is just the icing on the cake.  Now I'll grant that I've
chosen a convenient data representation for my purposes, but it's not
like I've cheated badly -- a list of (fd, data) or (fd, closure) pairs
would be a sensible thing to use in this kind of code.

Now of course this could all be done at the ocaml level, but wouldn't
it be nicer to do the elegant thing in the first place instead of
cleaning up the mess afterwards?  ;^)

Cheers,
-n8

-- 
>>>-- Nathaniel Gray -- Caltech Computer Science ------>
>>>-- Mojave Project -- http://mojave.cs.caltech.edu -->


  reply	other threads:[~2006-08-19  0:44 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-08-16 19:34 Nathaniel Gray
2006-08-17  3:49 ` [Caml-list] " malc
2006-08-18 21:00   ` Nathaniel Gray
2006-08-17  5:56 ` Bardur Arantsson
2006-08-18  7:10   ` [Caml-list] " Olivier Andrieu
2006-08-18 15:50     ` Bardur Arantsson
2006-08-18 21:33   ` [Caml-list] " Nathaniel Gray
2006-08-18 22:24     ` Bardur Arantsson
2006-08-19  0:33       ` Nathaniel Gray [this message]
2006-08-19  6:03         ` Bardur Arantsson
2006-08-21 22:45           ` [Caml-list] " Nathaniel Gray
2006-08-19  8:30         ` Olivier Andrieu
2006-08-21 22:35           ` Nathaniel Gray
2006-08-19  9:03     ` Richard Jones
2006-08-19  9:41       ` skaller
2006-08-18  8:46 ` [Caml-list] " Damien Doligez
2006-08-18 20:09   ` Nathaniel Gray
2006-08-23 23:36 ` Nathaniel Gray

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aee06c9e0608181733t6b6f261eobdd9d1c78c91c6b6@mail.gmail.com \
    --to=n8gray@gmail.com \
    --cc=caml-list@inria.fr \
    --cc=spam@scientician.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).