From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by yquem.inria.fr (Postfix) with ESMTP id BB33BBC74 for ; Tue, 22 Aug 2006 10:15:22 +0200 (CEST) Received: from smtp1.adl2.internode.on.net (smtp1.adl2.internode.on.net [203.16.214.181]) by nez-perce.inria.fr (8.13.6/8.13.6) with ESMTP id k7M8FKwB032022 for ; Tue, 22 Aug 2006 10:15:21 +0200 Received: from rosella (ppp14-47.lns2.syd7.internode.on.net [59.167.14.47]) by smtp1.adl2.internode.on.net (8.13.6/8.13.5) with ESMTP id k7M8F7C1035827; Tue, 22 Aug 2006 17:45:08 +0930 (CST) (envelope-from skaller@users.sourceforge.net) Subject: Re: [Caml-list] Re: Select on channels (again) From: skaller To: Jonathan Roewen Cc: Nathaniel Gray , Caml Mailing List In-Reply-To: References: Content-Type: text/plain Date: Tue, 22 Aug 2006 18:15:07 +1000 Message-Id: <1156234507.5707.47.camel@rosella.wigram> Mime-Version: 1.0 X-Mailer: Evolution 2.6.1 Content-Transfer-Encoding: 7bit X-Miltered: at nez-perce with ID 44EABD18.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; buffer:01 pervasives:01 unix:01 unix:01 ocaml:01 ocaml:01 buffered:01 buffer:01 byte:01 descriptors:01 callbacks:01 callbacks:01 scheduler:01 programmatic:01 non-blocking:01 On Tue, 2006-08-22 at 18:41 +1200, Jonathan Roewen wrote: > > It sounds simple but doesn't work. If select tells you a file > > descriptor doesn't have data waiting you can't be sure there isn't > > still data in the corresponding channel's buffer. See the thread that > > I referenced for a good discussion of why this is annoying. For one > > thing, it makes it impossible to use Marshal.from_channel without > > potentially blocking. > > Either one of us is misunderstanding the other.... You missed the first half of the discussion: > Instead of using Pervasives.open_xxx, use Unix.openfile which returns > Unix.file_descr, and also doesn't use internal ocaml buffering. > > Then, presumably, Unix.select would do what you expect, and then you > can use Unix.in_channel_of_descr to get an ocaml in_channel to read > from. > > And if I'm misunderstanding you, then perhaps the problem isn't really > Unix.select... The problem is that this defeats the use of all the formatting and buffering functions that work on buffered I/O channels. What's required is something that tells: (a) there is some data in the buffer OR (b) there is some data on the descriptor so that in either case some progress can be made. Unfortunately .. there's a reason this makes no sense: For raw byte streams .. you can just use the file descriptors already with select. Otherwise, there's no way to predict if an input will block, whether or not there is data in the buffer, and whether or not the file descriptor is ready, because the input operation can read some data THEN block. The same argument applies to output. Therefore .. there is no choice but to replace all the buffering anyhow, and in general the whole programming paradigm needs to be replaced. Felix demux system already does this I think, for both read/write n bytes, and for read/write a line. More difficult cases should be handled by in-core formatting eg: print_string (string_of_int i) is correct and print_int i is wrong. The former cannot block on formatting, the latter can. (assuming nonblocking line I/O is available). You're stuck between a rock and hard place here :) The read/write functions of a system are designed to provide control inversion: data coming in or going out is naturally interrupt (callback) driven, but it is inconvenient to program with callbacks (I would say it more strongly -- it is *untenable* to use callbacks). Therefore the scheduler provides blocking I/O, and switches out programmatic demands for I/O, effecting control inversion. You can try to work around this with non-blocking I/O, but it is really a hack because doing so is tantamount to writing your own scheduler to provide control inversion, in other words, inventing your own operating system. It is even worse if you use event notifications to avoid polling (I mean, it is even more complex). In general the only really sound solution is indeed to provide a full scale operating system abstraction layer, which requires the underlying programming language computational model be designed to work with it. Several systems can work this way: Felix and Haskell both have continuations, which seem to be the pre-requisite. MLton may also cope with this. The Ocaml computational model doesn't provide the required resources natively, although of course they could be implemented in Ocaml .. but then you would be programming with, for example, suitable monadic combinators, rather than arbitrary raw Ocaml code. Just so it is clear: given two sockets, you want to read integers off them. You can do this with two threads, both of which block. Or you can block, and invoke a callback when one conversion finally completes. The two techniques are control inverse. The only difference is that the thread model uses OS control inversion and the callback model uses hand written control inversion. BOTH techniques suck. The only way to do this properly is the way Felix does it: you write threaded code, but language control inverts it into callback driven code systematically, and provides its own OS abstraction layer: this gives you the responsiveness and performance of user space callback driven code, but the illusion of using threads. You will note this is not a magical silver bullet: it only works because the user code handles more specialised cases than a general purpose OS can handle well: if one tried to do this with full generality you'd just end up with yet another low performance operating system. IMHO the key here is that application specific information .. perhaps embodied in the type system .. can be used by the user program and language translator, but not the underlying OS. Just to see, in Felix you'd do it something like: var ich = mk_schannel[int](); spawn_sthread { forever { var x : int; read_int (sock1, &x); write ich, x; } } spawn_sthread { forever { var x : int; read_int (sock2, &x); write ich, x; } } forever { var x:int; read (ich, &x); print x; endl; } The two 'threads' spawned here are NOT pre-emptive threads. They're actually continuations, which are resumed by the underlying demux library notification mechanism starting them up again based on epoll/poll/kqueue/select etc. The interaction along the channel 'ich' is entirely synchronous. Ocaml can do this now using Event module .. but it only works across pthread boundaries. Strangely .. the Ocaml VM system does this stuff for the bytecode interpreter already, interleaving bytecode to emulate threads, and forwarding blocking operations so the emulated threads block .. but the actual pre-emptive thread (process) does not. -- John Skaller Felix, successor to C++: http://felix.sf.net