From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Delivered-To: caml-list@yquem.inria.fr Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by yquem.inria.fr (Postfix) with ESMTP id 17DF0BC8E for ; Tue, 10 May 2005 02:10:38 +0200 (CEST) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id j4A0AbVZ025178 for ; Tue, 10 May 2005 02:10:37 +0200 Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id CAA14606 for ; Tue, 10 May 2005 02:10:37 +0200 (MET DST) Received: from swip.net (mailfe01.swip.net [212.247.154.1]) by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id j4A0AaEE026847 for ; Tue, 10 May 2005 02:10:36 +0200 X-T2-Posting-ID: 2IIuXFmkcTpj3lKEFKW25A== Received: from [83.176.166.173] (HELO poincare) by mailfe01.swip.net (CommuniGate Pro SMTP 4.3c5) with ESMTP id 364627645; Tue, 10 May 2005 02:10:33 +0200 Received: from localhost ([127.0.0.1] ident=trch) by poincare with esmtp (Exim 4.50) id 1DVIKT-0002lz-KI; Tue, 10 May 2005 02:10:29 +0200 Date: Tue, 10 May 2005 02:10:28 +0200 (CEST) Message-Id: <20050510.021028.12907561.debian00@tiscali.be> To: info@gerd-stolpmann.de Cc: caml-list@inria.fr, ocamlnet-devel@lists.sourceforge.net Subject: Re: [Caml-list] Common CGI interface From: Christophe TROESTLER In-Reply-To: <20050506.221435.115446774.debian00@tiscali.be> References: <20050425.123834.56365302.debian00@tiscali.be> <1114513681.6324.156.camel@localhost.localdomain> <20050506.221435.115446774.debian00@tiscali.be> Organization: Universite de Mons-Hainaut (http://math.umh.ac.be/an/) X-Spook: event security Mole secure AMEMB DES Arnett credit card Belknap MDA analyzer X-Blessing: Om Ah Hum Vajra Guru Pema Siddhi Hum X-Operating-System: GNU/Linux (http://www.linux.org/) X-Mailer-URL: http://www.mew.org/ X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Multipart/Mixed; boundary="--Next_Part(Tue_May_10_02_10_28_2005_809)--" Content-Transfer-Encoding: 7bit X-Miltered: at nez-perce with ID 427FFBFD.001 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Miltered: at concorde with ID 427FFBFC.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; caml-list:01 christophe:01 troestler:01 defaults:01 parsing:01 val:01 val:01 config:01 config:01 sockaddr:01 sockaddr:01 bool:01 handler:01 threads:01 handler:01 X-Attachments: name="netcgi.mli" X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on yquem.inria.fr X-Spam-Status: No, score=0.5 required=5.0 tests=FROM_ENDS_IN_NUMS autolearn=disabled version=3.0.2 X-Spam-Level: ----Next_Part(Tue_May_10_02_10_28_2005_809)-- Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Hi, Let me continue to develop some ideas about the socket-accept-handle triplet. - For [socket] (and [run]) of the AJP connector, I would add an optional parameter [?props] to be able to pass an optional property list -- the values set in this way supersede the values set by other optional arguments in order to be able to set defaults. To be more flexible, I would also replace [jvm_emu_main] by a function parsing the arguments (allowing to set more arguments if needed). That yields the following functions: val arg_parse : ?anon_fun:(string -> unit) -> ?usage_msg:string -> (Arg.key * Arg.spec * Arg.doc) list -> (string * string) list val props_of_file : string -> (string * string) list val run : ?props:(string * string) list ?config:config -> ?output_type:output_type -> ?arg_storage:(string -> Netmime.mime_header -> arg_storage) -> ?sockaddr:Unix.sockaddr -> (cgi -> unit) -> unit val socket : ?props:(string * string) list -> ?backlog:int -> ?reuseaddr:bool -> ?addr:Unix.inet_addr -> ?port:int -> Unix.file_descr The equivalent to [jvm_emu_main(fun props auth addr port -> server f auth addr port)] now reads: AJP.run ~props:(AJP.arg_parse []) f - The [handle_connection] will not be in a separate module but instead adapts itself to the version present in the protocol (convenient but also useful e.g. if the app machine must handle several web servers with different versions of the protocol). How will [handle_connection] reports that the app must shutdown or restart? So far, it was raising an exception. However, I believe it is better to "force" the user to handle them, so they should be return values instead: type connection_handler = Unix.file_descr -> Unix.file_descr -> (cgi -> unit) -> connection_status and connection_status = Ok | Shutdown | Restart (The [Ok] is in case the server shuts down the connection after a request which I think it is entitled to do.) In case of threads, that makes also easy to transmit the value to the "main thread" through a ['a Event.event] -- this will generally be preferred to pipes in a multi-threaded app. I guess. The [connection_handler] takes two file descriptors, one for input and the other one for output, for flexibility (e.g. buiding a chain,...). Generally, both will be the same socket. (Maybe the distinction is overkill.) The connection handler will execute the function [cgi -> unit] for each (well formed) request coming in and catch (and log) all exceptions (it will raise no exception itself). As you see, I do not require the user to create a cgi object -- it is done automatically as most of the time one will want to do so. I think it does not make much sense to execute the function [cgi -> unit] in a separate thread or process for AJP as all requests are presented sequentially. For FCGI with mutliplexed requests it does however, so some flexibility is required in that case (such flexibility is not provided by the current FCGI interface either). - The "accept" part is the more difficult as it is where the peculiarities of the concurrent model express themselves the more. On the other hand, it is fairly generic in the sense it depends very little on the protocol one must handle. In fact, I do not think it is possible to define a function that will handle all cases -- or its interface will be prohibitively complicated, thus defeat its purpose. Therefore, I believe the best is to handle common cases and leave it to the user to define its specialized cases (plus I guess functors can be defined on top of [socket] and [handle_connection] as accept does not depend on the protocol). The first case is a new process per connection. Since we need a pipe to communicate to the father the return value of the child, the signature is: val accept_fork : ?props:(string * string) list -> ?onrestart:(unit -> unit) -> ?onshutdown:(unit -> unit) -> ?allow_hosts:Unix.inet_addr list -> ?fork:((Unix.file_descr -> unit) -> int * Unix.file_descr) -> connection_handler -> Unix.inet_addr -> unit where the first [Unix.file_descr] is in the child to write the return value and the second one is in the father to read it. A typical fork function is let fork f = let (infd, outfd) = Unix.pipe () in match Unix.fork() with | 0 -> f outfd; exit 0 (* or the double fork trick *) | n -> n, infd As for threads, I guess the better is to use an event -- created by accept itself -- to transmit the return value, so the following interface should be enough: val accept_threads : ?props:(string * string) list -> ?onrestart:(unit -> unit) -> ?onshutdown:(unit -> unit) -> ?allow_hosts:Unix.inet_addr list -> ?thread:((unit -> unit) -> unit) -> connection_handler -> Unix.inet_addr -> unit (It should be possible to start a new thread everytime or to use a thread pool with this.) Maybe I forgot some important cases and the above design should be refined but at least it convey the idea. The interface modified with the previous ideas is attached. Regards, ChriS --- P.S. If we ever go in the direction I suggest, a possibility is to develop it in a netcgi directory. Not only that will make possible for the two versions to coexist but is more natural as the library name is netcgi (think of -I +netcgi). ----Next_Part(Tue_May_10_02_10_28_2005_809)-- Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="netcgi.mli" (* * Types and functions shared by all connectors ***********************************************************************) module Random : sig val init : ?lock:(unit -> unit) -> ?unlock:(unit -> unit) -> string -> unit val init_from_file : ?lock:(unit -> unit) -> ?unlock:(unit -> unit) -> ?length:int -> string -> unit val byte : unit -> int val sessionid : int -> string (** [sessionid length] generates a [8 * length] bit ([lenght] hex digits) random string which can be used for session IDs, random cookies and so on. The string returned will be very random and hard to predict *) end class type argument = (* I do not see the point of the cgi prefix *) object (* ... methods to be discussed ... *) method storage : [`Memory | `File of string] (* No need to define [store] as it is only used here -- saying it here makes it easier to figure out what the method is. *) method representation : [ `Simple of Netmime.mime_body | `MIME of Netmime.mime_message ] (* Same justification as above: having a single point of entry to undertand a method is easier. *) end type cookie = ... (* I do not see the point of the cgi prefix *) class type environment = object (* Only changed methods are listed *) method cookie : string -> cookie method cookies : (string * cookie) list (* The first one is convenient. Moreover, both should use the cookie type. *) (* Since [set_input_state] and [set_output_state] are not supposed to be for the final user, it would be nicer if they did not appear here. In the same vein, I would not include [input_ch] and [input_state] in the *public* interface: they are only useful for the [cgi] to initialize itself. *) end class type cgi = (* formerly cgi_activation *) object (* I believe short names are better so long they are as readable *) method arg : string -> argument method arg_val : ?default:string -> string -> string (*was [argument_value]*) method arg_true : string -> bool (** This method returns false if the named parameter is missing, is an empty string, or is the string ["0"]. Otherwise it returns true. Thus the intent of this is to return true in the Perl sense of the word. If a parameter appears multiple times, then this uses the first definition and ignores the others. *) method arg_all : string -> argument list (* formerly [multiple_argument] *) method args : (string * argument) list method url : ?protocol:Netcgi.protocol -> ... -> unit -> string method set_header : ?status:status -> ... -> unit -> unit method set_redirection_header : string -> unit method output : Netchannels.trans_out_obj_channel method log : Netchannels.out_obj_channel (** A channel whose data is appended to the webserver log. *) method finalize : unit -> unit method environment : Netcgi.environment method request_method : [`GET | `HEAD | `POST | `DELETE | `PUT of argument] (* Single point of doc. *) end type config = { tmp_directory : string; tmp_prefix : string; permitted_http_methods : [`GET | `HEAD | `POST | `DELETE | `PUT] list; (* Uniformity *) permitted_input_content_types : string list; input_content_length_limit : int; workarounds : [ `MSIE_Content_type_bug | `backslash_bug ] list; (* Single point of documentation. *) } (* * Connectors ***********************************************************************) (* These names better convey the intent I think *) type output_type = [`Direct | `Transactional] type arg_storage = [`Memory | `File | `Automatic] type connection_handler = Unix.file_descr -> Unix.file_descr -> (cgi -> unit) -> connection_status and connection_status = Ok | Shutdown | Restart module CGI : sig val run : ?config:config -> ?output_type:output_type -> ?arg_storage:(string -> Netmime.mime_header -> arg_storage) -> (cgi -> unit) -> unit end module FCGI : sig val run : ?config:config -> ?output_type:output_type -> ?arg_storage:(string -> Netmime.mime_header -> arg_storage) -> ?sockaddr:Unix.sockaddr -> (cgi -> unit) -> unit (* [run] must handle the fact that on windows, apache communicates with FCGI scripts through named pipes. *) (* Some flexible functions that allow any concurrency model. Here is a possibility. *) val socket : ?backlog:int -> ?reuseaddr:bool -> ?addr:Unix.inet_addr -> ?port:int -> Unix.file_descr (** [socket ?backlog ?reuseaddr ?addr ?port] setup a FCGI socket listening to incomming requests. @param backlog Length of the backlog queue (connections not yet accepted by the AJP server) @param reuseaddr Whether to reuse the port @param addr defaults to localhost. @param port if not present, uses stdin (which is a socket on Unix or -- contrarily to the spec -- a pipe on win$). *) (* Functions analogous to the ones of AJP *) end module AJP : sig val arg_parse : ?anon_fun:(string -> unit) -> ?usage_msg:string -> (Arg.key * Arg.spec * Arg.doc) list -> (string * string) list val props_of_file : string -> (string * string) list val run : ?props:(string * string) list ?config:config -> ?output_type:output_type -> ?arg_storage:(string -> Netmime.mime_header -> arg_storage) -> ?sockaddr:Unix.sockaddr -> (cgi -> unit) -> unit val socket : ?props:(string * string) list -> ?backlog:int -> ?reuseaddr:bool -> ?addr:Unix.inet_addr -> ?port:int -> Unix.file_descr (** [socket ?props ?backlog ?reuseaddr ?addr ?port] setup a AJP socket listening to incomming requests. @param backlog Length of the backlog queue (connections not yet accepted by the AJP server) @param reuseaddr Whether to reuse the port @param addr defaults to localhost. @param port if not present, assume the program is launched by the web server. *) val accept_fork : ?props:(string * string) list -> ?onrestart:(unit -> unit) -> ?onshutdown:(unit -> unit) -> ?allow_hosts:Unix.inet_addr list -> ?fork:((Unix.file_descr -> unit) -> int * Unix.file_descr) -> connection_handler -> Unix.inet_addr -> unit val accept_threads : ?props:(string * string) list -> ?onrestart:(unit -> unit) -> ?onshutdown:(unit -> unit) -> ?allow_hosts:Unix.inet_addr list -> ?thread:((unit -> unit) -> unit) -> connection_handler -> Unix.inet_addr -> unit val handle_connection : ?props:(string * string) list ?config:config -> ?auth:(int * string) -> connection_handler end module Test : sig val simple_arg : string -> string -> argument val mime_arg : ?work_around_backslash_bug:bool -> string -> Netmime.mime_message -> argument val run : ?config:config -> ?output_type:output_type -> ?arg_storage:(string -> Netmime.mime_header -> arg_storage) -> ?args:cgi_argument list -> ?meth:request_method -> (cgi -> unit) -> unit (* More flexibility is definitely required here -- along the lines of [custom_environment]. I am thinking one could e.g. be general enough to allow the output to be set into a frame, and another frame being used for control, logging,... -- a live debugger if you like! *) end ----Next_Part(Tue_May_10_02_10_28_2005_809)----