caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Christophe TROESTLER <debian00@tiscali.be>
To: info@gerd-stolpmann.de
Cc: caml-list@inria.fr, ocamlnet-devel@lists.sourceforge.net
Subject: Re: [Caml-list] Common CGI interface
Date: Tue, 10 May 2005 02:10:28 +0200 (CEST)	[thread overview]
Message-ID: <20050510.021028.12907561.debian00@tiscali.be> (raw)
In-Reply-To: <20050506.221435.115446774.debian00@tiscali.be>

[-- Attachment #1: Type: Text/Plain, Size: 5444 bytes --]

Hi,

Let me continue to develop some ideas about the socket-accept-handle
triplet.

- For [socket] (and [run]) of the AJP connector, I would add an
  optional parameter [?props] to be able to pass an optional property
  list -- the values set in this way supersede the values set by other
  optional arguments in order to be able to set defaults.  To be more
  flexible, I would also replace [jvm_emu_main] by a function parsing
  the arguments (allowing to set more arguments if needed).  That
  yields the following functions:

  val arg_parse : ?anon_fun:(string -> unit) -> ?usage_msg:string ->
    (Arg.key * Arg.spec * Arg.doc) list -> (string * string) list
  val props_of_file : string -> (string * string) list

  val run :
    ?props:(string * string) list
    ?config:config ->
    ?output_type:output_type ->
    ?arg_storage:(string -> Netmime.mime_header -> arg_storage) ->
    ?sockaddr:Unix.sockaddr ->
    (cgi -> unit) -> unit

  val socket : ?props:(string * string) list -> ?backlog:int ->
    ?reuseaddr:bool -> ?addr:Unix.inet_addr -> ?port:int -> Unix.file_descr

  The equivalent to [jvm_emu_main(fun props auth addr port -> server f
  auth addr port)] now reads:

  AJP.run ~props:(AJP.arg_parse []) f


- The [handle_connection] will not be in a separate module but instead
  adapts itself to the version present in the protocol (convenient but
  also useful e.g. if the app machine must handle several web servers
  with different versions of the protocol).

  How will [handle_connection] reports that the app must shutdown or
  restart?  So far, it was raising an exception.  However, I believe
  it is better to "force" the user to handle them, so they should be
  return values instead:

  type connection_handler =
      Unix.file_descr -> Unix.file_descr -> (cgi -> unit) -> connection_status
  and connection_status = Ok | Shutdown | Restart

  (The [Ok] is in case the server shuts down the connection after a
  request which I think it is entitled to do.)

  In case of threads, that makes also easy to transmit the value to
  the "main thread" through a ['a Event.event] -- this will generally
  be preferred to pipes in a multi-threaded app. I guess.

  The [connection_handler] takes two file descriptors, one for input
  and the other one for output, for flexibility (e.g. buiding a
  chain,...).  Generally, both will be the same socket.  (Maybe the
  distinction is overkill.)

  The connection handler will execute the function [cgi -> unit] for
  each (well formed) request coming in and catch (and log) all
  exceptions (it will raise no exception itself).  As you see, I do
  not require the user to create a cgi object -- it is done
  automatically as most of the time one will want to do so.

  I think it does not make much sense to execute the function [cgi ->
  unit] in a separate thread or process for AJP as all requests are
  presented sequentially.  For FCGI with mutliplexed requests it does
  however, so some flexibility is required in that case (such
  flexibility is not provided by the current FCGI interface either).


- The "accept" part is the more difficult as it is where the
  peculiarities of the concurrent model express themselves the more.
  On the other hand, it is fairly generic in the sense it depends very
  little on the protocol one must handle.

  In fact, I do not think it is possible to define a function that
  will handle all cases -- or its interface will be prohibitively
  complicated, thus defeat its purpose.  Therefore, I believe the best
  is to handle common cases and leave it to the user to define its
  specialized cases (plus I guess functors can be defined on top of
  [socket] and [handle_connection] as accept does not depend on the
  protocol).

  The first case is a new process per connection.  Since we need a
  pipe to communicate to the father the return value of the child, the
  signature is:

  val accept_fork  : ?props:(string * string) list ->
    ?onrestart:(unit -> unit) -> ?onshutdown:(unit -> unit) ->
    ?allow_hosts:Unix.inet_addr list ->
    ?fork:((Unix.file_descr -> unit) -> int * Unix.file_descr) ->
    connection_handler -> Unix.inet_addr -> unit

  where the first [Unix.file_descr] is in the child to write the
  return value and the second one is in the father to read it.  A
  typical fork function is

  let fork f =
    let (infd, outfd) = Unix.pipe () in
    match Unix.fork() with
    | 0 -> f outfd; exit 0  (* or the double fork trick *)
    | n -> n, infd

  As for threads, I guess the better is to use an event -- created by
  accept itself -- to transmit the return value, so the following
  interface should be enough:

  val accept_threads : ?props:(string * string) list ->
    ?onrestart:(unit -> unit) -> ?onshutdown:(unit -> unit) ->
    ?allow_hosts:Unix.inet_addr list ->
    ?thread:((unit -> unit) -> unit) ->
    connection_handler -> Unix.inet_addr -> unit

  (It should be possible to start a new thread everytime or to use a
  thread pool with this.)

  Maybe I forgot some important cases and the above design should be
  refined but at least it convey the idea.


The interface modified with the previous ideas is attached.

Regards,
ChriS



---
P.S. If we ever go in the direction I suggest, a possibility is to
develop it in a netcgi directory.  Not only that will make possible
for the two versions to coexist but is more natural as the library
name is netcgi (think of -I +netcgi).

[-- Attachment #2: netcgi.mli --]
[-- Type: Text/Plain, Size: 7339 bytes --]

(*
 * Types and functions shared by all connectors
 ***********************************************************************)

module Random :
sig
  val init : ?lock:(unit -> unit) -> ?unlock:(unit -> unit) -> string -> unit
  val init_from_file : ?lock:(unit -> unit) -> ?unlock:(unit -> unit) ->
    ?length:int -> string -> unit
  val byte : unit -> int
  val sessionid : int -> string
    (** [sessionid length] generates a [8 * length] bit ([lenght] hex
        digits) random string which can be used for session IDs,
        random cookies and so on.  The string returned will be very
        random and hard to predict *)
end


class type argument = (* I do not see the point of the cgi prefix *)
object
  (* ... methods to be discussed ... *)

  method storage : [`Memory | `File of string]
    (* No need to define [store] as it is only used here -- saying it
       here makes it easier to figure out what the method is. *)
  method representation : [ `Simple of Netmime.mime_body
                          | `MIME of Netmime.mime_message ]
    (* Same justification as above: having a single point of entry to
       undertand a method is easier. *)
end


type cookie = ... (* I do not see the point of the cgi prefix *)


class type environment =
object
  (* Only changed methods are listed *)

  method cookie : string -> cookie
  method cookies : (string * cookie) list
    (* The first one is convenient.  Moreover, both should use the
       cookie type. *)

  (* Since [set_input_state] and [set_output_state] are not supposed
     to be for the final user, it would be nicer if they did not
     appear here.

     In the same vein, I would not include [input_ch] and
     [input_state] in the *public* interface: they are only useful for
     the [cgi] to initialize itself. *)
end

class type cgi = (* formerly cgi_activation *)
object
  (* I believe short names are better so long they are as readable *)
  method arg : string -> argument
  method arg_val : ?default:string -> string -> string (*was [argument_value]*)
  method arg_true : string -> bool
    (** This method returns false if the named parameter is missing,
        is an empty string, or is the string ["0"]. Otherwise it
        returns true. Thus the intent of this is to return true in the
        Perl sense of the word.  If a parameter appears multiple
        times, then this uses the first definition and ignores the
        others. *)
  method arg_all : string -> argument list (* formerly [multiple_argument] *)
  method args : (string * argument) list

  method url : ?protocol:Netcgi.protocol -> ... -> unit -> string

  method set_header : ?status:status -> ... -> unit -> unit
  method set_redirection_header : string -> unit
  method output : Netchannels.trans_out_obj_channel

  method log : Netchannels.out_obj_channel
    (** A channel whose data is appended to the webserver log. *)

  method finalize  : unit -> unit

  method environment : Netcgi.environment
  method request_method : [`GET | `HEAD | `POST | `DELETE | `PUT of argument]
    (* Single point of doc. *)
end



type config = {
  tmp_directory : string;
  tmp_prefix : string;
  permitted_http_methods : [`GET | `HEAD | `POST | `DELETE | `PUT] list;
                                                             (* Uniformity *)
  permitted_input_content_types : string list;
  input_content_length_limit : int;
  workarounds : [ `MSIE_Content_type_bug | `backslash_bug ] list;
    (* Single point of documentation. *)
}


(*
 * Connectors
 ***********************************************************************)

(* These names better convey the intent I think *)
type output_type = [`Direct | `Transactional]
type arg_storage = [`Memory | `File | `Automatic]

type connection_handler =
    Unix.file_descr -> Unix.file_descr -> (cgi -> unit) -> connection_status
and connection_status = Ok | Shutdown | Restart

module CGI :
sig
  val run :
    ?config:config ->
    ?output_type:output_type ->
    ?arg_storage:(string -> Netmime.mime_header -> arg_storage) ->
    (cgi -> unit) -> unit
end

module FCGI :
sig
  val run :
    ?config:config ->
    ?output_type:output_type ->
    ?arg_storage:(string -> Netmime.mime_header -> arg_storage) ->
    ?sockaddr:Unix.sockaddr ->
    (cgi -> unit) -> unit
    (* [run] must handle the fact that on windows, apache communicates
       with FCGI scripts through named pipes.  *)

  (* Some flexible functions that allow any concurrency model.  Here is
     a possibility. *)
  val socket : ?backlog:int -> ?reuseaddr:bool ->
    ?addr:Unix.inet_addr -> ?port:int -> Unix.file_descr
    (** [socket ?backlog ?reuseaddr ?addr ?port] setup a FCGI socket
	listening to incomming requests.

	@param backlog Length of the backlog queue (connections not yet
	accepted by the AJP server)
	@param reuseaddr Whether to reuse the port
	@param addr defaults to localhost.
	@param port if not present, uses stdin (which is a socket on
 	Unix or -- contrarily to the spec -- a pipe on win$).
    *)

(* Functions analogous to the ones of AJP *)
end

module AJP :
sig
  val arg_parse : ?anon_fun:(string -> unit) -> ?usage_msg:string ->
    (Arg.key * Arg.spec * Arg.doc) list -> (string * string) list
  val props_of_file : string -> (string * string) list

  val run :
    ?props:(string * string) list
    ?config:config ->
    ?output_type:output_type ->
    ?arg_storage:(string -> Netmime.mime_header -> arg_storage) ->
    ?sockaddr:Unix.sockaddr ->
    (cgi -> unit) -> unit


  val socket : ?props:(string * string) list -> ?backlog:int ->
    ?reuseaddr:bool -> ?addr:Unix.inet_addr -> ?port:int -> Unix.file_descr
    (** [socket ?props ?backlog ?reuseaddr ?addr ?port] setup a AJP
	socket listening to incomming requests.

	@param backlog Length of the backlog queue (connections not yet
	accepted by the AJP server)
	@param reuseaddr Whether to reuse the port
	@param addr defaults to localhost.
	@param port if not present, assume the program is launched
	by the web server.
    *)

  val accept_fork  : ?props:(string * string) list ->
    ?onrestart:(unit -> unit) -> ?onshutdown:(unit -> unit) ->
    ?allow_hosts:Unix.inet_addr list ->
    ?fork:((Unix.file_descr -> unit) -> int * Unix.file_descr) ->
    connection_handler -> Unix.inet_addr -> unit

  val accept_threads : ?props:(string * string) list ->
    ?onrestart:(unit -> unit) -> ?onshutdown:(unit -> unit) ->
    ?allow_hosts:Unix.inet_addr list ->
    ?thread:((unit -> unit) -> unit) ->
    connection_handler -> Unix.inet_addr -> unit

  val handle_connection : ?props:(string * string) list
    ?config:config -> ?auth:(int * string) -> connection_handler
end

module Test :
sig
  val simple_arg : string -> string -> argument
  val mime_arg : ?work_around_backslash_bug:bool ->
    string -> Netmime.mime_message -> argument

  val run :
    ?config:config ->
    ?output_type:output_type ->
    ?arg_storage:(string -> Netmime.mime_header -> arg_storage) ->
    ?args:cgi_argument list ->
    ?meth:request_method ->
    (cgi -> unit) -> unit
    (* More flexibility is definitely required here -- along the lines
       of [custom_environment].  I am thinking one could e.g. be
       general enough to allow the output to be set into a frame, and
       another frame being used for control, logging,... -- a live
       debugger if you like! *)
end

  parent reply	other threads:[~2005-05-10  0:10 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-04-18  6:15 CamlGI question Mike Hamburg
2005-04-18  7:29 ` [Caml-list] " Robert Roessler
2005-04-18 13:49   ` Alex Baretta
2005-04-18 14:31     ` Gerd Stolpmann
2005-04-18 16:04       ` Michael Alexander Hamburg
2005-04-18 16:28         ` Alex Baretta
2005-04-19  3:23           ` Mike Hamburg
2005-04-19  3:26             ` [Caml-list] CamlGI question [doh] Mike Hamburg
2005-04-19  9:18               ` Gerd Stolpmann
2005-04-19 15:28                 ` Mike Hamburg
     [not found]                   ` <1113933973.6248.76.camel@localhost.localdomain>
2005-04-19 18:44                     ` Eric Stokes
2005-04-19 19:18                       ` Christophe TROESTLER
2005-04-19 21:11                     ` Eric Stokes
2005-04-19  9:31               ` Alex Baretta
2005-04-19 11:33 ` [Caml-list] CamlGI question Christophe TROESTLER
2005-04-19 12:51   ` Christopher Alexander Stein
2005-04-19 19:03     ` Common CGI interface (was: [Caml-list] CamlGI question) Christophe TROESTLER
2005-04-19 19:54       ` Gerd Stolpmann
2005-04-20  6:55         ` Jean-Christophe Filliatre
2005-04-20  7:22         ` Common XML interface (was: Common CGI interface) Alain Frisch
2005-04-20 11:15           ` [Caml-list] " Gerd Stolpmann
2005-04-20 11:38             ` Nicolas Cannasse
2005-04-20 13:23           ` Stefano Zacchiroli
2005-04-21  6:59             ` [Caml-list] Common XML interface Alain Frisch
2005-04-21 11:34               ` Gerd Stolpmann
2005-04-20 20:00         ` Common CGI interface Christophe TROESTLER
2005-04-20 21:06           ` [Caml-list] " Gerd Stolpmann
2005-04-21  7:36             ` [Ocamlnet-devel] " Florian Hars
2005-04-21 10:41               ` Gerd Stolpmann
2005-04-25 10:38             ` Christophe TROESTLER
2005-04-26 11:08               ` Gerd Stolpmann
2005-05-06 20:14                 ` Christophe TROESTLER
2005-05-10  0:07                   ` [Caml-list] " Christophe TROESTLER
2005-05-10  0:10                   ` Christophe TROESTLER [this message]
2005-04-26 16:24               ` [Caml-list] " Eric Stokes
2005-05-06 20:14                 ` Christophe TROESTLER
2005-04-19 20:13   ` [Caml-list] CamlGI question Michael Alexander Hamburg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050510.021028.12907561.debian00@tiscali.be \
    --to=debian00@tiscali.be \
    --cc=caml-list@inria.fr \
    --cc=info@gerd-stolpmann.de \
    --cc=ocamlnet-devel@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).