From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <weis@pauillac.inria.fr>
Delivered-To: caml-list@yquem.inria.fr
Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39])
	by yquem.inria.fr (Postfix) with ESMTP id A5A5EBC84
	for <caml-list@yquem.inria.fr>; Tue, 26 Apr 2005 13:08:08 +0200 (CEST)
Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35])
	by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id j3QB8889018503
	for <caml-list@yquem.inria.fr>; Tue, 26 Apr 2005 13:08:08 +0200
Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id NAA22367 for <caml-list@pauillac.inria.fr>; Tue, 26 Apr 2005 13:08:07 +0200 (MET DST)
Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.126.190])
	by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id j3QB87QX011018
	for <caml-list@inria.fr>; Tue, 26 Apr 2005 13:08:07 +0200
Received: from [212.227.126.160] (helo=mrelayng.kundenserver.de)
	by moutng.kundenserver.de with esmtp (Exim 3.35 #1)
	id 1DQNvC-0004q1-00; Tue, 26 Apr 2005 13:08:06 +0200
Received: from [84.167.167.82] (helo=gate.lan.gerd-stolpmann.de)
	by mrelayng.kundenserver.de with asmtp (Exim 3.35 #1)
	id 1DQNvB-0005Ew-00; Tue, 26 Apr 2005 13:08:05 +0200
Received: from flakew.lan.gerd-stolpmann.de (flakew.lan.gerd-stolpmann.de [192.168.0.32])
	by gate.lan.gerd-stolpmann.de (Postfix) with ESMTP id 287D9C30F;
	Tue, 26 Apr 2005 13:08:03 +0200 (CEST)
Subject: Re: [Caml-list] Re: Common CGI interface
From: Gerd Stolpmann <info@gerd-stolpmann.de>
To: Christophe TROESTLER <debian00@tiscali.be>
Cc: caml-list@inria.fr, ocamlnet-devel@lists.sourceforge.net
In-Reply-To: <20050425.123834.56365302.debian00@tiscali.be>
References: <1113940495.6248.101.camel@localhost.localdomain>
	 <20050420.220031.69130073.debian00@tiscali.be>
	 <1114031186.6243.48.camel@localhost.localdomain>
	 <20050425.123834.56365302.debian00@tiscali.be>
Content-Type: text/plain
Date: Tue, 26 Apr 2005 13:08:01 +0200
Message-Id: <1114513681.6324.156.camel@localhost.localdomain>
Mime-Version: 1.0
X-Mailer: Evolution 2.0.2 
Content-Transfer-Encoding: 7bit
X-Provags-ID: kundenserver.de abuse@kundenserver.de auth:a6865a839c0178d9aa0ce41878507ea2
X-Miltered: at concorde with ID 426E2118.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)!
X-Miltered: at nez-perce with ID 426E2117.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)!
X-Spam: no; 0.00; caml-list:01 gerd:01 stolpmann:01 christophe:01 troestler:01 gerd:01 stolpmann:01 model:01 model:01 variants:01 reuse:01 descriptors:01 variants:01 prng:01 beginners:01 
X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on yquem.inria.fr
X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled 
	version=3.0.2
X-Spam-Level: 

Am Montag, den 25.04.2005, 12:38 +0200 schrieb Christophe TROESTLER:
> On Wed, 20 Apr 2005, Gerd Stolpmann <info@gerd-stolpmann.de> wrote:
> > 
> > defining their own connector.
> 
> I understand one needs to do so to extend the library but can you name
> other situations?  My feeling is that CGI, FCGI, AJP, and test are the
> more used ones and that a custom connector is seldom needed...  so
> shouldn't the standard connectors share a common standard (of course
> with a few peculiarities to each) while the function(s) to create new
> ones are grouped into a separate module. 

In principle you are right, and in fact the connectors share a common
standard except the way they are created. It would be nice if this way
could be made similar for each. In the past, this was not high priority
because it was more important to _have_ the connectors.

The creation of connectors depends very much on the overall processing
model. The most important types of models are:

- The model is controlled by the web server. This is true for CGI where
  the server creates a new process for each request. This may be also
  the case for FastCGI, and even for AJP (although regarded as 
  obsolete).

  In this case the web server creates new processes when needed, and
  communicates somehow with existing processes. As you mention
  the function prototype establish_server: I think this is not 
  applicable here, because the application only reacts on the
  web server, but isn't a real server of its own. And wrapping the
  "reactivity" into a function establish_server would be very strange,
  at least.

- Multi-processing controlled by the application itself. Currently,
  this is only implemented for AJP. Multi-processing has several
  variants, the simplest form is "fork on new connection", but for
  high performance one needs to reuse processes, and pre-forking.

  Multi-processing is special because the connector must know about
  it - this has to do with the way file descriptors can be passed
  between processes - generally only from master to child, and 
  this means some parts of the connector run in the master process,
  and other parts in the children processes.

- Multi-threading controlled by the application itself. As for
  multi-processing, one can have several variants. It is much simpler
  for the connector, however, because one does not have the file
  descriptor passing limitations, and the connector can be made
  very configurable in this case.

There are further aspects that are different for the processing models:

- Whether only one URL is served by the application, or several, or
  a whole URL prefix.

- Sometimes persistent connections to external services (e.g. databases)
  are needed. Managing these depends very much on the model. For
  example, in a multi-threading model one usually wants a shared pool
  of external connections. In a multi-processing model pooling is not
  possible, but one can have one connection per process.

- Sometimes the instances of the application want to communicate with
  each other, e.g. fast management of sessions, or even coordinated
  shutdown.

My point is that it is not easy to find a common description of all that
such that one can have a common signature for all connectors. And even
if we had that: Do you really want to say for every CGI that you don't
have all that features that are only available for server models? Or
isn't it more honest to do without complex specifications when they
aren't available?

>  The prng* functions should
> be in the main module -- an additional [random_sessionid] function
> (generating e.g. 128 bits random strings) could be useful.

Accepted.

> > Of course, this is confusing for beginners, but I don't really see
> > how to improve this without giving up modularity (i.e. every
> > connector has its own entry point).
> 
> I am afraid that I am not sure to fully grasp which kind of modularity
> you have in mind -- certainly because of my lack of experience in web
> devel.  For example, I do not understand why
> [Netcgi_jserv.server_init] is not just included in [server_loop].

This is not possible for multi-processing models: server_init runs in
the master process, and server_loop in every child process.

> Another reason modularity is good for it multithreading (or multiple
> processes).  But there are other ways to handle that than to split
> into many functions.  For example, on can imagine
> 
>   val establish_server : ?max_conns:int -> ... ->
>     ?fork:((connection -> unit) -> connection -> unit) ->
>     (connection -> unit) -> Unix.sockaddr -> unit
> 
> (?fork can create a process or a thread).  This makes it possible to
> wrap the function handling the connection (connection -> unit) so that
> exceptions it raises are dealt with appropriately -- thus for example
> it seems possible to get rid of the care the user must exercise with
> [Signal_shutdown]...
> 
> May you explain situations for which a [establish_server] /
> [handle_request] modularity is not enough?

In general, I can imagine for every connector a single entry point
function, although the arguments of each would be very different.

The other question is whether one should hide all the internals (e.g.
"pack" them away). I think no, this is the wrong way, although some
better way of communicating where the entry-point functions are would be
desirable (better documentation, maybe special modules only for users).

> > If we added the callback method for CGI, it would be simply
> 
> I am not suggesting to simply _add_ one (that would just make the
> whole interface more confusing) but to rework the interface so that
> 
> * all connectors are treated equally (e.g. CGI is noting special
>   w.r.t. other, conceptually) and the modularity is handled the same
>   way for all of them (short of optional arguments).
> 
> * a separate module possesses the material to extend netcgi, e.g. to
>   create specially tailored connectors.

As explained, this isn't as simple as you think. The connectors aren't
equal, and the user must know that, and merging the specific differences
into a common standard is far from being trivial.

I am currently thinking about a system of configuration objects. Since
O'Caml 3.08 we have anonymous objects, and that means creating an object
is no more difficult than creating a record value, e.g.

let my_config =
  object
    method this_property = Do it this way
    method that_property = Do it that way
  end

The idea is that the web application can define one configuration
object, and every connector picks the parts of the configuration it
needs (by using subtyping). It is simple to define defaults - the object
simply inherits from a default configuration class. The point is that
every connector can request as many configuration options as it needs
from the user. Options that have the same meaning for several connectors
get the same names.

Ideally, the web application needs only to define one configuration
object, and only by calling a different entry point the connector can be
changed to a different one.

> Another thing that seems to be lacking is a uniform way to write in
> the server log.  For CGI it is stderr, FCGI uses special "channel"
> (not stderr),...  This is important e.g. to log nonfatal errors.

Accepted. The environment will have an error-logging function in the
future. (I also need that for the HTTP daemon I am currently
developing.)

> > > About arguments: is the mutability of arguments useful?  This makes
> > > the whole interface more complex for a purpose I can't see.  
> > 
> > For example, to help for debugging.
> 
> May you explain how?  Is it useful to modify the value of a param
> inside a request handling function, with global effect (i.e. not just
> for the function scope)?  Setting parameters before handling the
> request is a different matter -- a powerful "test" mode can certainly
> do this without mutability (exposed).

I guess you see a dynamic web page as a function, and of course the
arguments of a function are immutable. However, this is only one view of
the thing.

In web development, the arguments are often considered as session state,
passed from one web page to the next. In this thinking, mutability is
quite natural: the arguments are the global variables of the whole
application spanning the individual web pages it consists of.

> But then, if you do not treat CGI in a special way (i.e. have distinct
> CGI and test connectors) it is not needed.  In fact, it is not clear
> to me why it is good to have [std_environment] and [test_environment]
> in the interface as, as far as I can tell, they will just be used to
> implement the associated connectors (i.e. what this modularity brings
> you?).  [custom_environment] is fine and should be put in the
> "extension" module.

Ok, this exception exposes internals. But, as already pointed out, I
don't see why this is so bad.

Gerd
-- 
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
------------------------------------------------------------