Hi,

To start with, sorry to reply at such a slow pace but I am quite busy
with my main job...

On Tue, 26 Apr 2005, Gerd Stolpmann <info@gerd-stolpmann.de> wrote:
> 
> - The model is controlled by the web server. [...] establish_server:
> I think this is not applicable here

I agree.  This is the easier case as, from the point of view of the
app writer, it basically amounts to a single process (that may serve
several requests sequentially).  [This is also the model for
mod_caml.]  I would suggest that a default function [run] takes care
of this case for every connector.  Resources "opened" (e.g. a DB
connection) before [run f] can be reused by each call of [f].  For
convenience, I would add an optional argument [?sockaddr] to [run] can
turn it into a distant app server (still sequential).

No other connectors are needed for CGI and Test.

> - Multi-processing controlled by the application itself. [...]
> - Multi-threading controlled by the application itself. [...]
>
> - Whether only one URL is served by the application, or several [...]
> - [...] persistent connections to external services [...]
> - [...] instances of the application [...] communicate with each other [...]
>
> My point is that it is not easy to find a common description of all
> that [...] really want to say for every CGI that you don't have all
> that features that are only available for server models?  [...]

As said above, sequential processing should just be done through a
[run] function.  My idea is NOT to implement all the above models,
just to have the minimal set of primitives handling their respective
protocols and factored in such a way that ANY kind of concurrency one
want CAN be programmed on top of them.  After all, as you point out,
one may not be able to describe all possibilities that the user may
want.

The two points where launching a new thread/process makes sense are:
- to accept connections on a given (list of) socket(s);
- to handle a request (when its input has been completely provided).

As far as I understand, for AJP, the second possibility is not so much
interesting (requests are sequential).  It is however for FCGI since
requests may be multiplexed (thus one may want to continue reading the
other requests while processing some).

Currently the primirives on which multi-processes/threads apps can be
built are:
- Netcgi_jserv.server_init
- Netcgi_jserv.server_loop
- Netcgi_jserv_ajp12.serve_connection

IMO they are a bit cumbersome to use (looking at their implementation
is necessary to graps them fully) but they are a good example of what
I mean when I'm talking about a "minimal set of primitives".  I'll
have to think more to see if I can come up with something that I like
better (and hopefully that you will too! :).  (BTW, the AJP protocol
includes its version, so I think it would be good to have a
[serve_connection] that adapts to it -- this makes
[Netcgi_jserv_app.protocol_type] useless.)


A related note on connectors is how they should handle exceptions
raised by the script.  My feeling is that they should catch them and
log them and then get ready for the next request (obviously the last
part only makes sense when several requests are handled -- maybe
sequentially -- by the script).  That seems better than simply
crashing the app.  IMO the exception [Exit] should be treated
specially and be accepted as an appropriate way of ending prematurely
the script (it is really useful to finish early after error
reporting).

> > I do not understand why [Netcgi_jserv.server_init] is not just
> > included in [server_loop].
> 
> This is not possible for multi-processing models: server_init runs in
> the master process, and server_loop in every child process.

I am wondering: why have concurrent, possibly mutex protected, accept
to the sockect instead of having one process listening on it and
dispatching (to processes or threads) on each accept?

> > * all connectors are treated equally
>
> As explained, this isn't as simple as you think. The connectors
> aren't equal, and the user must know that, and merging the specific
> differences into a common standard is far from being trivial.

The purpose is not to hide *all* differences between the connectors
but, as much as possible, to do the same things the same way.  In
other words: have them share a common "philosophy".

> I am currently thinking about a system of configuration objects. [...]

Yes, they are a good idea and may help designing something elegant.

> > > > About arguments: is the mutability of arguments useful?
> 
> [...] the arguments are often considered as session state, passed
> from one web page to the next [...] mutability is quite natural

I would agree if there was a way to "return" them to the next page.
But of course, there can't be at the level of the "connector".  So
having mutability at this level is in fact misleading.  IMO, session
preservation belongs to a "model" of web applications -- that is to a
library build on the top of connectors [1].  It is the role of that
library to provide pages-as-functions, session managment through
arguments/databases/continuations/session-key/cookies,... [2].  The
bottom line is that I believe that connectors should not "push"
towards a given model -- as moreover they can only provide a half
baked solution [3].

---
[1] It is important I believe that the community who can develop such
    "higher level" libraries is offered a simple and standard
    interface to connectors (including mod_caml if you ask me).

[2] I've been dreaming of a session module that can dialog with fully
    typed templates (a la Kartz, but typed) which lets you define the
    session variables you need and manages the state transparently...

[3] The arguments being string or files, they are very much mutable by
    their very own nature.  However, the flexibility to directly
    modify them is barely matched by the interface (e.g. there is no
    need to "open" the argument for reading AND writing).

    Also, the session management may want to have "hidden" variables
    (like a session id that is automatically generated and not user's
    business) and it is not nice that there is a way to modify those
    "behind the back" of the library.

> > [std_environment] and [test_environment]
> > [custom_environment] is fine
> 
> Ok, this exception exposes internals. But, as already pointed out, I
> don't see why this is so bad.

Well, that will not give me nightmares either.  Nonetheless, I am a
strong believer that interfaces should be minimal and hide irrelevant
details unless there is a strong case about it ("it's there but ignore
it" is not neat IMO).  Note that my question (now stripped) was
broader: what is the modularity gained by providing [std_environment]
and [test_environment] ? -- seems to me [custom_environment] is all we
need.

To speak on something concrete I attach a piece of the interface as I
see it.  It does not include the "extension" interface.  If we agree
that it is only is useful to define new connectors, it should either
be an inner module (hidden when -pack'ing) or a separate module.
Normal comments explain the intent or raise questions.  OCamldoc
comments document new functions.

Regards,
ChriS


---
P.S. tmp_directory in make_message_board (in file
netcgi_jserv_app.ml) should take its value from the config record.