caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* CamlGI question
@ 2005-04-18  6:15 Mike Hamburg
  2005-04-18  7:29 ` [Caml-list] " Robert Roessler
  2005-04-19 11:33 ` [Caml-list] CamlGI question Christophe TROESTLER
  0 siblings, 2 replies; 37+ messages in thread
From: Mike Hamburg @ 2005-04-18  6:15 UTC (permalink / raw)
  To: caml-list

Hello to the list.

Is CamlGI still actively maintained?  I'm writing a CGI/FastCGI program 
using it, and have been having some trouble with the library.

My CGI program is designed to be a more flexible way to index web 
pages.  It's not finished yet, but a not-very-polished toy example can 
be found at http://capricorn.dnsalias.org/mike/index/ .  (I may as well 
mention that "not very polished" means, among other things, "not known 
to work in browsers other than Firefox, and known to display wrong in 
Safari").

When used as a FastCGI, the indexing script hangs, either usually or 
always, after writing out exactly 8KB of data (that is, 8192 bytes, no 
more, no less).  After 30 seconds, mod_fastcgi times out the 
connection, but mysteriously writes out the rest of the script's 
output.  It is quite clear that the script has finished by the time the 
hang occurs and that all its output has been written with Unix.write, 
as it displays even if killed with signal 9 while hanging; therefore I 
think it may be a formatting error in the output routines of CamlGI.

The script also sometimes breaks its output pipe to the server, which 
stays broken it until it is restarted manually (eg sudo killall 
index.fcgi).  This may or may not be a separate bug.

The plain CGI version works just fine, if a bit slow, but some of the 
features of the script only work in the FastCGI version, such as 
thumbnailing.

Has anyone familiar with CamlGI seen either of these issues before, or 
have any idea how to resolve them?

Thanks for your time,
--Mike Hamburg


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] CamlGI question
  2005-04-18  6:15 CamlGI question Mike Hamburg
@ 2005-04-18  7:29 ` Robert Roessler
  2005-04-18 13:49   ` Alex Baretta
  2005-04-19 11:33 ` [Caml-list] CamlGI question Christophe TROESTLER
  1 sibling, 1 reply; 37+ messages in thread
From: Robert Roessler @ 2005-04-18  7:29 UTC (permalink / raw)
  To: Mike Hamburg; +Cc: Caml-list

Mike Hamburg wrote:

> Is CamlGI still actively maintained?  I'm writing a CGI/FastCGI program 
> using it, and have been having some trouble with the library.

I am not able to shed any light on the CamlGI question... OTOH, the 
announcement from Gerd Stolpmann a few days ago regarding Ocamlnet 1.0 
may be of interest, given that it includes "a mature implementation of 
the CGI protocol" and "an implementation of the FastCGI protocol".

http://sourceforge.net/projects/ocamlnet

Robert Roessler
roessler@rftp.com
http://www.rftp.com


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] CamlGI question
  2005-04-18  7:29 ` [Caml-list] " Robert Roessler
@ 2005-04-18 13:49   ` Alex Baretta
  2005-04-18 14:31     ` Gerd Stolpmann
  0 siblings, 1 reply; 37+ messages in thread
From: Alex Baretta @ 2005-04-18 13:49 UTC (permalink / raw)
  To: Robert Roessler; +Cc: Mike Hamburg, Caml-list

Robert Roessler wrote:
> Mike Hamburg wrote:
> 
>> Is CamlGI still actively maintained?  I'm writing a CGI/FastCGI 
>> program using it, and have been having some trouble with the library.
> 
> 
> I am not able to shed any light on the CamlGI question... OTOH, the 
> announcement from Gerd Stolpmann a few days ago regarding Ocamlnet 1.0 
> may be of interest, given that it includes "a mature implementation of 
> the CGI protocol" and "an implementation of the FastCGI protocol".

It is worth noting that Baretta DE&IT has commissioned a full 
implementation of the HTTP/1.1 protocol from Gerd. The HTTP library will 
be based on Ocamlnet and will export more or less the same API as the 
Netcgi module. We chose this approach rather than FastCGI because the 
FastCGI project seems dead and did not look like a viable solution for 
our Xcaml application server.

Xcaml aims at being a Apache+Tomcat+JSP+Servlet replacement. The Xcaml 
virtual machine and API are already complete, but the performance which 
they achieve in conjunction with Apache is mediocre. Gerd's new HTTP 
connector Ocamlnet will give us top notch performance while without 
sacrificing the safety guarantees of the Ocaml language and VM.

The new library will be released to the community by Baretta DE&IT and 
Gerd Stolpmann jointly under the terms of the GPL. When the integration 
with the Xcaml server will be done, the full Application System/Xcaml 
will be released under the terms of the GPL.

Alex

-- 
*********************************************************************
http://www.barettadeit.com/
Baretta DE&IT
A division of Baretta SRL

tel. +39 02 370 111 55
fax. +39 02 370 111 54

Our technology:

The Application System/Xcaml (AS/Xcaml)
<http://www.asxcaml.org/>

The FreerP Project
<http://www.freerp.org/>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] CamlGI question
  2005-04-18 13:49   ` Alex Baretta
@ 2005-04-18 14:31     ` Gerd Stolpmann
  2005-04-18 16:04       ` Michael Alexander Hamburg
  0 siblings, 1 reply; 37+ messages in thread
From: Gerd Stolpmann @ 2005-04-18 14:31 UTC (permalink / raw)
  To: Alex Baretta; +Cc: Robert Roessler, Mike Hamburg, Caml-list

Am Montag, den 18.04.2005, 15:49 +0200 schrieb Alex Baretta:
> Robert Roessler wrote:
> > I am not able to shed any light on the CamlGI question... OTOH, the 
> > announcement from Gerd Stolpmann a few days ago regarding Ocamlnet 1.0 
> > may be of interest, given that it includes "a mature implementation of 
> > the CGI protocol" and "an implementation of the FastCGI protocol".
> 
> It is worth noting that Baretta DE&IT has commissioned a full 
> implementation of the HTTP/1.1 protocol from Gerd. The HTTP library will 
> be based on Ocamlnet and will export more or less the same API as the 
> Netcgi module. We chose this approach rather than FastCGI because the 
> FastCGI project seems dead and did not look like a viable solution for 
> our Xcaml application server.
> 
> Xcaml aims at being a Apache+Tomcat+JSP+Servlet replacement. The Xcaml 
> virtual machine and API are already complete, but the performance which 
> they achieve in conjunction with Apache is mediocre. Gerd's new HTTP 
> connector Ocamlnet will give us top notch performance while without 
> sacrificing the safety guarantees of the Ocaml language and VM.

Let me also add a few words about this project. What we are going to
implement here is nothing else but a web server written in O'Caml, or
better a web server component that can be integrated into the
application it is serving. Of course, this web server will have
"industry quality", especially regarding stability and performance. The
HTTP kernel is already written, and implements event-driven message
exchange for HTTP/1.0 and 1.1 in only 1200 lines of code.

Another part of the web server is called the "reactor". It provides a
Netcgi-compatible interface into which existing applications using
Netcgi can be simply plugged in. That means it will be quite easy to add
the web server component to existing CGI applications. The reactor
processes one HTTP request after the other, and can call an arbitrary
content generator for every request. To achieve parallelism, it is
planned to integrate the reactor into a multi-threaded setup.

I am also figuring out a purely event-based implementation (using only
Unix.select) in the hope that the simplification of scheduling will give
us a performance boost. This setup will be a lot more complicated, and
when carefully combined with multi-threading or -processing it will also
be possible to plug in existing Netcgi-based application in addition to
purely event-based content generators, i.e. the best of all worlds.

As you can see, some aspects of the web server design follow
conservative ideas (like the reactor), and some are very experimental. I
hope this results in a top-performing server that can be configured in
very flexible ways.

Gerd

> The new library will be released to the community by Baretta DE&IT and 
> Gerd Stolpmann jointly under the terms of the GPL. When the integration 
> with the Xcaml server will be done, the full Application System/Xcaml 
> will be released under the terms of the GPL.

> Alex
> 
-- 
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
------------------------------------------------------------



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] CamlGI question
  2005-04-18 14:31     ` Gerd Stolpmann
@ 2005-04-18 16:04       ` Michael Alexander Hamburg
  2005-04-18 16:28         ` Alex Baretta
  0 siblings, 1 reply; 37+ messages in thread
From: Michael Alexander Hamburg @ 2005-04-18 16:04 UTC (permalink / raw)
  To: Gerd Stolpmann; +Cc: Alex Baretta, Robert Roessler, Caml-list

Given then that my application should be multithreaded, and will be
running on a webserver using Rails (which traditionally uses FastCGI),
which of these libraries do you suggest that I use?  Http, Netcgi_afp or
Netcgi_fcgi?  Or are they interoperable enough that it doesn't matter?

Thanks a lot,
Mike

On Mon, 18 Apr 2005, Gerd Stolpmann wrote:

> Am Montag, den 18.04.2005, 15:49 +0200 schrieb Alex Baretta:
> > Robert Roessler wrote:
> > > I am not able to shed any light on the CamlGI question... OTOH, the
> > > announcement from Gerd Stolpmann a few days ago regarding Ocamlnet 1.0
> > > may be of interest, given that it includes "a mature implementation of
> > > the CGI protocol" and "an implementation of the FastCGI protocol".
> >
> > It is worth noting that Baretta DE&IT has commissioned a full
> > implementation of the HTTP/1.1 protocol from Gerd. The HTTP library will
> > be based on Ocamlnet and will export more or less the same API as the
> > Netcgi module. We chose this approach rather than FastCGI because the
> > FastCGI project seems dead and did not look like a viable solution for
> > our Xcaml application server.
> >
> > Xcaml aims at being a Apache+Tomcat+JSP+Servlet replacement. The Xcaml
> > virtual machine and API are already complete, but the performance which
> > they achieve in conjunction with Apache is mediocre. Gerd's new HTTP
> > connector Ocamlnet will give us top notch performance while without
> > sacrificing the safety guarantees of the Ocaml language and VM.
>
> Let me also add a few words about this project. What we are going to
> implement here is nothing else but a web server written in O'Caml, or
> better a web server component that can be integrated into the
> application it is serving. Of course, this web server will have
> "industry quality", especially regarding stability and performance. The
> HTTP kernel is already written, and implements event-driven message
> exchange for HTTP/1.0 and 1.1 in only 1200 lines of code.
>
> Another part of the web server is called the "reactor". It provides a
> Netcgi-compatible interface into which existing applications using
> Netcgi can be simply plugged in. That means it will be quite easy to add
> the web server component to existing CGI applications. The reactor
> processes one HTTP request after the other, and can call an arbitrary
> content generator for every request. To achieve parallelism, it is
> planned to integrate the reactor into a multi-threaded setup.
>
> I am also figuring out a purely event-based implementation (using only
> Unix.select) in the hope that the simplification of scheduling will give
> us a performance boost. This setup will be a lot more complicated, and
> when carefully combined with multi-threading or -processing it will also
> be possible to plug in existing Netcgi-based application in addition to
> purely event-based content generators, i.e. the best of all worlds.
>
> As you can see, some aspects of the web server design follow
> conservative ideas (like the reactor), and some are very experimental. I
> hope this results in a top-performing server that can be configured in
> very flexible ways.
>
> Gerd
>
> > The new library will be released to the community by Baretta DE&IT and
> > Gerd Stolpmann jointly under the terms of the GPL. When the integration
> > with the Xcaml server will be done, the full Application System/Xcaml
> > will be released under the terms of the GPL.
>
> > Alex
> >
> --
> ------------------------------------------------------------
> Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany
> gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
> ------------------------------------------------------------
>
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] CamlGI question
  2005-04-18 16:04       ` Michael Alexander Hamburg
@ 2005-04-18 16:28         ` Alex Baretta
  2005-04-19  3:23           ` Mike Hamburg
  0 siblings, 1 reply; 37+ messages in thread
From: Alex Baretta @ 2005-04-18 16:28 UTC (permalink / raw)
  To: Caml-list

Michael Alexander Hamburg wrote:
> Given then that my application should be multithreaded, and will be
> running on a webserver using Rails (which traditionally uses FastCGI),
> which of these libraries do you suggest that I use?  Http, Netcgi_afp or
> Netcgi_fcgi?  Or are they interoperable enough that it doesn't matter?

So long as you use Netcgi, it does not matter. The API does not expose 
the difference between the various connectors.

Alex

-- 
*********************************************************************
http://www.barettadeit.com/
Baretta DE&IT
A division of Baretta SRL

tel. +39 02 370 111 55
fax. +39 02 370 111 54

Our technology:

The Application System/Xcaml (AS/Xcaml)
<http://www.asxcaml.org/>

The FreerP Project
<http://www.freerp.org/>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] CamlGI question
  2005-04-18 16:28         ` Alex Baretta
@ 2005-04-19  3:23           ` Mike Hamburg
  2005-04-19  3:26             ` [Caml-list] CamlGI question [doh] Mike Hamburg
  0 siblings, 1 reply; 37+ messages in thread
From: Mike Hamburg @ 2005-04-19  3:23 UTC (permalink / raw)
  To: Alex Baretta; +Cc: Caml-list

I have ported my application to NetCGI, and it works as a CGI script,
just as it did under OCamlGI.  However, if compiled as a FastCGI script,
it denies connections from the server and then dies (broken pipe).  I'd
rather run the cgi as a standalone application, preferably FastCGI as
that's in my server, but if the FastCGI code in NetCGI doesn't work, I'd
be fine compiling it as AJP.

How should I either
(1) fix a broken pipe error in NetCGI or
(2) fix the broken pipe?

Thanks again,
Mike Hamburg

Alex Baretta wrote:

> Michael Alexander Hamburg wrote:
>
>> Given then that my application should be multithreaded, and will be
>> running on a webserver using Rails (which traditionally uses FastCGI),
>> which of these libraries do you suggest that I use?  Http, Netcgi_afp or
>> Netcgi_fcgi?  Or are they interoperable enough that it doesn't matter?
>
>
> So long as you use Netcgi, it does not matter. The API does not expose
> the difference between the various connectors.
>
> Alex
>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] CamlGI question [doh]
  2005-04-19  3:23           ` Mike Hamburg
@ 2005-04-19  3:26             ` Mike Hamburg
  2005-04-19  9:18               ` Gerd Stolpmann
  2005-04-19  9:31               ` Alex Baretta
  0 siblings, 2 replies; 37+ messages in thread
From: Mike Hamburg @ 2005-04-19  3:26 UTC (permalink / raw)
  To: Mike Hamburg; +Cc: Alex Baretta, Caml-list

I'm obviously too tired to be programming.  That should, of course, read
(1) fix a broken pipe error in Netcgi_fast or
(2) port the application to (and configure the webserver for) AJP

--Mike

Mike Hamburg wrote:

>How should I either
>(1) fix a broken pipe error in NetCGI or
>(2) fix the broken pipe?
>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] CamlGI question [doh]
  2005-04-19  3:26             ` [Caml-list] CamlGI question [doh] Mike Hamburg
@ 2005-04-19  9:18               ` Gerd Stolpmann
  2005-04-19 15:28                 ` Mike Hamburg
  2005-04-19  9:31               ` Alex Baretta
  1 sibling, 1 reply; 37+ messages in thread
From: Gerd Stolpmann @ 2005-04-19  9:18 UTC (permalink / raw)
  To: Mike Hamburg; +Cc: Alex Baretta, Caml-list

Am Montag, den 18.04.2005, 23:26 -0400 schrieb Mike Hamburg:
> I'm obviously too tired to be programming.  That should, of course, read
> (1) fix a broken pipe error in Netcgi_fast or

Well, we would need more details to help you.

> (2) port the application to (and configure the webserver for) AJP

Porting the application is quite simple, just follow the examples coming
with Ocamlnet (in examples/jserv). For the webserver, you need mod_jk
(jk=Jakarta). The configuration reference is here:

http://jakarta.apache.org/tomcat/connectors-doc/config/apache.html

Note that Ocamlnet only supports AJP-1.2, not 1.3 which is the current
default for Tomcat.

Gerd
> 
> --Mike
> 
> Mike Hamburg wrote:
> 
> >How should I either
> >(1) fix a broken pipe error in NetCGI or
> >(2) fix the broken pipe?
> >
> 
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
> 
-- 
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
------------------------------------------------------------



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] CamlGI question [doh]
  2005-04-19  3:26             ` [Caml-list] CamlGI question [doh] Mike Hamburg
  2005-04-19  9:18               ` Gerd Stolpmann
@ 2005-04-19  9:31               ` Alex Baretta
  1 sibling, 0 replies; 37+ messages in thread
From: Alex Baretta @ 2005-04-19  9:31 UTC (permalink / raw)
  To: Ocaml

Mike Hamburg wrote:
> I'm obviously too tired to be programming.  That should, of course, read
> (1) fix a broken pipe error in Netcgi_fast or
> (2) port the application to (and configure the webserver for) AJP
> 
> --Mike
> 
> Mike Hamburg wrote:
> 

If at all possible, rather than AJP I'd stick to FastCGI.  The code in 
Ocamlnet reportedly works fine, but I don't have experience with it. Are 
you sure that the broken pipe is caused by a bug in Ocamlnet and not in 
the web server?

Alex

-- 
*********************************************************************
http://www.barettadeit.com/
Baretta DE&IT
A division of Baretta SRL

tel. +39 02 370 111 55
fax. +39 02 370 111 54

Our technology:

The Application System/Xcaml (AS/Xcaml)
<http://www.asxcaml.org/>

The FreerP Project
<http://www.freerp.org/>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] CamlGI question
  2005-04-18  6:15 CamlGI question Mike Hamburg
  2005-04-18  7:29 ` [Caml-list] " Robert Roessler
@ 2005-04-19 11:33 ` Christophe TROESTLER
  2005-04-19 12:51   ` Christopher Alexander Stein
  2005-04-19 20:13   ` [Caml-list] CamlGI question Michael Alexander Hamburg
  1 sibling, 2 replies; 37+ messages in thread
From: Christophe TROESTLER @ 2005-04-19 11:33 UTC (permalink / raw)
  To: Mike Hamburg; +Cc: caml-list

On Mon, 18 Apr 2005, Mike Hamburg <hamburg@fas.harvard.edu> wrote:
> 
> Is CamlGI still actively maintained?  I'm writing a CGI/FastCGI
> program using it, and have been having some trouble with the library.

It is -- I just do not have much time to care about it.

> http://capricorn.dnsalias.org/mike/index/

I downloaded your files.  For a start, all the modules are -pack'ed
into camlGI.cm[x]a, so you only need to link with that file.  Also,
the interface of the library is in camlGI.mli with complete
documentation.  In particular, you should not use hidden submodules:
e.g. in "path.ml", say [open CamlGI] and then [Cgi.HttpError] instead
of [Cgi_types.HttpError].  Same in "index.ml": you should say [open
CamlGI.Cgi], not [open Cgi].

[Request.metavar rq.rq "SERVER_NAME"] can simply be replaced with
[Request.server_name rq.rq]

[cgi#header_was_emitted ()]: such method does not exist in the public
interface.

I do not see why you set [rq=request] as the request can be gotten
from the cgi object [cgi#request].

> When used as a FastCGI, the indexing script hangs, [...] It is quite
> clear that the script has finished by the time the hang occurs

Do the examples provided with the lib work as they should?

Are you sure your [main] function actually terminates?  Indeed, the
output is buffered (at least by CamlGI) and may not be fully outputted
until the script finishes.  Also, if you wish to launch a new
process/thread per request, be sure to use the [fork] optional
parameter to [handle_requests ?fork f conn] -- otherwise [f] will
return immediately and the output "channel" will be closed early.

CamlGI follows closely the spec -- even the multiplexing part which is
not implemented by many.

> The plain CGI version works just fine

CGI output is not buffered.

> the features of the script only work in the FastCGI version, such as
> thumbnailing.

Why is that?  Do you need persistence for that?

Hope it helps.  If it does not, send me an example (if possible
minimal but definitely self-contained) that exhibits the undesired
behavior and I'll have a look.

ChriS


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] CamlGI question
  2005-04-19 11:33 ` [Caml-list] CamlGI question Christophe TROESTLER
@ 2005-04-19 12:51   ` Christopher Alexander Stein
  2005-04-19 19:03     ` Common CGI interface (was: [Caml-list] CamlGI question) Christophe TROESTLER
  2005-04-19 20:13   ` [Caml-list] CamlGI question Michael Alexander Hamburg
  1 sibling, 1 reply; 37+ messages in thread
From: Christopher Alexander Stein @ 2005-04-19 12:51 UTC (permalink / raw)
  To: Christophe TROESTLER; +Cc: Mike Hamburg, caml-list

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: TEXT/PLAIN; charset=X-UNKNOWN, Size: 2859 bytes --]


Je pense qu'aujourd'hui, une interface fonctionnante de CGI est
presque aussi important pour une langue d'usage universel qu'un
générateur exécutable de A.OUT ou ELF. Une interface de base de
données (MySQL) suit de près derrière.

Lex

On Tue, 19 Apr 2005, Christophe TROESTLER wrote:

> On Mon, 18 Apr 2005, Mike Hamburg <hamburg@fas.harvard.edu> wrote:
> >
> > Is CamlGI still actively maintained?  I'm writing a CGI/FastCGI
> > program using it, and have been having some trouble with the library.
>
> It is -- I just do not have much time to care about it.
>
> > http://capricorn.dnsalias.org/mike/index/
>
> I downloaded your files.  For a start, all the modules are -pack'ed
> into camlGI.cm[x]a, so you only need to link with that file.  Also,
> the interface of the library is in camlGI.mli with complete
> documentation.  In particular, you should not use hidden submodules:
> e.g. in "path.ml", say [open CamlGI] and then [Cgi.HttpError] instead
> of [Cgi_types.HttpError].  Same in "index.ml": you should say [open
> CamlGI.Cgi], not [open Cgi].
>
> [Request.metavar rq.rq "SERVER_NAME"] can simply be replaced with
> [Request.server_name rq.rq]
>
> [cgi#header_was_emitted ()]: such method does not exist in the public
> interface.
>
> I do not see why you set [rq=request] as the request can be gotten
> from the cgi object [cgi#request].
>
> > When used as a FastCGI, the indexing script hangs, [...] It is quite
> > clear that the script has finished by the time the hang occurs
>
> Do the examples provided with the lib work as they should?
>
> Are you sure your [main] function actually terminates?  Indeed, the
> output is buffered (at least by CamlGI) and may not be fully outputted
> until the script finishes.  Also, if you wish to launch a new
> process/thread per request, be sure to use the [fork] optional
> parameter to [handle_requests ?fork f conn] -- otherwise [f] will
> return immediately and the output "channel" will be closed early.
>
> CamlGI follows closely the spec -- even the multiplexing part which is
> not implemented by many.
>
> > The plain CGI version works just fine
>
> CGI output is not buffered.
>
> > the features of the script only work in the FastCGI version, such as
> > thumbnailing.
>
> Why is that?  Do you need persistence for that?
>
> Hope it helps.  If it does not, send me an example (if possible
> minimal but definitely self-contained) that exhibits the undesired
> behavior and I'll have a look.
>
> ChriS
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] CamlGI question [doh]
  2005-04-19  9:18               ` Gerd Stolpmann
@ 2005-04-19 15:28                 ` Mike Hamburg
       [not found]                   ` <1113933973.6248.76.camel@localhost.localdomain>
  0 siblings, 1 reply; 37+ messages in thread
From: Mike Hamburg @ 2005-04-19 15:28 UTC (permalink / raw)
  To: Gerd Stolpmann; +Cc: Alex Baretta, Caml-list

I'll try to help work out this bug in either Apache/FastCGI or Netcgi,
and if that doesn't work, I'll go to AJP.  I compiled mod_fastcgi-2.4.0
on Apache 2.0.53.  It successfully serves FastCGI pages written with
CamlGI, except that it sometimes pauses after 8k of data, and sometimes
breaks the pipe to the CGI.

Under Netcgi 1.0, the CGI version works but the FastCGI version does not
work at all.  The difference between the CGI and FastCGI versions is only:

let () = Netcgi_fcgi.serv main (`Direct "<br><hr><br>");
(* let cgi = new Netcgi.std_activation ()
let () =  main cgi *)

The application reads files from the outside world, and may launch
processes with fork+execv to thumbnail images.  It is multithreaded,
with at least 4 threads at all times: the main thread, a thumbnail cache
manager, a thread which handles signals (this cannot be done in a signal
handler, as it may block, so I wake up the handler thread instead), and
a thread which rereads the configuration files periodically, so that the
user does not have to send HUP.

The application usually raises Unix.Unix_error (EPIPE, "write", "") on
the line Netcgi_fcgi.serv main (`Direct "<br><hr><br>") or rather,
inside Netcgi_fcgi.serv at some point; I believe this occurs before main
is called.  The webserver also receives EPIPE and returns error 500. 
Sometimes, I get other errors: sometimes main gets called, and I get
Failure("send_output_header").  If this happens, then rollback_output
successfully prints "<br><hr><br>" but no more (i.e. no further
diagnostics).  Sometimes, the webserver times out waiting for the first
read, in which case the application still raises EPIPE.

Is there any more information that would help to debug the problem?

Mike

Alex Baretta wrote:

> If at all possible, rather than AJP I'd stick to FastCGI.  The code in
> Ocamlnet reportedly works fine, but I don't have experience with it.
> Are you sure that the broken pipe is caused by a bug in Ocamlnet and
> not in the web server?
>
> Alex 



Gerd Stolpmann wrote:

>Am Montag, den 18.04.2005, 23:26 -0400 schrieb Mike Hamburg:
>  
>
>>I'm obviously too tired to be programming.  That should, of course, read
>>(1) fix a broken pipe error in Netcgi_fast or
>>    
>>
>
>Well, we would need more details to help you.
>
>  
>
>>(2) port the application to (and configure the webserver for) AJP
>>    
>>
>
>Porting the application is quite simple, just follow the examples coming
>with Ocamlnet (in examples/jserv). For the webserver, you need mod_jk
>(jk=Jakarta). The configuration reference is here:
>
>http://jakarta.apache.org/tomcat/connectors-doc/config/apache.html
>
>Note that Ocamlnet only supports AJP-1.2, not 1.3 which is the current
>default for Tomcat.
>
>Gerd
>  
>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] CamlGI question [doh]
       [not found]                   ` <1113933973.6248.76.camel@localhost.localdomain>
@ 2005-04-19 18:44                     ` Eric Stokes
  2005-04-19 19:18                       ` Christophe TROESTLER
  2005-04-19 21:11                     ` Eric Stokes
  1 sibling, 1 reply; 37+ messages in thread
From: Eric Stokes @ 2005-04-19 18:44 UTC (permalink / raw)
  To: Gerd Stolpmann; +Cc: caml-list, Mike Hamburg

I have several multi threaded applications written with Netcgi_fastcgi, 
and I can say that I have seen problems like this, but not to this 
degree. However I've only tested on apache 1.3.x and 2.0.x with 
mod_fastcgi 2.4.2. That said, printing stuff to stdout should not cause 
a problem, because stdout is a listening socket, the data would 
essentially go into a buffer which will never be read. I think the 
problem may actually be single_write vs. write. When I originally wrote 
the fastcgi code for ocamlnet I was new to Ocaml, and I did not realize 
that write was different from the libc equivalent. Under load, or in 
some situations, using Unix.write will definitely cause a protocol 
error and could explain some strange intermittent behavior that I have 
seen. Therefore I will commit a patch the fastcgi library to use 
single_write instead of write and we'll see what happens. I'll also try 
to determine the effect of an error, and get back to you.

Mike, thanks for sticking with us and providing data, we appreciate it 
a lot.

On Apr 19, 2005, at 11:06 AM, Gerd Stolpmann wrote:

> Am Dienstag, den 19.04.2005, 11:28 -0400 schrieb Mike Hamburg:
>> I'll try to help work out this bug in either Apache/FastCGI or Netcgi,
>> and if that doesn't work, I'll go to AJP.  I compiled 
>> mod_fastcgi-2.4.0
>> on Apache 2.0.53.  It successfully serves FastCGI pages written with
>> CamlGI, except that it sometimes pauses after 8k of data, and 
>> sometimes
>> breaks the pipe to the CGI.
>
> Just to make sure Netcgi works also for multi-threaded programs, I made
> a test for myself. I added a second thread to the example 
> add_fastcgi.ml
> (of Ocamlnet), such that it reads now
>
> let a = ref ["Z"; "X"; "A"; "P"; "B" ];;
>
> Thread.create
>   (fun () ->
>      while true do
>        a := List.sort compare !a
>      done
>   )
>   ()
> ;;
>
> (* start the fastcgi server *)
> serv process buffered_transactional_optype;;
>
> at the end. So there is now a thread sorting this list over and over
> again.
>
> I tried it with Apache 2.0.50 and mod_fastcgi 2.4.2 (those versions
> coming with Ubuntu Warty). The first test failed, I had to comment out
> the FastCgiWrapper which is part of the default configuration. After
> that, the program just worked.
>
> I also looked at the sources. There might be problems because the 
> O'Caml
> thread implementation uses signals to wake up threads in some
> situations. This may cause that the Unix.read/write (one should better
> use Unix.single_write, btw.) fail with EINTR. This is not caught.
>
>> Under Netcgi 1.0, the CGI version works but the FastCGI version does 
>> not
>> work at all.  The difference between the CGI and FastCGI versions is 
>> only:
>>
>> let () = Netcgi_fcgi.serv main (`Direct "<br><hr><br>");
>> (* let cgi = new Netcgi.std_activation ()
>> let () =  main cgi *)
>>
>> The application reads files from the outside world, and may launch
>> processes with fork+execv to thumbnail images.  It is multithreaded,
>> with at least 4 threads at all times: the main thread, a thumbnail 
>> cache
>> manager, a thread which handles signals (this cannot be done in a 
>> signal
>> handler, as it may block, so I wake up the handler thread instead), 
>> and
>> a thread which rereads the configuration files periodically, so that 
>> the
>> user does not have to send HUP.
>>
>> The application usually raises Unix.Unix_error (EPIPE, "write", "") on
>> the line Netcgi_fcgi.serv main (`Direct "<br><hr><br>") or rather,
>> inside Netcgi_fcgi.serv at some point; I believe this occurs before 
>> main
>> is called.
>
> You can figure this out exactly when you byte-compile the program with
> -g, and set OCAMLRUNPARAM=b=1, e.g. using a wrapper script and
> redirecting stderr to a file.
>
> The EPIPE error is quite surprising, because Netcgi wraps all its
> exceptions into FCGI_Error.
>
>>  The webserver also receives EPIPE and returns error 500.
>> Sometimes, I get other errors: sometimes main gets called, and I get
>> Failure("send_output_header").
>
> This is very strange, too. Failure("send_output_header") can only 
> happen
> when the HTTP header is printed at the wrong moment (i.e. too late).
>
>>  If this happens, then rollback_output
>> successfully prints "<br><hr><br>" but no more (i.e. no further
>> diagnostics).  Sometimes, the webserver times out waiting for the 
>> first
>> read, in which case the application still raises EPIPE.
>>
>> Is there any more information that would help to debug the problem?
>
> I suspect a part of the program prints something to stdout at the wrong
> moment, maybe because of a race condition. This is only my intuition,
> but this is the first thing I would try to look for. This would explain
> why the fcgi protocol is violated, and EPIPE is the logical consequence
> (the web server sees wrong data and shuts down the socket).
>
> Maybe an strace of the process can show what is really going on.
>
> Gerd
>
>>
>> Mike
>>
>> Alex Baretta wrote:
>>
>>> If at all possible, rather than AJP I'd stick to FastCGI.  The code 
>>> in
>>> Ocamlnet reportedly works fine, but I don't have experience with it.
>>> Are you sure that the broken pipe is caused by a bug in Ocamlnet and
>>> not in the web server?
>>>
>>> Alex
>>
>>
>>
>> Gerd Stolpmann wrote:
>>
>>> Am Montag, den 18.04.2005, 23:26 -0400 schrieb Mike Hamburg:
>>>
>>>
>>>> I'm obviously too tired to be programming.  That should, of course, 
>>>> read
>>>> (1) fix a broken pipe error in Netcgi_fast or
>>>>
>>>>
>>>
>>> Well, we would need more details to help you.
>>>
>>>
>>>
>>>> (2) port the application to (and configure the webserver for) AJP
>>>>
>>>>
>>>
>>> Porting the application is quite simple, just follow the examples 
>>> coming
>>> with Ocamlnet (in examples/jserv). For the webserver, you need mod_jk
>>> (jk=Jakarta). The configuration reference is here:
>>>
>>> http://jakarta.apache.org/tomcat/connectors-doc/config/apache.html
>>>
>>> Note that Ocamlnet only supports AJP-1.2, not 1.3 which is the 
>>> current
>>> default for Tomcat.
>>>
>>> Gerd
>>>
>>>
>>
>>
> -- 
> ------------------------------------------------------------
> Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany
> gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
> ------------------------------------------------------------
>
>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Common CGI interface (was: [Caml-list] CamlGI question)
  2005-04-19 12:51   ` Christopher Alexander Stein
@ 2005-04-19 19:03     ` Christophe TROESTLER
  2005-04-19 19:54       ` Gerd Stolpmann
  0 siblings, 1 reply; 37+ messages in thread
From: Christophe TROESTLER @ 2005-04-19 19:03 UTC (permalink / raw)
  To: stein; +Cc: hamburg, caml-list

On Tue, 19 Apr 2005, Christopher Alexander Stein <stein@eecs.harvard.edu> wrote:
> 
> Je pense qu'aujourd'hui, une interface [...] de CGI est [...] important

I am not sure what is your point but the trouble right now is not that
there are no CGI library but that there are too many [1]!  So let me
place a call:

 | Would people be interested in setting up a list to discuss a common
 | CGI-like interface, i.e. a minimal set of services to be offered
 | (in the same vein to what was done I/O objects, see
 | http://ocaml-programming.de/rec/IO-Classes.html).  [It should not
 | be hurried as for some library authors, this is not the main job.]
 | The aim is to make possible to develop higher level libraries
 | (e.g. template libraries) that work whatever the basic interface
 | one favors.

> Une interface de base de données (MySQL) suit de près derrière.

There are libraries for many databases as well as a generic one: DBI
(http://savannah.nongnu.org/cgi-bin/viewcvs/modcaml/ocamldbi/).

Cheers,
ChriS

---
[1] Among others,
- Maxence Guesdon CGI (http://pauillac.inria.fr/~guesdon/Tools/cgi/)
- CamlGI (http://sourceforge.net/projects/ocaml-cgi/)
- fcgi-ocaml (http://sourceforge.net/projects/tcl-fastcgi/)
- mod_caml (https://savannah.nongnu.org/projects/modcaml/)
- OCamlnet (http://ocamlnet.sourceforge.net/)
- cgi (http://www.lri.fr/~filliatr/ftp/ocaml/cgi/)


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] CamlGI question [doh]
  2005-04-19 18:44                     ` Eric Stokes
@ 2005-04-19 19:18                       ` Christophe TROESTLER
  0 siblings, 0 replies; 37+ messages in thread
From: Christophe TROESTLER @ 2005-04-19 19:18 UTC (permalink / raw)
  To: O'Caml Mailing List

On Tue, 19 Apr 2005, Eric Stokes <eric.stokes@csun.edu> wrote:
> 
> printing stuff to stdout should not cause a problem, because stdout
> is a listening socket, the data would essentially go into a buffer
> which will never be read.

On many FastCGI implementations it indeed shouldn't be a problem but
that may if they respect the spec.  Indeed the latter mandates that
stdout and stderr be closed
(http://www.fastcgi.com/devkit/doc/fcgi-spec.html#S2.2).

> [...] problem may actually be single_write vs. write.  [...] and get
> back to you.

Get back to us all.  I am also interested whether this is indeed the
problem.

Regards,
ChriS


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Common CGI interface (was: [Caml-list] CamlGI question)
  2005-04-19 19:03     ` Common CGI interface (was: [Caml-list] CamlGI question) Christophe TROESTLER
@ 2005-04-19 19:54       ` Gerd Stolpmann
  2005-04-20  6:55         ` Jean-Christophe Filliatre
                           ` (2 more replies)
  0 siblings, 3 replies; 37+ messages in thread
From: Gerd Stolpmann @ 2005-04-19 19:54 UTC (permalink / raw)
  To: Christophe TROESTLER; +Cc: stein, hamburg, caml-list

Am Dienstag, den 19.04.2005, 21:03 +0200 schrieb Christophe TROESTLER:
> On Tue, 19 Apr 2005, Christopher Alexander Stein <stein@eecs.harvard.edu> wrote:
> > 
> > Je pense qu'aujourd'hui, une interface [...] de CGI est [...] important
> 
> I am not sure what is your point but the trouble right now is not that
> there are no CGI library but that there are too many [1]!  So let me
> place a call:
> 
>  | Would people be interested in setting up a list to discuss a common
>  | CGI-like interface, i.e. a minimal set of services to be offered
>  | (in the same vein to what was done I/O objects, see
>  | http://ocaml-programming.de/rec/IO-Classes.html).  [It should not
>  | be hurried as for some library authors, this is not the main job.]
>  | The aim is to make possible to develop higher level libraries
>  | (e.g. template libraries) that work whatever the basic interface
>  | one favors.

Good idea. However, I think it is too late for such a discussion.

First, it already happened. Do you remember Bedouin? Although this
debate was about the general design of web applications, there was also
a "branch" targeting the low-level stuff, especially CGI and other
connectors. This branch was Ocamlnet.

Second, Ocamlnet exactly defines the "minimal set of services" (besides
including several implementations). The interesting point is that it is
possible to do implementations outside Ocamlnet by just defining
compatible classes. This was a design idea from the very beginning,
realized by using classes instead of functors everywhere. Because
Ocamlnet has several layers, the developer of a new connector is even
free to choose the level of the implementation, often giving one the
chance to reuse code.

I am quite astonished at seeing that many CGI implementations. I only
knew the implementation of de Rauglaudre and Filliatre, and its
limitations were one the motivations to develop Ocamlnet. Except
mod_ocaml, which is somehow a different thing, the other libraries seem
to have the same limitations: Non-modular design, missing features like
upload of large (> 16 MB) files, or internationalization. I don't say
Ocamlnet is perfect, but it is a step into the right direction.

Gerd

> 
> > Une interface de base de données (MySQL) suit de près derrière.
> 
> There are libraries for many databases as well as a generic one: DBI
> (http://savannah.nongnu.org/cgi-bin/viewcvs/modcaml/ocamldbi/).
> 
> Cheers,
> ChriS
> 
> ---
> [1] Among others,
> - Maxence Guesdon CGI (http://pauillac.inria.fr/~guesdon/Tools/cgi/)
> - CamlGI (http://sourceforge.net/projects/ocaml-cgi/)
> - fcgi-ocaml (http://sourceforge.net/projects/tcl-fastcgi/)
> - mod_caml (https://savannah.nongnu.org/projects/modcaml/)
> - OCamlnet (http://ocamlnet.sourceforge.net/)
> - cgi (http://www.lri.fr/~filliatr/ftp/ocaml/cgi/)
> 
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
> 
-- 
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
------------------------------------------------------------



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] CamlGI question
  2005-04-19 11:33 ` [Caml-list] CamlGI question Christophe TROESTLER
  2005-04-19 12:51   ` Christopher Alexander Stein
@ 2005-04-19 20:13   ` Michael Alexander Hamburg
  1 sibling, 0 replies; 37+ messages in thread
From: Michael Alexander Hamburg @ 2005-04-19 20:13 UTC (permalink / raw)
  To: Christophe TROESTLER; +Cc: caml-list



On Tue, 19 Apr 2005, Christophe TROESTLER wrote:

> On Mon, 18 Apr 2005, Mike Hamburg <hamburg@fas.harvard.edu> wrote:
> >
> > http://capricorn.dnsalias.org/mike/index/
>
> I downloaded your files.  For a start, all the modules are -pack'ed
> into camlGI.cm[x]a, so you only need to link with that file.  Also,
> the interface of the library is in camlGI.mli with complete
> documentation.  In particular, you should not use hidden submodules:
> e.g. in "path.ml", say [open CamlGI] and then [Cgi.HttpError] instead
> of [Cgi_types.HttpError].  Same in "index.ml": you should say [open
> CamlGI.Cgi], not [open Cgi].
>
> [Request.metavar rq.rq "SERVER_NAME"] can simply be replaced with
> [Request.server_name rq.rq]
>
> [cgi#header_was_emitted ()]: such method does not exist in the public
> interface.

It doesn't exist at all.  I added it because I was getting mysterious
type errors from OCaml when trying to compile directly.  It was on my
"things to clean up" list.

> I do not see why you set [rq=request] as the request can be gotten
> from the cgi object [cgi#request].

You're right, I missed that.

> > When used as a FastCGI, the indexing script hangs, [...] It is quite
> > clear that the script has finished by the time the hang occurs
>
> Do the examples provided with the lib work as they should?

I'll check that too (I'm not at home right now).

> Are you sure your [main] function actually terminates?  Indeed, the
> output is buffered (at least by CamlGI) and may not be fully outputted
> until the script finishes.  Also, if you wish to launch a new
> process/thread per request, be sure to use the [fork] optional
> parameter to [handle_requests ?fork f conn] -- otherwise [f] will
> return immediately and the output "channel" will be closed early.

My main function does actually terminate.  I put a logging message at the
end of the function.  And it's not just buffering output, as killing the
CGI script with signal 9 causes no truncation.  I'll set it to create a
new thread to handle the request, although that doesn't particularly
matter (the threading is necessary to keep the thumbnail cache running,
not for load issues).

> CamlGI follows closely the spec -- even the multiplexing part which is
> not implemented by many.
>
> > The plain CGI version works just fine
>
> CGI output is not buffered.
>
> > the features of the script only work in the FastCGI version, such as
> > thumbnailing.
>
> Why is that?  Do you need persistence for that?

I can do it without persistence, but synchronization is much more
difficult in that case, eg, preventing to processes from trying to
thumbnail the same files at the asme time.

> Hope it helps.  If it does not, send me an example (if possible
> minimal but definitely self-contained) that exhibits the undesired
> behavior and I'll have a look.

I'll work on it soon and figure out whether this helps or not.  Hopefully
it will.

> ChriS
>

Thanks a lot for your help,
Mike


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] CamlGI question [doh]
       [not found]                   ` <1113933973.6248.76.camel@localhost.localdomain>
  2005-04-19 18:44                     ` Eric Stokes
@ 2005-04-19 21:11                     ` Eric Stokes
  1 sibling, 0 replies; 37+ messages in thread
From: Eric Stokes @ 2005-04-19 21:11 UTC (permalink / raw)
  To: Gerd Stolpmann; +Cc: caml-list, Mike Hamburg

I've looked into this a bit more, and I have two things to say.

1. Unfortunately, it may not get any better than write. Once I slap on 
the fastcgi header telling the server how many bytes I'm going to 
write, I've committed to write that many bytes, If I don't its an 
error. That means I'd end up having to do the same thing as write if I 
used write_single. I'd have to check the return value to be sure that I 
wrote the correct number of bytes and if I didn't then call 
write_single again. This means that if there is an error on the third 
invocation, I'd have a protocol error, because some data has already 
been written. I might mitigate this by writing data in smaller chunks, 
but this can effect performance. I'll do some thinking about this.

2. We are not specific in the documentation about how you actually 
write threaded apps with Netcgi_fastcgi. This is very likely the cause 
of your problem. You need to call a new serv function for each thread 
that you want to work on web stuff. The idea is that each thread calls 
accept on the listening socket, and clients are distributed to threads 
in this manner. If your thread is never going to send data back via the 
fastcgi sockets then it is fine if it does not call serv, however any 
thread which communicates with the web server must do so only through 
serv. eg.

(* this is how I run threaded web applications via Ocamlnet *)
let worker_thread () =
   Netcgi_fcgi.serv req_handler
     buffered_transactional_optype;;

for i = 0 to numthreads
do
	Thread.create worker_thread ()
done

On Apr 19, 2005, at 11:06 AM, Gerd Stolpmann wrote:

> Am Dienstag, den 19.04.2005, 11:28 -0400 schrieb Mike Hamburg:
>> I'll try to help work out this bug in either Apache/FastCGI or Netcgi,
>> and if that doesn't work, I'll go to AJP.  I compiled 
>> mod_fastcgi-2.4.0
>> on Apache 2.0.53.  It successfully serves FastCGI pages written with
>> CamlGI, except that it sometimes pauses after 8k of data, and 
>> sometimes
>> breaks the pipe to the CGI.
>
> Just to make sure Netcgi works also for multi-threaded programs, I made
> a test for myself. I added a second thread to the example 
> add_fastcgi.ml
> (of Ocamlnet), such that it reads now
>
> let a = ref ["Z"; "X"; "A"; "P"; "B" ];;
>
> Thread.create
>   (fun () ->
>      while true do
>        a := List.sort compare !a
>      done
>   )
>   ()
> ;;
>
> (* start the fastcgi server *)
> serv process buffered_transactional_optype;;
>
> at the end. So there is now a thread sorting this list over and over
> again.
>
> I tried it with Apache 2.0.50 and mod_fastcgi 2.4.2 (those versions
> coming with Ubuntu Warty). The first test failed, I had to comment out
> the FastCgiWrapper which is part of the default configuration. After
> that, the program just worked.
>
> I also looked at the sources. There might be problems because the 
> O'Caml
> thread implementation uses signals to wake up threads in some
> situations. This may cause that the Unix.read/write (one should better
> use Unix.single_write, btw.) fail with EINTR. This is not caught.
>
>> Under Netcgi 1.0, the CGI version works but the FastCGI version does 
>> not
>> work at all.  The difference between the CGI and FastCGI versions is 
>> only:
>>
>> let () = Netcgi_fcgi.serv main (`Direct "<br><hr><br>");
>> (* let cgi = new Netcgi.std_activation ()
>> let () =  main cgi *)
>>
>> The application reads files from the outside world, and may launch
>> processes with fork+execv to thumbnail images.  It is multithreaded,
>> with at least 4 threads at all times: the main thread, a thumbnail 
>> cache
>> manager, a thread which handles signals (this cannot be done in a 
>> signal
>> handler, as it may block, so I wake up the handler thread instead), 
>> and
>> a thread which rereads the configuration files periodically, so that 
>> the
>> user does not have to send HUP.
>>
>> The application usually raises Unix.Unix_error (EPIPE, "write", "") on
>> the line Netcgi_fcgi.serv main (`Direct "<br><hr><br>") or rather,
>> inside Netcgi_fcgi.serv at some point; I believe this occurs before 
>> main
>> is called.
>
> You can figure this out exactly when you byte-compile the program with
> -g, and set OCAMLRUNPARAM=b=1, e.g. using a wrapper script and
> redirecting stderr to a file.
>
> The EPIPE error is quite surprising, because Netcgi wraps all its
> exceptions into FCGI_Error.
>
>>  The webserver also receives EPIPE and returns error 500.
>> Sometimes, I get other errors: sometimes main gets called, and I get
>> Failure("send_output_header").
>
> This is very strange, too. Failure("send_output_header") can only 
> happen
> when the HTTP header is printed at the wrong moment (i.e. too late).
>
>>  If this happens, then rollback_output
>> successfully prints "<br><hr><br>" but no more (i.e. no further
>> diagnostics).  Sometimes, the webserver times out waiting for the 
>> first
>> read, in which case the application still raises EPIPE.
>>
>> Is there any more information that would help to debug the problem?
>
> I suspect a part of the program prints something to stdout at the wrong
> moment, maybe because of a race condition. This is only my intuition,
> but this is the first thing I would try to look for. This would explain
> why the fcgi protocol is violated, and EPIPE is the logical consequence
> (the web server sees wrong data and shuts down the socket).
>
> Maybe an strace of the process can show what is really going on.
>
> Gerd
>
>>
>> Mike
>>
>> Alex Baretta wrote:
>>
>>> If at all possible, rather than AJP I'd stick to FastCGI.  The code 
>>> in
>>> Ocamlnet reportedly works fine, but I don't have experience with it.
>>> Are you sure that the broken pipe is caused by a bug in Ocamlnet and
>>> not in the web server?
>>>
>>> Alex
>>
>>
>>
>> Gerd Stolpmann wrote:
>>
>>> Am Montag, den 18.04.2005, 23:26 -0400 schrieb Mike Hamburg:
>>>
>>>
>>>> I'm obviously too tired to be programming.  That should, of course, 
>>>> read
>>>> (1) fix a broken pipe error in Netcgi_fast or
>>>>
>>>>
>>>
>>> Well, we would need more details to help you.
>>>
>>>
>>>
>>>> (2) port the application to (and configure the webserver for) AJP
>>>>
>>>>
>>>
>>> Porting the application is quite simple, just follow the examples 
>>> coming
>>> with Ocamlnet (in examples/jserv). For the webserver, you need mod_jk
>>> (jk=Jakarta). The configuration reference is here:
>>>
>>> http://jakarta.apache.org/tomcat/connectors-doc/config/apache.html
>>>
>>> Note that Ocamlnet only supports AJP-1.2, not 1.3 which is the 
>>> current
>>> default for Tomcat.
>>>
>>> Gerd
>>>
>>>
>>
>>
> -- 
> ------------------------------------------------------------
> Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany
> gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
> ------------------------------------------------------------
>
>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Common CGI interface (was: [Caml-list] CamlGI question)
  2005-04-19 19:54       ` Gerd Stolpmann
@ 2005-04-20  6:55         ` Jean-Christophe Filliatre
  2005-04-20  7:22         ` Common XML interface (was: Common CGI interface) Alain Frisch
  2005-04-20 20:00         ` Common CGI interface Christophe TROESTLER
  2 siblings, 0 replies; 37+ messages in thread
From: Jean-Christophe Filliatre @ 2005-04-20  6:55 UTC (permalink / raw)
  To: Gerd Stolpmann; +Cc: Christophe TROESTLER, stein, hamburg, caml-list


Gerd Stolpmann writes:
 > 
 > I am quite astonished at seeing that many CGI implementations. I only
 > knew the implementation of de Rauglaudre and Filliatre, and its
 > limitations were one the motivations to develop Ocamlnet. 

Just to clarify the situation (if  needed): I wrote my CGI library for
my own purposes and it  is not intended to be complete, RFC-compliant,
or whatever.   Even if it appears  in the hump  (by the time I  put it
online there were  not so many such libraries), it  does not make much
sense to compare it today with libraries such as ocamlnet.

-- 
Jean-Christophe Filliâtre (http://www.lri.fr/~filliatr)


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Common XML interface (was: Common CGI interface)
  2005-04-19 19:54       ` Gerd Stolpmann
  2005-04-20  6:55         ` Jean-Christophe Filliatre
@ 2005-04-20  7:22         ` Alain Frisch
  2005-04-20 11:15           ` [Caml-list] " Gerd Stolpmann
  2005-04-20 13:23           ` Stefano Zacchiroli
  2005-04-20 20:00         ` Common CGI interface Christophe TROESTLER
  2 siblings, 2 replies; 37+ messages in thread
From: Alain Frisch @ 2005-04-20  7:22 UTC (permalink / raw)
  To: caml-list

Gerd Stolpmann wrote:
> Am Dienstag, den 19.04.2005, 21:03 +0200 schrieb Christophe TROESTLER:
>> | Would people be interested in setting up a list to discuss a common
>> | CGI-like interface, i.e. a minimal set of services to be offered
>> | (in the same vein to what was done I/O objects, see
>> | http://ocaml-programming.de/rec/IO-Classes.html).  [...]
>
> Good idea. However, I think it is too late for such a discussion.

Another kind of library which would benefit from such effort is
XML parsing. I know about pxp, expat, xml-light, ocaml-xmlr, tony, 
xmllexer, and there might be others.

It would be great to have some common interface. An event-driven 
interface is probably easier to agree upon. There are many points to 
address (external entities, encodings, namespace processing, ... even if 
the features are not available in all the parsers).

Anyone interested in this discussion ?

-- Alain


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Common XML interface (was: Common CGI interface)
  2005-04-20  7:22         ` Common XML interface (was: Common CGI interface) Alain Frisch
@ 2005-04-20 11:15           ` Gerd Stolpmann
  2005-04-20 11:38             ` Nicolas Cannasse
  2005-04-20 13:23           ` Stefano Zacchiroli
  1 sibling, 1 reply; 37+ messages in thread
From: Gerd Stolpmann @ 2005-04-20 11:15 UTC (permalink / raw)
  To: Alain Frisch; +Cc: caml-list

Am Mittwoch, den 20.04.2005, 09:22 +0200 schrieb Alain Frisch:
> Gerd Stolpmann wrote:
> > Am Dienstag, den 19.04.2005, 21:03 +0200 schrieb Christophe TROESTLER:
> >> | Would people be interested in setting up a list to discuss a common
> >> | CGI-like interface, i.e. a minimal set of services to be offered
> >> | (in the same vein to what was done I/O objects, see
> >> | http://ocaml-programming.de/rec/IO-Classes.html).  [...]
> >
> > Good idea. However, I think it is too late for such a discussion.
> 
> Another kind of library which would benefit from such effort is
> XML parsing. I know about pxp, expat, xml-light, ocaml-xmlr, tony, 
> xmllexer, and there might be others.
> 
> It would be great to have some common interface. An event-driven 
> interface is probably easier to agree upon. There are many points to 
> address (external entities, encodings, namespace processing, ... even if 
> the features are not available in all the parsers).
> 
> Anyone interested in this discussion ?

Yes, this discussion makes a lot more sense.

Gerd


> 
> -- Alain
> 
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
> 
-- 
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
------------------------------------------------------------



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Common XML interface (was: Common CGI interface)
  2005-04-20 11:15           ` [Caml-list] " Gerd Stolpmann
@ 2005-04-20 11:38             ` Nicolas Cannasse
  0 siblings, 0 replies; 37+ messages in thread
From: Nicolas Cannasse @ 2005-04-20 11:38 UTC (permalink / raw)
  To: Gerd Stolpmann, Alain Frisch; +Cc: caml-list

> > Gerd Stolpmann wrote:
> > > Am Dienstag, den 19.04.2005, 21:03 +0200 schrieb Christophe TROESTLER:
> > >> | Would people be interested in setting up a list to discuss a common
> > >> | CGI-like interface, i.e. a minimal set of services to be offered
> > >> | (in the same vein to what was done I/O objects, see
> > >> | http://ocaml-programming.de/rec/IO-Classes.html).  [...]
> > >
> > > Good idea. However, I think it is too late for such a discussion.
> >
> > Another kind of library which would benefit from such effort is
> > XML parsing. I know about pxp, expat, xml-light, ocaml-xmlr, tony,
> > xmllexer, and there might be others.
> >
> > It would be great to have some common interface. An event-driven
> > interface is probably easier to agree upon. There are many points to
> > address (external entities, encodings, namespace processing, ... even if
> > the features are not available in all the parsers).
> >
> > Anyone interested in this discussion ?
>
> Yes, this discussion makes a lot more sense.
>
> Gerd

I'm willing also to make XmlLight compatible, as we did for IO :)

Nicolas Cannasse


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Common XML interface (was: Common CGI interface)
  2005-04-20  7:22         ` Common XML interface (was: Common CGI interface) Alain Frisch
  2005-04-20 11:15           ` [Caml-list] " Gerd Stolpmann
@ 2005-04-20 13:23           ` Stefano Zacchiroli
  2005-04-21  6:59             ` [Caml-list] Common XML interface Alain Frisch
  1 sibling, 1 reply; 37+ messages in thread
From: Stefano Zacchiroli @ 2005-04-20 13:23 UTC (permalink / raw)
  To: Inria Ocaml Mailing List

On Wed, Apr 20, 2005 at 09:22:23AM +0200, Alain Frisch wrote:
> It would be great to have some common interface. An event-driven 
> interface is probably easier to agree upon. There are many points to 

Even if certainly easier to agree upon, event-driven interface for XML
are harder to program than tree based ones.

Basic tree operations should not be that hard to agree upon ...

-- 
Stefano Zacchiroli -*- Computer Science PhD student @ Uny Bologna, Italy
zack@{cs.unibo.it,debian.org,bononia.it} -%- http://www.bononia.it/zack/
If there's any real truth it's that the entire multidimensional infinity
of the Universe is almost certainly being run by a bunch of maniacs. -!-


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Common CGI interface
  2005-04-19 19:54       ` Gerd Stolpmann
  2005-04-20  6:55         ` Jean-Christophe Filliatre
  2005-04-20  7:22         ` Common XML interface (was: Common CGI interface) Alain Frisch
@ 2005-04-20 20:00         ` Christophe TROESTLER
  2005-04-20 21:06           ` [Caml-list] " Gerd Stolpmann
  2 siblings, 1 reply; 37+ messages in thread
From: Christophe TROESTLER @ 2005-04-20 20:00 UTC (permalink / raw)
  To: O'Caml Mailing List; +Cc: ocamlnet-devel

On Tue, 19 Apr 2005, Gerd Stolpmann <info@gerd-stolpmann.de> wrote:
> 
> Good idea. However, I think it is too late for such a discussion.
> First, it already happened. [...] Ocamlnet.

Are questions welcomed?

At the time I was not so much interested by web apps -- this is still
not my main concern but, at times, I have to build some and I like
both powerful and simple tools.  My experience with OCamlNet is that,
for a newcomer, it is difficult to find ones way through it.  The
library is impressive but, IMO, the interface could be made _simpler_
and more orthogonal.

For example I am wondering why standard CGI must use [let cgi = new
std_activation()] while FastCGI requires [Netcgi_fcgi.serv (fun cgi ->
...)].  Why can't the callback method be used consistently all over
the place?  Additional advantages are that it allows to handle
exceptions [1], to [#finalize] automatically when the request has been
dealt with (the user may still want to call [#finalize] manually but
would not be required to do so) and to [#commit]/[#flush] the output.
Finally, how are we supposed to launch different threads for different
requests [2]?

About arguments: is the mutability of arguments useful?  This makes
the whole interface more complex for a purpose I can't see.  Also, why
not distinguish simple parameters (for which a method that returns a
string is sufficient) and file uploads (for which one clearly wants
more flexibility).

Why is there an exception [Std_environment_not_found]?  Isn't it the
role of the library to reject requests with lack of information (and
log them)?  Why bother the user with that?  (I don't even think one
may want to customize the reply to such requests as they are just
bogus.)

I have a few more questions in the same vein but will stop here
waiting for reactions before bothering everybody even more! :-)

Regards,
ChriS

---
[1] I believe the library should just log them and process the next
request.  Moreover [Exit] should probably be treated specially as a
way to terminate the script early (say after an error message).

[2] I read the reply of Eric Stokes but I do not understand it: are
the various threads going to share the same socket??  And what about
if requests are multiplexed (this is not currently supported by Apache
but may be eventually with the new threaded Apache and is definitely
supported by other web servers)?


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Common CGI interface
  2005-04-20 20:00         ` Common CGI interface Christophe TROESTLER
@ 2005-04-20 21:06           ` Gerd Stolpmann
  2005-04-21  7:36             ` [Ocamlnet-devel] " Florian Hars
  2005-04-25 10:38             ` Christophe TROESTLER
  0 siblings, 2 replies; 37+ messages in thread
From: Gerd Stolpmann @ 2005-04-20 21:06 UTC (permalink / raw)
  To: Christophe TROESTLER; +Cc: O'Caml Mailing List, ocamlnet-devel

Am Mittwoch, den 20.04.2005, 22:00 +0200 schrieb Christophe TROESTLER:
> On Tue, 19 Apr 2005, Gerd Stolpmann <info@gerd-stolpmann.de> wrote:
> > 
> > Good idea. However, I think it is too late for such a discussion.
> > First, it already happened. [...] Ocamlnet.
> 
> Are questions welcomed?

Yes, of course. Also ideas for improvements, or just impressions.

> At the time I was not so much interested by web apps -- this is still
> not my main concern but, at times, I have to build some and I like
> both powerful and simple tools.  My experience with OCamlNet is that,
> for a newcomer, it is difficult to find ones way through it.  The
> library is impressive but, IMO, the interface could be made _simpler_
> and more orthogonal.

This is quite complicated to explain. Ocamlnet exhibits some of the
internal complexity to give "power users" more possibilities, for
example defining their own connector. Furthermore, it does not try to
hide the peculiarities of the various connector protocols. One sees that
every CGI request is performed by a new process, and for FastCGI and AJP
it is not hidden whether multi-processing or multi-threading is used to
parallelize requests.

Of course, this is confusing for beginners, but I don't really see how
to improve this without giving up modularity (i.e. every connector has
its own entry point).

> For example I am wondering why standard CGI must use [let cgi = new
> std_activation()] while FastCGI requires [Netcgi_fcgi.serv (fun cgi ->
> ...)].  Why can't the callback method be used consistently all over
> the place?  

For historical reasons, the CGI connector has a simplified entry point:

let cgi = new std_activation()

Why does this initialize for CGI? Because the argument ~env is missing,
and by default, env is tried to be taken from the process environment to
initialize for CGI. This simply means that on this level it is
implemented that CGI is the default connector.

Internally, the other connectors also create a std_activation object,
but with a certain ~env argument, making it different.

If we added the callback method for CGI, it would be simply

let cgi_serv f = f (new std_activation())

(maybe with added exception handling).

> Additional advantages are that it allows to handle
> exceptions [1], to [#finalize] automatically when the request has been
> dealt with (the user may still want to call [#finalize] manually but
> would not be required to do so) and to [#commit]/[#flush] the output.

Accepted, this would be better.

> Finally, how are we supposed to launch different threads for different
> requests [2]?

Maybe Eric can comment on this.

> About arguments: is the mutability of arguments useful?  This makes
> the whole interface more complex for a purpose I can't see.  

For example, to help for debugging. The command-line interface uses the
mutability of the arguments, too.

> Also, why
> not distinguish simple parameters (for which a method that returns a
> string is sufficient) and file uploads (for which one clearly wants
> more flexibility).

Because this is bullshit. It is not always a good idea to copy bad
habits of other libraries - I know that all other libraries treat simple
arguments and file arguments differently.

However, this is a difference that actually does not exist on the HTTP
level. I think it is shortsighted to artificially differ between things
that are principally the same. For example, what happens when a new HTML
feature is defined by W3C that requires a new kind of argument? E.g. a
rich text editor whose contents are transported with a new kind of
header? W3C will simply represent that argument in a form-encoded
request. The point is that OCamlnet can decode and represent all
form-encoded requests, no matter whether it is a file, a simple value,
or something completely different.

Btw, the uniform representation of arguments can already be very useful
now, for example for processing non-web requests.

> Why is there an exception [Std_environment_not_found]?  Isn't it the
> role of the library to reject requests with lack of information (and
> log them)?  Why bother the user with that?  (I don't even think one
> may want to customize the reply to such requests as they are just
> bogus.)

See above: CGI is the default connector, and this exception is raised
when the default does not apply.

> I have a few more questions in the same vein but will stop here
> waiting for reactions before bothering everybody even more! :-)

Ok, let's see whether this discussion is fruitful.

Gerd
-- 
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
------------------------------------------------------------



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Common XML interface
  2005-04-20 13:23           ` Stefano Zacchiroli
@ 2005-04-21  6:59             ` Alain Frisch
  2005-04-21 11:34               ` Gerd Stolpmann
  0 siblings, 1 reply; 37+ messages in thread
From: Alain Frisch @ 2005-04-21  6:59 UTC (permalink / raw)
  To: Stefano Zacchiroli; +Cc: Inria Ocaml Mailing List

Stefano Zacchiroli wrote:
> On Wed, Apr 20, 2005 at 09:22:23AM +0200, Alain Frisch wrote:
> 
>>It would be great to have some common interface. An event-driven 
>>interface is probably easier to agree upon. There are many points to 
> 
> 
> Even if certainly easier to agree upon, event-driven interface for XML
> are harder to program than tree based ones.

Some applications really need stream based processing: loading the XML 
document into memory is out of question (because it is huge) and/or 
processing needs to happen as soon as data is available (e.g. for the 
Jabber protocol).

> Basic tree operations should not be that hard to agree upon ...

I'm afraid it will be hard. To start with, do we want mutable trees, 
upward pointers ?  Do we want to keep locations, namespace declarations, 
comments, entity references ... ?  Which whitespace to remove ?

Anyway, a tree representation can easily be built on top of an 
event-driven interface. The difficult part in parsing XML is really 
lexing. We can try to agree upon one or several standard tree 
representation, but I believe we should start with an event-driven 
interface.

Is someone willing to set-up a mailing list for this discussion ?

-- Alain


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Ocamlnet-devel] Re: [Caml-list] Re: Common CGI interface
  2005-04-20 21:06           ` [Caml-list] " Gerd Stolpmann
@ 2005-04-21  7:36             ` Florian Hars
  2005-04-21 10:41               ` Gerd Stolpmann
  2005-04-25 10:38             ` Christophe TROESTLER
  1 sibling, 1 reply; 37+ messages in thread
From: Florian Hars @ 2005-04-21  7:36 UTC (permalink / raw)
  To: Gerd Stolpmann; +Cc: O'Caml Mailing List, ocamlnet-devel

Gerd Stolpmann wrote:
> Am Mittwoch, den 20.04.2005, 22:00 +0200 schrieb Christophe TROESTLER:
>>Also, why not distinguish simple parameters (for which a method that returns a
>>string is sufficient) and file uploads (for which one clearly wants
>>more flexibility).
> 
> Because this is bullshit.

Oh, by the way, ocamlnet does not work with form data sent by Acrobat Reader, 
which POSTs its application/vnd.fdf responses an then runs into
                  | _ ->
                       failwith "Netcgi.std_activation: unknown Content-type";

Xforms 1.0 allows multipart/related and application/xml, WHATWG Web Forms 2.0 
specify application/x-www-form+xml, text/plain, as well as any legal MIME type 
as possible in POSTs, too. So something should be done here.

Yours, Florian Hars.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Ocamlnet-devel] Re: [Caml-list] Re: Common CGI interface
  2005-04-21  7:36             ` [Ocamlnet-devel] " Florian Hars
@ 2005-04-21 10:41               ` Gerd Stolpmann
  0 siblings, 0 replies; 37+ messages in thread
From: Gerd Stolpmann @ 2005-04-21 10:41 UTC (permalink / raw)
  To: Florian Hars; +Cc: Gerd Stolpmann, O'Caml Mailing List, ocamlnet-devel

Am Donnerstag, den 21.04.2005, 09:36 +0200 schrieb Florian Hars:
> Gerd Stolpmann wrote:
> > Am Mittwoch, den 20.04.2005, 22:00 +0200 schrieb Christophe TROESTLER:
> >>Also, why not distinguish simple parameters (for which a method that returns a
> >>string is sufficient) and file uploads (for which one clearly wants
> >>more flexibility).
> > 
> > Because this is bullshit.
> 
> Oh, by the way, ocamlnet does not work with form data sent by Acrobat Reader, 
> which POSTs its application/vnd.fdf responses an then runs into
>                   | _ ->
>                        failwith "Netcgi.std_activation: unknown Content-type";
> 
> Xforms 1.0 allows multipart/related and application/xml, WHATWG Web Forms 2.0 
> specify application/x-www-form+xml, text/plain, as well as any legal MIME type 
> as possible in POSTs, too. So something should be done here.

Thank you for the tip. Support for the non-multipart types is quite
simple to add, one can map it to one "default" argument (and leave it to
the application to parse it). Of course, we could also directly parse
the application/xml data, but that would make Ocamlnet dependent on an
XML parser.

For multipart types the question arises whether to analyze them further,
and maybe map to several arguments (but how many levels of nested
multiparts?).

Gerd

> Yours, Florian Hars.
> 
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
> 
-- 
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
------------------------------------------------------------



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Common XML interface
  2005-04-21  6:59             ` [Caml-list] Common XML interface Alain Frisch
@ 2005-04-21 11:34               ` Gerd Stolpmann
  0 siblings, 0 replies; 37+ messages in thread
From: Gerd Stolpmann @ 2005-04-21 11:34 UTC (permalink / raw)
  To: Alain Frisch; +Cc: Stefano Zacchiroli, Inria Ocaml Mailing List

Am Donnerstag, den 21.04.2005, 08:59 +0200 schrieb Alain Frisch:
> Stefano Zacchiroli wrote:
> > On Wed, Apr 20, 2005 at 09:22:23AM +0200, Alain Frisch wrote:
> > 
> >>It would be great to have some common interface. An event-driven 
> >>interface is probably easier to agree upon. There are many points to 
> > 
> > 
> > Even if certainly easier to agree upon, event-driven interface for XML
> > are harder to program than tree based ones.
> 
> Some applications really need stream based processing: loading the XML 
> document into memory is out of question (because it is huge) and/or 
> processing needs to happen as soon as data is available (e.g. for the 
> Jabber protocol).
> 
> > Basic tree operations should not be that hard to agree upon ...
> 
> I'm afraid it will be hard. To start with, do we want mutable trees, 
> upward pointers ?  Do we want to keep locations, namespace declarations, 
> comments, entity references ... ?  Which whitespace to remove ?

For a standard representation we should use DOM, simply because lots of
XML standards refer to DOM. Of course, that doesn't answer all details.

> Anyway, a tree representation can easily be built on top of an 
> event-driven interface. The difficult part in parsing XML is really 
> lexing. We can try to agree upon one or several standard tree 
> representation, but I believe we should start with an event-driven 
> interface.

And it is much simpler.

> Is someone willing to set-up a mailing list for this discussion ?

I have set up a mailing list:

https://gps.dynxs.de/mailman/listinfo/xml-list

I would suggest we wait until Monday before starting the discussion so
everybody can sign up who is interested.

Gerd
-- 
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
------------------------------------------------------------



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Common CGI interface
  2005-04-20 21:06           ` [Caml-list] " Gerd Stolpmann
  2005-04-21  7:36             ` [Ocamlnet-devel] " Florian Hars
@ 2005-04-25 10:38             ` Christophe TROESTLER
  2005-04-26 11:08               ` Gerd Stolpmann
  2005-04-26 16:24               ` [Caml-list] " Eric Stokes
  1 sibling, 2 replies; 37+ messages in thread
From: Christophe TROESTLER @ 2005-04-25 10:38 UTC (permalink / raw)
  To: info; +Cc: caml-list, ocamlnet-devel

On Wed, 20 Apr 2005, Gerd Stolpmann <info@gerd-stolpmann.de> wrote:
> 
> defining their own connector.

I understand one needs to do so to extend the library but can you name
other situations?  My feeling is that CGI, FCGI, AJP, and test are the
more used ones and that a custom connector is seldom needed...  so
shouldn't the standard connectors share a common standard (of course
with a few peculiarities to each) while the function(s) to create new
ones are grouped into a separate module.  The prng* functions should
be in the main module -- an additional [random_sessionid] function
(generating e.g. 128 bits random strings) could be useful.

> Furthermore, it does not try to hide the peculiarities of the
> various connector protocols.

The purpose of the various connectors being the same, I believe they
should share a common interface whenever possible.  It is needlessly
inconvenient to have to learn different interfaces for a given
concept.  Also, whenever possible, I believe names from the standard
library should be reused (e.g. establish_server).

> One sees that every CGI request is performed by a new process, and
> for FastCGI and AJP it is not hidden whether multi-processing or
> multi-threading is used to parallelize requests.

It is good to be able to choose.  For FCGI however, I was expecting
some comments of Eric to understand better how it works (including
multiplexing).

> Of course, this is confusing for beginners, but I don't really see
> how to improve this without giving up modularity (i.e. every
> connector has its own entry point).

I am afraid that I am not sure to fully grasp which kind of modularity
you have in mind -- certainly because of my lack of experience in web
devel.  For example, I do not understand why
[Netcgi_jserv.server_init] is not just included in [server_loop].
Another reason modularity is good for it multithreading (or multiple
processes).  But there are other ways to handle that than to split
into many functions.  For example, on can imagine

  val establish_server : ?max_conns:int -> ... ->
    ?fork:((connection -> unit) -> connection -> unit) ->
    (connection -> unit) -> Unix.sockaddr -> unit

(?fork can create a process or a thread).  This makes it possible to
wrap the function handling the connection (connection -> unit) so that
exceptions it raises are dealt with appropriately -- thus for example
it seems possible to get rid of the care the user must exercise with
[Signal_shutdown]...

May you explain situations for which a [establish_server] /
[handle_request] modularity is not enough?

> If we added the callback method for CGI, it would be simply

I am not suggesting to simply _add_ one (that would just make the
whole interface more confusing) but to rework the interface so that

* all connectors are treated equally (e.g. CGI is noting special
  w.r.t. other, conceptually) and the modularity is handled the same
  way for all of them (short of optional arguments).

* a separate module possesses the material to extend netcgi, e.g. to
  create specially tailored connectors.

Another thing that seems to be lacking is a uniform way to write in
the server log.  For CGI it is stderr, FCGI uses special "channel"
(not stderr),...  This is important e.g. to log nonfatal errors.

> > About arguments: is the mutability of arguments useful?  This makes
> > the whole interface more complex for a purpose I can't see.  
> 
> For example, to help for debugging.

May you explain how?  Is it useful to modify the value of a param
inside a request handling function, with global effect (i.e. not just
for the function scope)?  Setting parameters before handling the
request is a different matter -- a powerful "test" mode can certainly
do this without mutability (exposed).

> The command-line interface uses the mutability of the arguments,

Well, it is fine with me that the function creating the environment
can modify it.  What I am objecting is that [cgi_activation] offers
functions to mutate them.

> > [Std_environment_not_found]
> See above: CGI is the default connector, and this exception is raised
> when the default does not apply.

But then, if you do not treat CGI in a special way (i.e. have distinct
CGI and test connectors) it is not needed.  In fact, it is not clear
to me why it is good to have [std_environment] and [test_environment]
in the interface as, as far as I can tell, they will just be used to
implement the associated connectors (i.e. what this modularity brings
you?).  [custom_environment] is fine and should be put in the
"extension" module.

Regards,
ChriS


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Common CGI interface
  2005-04-25 10:38             ` Christophe TROESTLER
@ 2005-04-26 11:08               ` Gerd Stolpmann
  2005-05-06 20:14                 ` Christophe TROESTLER
  2005-04-26 16:24               ` [Caml-list] " Eric Stokes
  1 sibling, 1 reply; 37+ messages in thread
From: Gerd Stolpmann @ 2005-04-26 11:08 UTC (permalink / raw)
  To: Christophe TROESTLER; +Cc: caml-list, ocamlnet-devel

Am Montag, den 25.04.2005, 12:38 +0200 schrieb Christophe TROESTLER:
> On Wed, 20 Apr 2005, Gerd Stolpmann <info@gerd-stolpmann.de> wrote:
> > 
> > defining their own connector.
> 
> I understand one needs to do so to extend the library but can you name
> other situations?  My feeling is that CGI, FCGI, AJP, and test are the
> more used ones and that a custom connector is seldom needed...  so
> shouldn't the standard connectors share a common standard (of course
> with a few peculiarities to each) while the function(s) to create new
> ones are grouped into a separate module. 

In principle you are right, and in fact the connectors share a common
standard except the way they are created. It would be nice if this way
could be made similar for each. In the past, this was not high priority
because it was more important to _have_ the connectors.

The creation of connectors depends very much on the overall processing
model. The most important types of models are:

- The model is controlled by the web server. This is true for CGI where
  the server creates a new process for each request. This may be also
  the case for FastCGI, and even for AJP (although regarded as 
  obsolete).

  In this case the web server creates new processes when needed, and
  communicates somehow with existing processes. As you mention
  the function prototype establish_server: I think this is not 
  applicable here, because the application only reacts on the
  web server, but isn't a real server of its own. And wrapping the
  "reactivity" into a function establish_server would be very strange,
  at least.

- Multi-processing controlled by the application itself. Currently,
  this is only implemented for AJP. Multi-processing has several
  variants, the simplest form is "fork on new connection", but for
  high performance one needs to reuse processes, and pre-forking.

  Multi-processing is special because the connector must know about
  it - this has to do with the way file descriptors can be passed
  between processes - generally only from master to child, and 
  this means some parts of the connector run in the master process,
  and other parts in the children processes.

- Multi-threading controlled by the application itself. As for
  multi-processing, one can have several variants. It is much simpler
  for the connector, however, because one does not have the file
  descriptor passing limitations, and the connector can be made
  very configurable in this case.

There are further aspects that are different for the processing models:

- Whether only one URL is served by the application, or several, or
  a whole URL prefix.

- Sometimes persistent connections to external services (e.g. databases)
  are needed. Managing these depends very much on the model. For
  example, in a multi-threading model one usually wants a shared pool
  of external connections. In a multi-processing model pooling is not
  possible, but one can have one connection per process.

- Sometimes the instances of the application want to communicate with
  each other, e.g. fast management of sessions, or even coordinated
  shutdown.

My point is that it is not easy to find a common description of all that
such that one can have a common signature for all connectors. And even
if we had that: Do you really want to say for every CGI that you don't
have all that features that are only available for server models? Or
isn't it more honest to do without complex specifications when they
aren't available?

>  The prng* functions should
> be in the main module -- an additional [random_sessionid] function
> (generating e.g. 128 bits random strings) could be useful.

Accepted.

> > Of course, this is confusing for beginners, but I don't really see
> > how to improve this without giving up modularity (i.e. every
> > connector has its own entry point).
> 
> I am afraid that I am not sure to fully grasp which kind of modularity
> you have in mind -- certainly because of my lack of experience in web
> devel.  For example, I do not understand why
> [Netcgi_jserv.server_init] is not just included in [server_loop].

This is not possible for multi-processing models: server_init runs in
the master process, and server_loop in every child process.

> Another reason modularity is good for it multithreading (or multiple
> processes).  But there are other ways to handle that than to split
> into many functions.  For example, on can imagine
> 
>   val establish_server : ?max_conns:int -> ... ->
>     ?fork:((connection -> unit) -> connection -> unit) ->
>     (connection -> unit) -> Unix.sockaddr -> unit
> 
> (?fork can create a process or a thread).  This makes it possible to
> wrap the function handling the connection (connection -> unit) so that
> exceptions it raises are dealt with appropriately -- thus for example
> it seems possible to get rid of the care the user must exercise with
> [Signal_shutdown]...
> 
> May you explain situations for which a [establish_server] /
> [handle_request] modularity is not enough?

In general, I can imagine for every connector a single entry point
function, although the arguments of each would be very different.

The other question is whether one should hide all the internals (e.g.
"pack" them away). I think no, this is the wrong way, although some
better way of communicating where the entry-point functions are would be
desirable (better documentation, maybe special modules only for users).

> > If we added the callback method for CGI, it would be simply
> 
> I am not suggesting to simply _add_ one (that would just make the
> whole interface more confusing) but to rework the interface so that
> 
> * all connectors are treated equally (e.g. CGI is noting special
>   w.r.t. other, conceptually) and the modularity is handled the same
>   way for all of them (short of optional arguments).
> 
> * a separate module possesses the material to extend netcgi, e.g. to
>   create specially tailored connectors.

As explained, this isn't as simple as you think. The connectors aren't
equal, and the user must know that, and merging the specific differences
into a common standard is far from being trivial.

I am currently thinking about a system of configuration objects. Since
O'Caml 3.08 we have anonymous objects, and that means creating an object
is no more difficult than creating a record value, e.g.

let my_config =
  object
    method this_property = Do it this way
    method that_property = Do it that way
  end

The idea is that the web application can define one configuration
object, and every connector picks the parts of the configuration it
needs (by using subtyping). It is simple to define defaults - the object
simply inherits from a default configuration class. The point is that
every connector can request as many configuration options as it needs
from the user. Options that have the same meaning for several connectors
get the same names.

Ideally, the web application needs only to define one configuration
object, and only by calling a different entry point the connector can be
changed to a different one.

> Another thing that seems to be lacking is a uniform way to write in
> the server log.  For CGI it is stderr, FCGI uses special "channel"
> (not stderr),...  This is important e.g. to log nonfatal errors.

Accepted. The environment will have an error-logging function in the
future. (I also need that for the HTTP daemon I am currently
developing.)

> > > About arguments: is the mutability of arguments useful?  This makes
> > > the whole interface more complex for a purpose I can't see.  
> > 
> > For example, to help for debugging.
> 
> May you explain how?  Is it useful to modify the value of a param
> inside a request handling function, with global effect (i.e. not just
> for the function scope)?  Setting parameters before handling the
> request is a different matter -- a powerful "test" mode can certainly
> do this without mutability (exposed).

I guess you see a dynamic web page as a function, and of course the
arguments of a function are immutable. However, this is only one view of
the thing.

In web development, the arguments are often considered as session state,
passed from one web page to the next. In this thinking, mutability is
quite natural: the arguments are the global variables of the whole
application spanning the individual web pages it consists of.

> But then, if you do not treat CGI in a special way (i.e. have distinct
> CGI and test connectors) it is not needed.  In fact, it is not clear
> to me why it is good to have [std_environment] and [test_environment]
> in the interface as, as far as I can tell, they will just be used to
> implement the associated connectors (i.e. what this modularity brings
> you?).  [custom_environment] is fine and should be put in the
> "extension" module.

Ok, this exception exposes internals. But, as already pointed out, I
don't see why this is so bad.

Gerd
-- 
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
------------------------------------------------------------



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Common CGI interface
  2005-04-25 10:38             ` Christophe TROESTLER
  2005-04-26 11:08               ` Gerd Stolpmann
@ 2005-04-26 16:24               ` Eric Stokes
  2005-05-06 20:14                 ` Christophe TROESTLER
  1 sibling, 1 reply; 37+ messages in thread
From: Eric Stokes @ 2005-04-26 16:24 UTC (permalink / raw)
  To: Christophe TROESTLER; +Cc: caml-list, info, ocamlnet-devel

	As far as fastcgi's process model is concerned, let me give a few more 
details about that. There are actually N process models for fastcgi, 
divided into two groups. One group is managed by the web server in an 
"application server" environment, and the other group is stand alone. 
The common case is the first group, because it is the easiest to 
implement, and generally works quite well. In that case, concurrency is 
configured at the app server level (web server). App servers implement 
both sequential processing, and process pools (the first being a 
special case of the second where p=1). The app server is generally 
configured to start N processes per application, and each will 
sequentially process requests given to it by the web server. The web 
server will take care of routing requests to different processes (or 
push it off onto the OS schedular). The serv function simply calls 
accept on stdin (which is a listening socket), and processes each 
request. All aspects of process creation and destruction are handled by 
the web/app server. A slight variation on this model is to create 
multiple threads within the same process, with each thread running a 
serv loop, this will maximize pseudo-parallelism within a single 
process, and can be a benefit for IO bound applications. CPU bound 
applications do not currently benefit from this model.
	Just within the first group of concurrency models it is possible to 
have the two most sought after concurrency methods in practical terms, 
process pool concurrency, and threaded concurrency. Because of this, 
very little work has been done supporting the second group, where the 
web server does not manage the application in any way, but instead 
simply connects to it as a client. However, with the current code base 
it is possible to use the second model, in which case you are free to 
implement whatever concurrency model you like. That being said, nothing 
is currently implemented for you. My view is that if you are writing an 
application for which the concurrency models provided by the web server 
are not sufficient then you are more than likely working on a very big 
project, and would most certainly reject out of hand any kind of silly 
canned attempt at a server construction kit I could provide in 
ocamlnet.
	The reason that the jserv modules provide such things, while the 
fastcgi modules do not is simply that those two standards decided to go 
in a different direction. Fastcgi tries to be just like CGI in as many 
ways as it can, so I find their decision to implement an application 
server very natural, CGI itself was perhaps one of the earliest 
application servers created. I admit that we have not provided 
extensive documentation of this in the past, and this is something I 
will try to remedy. However, there is a lot of good documentation on 
the workings of fastcgi, and on building fastcgi applications in 
general. We try to implement a fastcgi connector that is close enough 
to the standard that that documentation is useful.

On Apr 25, 2005, at 3:38 AM, Christophe TROESTLER wrote:

> On Wed, 20 Apr 2005, Gerd Stolpmann <info@gerd-stolpmann.de> wrote:
>>
>> defining their own connector.
>
> I understand one needs to do so to extend the library but can you name
> other situations?  My feeling is that CGI, FCGI, AJP, and test are the
> more used ones and that a custom connector is seldom needed...  so
> shouldn't the standard connectors share a common standard (of course
> with a few peculiarities to each) while the function(s) to create new
> ones are grouped into a separate module.  The prng* functions should
> be in the main module -- an additional [random_sessionid] function
> (generating e.g. 128 bits random strings) could be useful.
>
>> Furthermore, it does not try to hide the peculiarities of the
>> various connector protocols.
>
> The purpose of the various connectors being the same, I believe they
> should share a common interface whenever possible.  It is needlessly
> inconvenient to have to learn different interfaces for a given
> concept.  Also, whenever possible, I believe names from the standard
> library should be reused (e.g. establish_server).
>
>> One sees that every CGI request is performed by a new process, and
>> for FastCGI and AJP it is not hidden whether multi-processing or
>> multi-threading is used to parallelize requests.
>
> It is good to be able to choose.  For FCGI however, I was expecting
> some comments of Eric to understand better how it works (including
> multiplexing).
>
>> Of course, this is confusing for beginners, but I don't really see
>> how to improve this without giving up modularity (i.e. every
>> connector has its own entry point).
>
> I am afraid that I am not sure to fully grasp which kind of modularity
> you have in mind -- certainly because of my lack of experience in web
> devel.  For example, I do not understand why
> [Netcgi_jserv.server_init] is not just included in [server_loop].
> Another reason modularity is good for it multithreading (or multiple
> processes).  But there are other ways to handle that than to split
> into many functions.  For example, on can imagine
>
>   val establish_server : ?max_conns:int -> ... ->
>     ?fork:((connection -> unit) -> connection -> unit) ->
>     (connection -> unit) -> Unix.sockaddr -> unit
>
> (?fork can create a process or a thread).  This makes it possible to
> wrap the function handling the connection (connection -> unit) so that
> exceptions it raises are dealt with appropriately -- thus for example
> it seems possible to get rid of the care the user must exercise with
> [Signal_shutdown]...
>
> May you explain situations for which a [establish_server] /
> [handle_request] modularity is not enough?
>
>> If we added the callback method for CGI, it would be simply
>
> I am not suggesting to simply _add_ one (that would just make the
> whole interface more confusing) but to rework the interface so that
>
> * all connectors are treated equally (e.g. CGI is noting special
>   w.r.t. other, conceptually) and the modularity is handled the same
>   way for all of them (short of optional arguments).
>
> * a separate module possesses the material to extend netcgi, e.g. to
>   create specially tailored connectors.
>
> Another thing that seems to be lacking is a uniform way to write in
> the server log.  For CGI it is stderr, FCGI uses special "channel"
> (not stderr),...  This is important e.g. to log nonfatal errors.
>
>>> About arguments: is the mutability of arguments useful?  This makes
>>> the whole interface more complex for a purpose I can't see.
>>
>> For example, to help for debugging.
>
> May you explain how?  Is it useful to modify the value of a param
> inside a request handling function, with global effect (i.e. not just
> for the function scope)?  Setting parameters before handling the
> request is a different matter -- a powerful "test" mode can certainly
> do this without mutability (exposed).
>
>> The command-line interface uses the mutability of the arguments,
>
> Well, it is fine with me that the function creating the environment
> can modify it.  What I am objecting is that [cgi_activation] offers
> functions to mutate them.
>
>>> [Std_environment_not_found]
>> See above: CGI is the default connector, and this exception is raised
>> when the default does not apply.
>
> But then, if you do not treat CGI in a special way (i.e. have distinct
> CGI and test connectors) it is not needed.  In fact, it is not clear
> to me why it is good to have [std_environment] and [test_environment]
> in the interface as, as far as I can tell, they will just be used to
> implement the associated connectors (i.e. what this modularity brings
> you?).  [custom_environment] is fine and should be put in the
> "extension" module.
>
> Regards,
> ChriS
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Common CGI interface
  2005-04-26 11:08               ` Gerd Stolpmann
@ 2005-05-06 20:14                 ` Christophe TROESTLER
  2005-05-10  0:07                   ` [Caml-list] " Christophe TROESTLER
  2005-05-10  0:10                   ` Christophe TROESTLER
  0 siblings, 2 replies; 37+ messages in thread
From: Christophe TROESTLER @ 2005-05-06 20:14 UTC (permalink / raw)
  To: Gerd Stolpmann; +Cc: O'Caml Mailing List, ocamlnet-devel

[-- Attachment #1: Type: Text/Plain, Size: 7346 bytes --]

Hi,

To start with, sorry to reply at such a slow pace but I am quite busy
with my main job...

On Tue, 26 Apr 2005, Gerd Stolpmann <info@gerd-stolpmann.de> wrote:
> 
> - The model is controlled by the web server. [...] establish_server:
> I think this is not applicable here

I agree.  This is the easier case as, from the point of view of the
app writer, it basically amounts to a single process (that may serve
several requests sequentially).  [This is also the model for
mod_caml.]  I would suggest that a default function [run] takes care
of this case for every connector.  Resources "opened" (e.g. a DB
connection) before [run f] can be reused by each call of [f].  For
convenience, I would add an optional argument [?sockaddr] to [run] can
turn it into a distant app server (still sequential).

No other connectors are needed for CGI and Test.

> - Multi-processing controlled by the application itself. [...]
> - Multi-threading controlled by the application itself. [...]
>
> - Whether only one URL is served by the application, or several [...]
> - [...] persistent connections to external services [...]
> - [...] instances of the application [...] communicate with each other [...]
>
> My point is that it is not easy to find a common description of all
> that [...] really want to say for every CGI that you don't have all
> that features that are only available for server models?  [...]

As said above, sequential processing should just be done through a
[run] function.  My idea is NOT to implement all the above models,
just to have the minimal set of primitives handling their respective
protocols and factored in such a way that ANY kind of concurrency one
want CAN be programmed on top of them.  After all, as you point out,
one may not be able to describe all possibilities that the user may
want.

The two points where launching a new thread/process makes sense are:
- to accept connections on a given (list of) socket(s);
- to handle a request (when its input has been completely provided).

As far as I understand, for AJP, the second possibility is not so much
interesting (requests are sequential).  It is however for FCGI since
requests may be multiplexed (thus one may want to continue reading the
other requests while processing some).

Currently the primirives on which multi-processes/threads apps can be
built are:
- Netcgi_jserv.server_init
- Netcgi_jserv.server_loop
- Netcgi_jserv_ajp12.serve_connection

IMO they are a bit cumbersome to use (looking at their implementation
is necessary to graps them fully) but they are a good example of what
I mean when I'm talking about a "minimal set of primitives".  I'll
have to think more to see if I can come up with something that I like
better (and hopefully that you will too! :).  (BTW, the AJP protocol
includes its version, so I think it would be good to have a
[serve_connection] that adapts to it -- this makes
[Netcgi_jserv_app.protocol_type] useless.)


A related note on connectors is how they should handle exceptions
raised by the script.  My feeling is that they should catch them and
log them and then get ready for the next request (obviously the last
part only makes sense when several requests are handled -- maybe
sequentially -- by the script).  That seems better than simply
crashing the app.  IMO the exception [Exit] should be treated
specially and be accepted as an appropriate way of ending prematurely
the script (it is really useful to finish early after error
reporting).

> > I do not understand why [Netcgi_jserv.server_init] is not just
> > included in [server_loop].
> 
> This is not possible for multi-processing models: server_init runs in
> the master process, and server_loop in every child process.

I am wondering: why have concurrent, possibly mutex protected, accept
to the sockect instead of having one process listening on it and
dispatching (to processes or threads) on each accept?

> > * all connectors are treated equally
>
> As explained, this isn't as simple as you think. The connectors
> aren't equal, and the user must know that, and merging the specific
> differences into a common standard is far from being trivial.

The purpose is not to hide *all* differences between the connectors
but, as much as possible, to do the same things the same way.  In
other words: have them share a common "philosophy".

> I am currently thinking about a system of configuration objects. [...]

Yes, they are a good idea and may help designing something elegant.

> > > > About arguments: is the mutability of arguments useful?
> 
> [...] the arguments are often considered as session state, passed
> from one web page to the next [...] mutability is quite natural

I would agree if there was a way to "return" them to the next page.
But of course, there can't be at the level of the "connector".  So
having mutability at this level is in fact misleading.  IMO, session
preservation belongs to a "model" of web applications -- that is to a
library build on the top of connectors [1].  It is the role of that
library to provide pages-as-functions, session managment through
arguments/databases/continuations/session-key/cookies,... [2].  The
bottom line is that I believe that connectors should not "push"
towards a given model -- as moreover they can only provide a half
baked solution [3].

---
[1] It is important I believe that the community who can develop such
    "higher level" libraries is offered a simple and standard
    interface to connectors (including mod_caml if you ask me).

[2] I've been dreaming of a session module that can dialog with fully
    typed templates (a la Kartz, but typed) which lets you define the
    session variables you need and manages the state transparently...

[3] The arguments being string or files, they are very much mutable by
    their very own nature.  However, the flexibility to directly
    modify them is barely matched by the interface (e.g. there is no
    need to "open" the argument for reading AND writing).

    Also, the session management may want to have "hidden" variables
    (like a session id that is automatically generated and not user's
    business) and it is not nice that there is a way to modify those
    "behind the back" of the library.

> > [std_environment] and [test_environment]
> > [custom_environment] is fine
> 
> Ok, this exception exposes internals. But, as already pointed out, I
> don't see why this is so bad.

Well, that will not give me nightmares either.  Nonetheless, I am a
strong believer that interfaces should be minimal and hide irrelevant
details unless there is a strong case about it ("it's there but ignore
it" is not neat IMO).  Note that my question (now stripped) was
broader: what is the modularity gained by providing [std_environment]
and [test_environment] ? -- seems to me [custom_environment] is all we
need.

To speak on something concrete I attach a piece of the interface as I
see it.  It does not include the "extension" interface.  If we agree
that it is only is useful to define new connectors, it should either
be an inner module (hidden when -pack'ing) or a separate module.
Normal comments explain the intent or raise questions.  OCamldoc
comments document new functions.

Regards,
ChriS


---
P.S. tmp_directory in make_message_board (in file
netcgi_jserv_app.ml) should take its value from the config record.

[-- Attachment #2: Interface proposal --]
[-- Type: Text/Plain, Size: 5817 bytes --]

(*
 * Types and functions shared by all connectors
 ***********************************************************************)

module Random :
sig
  val init : ?lock:(unit -> unit) -> ?unlock:(unit -> unit) -> string -> unit
  val init_from_file : ?lock:(unit -> unit) -> ?unlock:(unit -> unit) ->
    ?length:int -> string -> unit
  val byte : unit -> int
  val sessionid : int -> string
    (** [sessionid length] generates a [8 * length] bit ([lenght] hex
        digits) random string which can be used for session IDs,
        random cookies and so on.  The string returned will be very
        random and hard to predict *)
end


class type argument = (* I do not see the point of the cgi prefix *)
object
  (* ... methods to be discussed ... *)

  method storage : [`Memory | `File of string]
    (* No need to define [store] as it is only used here -- saying it
       here makes it easier to figure out what the method is. *)
  method representation : [ `Simple of Netmime.mime_body
                          | `MIME of Netmime.mime_message ]
    (* Same justification as above: having a single point of entry to
       undertand a method is easier. *)
end


type cookie = ... (* I do not see the point of the cgi prefix *)


class type environment =
object
  (* Only changed methods are listed *)

  method cookie : string -> cookie
  method cookies : (string * cookie) list
    (* The first one is convenient.  Moreover, both should use the
       cookie type. *)

  (* Since [set_input_state] and [set_output_state] are not supposed
     to be for the final user, it would be nicer if they did not
     appear here.

     In the same vein, I would not include [input_ch] and
     [input_state] in the *public* interface: they are only useful for
     the [cgi] to initialize itself. *)
end

class type cgi = (* formerly cgi_activation *)
object
  (* I believe short names are better so long they are as readable *)
  method arg : string -> argument
  method arg_val : ?default:string -> string -> string (*was [argument_value]*)
  method arg_true : string -> bool
    (** This method returns false if the named parameter is missing,
        is an empty string, or is the string ["0"]. Otherwise it
        returns true. Thus the intent of this is to return true in the
        Perl sense of the word.  If a parameter appears multiple
        times, then this uses the first definition and ignores the
        others. *)
  method arg_all : string -> argument list (* formerly [multiple_argument] *)
  method args : (string * argument) list

  method url : ?protocol:Netcgi.protocol -> ... -> unit -> string

  method set_header : ?status:status -> ... -> unit -> unit
  method set_redirection_header : string -> unit
  method output : Netchannels.trans_out_obj_channel

  method log : Netchannels.out_obj_channel
    (** A channel whose data is appended to the webserver log. *)

  method finalize  : unit -> unit

  method environment : Netcgi.environment
  method request_method : [`GET | `HEAD | `POST | `DELETE | `PUT of argument]
    (* Single point of doc. *)
end



type config = {
  tmp_directory : string;
  tmp_prefix : string;
  permitted_http_methods : [`GET | `HEAD | `POST | `DELETE | `PUT] list;
                                                             (* Uniformity *)
  permitted_input_content_types : string list;
  input_content_length_limit : int;
  workarounds : [ `MSIE_Content_type_bug | `backslash_bug ] list;
    (* Single point of documentation. *)
}

(* These names better convey the intent I think *)
type output_type = [`Direct | `Transactional]
type arg_storage = [`Memory | `File | `Automatic]

(*
 * Specific connectors
 ***********************************************************************)
module CGI :
sig
  val run :
    ?config:config ->
    ?output_type:output_type ->
    ?arg_storage:(string -> Netmime.mime_header -> arg_storage) ->
    (cgi -> unit) -> unit
end

module FCGI :
sig
  val run :
    ?config:config ->
    ?output_type:output_type ->
    ?arg_storage:(string -> Netmime.mime_header -> arg_storage) ->
    ?sockaddr:Unix.sockaddr ->
    (cgi -> unit) -> unit

  (* Some flexible functions that allow any concurrency model.  Here is
     a possibility. *)
  val socket : ?backlog:int -> ?reuseaddr:bool ->
    ?addr:Unix.inet_addr -> ?port:int -> Unix.file_descr
    (** [socket ?backlog ?reuseaddr ?addr ?port] setup a FCGI socket
	listening to incomming requests.

	@param backlog Length of the backlog queue (connections not yet
	accepted by the AJP server)
	@param reuseaddr Whether to reuse the port
	@param addr defaults to localhost.
	@param port if not present, uses stdin (which is a socket on
 	Unix or -- contrarily to the spec -- a pipe on win$).
    *)

  (* More thinking is needed on "accept" and "request-handler". *)
end

module AJP :
sig
  val run :
    ?config:config ->
    ?output_type:output_type ->
    ?arg_storage:(string -> Netmime.mime_header -> arg_storage) ->
    ?sockaddr:Unix.sockaddr ->
    (cgi -> unit) -> unit

(* Some flexible functions that allow any concurrency model.  Similar
   to the ones for FCGI. *)
end

module Test :
sig
  val simple_arg : string -> string -> argument
  val mime_argument : ?work_around_backslash_bug:bool ->
    string -> Netmime.mime_message -> argument

  val run :
    ?config:config ->
    ?output_type:output_type ->
    ?arg_storage:(string -> Netmime.mime_header -> arg_storage) ->
    ?args:cgi_argument list ->
    ?meth:request_method ->
    (cgi -> unit) -> unit
    (* More flexibility is definitely required here -- along the lines
       of [custom_environment].  I am thinking one could e.g. be
       general enough to allow the output to be set into a frame, and
       another frame being used for control, logging,... -- a live
       debugger if you like! *)
end

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Common CGI interface
  2005-04-26 16:24               ` [Caml-list] " Eric Stokes
@ 2005-05-06 20:14                 ` Christophe TROESTLER
  0 siblings, 0 replies; 37+ messages in thread
From: Christophe TROESTLER @ 2005-05-06 20:14 UTC (permalink / raw)
  To: Eric Stokes; +Cc: caml-list, info, ocamlnet-devel

Hi!

First of all, let me say that the code for the FCGI connector will
need to be changed!  Indeed, Netcgi prides itself on not having the
same limitations as the other libraries, in particular the
Sys.max_string_length limit.  But the FCGI module very much has it as
the entire input for a request is accumulated into a string!

I also believe the FCGI connector should support multiplexing of
requests since it is part of the spec.

Also, as a minor remark, the object [fcgi_out_channel] does not
protect against [#output ""] unwittingly closing the channel.

On Tue, 26 Apr 2005, Eric Stokes <gremlin@itkinetix.com> wrote:
> 
> As far as fastcgi's process model is concerned [...] managed by the
> web server [...] stand alone.  The common case is the first group
> [...]  A slight variation [...] create multiple threads within the
> same process, with each thread running a serv loop [...]

Ok -- several threads can share a given socket, even without a mutex.

> second group [...] My view is that if you are writing an application
> for which the concurrency models provided by the web server are not
> sufficient then you are more than likely working on a very big
> project, 

Let me strongly disagree with that.

For example, where I work, they will not be much keen to let me run a
script on the web server (because of security concerns and space
reasons).  Moreover, OCaml support for AIX has been dropped recently
IIRC so that would not be possible anyway.

> and would most certainly reject out of hand any kind of silly canned
> attempt at a server construction kit I could provide in ocamlnet.

Is this how you would characterize what JSERV offers? ;)

In any case, if the way JSERV handle concurrency models is good
enough, then I see no reason FCGI does not support that as well...
The ideal situation would be that a concurrency module be built on top
of a set of functions (through, say, a functor) for both FCGI and AJP.

> there is a lot of good documentation on the workings of fastcgi, and
> on building fastcgi applications in general. We try to implement a
> fastcgi connector that is close enough to the standard that that
> documentation is useful.

Yes and no.  The interface is fairly different from the ones in
e.g. C++ or Perl, so some doc on how to use it for multiprocesses /
multithreads is welcome.

Regards,
ChriS


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Common CGI interface
  2005-05-06 20:14                 ` Christophe TROESTLER
@ 2005-05-10  0:07                   ` Christophe TROESTLER
  2005-05-10  0:10                   ` Christophe TROESTLER
  1 sibling, 0 replies; 37+ messages in thread
From: Christophe TROESTLER @ 2005-05-10  0:07 UTC (permalink / raw)
  To: info; +Cc: caml-list, ocamlnet-devel

[-- Attachment #1: Type: Text/Plain, Size: 5444 bytes --]

Hi,

Let me continue to develop some ideas about the socket-accept-handle
triplet.

- For [socket] (and [run]) of the AJP connector, I would add an
  optional parameter [?props] to be able to pass an optional property
  list -- the values set in this way supersede the values set by other
  optional arguments in order to be able to set defaults.  To be more
  flexible, I would also replace [jvm_emu_main] by a function parsing
  the arguments (allowing to set more arguments if needed).  That
  yields the following functions:

  val arg_parse : ?anon_fun:(string -> unit) -> ?usage_msg:string ->
    (Arg.key * Arg.spec * Arg.doc) list -> (string * string) list
  val props_of_file : string -> (string * string) list

  val run :
    ?props:(string * string) list
    ?config:config ->
    ?output_type:output_type ->
    ?arg_storage:(string -> Netmime.mime_header -> arg_storage) ->
    ?sockaddr:Unix.sockaddr ->
    (cgi -> unit) -> unit

  val socket : ?props:(string * string) list -> ?backlog:int ->
    ?reuseaddr:bool -> ?addr:Unix.inet_addr -> ?port:int -> Unix.file_descr

  The equivalent to [jvm_emu_main(fun props auth addr port -> server f
  auth addr port)] now reads:

  AJP.run ~props:(AJP.arg_parse []) f


- The [handle_connection] will not be in a separate module but instead
  adapts itself to the version present in the protocol (convenient but
  also useful e.g. if the app machine must handle several web servers
  with different versions of the protocol).

  How will [handle_connection] reports that the app must shutdown or
  restart?  So far, it was raising an exception.  However, I believe
  it is better to "force" the user to handle them, so they should be
  return values instead:

  type connection_handler =
      Unix.file_descr -> Unix.file_descr -> (cgi -> unit) -> connection_status
  and connection_status = Ok | Shutdown | Restart

  (The [Ok] is in case the server shuts down the connection after a
  request which I think it is entitled to do.)

  In case of threads, that makes also easy to transmit the value to
  the "main thread" through a ['a Event.event] -- this will generally
  be preferred to pipes in a multi-threaded app. I guess.

  The [connection_handler] takes two file descriptors, one for input
  and the other one for output, for flexibility (e.g. buiding a
  chain,...).  Generally, both will be the same socket.  (Maybe the
  distinction is overkill.)

  The connection handler will execute the function [cgi -> unit] for
  each (well formed) request coming in and catch (and log) all
  exceptions (it will raise no exception itself).  As you see, I do
  not require the user to create a cgi object -- it is done
  automatically as most of the time one will want to do so.

  I think it does not make much sense to execute the function [cgi ->
  unit] in a separate thread or process for AJP as all requests are
  presented sequentially.  For FCGI with mutliplexed requests it does
  however, so some flexibility is required in that case (such
  flexibility is not provided by the current FCGI interface either).


- The "accept" part is the more difficult as it is where the
  peculiarities of the concurrent model express themselves the more.
  On the other hand, it is fairly generic in the sense it depends very
  little on the protocol one must handle.

  In fact, I do not think it is possible to define a function that
  will handle all cases -- or its interface will be prohibitively
  complicated, thus defeat its purpose.  Therefore, I believe the best
  is to handle common cases and leave it to the user to define its
  specialized cases (plus I guess functors can be defined on top of
  [socket] and [handle_connection] as accept does not depend on the
  protocol).

  The first case is a new process per connection.  Since we need a
  pipe to communicate to the father the return value of the child, the
  signature is:

  val accept_fork  : ?props:(string * string) list ->
    ?onrestart:(unit -> unit) -> ?onshutdown:(unit -> unit) ->
    ?allow_hosts:Unix.inet_addr list ->
    ?fork:((Unix.file_descr -> unit) -> int * Unix.file_descr) ->
    connection_handler -> Unix.inet_addr -> unit

  where the first [Unix.file_descr] is in the child to write the
  return value and the second one is in the father to read it.  A
  typical fork function is

  let fork f =
    let (infd, outfd) = Unix.pipe () in
    match Unix.fork() with
    | 0 -> f outfd; exit 0  (* or the double fork trick *)
    | n -> n, infd

  As for threads, I guess the better is to use an event -- created by
  accept itself -- to transmit the return value, so the following
  interface should be enough:

  val accept_threads : ?props:(string * string) list ->
    ?onrestart:(unit -> unit) -> ?onshutdown:(unit -> unit) ->
    ?allow_hosts:Unix.inet_addr list ->
    ?thread:((unit -> unit) -> unit) ->
    connection_handler -> Unix.inet_addr -> unit

  (It should be possible to start a new thread everytime or to use a
  thread pool with this.)

  Maybe I forgot some important cases and the above design should be
  refined but at least it convey the idea.


The interface modified with the previous ideas is attached.

Regards,
ChriS



---
P.S. If we ever go in the direction I suggest, a possibility is to
develop it in a netcgi directory.  Not only that will make possible
for the two versions to coexist but is more natural as the library
name is netcgi (think of -I +netcgi).

[-- Attachment #2: netcgi.mli --]
[-- Type: Text/Plain, Size: 7339 bytes --]

(*
 * Types and functions shared by all connectors
 ***********************************************************************)

module Random :
sig
  val init : ?lock:(unit -> unit) -> ?unlock:(unit -> unit) -> string -> unit
  val init_from_file : ?lock:(unit -> unit) -> ?unlock:(unit -> unit) ->
    ?length:int -> string -> unit
  val byte : unit -> int
  val sessionid : int -> string
    (** [sessionid length] generates a [8 * length] bit ([lenght] hex
        digits) random string which can be used for session IDs,
        random cookies and so on.  The string returned will be very
        random and hard to predict *)
end


class type argument = (* I do not see the point of the cgi prefix *)
object
  (* ... methods to be discussed ... *)

  method storage : [`Memory | `File of string]
    (* No need to define [store] as it is only used here -- saying it
       here makes it easier to figure out what the method is. *)
  method representation : [ `Simple of Netmime.mime_body
                          | `MIME of Netmime.mime_message ]
    (* Same justification as above: having a single point of entry to
       undertand a method is easier. *)
end


type cookie = ... (* I do not see the point of the cgi prefix *)


class type environment =
object
  (* Only changed methods are listed *)

  method cookie : string -> cookie
  method cookies : (string * cookie) list
    (* The first one is convenient.  Moreover, both should use the
       cookie type. *)

  (* Since [set_input_state] and [set_output_state] are not supposed
     to be for the final user, it would be nicer if they did not
     appear here.

     In the same vein, I would not include [input_ch] and
     [input_state] in the *public* interface: they are only useful for
     the [cgi] to initialize itself. *)
end

class type cgi = (* formerly cgi_activation *)
object
  (* I believe short names are better so long they are as readable *)
  method arg : string -> argument
  method arg_val : ?default:string -> string -> string (*was [argument_value]*)
  method arg_true : string -> bool
    (** This method returns false if the named parameter is missing,
        is an empty string, or is the string ["0"]. Otherwise it
        returns true. Thus the intent of this is to return true in the
        Perl sense of the word.  If a parameter appears multiple
        times, then this uses the first definition and ignores the
        others. *)
  method arg_all : string -> argument list (* formerly [multiple_argument] *)
  method args : (string * argument) list

  method url : ?protocol:Netcgi.protocol -> ... -> unit -> string

  method set_header : ?status:status -> ... -> unit -> unit
  method set_redirection_header : string -> unit
  method output : Netchannels.trans_out_obj_channel

  method log : Netchannels.out_obj_channel
    (** A channel whose data is appended to the webserver log. *)

  method finalize  : unit -> unit

  method environment : Netcgi.environment
  method request_method : [`GET | `HEAD | `POST | `DELETE | `PUT of argument]
    (* Single point of doc. *)
end



type config = {
  tmp_directory : string;
  tmp_prefix : string;
  permitted_http_methods : [`GET | `HEAD | `POST | `DELETE | `PUT] list;
                                                             (* Uniformity *)
  permitted_input_content_types : string list;
  input_content_length_limit : int;
  workarounds : [ `MSIE_Content_type_bug | `backslash_bug ] list;
    (* Single point of documentation. *)
}


(*
 * Connectors
 ***********************************************************************)

(* These names better convey the intent I think *)
type output_type = [`Direct | `Transactional]
type arg_storage = [`Memory | `File | `Automatic]

type connection_handler =
    Unix.file_descr -> Unix.file_descr -> (cgi -> unit) -> connection_status
and connection_status = Ok | Shutdown | Restart

module CGI :
sig
  val run :
    ?config:config ->
    ?output_type:output_type ->
    ?arg_storage:(string -> Netmime.mime_header -> arg_storage) ->
    (cgi -> unit) -> unit
end

module FCGI :
sig
  val run :
    ?config:config ->
    ?output_type:output_type ->
    ?arg_storage:(string -> Netmime.mime_header -> arg_storage) ->
    ?sockaddr:Unix.sockaddr ->
    (cgi -> unit) -> unit
    (* [run] must handle the fact that on windows, apache communicates
       with FCGI scripts through named pipes.  *)

  (* Some flexible functions that allow any concurrency model.  Here is
     a possibility. *)
  val socket : ?backlog:int -> ?reuseaddr:bool ->
    ?addr:Unix.inet_addr -> ?port:int -> Unix.file_descr
    (** [socket ?backlog ?reuseaddr ?addr ?port] setup a FCGI socket
	listening to incomming requests.

	@param backlog Length of the backlog queue (connections not yet
	accepted by the AJP server)
	@param reuseaddr Whether to reuse the port
	@param addr defaults to localhost.
	@param port if not present, uses stdin (which is a socket on
 	Unix or -- contrarily to the spec -- a pipe on win$).
    *)

(* Functions analogous to the ones of AJP *)
end

module AJP :
sig
  val arg_parse : ?anon_fun:(string -> unit) -> ?usage_msg:string ->
    (Arg.key * Arg.spec * Arg.doc) list -> (string * string) list
  val props_of_file : string -> (string * string) list

  val run :
    ?props:(string * string) list
    ?config:config ->
    ?output_type:output_type ->
    ?arg_storage:(string -> Netmime.mime_header -> arg_storage) ->
    ?sockaddr:Unix.sockaddr ->
    (cgi -> unit) -> unit


  val socket : ?props:(string * string) list -> ?backlog:int ->
    ?reuseaddr:bool -> ?addr:Unix.inet_addr -> ?port:int -> Unix.file_descr
    (** [socket ?props ?backlog ?reuseaddr ?addr ?port] setup a AJP
	socket listening to incomming requests.

	@param backlog Length of the backlog queue (connections not yet
	accepted by the AJP server)
	@param reuseaddr Whether to reuse the port
	@param addr defaults to localhost.
	@param port if not present, assume the program is launched
	by the web server.
    *)

  val accept_fork  : ?props:(string * string) list ->
    ?onrestart:(unit -> unit) -> ?onshutdown:(unit -> unit) ->
    ?allow_hosts:Unix.inet_addr list ->
    ?fork:((Unix.file_descr -> unit) -> int * Unix.file_descr) ->
    connection_handler -> Unix.inet_addr -> unit

  val accept_threads : ?props:(string * string) list ->
    ?onrestart:(unit -> unit) -> ?onshutdown:(unit -> unit) ->
    ?allow_hosts:Unix.inet_addr list ->
    ?thread:((unit -> unit) -> unit) ->
    connection_handler -> Unix.inet_addr -> unit

  val handle_connection : ?props:(string * string) list
    ?config:config -> ?auth:(int * string) -> connection_handler
end

module Test :
sig
  val simple_arg : string -> string -> argument
  val mime_arg : ?work_around_backslash_bug:bool ->
    string -> Netmime.mime_message -> argument

  val run :
    ?config:config ->
    ?output_type:output_type ->
    ?arg_storage:(string -> Netmime.mime_header -> arg_storage) ->
    ?args:cgi_argument list ->
    ?meth:request_method ->
    (cgi -> unit) -> unit
    (* More flexibility is definitely required here -- along the lines
       of [custom_environment].  I am thinking one could e.g. be
       general enough to allow the output to be set into a frame, and
       another frame being used for control, logging,... -- a live
       debugger if you like! *)
end

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Common CGI interface
  2005-05-06 20:14                 ` Christophe TROESTLER
  2005-05-10  0:07                   ` [Caml-list] " Christophe TROESTLER
@ 2005-05-10  0:10                   ` Christophe TROESTLER
  1 sibling, 0 replies; 37+ messages in thread
From: Christophe TROESTLER @ 2005-05-10  0:10 UTC (permalink / raw)
  To: info; +Cc: caml-list, ocamlnet-devel

[-- Attachment #1: Type: Text/Plain, Size: 5444 bytes --]

Hi,

Let me continue to develop some ideas about the socket-accept-handle
triplet.

- For [socket] (and [run]) of the AJP connector, I would add an
  optional parameter [?props] to be able to pass an optional property
  list -- the values set in this way supersede the values set by other
  optional arguments in order to be able to set defaults.  To be more
  flexible, I would also replace [jvm_emu_main] by a function parsing
  the arguments (allowing to set more arguments if needed).  That
  yields the following functions:

  val arg_parse : ?anon_fun:(string -> unit) -> ?usage_msg:string ->
    (Arg.key * Arg.spec * Arg.doc) list -> (string * string) list
  val props_of_file : string -> (string * string) list

  val run :
    ?props:(string * string) list
    ?config:config ->
    ?output_type:output_type ->
    ?arg_storage:(string -> Netmime.mime_header -> arg_storage) ->
    ?sockaddr:Unix.sockaddr ->
    (cgi -> unit) -> unit

  val socket : ?props:(string * string) list -> ?backlog:int ->
    ?reuseaddr:bool -> ?addr:Unix.inet_addr -> ?port:int -> Unix.file_descr

  The equivalent to [jvm_emu_main(fun props auth addr port -> server f
  auth addr port)] now reads:

  AJP.run ~props:(AJP.arg_parse []) f


- The [handle_connection] will not be in a separate module but instead
  adapts itself to the version present in the protocol (convenient but
  also useful e.g. if the app machine must handle several web servers
  with different versions of the protocol).

  How will [handle_connection] reports that the app must shutdown or
  restart?  So far, it was raising an exception.  However, I believe
  it is better to "force" the user to handle them, so they should be
  return values instead:

  type connection_handler =
      Unix.file_descr -> Unix.file_descr -> (cgi -> unit) -> connection_status
  and connection_status = Ok | Shutdown | Restart

  (The [Ok] is in case the server shuts down the connection after a
  request which I think it is entitled to do.)

  In case of threads, that makes also easy to transmit the value to
  the "main thread" through a ['a Event.event] -- this will generally
  be preferred to pipes in a multi-threaded app. I guess.

  The [connection_handler] takes two file descriptors, one for input
  and the other one for output, for flexibility (e.g. buiding a
  chain,...).  Generally, both will be the same socket.  (Maybe the
  distinction is overkill.)

  The connection handler will execute the function [cgi -> unit] for
  each (well formed) request coming in and catch (and log) all
  exceptions (it will raise no exception itself).  As you see, I do
  not require the user to create a cgi object -- it is done
  automatically as most of the time one will want to do so.

  I think it does not make much sense to execute the function [cgi ->
  unit] in a separate thread or process for AJP as all requests are
  presented sequentially.  For FCGI with mutliplexed requests it does
  however, so some flexibility is required in that case (such
  flexibility is not provided by the current FCGI interface either).


- The "accept" part is the more difficult as it is where the
  peculiarities of the concurrent model express themselves the more.
  On the other hand, it is fairly generic in the sense it depends very
  little on the protocol one must handle.

  In fact, I do not think it is possible to define a function that
  will handle all cases -- or its interface will be prohibitively
  complicated, thus defeat its purpose.  Therefore, I believe the best
  is to handle common cases and leave it to the user to define its
  specialized cases (plus I guess functors can be defined on top of
  [socket] and [handle_connection] as accept does not depend on the
  protocol).

  The first case is a new process per connection.  Since we need a
  pipe to communicate to the father the return value of the child, the
  signature is:

  val accept_fork  : ?props:(string * string) list ->
    ?onrestart:(unit -> unit) -> ?onshutdown:(unit -> unit) ->
    ?allow_hosts:Unix.inet_addr list ->
    ?fork:((Unix.file_descr -> unit) -> int * Unix.file_descr) ->
    connection_handler -> Unix.inet_addr -> unit

  where the first [Unix.file_descr] is in the child to write the
  return value and the second one is in the father to read it.  A
  typical fork function is

  let fork f =
    let (infd, outfd) = Unix.pipe () in
    match Unix.fork() with
    | 0 -> f outfd; exit 0  (* or the double fork trick *)
    | n -> n, infd

  As for threads, I guess the better is to use an event -- created by
  accept itself -- to transmit the return value, so the following
  interface should be enough:

  val accept_threads : ?props:(string * string) list ->
    ?onrestart:(unit -> unit) -> ?onshutdown:(unit -> unit) ->
    ?allow_hosts:Unix.inet_addr list ->
    ?thread:((unit -> unit) -> unit) ->
    connection_handler -> Unix.inet_addr -> unit

  (It should be possible to start a new thread everytime or to use a
  thread pool with this.)

  Maybe I forgot some important cases and the above design should be
  refined but at least it convey the idea.


The interface modified with the previous ideas is attached.

Regards,
ChriS



---
P.S. If we ever go in the direction I suggest, a possibility is to
develop it in a netcgi directory.  Not only that will make possible
for the two versions to coexist but is more natural as the library
name is netcgi (think of -I +netcgi).

[-- Attachment #2: netcgi.mli --]
[-- Type: Text/Plain, Size: 7339 bytes --]

(*
 * Types and functions shared by all connectors
 ***********************************************************************)

module Random :
sig
  val init : ?lock:(unit -> unit) -> ?unlock:(unit -> unit) -> string -> unit
  val init_from_file : ?lock:(unit -> unit) -> ?unlock:(unit -> unit) ->
    ?length:int -> string -> unit
  val byte : unit -> int
  val sessionid : int -> string
    (** [sessionid length] generates a [8 * length] bit ([lenght] hex
        digits) random string which can be used for session IDs,
        random cookies and so on.  The string returned will be very
        random and hard to predict *)
end


class type argument = (* I do not see the point of the cgi prefix *)
object
  (* ... methods to be discussed ... *)

  method storage : [`Memory | `File of string]
    (* No need to define [store] as it is only used here -- saying it
       here makes it easier to figure out what the method is. *)
  method representation : [ `Simple of Netmime.mime_body
                          | `MIME of Netmime.mime_message ]
    (* Same justification as above: having a single point of entry to
       undertand a method is easier. *)
end


type cookie = ... (* I do not see the point of the cgi prefix *)


class type environment =
object
  (* Only changed methods are listed *)

  method cookie : string -> cookie
  method cookies : (string * cookie) list
    (* The first one is convenient.  Moreover, both should use the
       cookie type. *)

  (* Since [set_input_state] and [set_output_state] are not supposed
     to be for the final user, it would be nicer if they did not
     appear here.

     In the same vein, I would not include [input_ch] and
     [input_state] in the *public* interface: they are only useful for
     the [cgi] to initialize itself. *)
end

class type cgi = (* formerly cgi_activation *)
object
  (* I believe short names are better so long they are as readable *)
  method arg : string -> argument
  method arg_val : ?default:string -> string -> string (*was [argument_value]*)
  method arg_true : string -> bool
    (** This method returns false if the named parameter is missing,
        is an empty string, or is the string ["0"]. Otherwise it
        returns true. Thus the intent of this is to return true in the
        Perl sense of the word.  If a parameter appears multiple
        times, then this uses the first definition and ignores the
        others. *)
  method arg_all : string -> argument list (* formerly [multiple_argument] *)
  method args : (string * argument) list

  method url : ?protocol:Netcgi.protocol -> ... -> unit -> string

  method set_header : ?status:status -> ... -> unit -> unit
  method set_redirection_header : string -> unit
  method output : Netchannels.trans_out_obj_channel

  method log : Netchannels.out_obj_channel
    (** A channel whose data is appended to the webserver log. *)

  method finalize  : unit -> unit

  method environment : Netcgi.environment
  method request_method : [`GET | `HEAD | `POST | `DELETE | `PUT of argument]
    (* Single point of doc. *)
end



type config = {
  tmp_directory : string;
  tmp_prefix : string;
  permitted_http_methods : [`GET | `HEAD | `POST | `DELETE | `PUT] list;
                                                             (* Uniformity *)
  permitted_input_content_types : string list;
  input_content_length_limit : int;
  workarounds : [ `MSIE_Content_type_bug | `backslash_bug ] list;
    (* Single point of documentation. *)
}


(*
 * Connectors
 ***********************************************************************)

(* These names better convey the intent I think *)
type output_type = [`Direct | `Transactional]
type arg_storage = [`Memory | `File | `Automatic]

type connection_handler =
    Unix.file_descr -> Unix.file_descr -> (cgi -> unit) -> connection_status
and connection_status = Ok | Shutdown | Restart

module CGI :
sig
  val run :
    ?config:config ->
    ?output_type:output_type ->
    ?arg_storage:(string -> Netmime.mime_header -> arg_storage) ->
    (cgi -> unit) -> unit
end

module FCGI :
sig
  val run :
    ?config:config ->
    ?output_type:output_type ->
    ?arg_storage:(string -> Netmime.mime_header -> arg_storage) ->
    ?sockaddr:Unix.sockaddr ->
    (cgi -> unit) -> unit
    (* [run] must handle the fact that on windows, apache communicates
       with FCGI scripts through named pipes.  *)

  (* Some flexible functions that allow any concurrency model.  Here is
     a possibility. *)
  val socket : ?backlog:int -> ?reuseaddr:bool ->
    ?addr:Unix.inet_addr -> ?port:int -> Unix.file_descr
    (** [socket ?backlog ?reuseaddr ?addr ?port] setup a FCGI socket
	listening to incomming requests.

	@param backlog Length of the backlog queue (connections not yet
	accepted by the AJP server)
	@param reuseaddr Whether to reuse the port
	@param addr defaults to localhost.
	@param port if not present, uses stdin (which is a socket on
 	Unix or -- contrarily to the spec -- a pipe on win$).
    *)

(* Functions analogous to the ones of AJP *)
end

module AJP :
sig
  val arg_parse : ?anon_fun:(string -> unit) -> ?usage_msg:string ->
    (Arg.key * Arg.spec * Arg.doc) list -> (string * string) list
  val props_of_file : string -> (string * string) list

  val run :
    ?props:(string * string) list
    ?config:config ->
    ?output_type:output_type ->
    ?arg_storage:(string -> Netmime.mime_header -> arg_storage) ->
    ?sockaddr:Unix.sockaddr ->
    (cgi -> unit) -> unit


  val socket : ?props:(string * string) list -> ?backlog:int ->
    ?reuseaddr:bool -> ?addr:Unix.inet_addr -> ?port:int -> Unix.file_descr
    (** [socket ?props ?backlog ?reuseaddr ?addr ?port] setup a AJP
	socket listening to incomming requests.

	@param backlog Length of the backlog queue (connections not yet
	accepted by the AJP server)
	@param reuseaddr Whether to reuse the port
	@param addr defaults to localhost.
	@param port if not present, assume the program is launched
	by the web server.
    *)

  val accept_fork  : ?props:(string * string) list ->
    ?onrestart:(unit -> unit) -> ?onshutdown:(unit -> unit) ->
    ?allow_hosts:Unix.inet_addr list ->
    ?fork:((Unix.file_descr -> unit) -> int * Unix.file_descr) ->
    connection_handler -> Unix.inet_addr -> unit

  val accept_threads : ?props:(string * string) list ->
    ?onrestart:(unit -> unit) -> ?onshutdown:(unit -> unit) ->
    ?allow_hosts:Unix.inet_addr list ->
    ?thread:((unit -> unit) -> unit) ->
    connection_handler -> Unix.inet_addr -> unit

  val handle_connection : ?props:(string * string) list
    ?config:config -> ?auth:(int * string) -> connection_handler
end

module Test :
sig
  val simple_arg : string -> string -> argument
  val mime_arg : ?work_around_backslash_bug:bool ->
    string -> Netmime.mime_message -> argument

  val run :
    ?config:config ->
    ?output_type:output_type ->
    ?arg_storage:(string -> Netmime.mime_header -> arg_storage) ->
    ?args:cgi_argument list ->
    ?meth:request_method ->
    (cgi -> unit) -> unit
    (* More flexibility is definitely required here -- along the lines
       of [custom_environment].  I am thinking one could e.g. be
       general enough to allow the output to be set into a frame, and
       another frame being used for control, logging,... -- a live
       debugger if you like! *)
end

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2005-05-10  0:10 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-04-18  6:15 CamlGI question Mike Hamburg
2005-04-18  7:29 ` [Caml-list] " Robert Roessler
2005-04-18 13:49   ` Alex Baretta
2005-04-18 14:31     ` Gerd Stolpmann
2005-04-18 16:04       ` Michael Alexander Hamburg
2005-04-18 16:28         ` Alex Baretta
2005-04-19  3:23           ` Mike Hamburg
2005-04-19  3:26             ` [Caml-list] CamlGI question [doh] Mike Hamburg
2005-04-19  9:18               ` Gerd Stolpmann
2005-04-19 15:28                 ` Mike Hamburg
     [not found]                   ` <1113933973.6248.76.camel@localhost.localdomain>
2005-04-19 18:44                     ` Eric Stokes
2005-04-19 19:18                       ` Christophe TROESTLER
2005-04-19 21:11                     ` Eric Stokes
2005-04-19  9:31               ` Alex Baretta
2005-04-19 11:33 ` [Caml-list] CamlGI question Christophe TROESTLER
2005-04-19 12:51   ` Christopher Alexander Stein
2005-04-19 19:03     ` Common CGI interface (was: [Caml-list] CamlGI question) Christophe TROESTLER
2005-04-19 19:54       ` Gerd Stolpmann
2005-04-20  6:55         ` Jean-Christophe Filliatre
2005-04-20  7:22         ` Common XML interface (was: Common CGI interface) Alain Frisch
2005-04-20 11:15           ` [Caml-list] " Gerd Stolpmann
2005-04-20 11:38             ` Nicolas Cannasse
2005-04-20 13:23           ` Stefano Zacchiroli
2005-04-21  6:59             ` [Caml-list] Common XML interface Alain Frisch
2005-04-21 11:34               ` Gerd Stolpmann
2005-04-20 20:00         ` Common CGI interface Christophe TROESTLER
2005-04-20 21:06           ` [Caml-list] " Gerd Stolpmann
2005-04-21  7:36             ` [Ocamlnet-devel] " Florian Hars
2005-04-21 10:41               ` Gerd Stolpmann
2005-04-25 10:38             ` Christophe TROESTLER
2005-04-26 11:08               ` Gerd Stolpmann
2005-05-06 20:14                 ` Christophe TROESTLER
2005-05-10  0:07                   ` [Caml-list] " Christophe TROESTLER
2005-05-10  0:10                   ` Christophe TROESTLER
2005-04-26 16:24               ` [Caml-list] " Eric Stokes
2005-05-06 20:14                 ` Christophe TROESTLER
2005-04-19 20:13   ` [Caml-list] CamlGI question Michael Alexander Hamburg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).