From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Delivered-To: caml-list@yquem.inria.fr Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by yquem.inria.fr (Postfix) with ESMTP id E13FFBC48 for ; Tue, 19 Apr 2005 20:45:10 +0200 (CEST) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id j3JIjAxf011417 for ; Tue, 19 Apr 2005 20:45:10 +0200 Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id UAA25211 for ; Tue, 19 Apr 2005 20:45:09 +0200 (MET DST) Received: from plover.csun.edu (plover.csun.edu [130.166.1.24]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id j3JIj8n8022404 for ; Tue, 19 Apr 2005 20:45:09 +0200 Received: from puffin.csun.edu (puffin.csun.edu [130.166.1.21]) by plover.csun.edu (MOS 3.5.4-GR) with ESMTP id BOD38006; Tue, 19 Apr 2005 11:44:58 -0700 (PDT) Received: from [130.166.10.117] (s010n117.csun.edu [130.166.10.117]) by puffin.csun.edu (MOS 3.5.4-GR) with ESMTP id DEI36303 (AUTH eric); Tue, 19 Apr 2005 11:44:57 -0700 (PDT) In-Reply-To: <1113933973.6248.76.camel@localhost.localdomain> References: <810e04dc0f6bb601fb828db8d18def6c@fas.harvard.edu> <426361DB.10704@rftp.com> <4263BAD0.5020901@barettadeit.com> <1113834691.6248.42.camel@localhost.localdomain> <4263E01D.5090705@barettadeit.com> <426479A2.1020404@fas.harvard.edu> <42647A4B.3050507@fas.harvard.edu> <1113902296.6248.47.camel@localhost.localdomain> <42652387.4030006@fas.harvard.edu> <1113933973.6248.76.camel@localhost.localdomain> Mime-Version: 1.0 (Apple Message framework v622) Content-Type: text/plain; charset=US-ASCII; format=flowed Message-Id: Content-Transfer-Encoding: 7bit Cc: caml-list@inria.fr, Mike Hamburg From: Eric Stokes Subject: Re: [Caml-list] CamlGI question [doh] Date: Tue, 19 Apr 2005 11:44:56 -0700 To: Gerd Stolpmann X-Mailer: Apple Mail (2.622) X-Miltered: at concorde with ID 426551B6.001 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Miltered: at nez-perce with ID 426551B4.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; caml-list:01 stdout:01 stdout:01 buffer:01 ocamlnet:01 ocaml:01 libc:01 gerd:01 stolpmann:01 bug:01 ocamlnet:01 buffered:01 o'caml:01 threads:01 eintr:01 X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on yquem.inria.fr X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.0.2 X-Spam-Level: I have several multi threaded applications written with Netcgi_fastcgi, and I can say that I have seen problems like this, but not to this degree. However I've only tested on apache 1.3.x and 2.0.x with mod_fastcgi 2.4.2. That said, printing stuff to stdout should not cause a problem, because stdout is a listening socket, the data would essentially go into a buffer which will never be read. I think the problem may actually be single_write vs. write. When I originally wrote the fastcgi code for ocamlnet I was new to Ocaml, and I did not realize that write was different from the libc equivalent. Under load, or in some situations, using Unix.write will definitely cause a protocol error and could explain some strange intermittent behavior that I have seen. Therefore I will commit a patch the fastcgi library to use single_write instead of write and we'll see what happens. I'll also try to determine the effect of an error, and get back to you. Mike, thanks for sticking with us and providing data, we appreciate it a lot. On Apr 19, 2005, at 11:06 AM, Gerd Stolpmann wrote: > Am Dienstag, den 19.04.2005, 11:28 -0400 schrieb Mike Hamburg: >> I'll try to help work out this bug in either Apache/FastCGI or Netcgi, >> and if that doesn't work, I'll go to AJP. I compiled >> mod_fastcgi-2.4.0 >> on Apache 2.0.53. It successfully serves FastCGI pages written with >> CamlGI, except that it sometimes pauses after 8k of data, and >> sometimes >> breaks the pipe to the CGI. > > Just to make sure Netcgi works also for multi-threaded programs, I made > a test for myself. I added a second thread to the example > add_fastcgi.ml > (of Ocamlnet), such that it reads now > > let a = ref ["Z"; "X"; "A"; "P"; "B" ];; > > Thread.create > (fun () -> > while true do > a := List.sort compare !a > done > ) > () > ;; > > (* start the fastcgi server *) > serv process buffered_transactional_optype;; > > at the end. So there is now a thread sorting this list over and over > again. > > I tried it with Apache 2.0.50 and mod_fastcgi 2.4.2 (those versions > coming with Ubuntu Warty). The first test failed, I had to comment out > the FastCgiWrapper which is part of the default configuration. After > that, the program just worked. > > I also looked at the sources. There might be problems because the > O'Caml > thread implementation uses signals to wake up threads in some > situations. This may cause that the Unix.read/write (one should better > use Unix.single_write, btw.) fail with EINTR. This is not caught. > >> Under Netcgi 1.0, the CGI version works but the FastCGI version does >> not >> work at all. The difference between the CGI and FastCGI versions is >> only: >> >> let () = Netcgi_fcgi.serv main (`Direct "


"); >> (* let cgi = new Netcgi.std_activation () >> let () = main cgi *) >> >> The application reads files from the outside world, and may launch >> processes with fork+execv to thumbnail images. It is multithreaded, >> with at least 4 threads at all times: the main thread, a thumbnail >> cache >> manager, a thread which handles signals (this cannot be done in a >> signal >> handler, as it may block, so I wake up the handler thread instead), >> and >> a thread which rereads the configuration files periodically, so that >> the >> user does not have to send HUP. >> >> The application usually raises Unix.Unix_error (EPIPE, "write", "") on >> the line Netcgi_fcgi.serv main (`Direct "


") or rather, >> inside Netcgi_fcgi.serv at some point; I believe this occurs before >> main >> is called. > > You can figure this out exactly when you byte-compile the program with > -g, and set OCAMLRUNPARAM=b=1, e.g. using a wrapper script and > redirecting stderr to a file. > > The EPIPE error is quite surprising, because Netcgi wraps all its > exceptions into FCGI_Error. > >> The webserver also receives EPIPE and returns error 500. >> Sometimes, I get other errors: sometimes main gets called, and I get >> Failure("send_output_header"). > > This is very strange, too. Failure("send_output_header") can only > happen > when the HTTP header is printed at the wrong moment (i.e. too late). > >> If this happens, then rollback_output >> successfully prints "


" but no more (i.e. no further >> diagnostics). Sometimes, the webserver times out waiting for the >> first >> read, in which case the application still raises EPIPE. >> >> Is there any more information that would help to debug the problem? > > I suspect a part of the program prints something to stdout at the wrong > moment, maybe because of a race condition. This is only my intuition, > but this is the first thing I would try to look for. This would explain > why the fcgi protocol is violated, and EPIPE is the logical consequence > (the web server sees wrong data and shuts down the socket). > > Maybe an strace of the process can show what is really going on. > > Gerd > >> >> Mike >> >> Alex Baretta wrote: >> >>> If at all possible, rather than AJP I'd stick to FastCGI. The code >>> in >>> Ocamlnet reportedly works fine, but I don't have experience with it. >>> Are you sure that the broken pipe is caused by a bug in Ocamlnet and >>> not in the web server? >>> >>> Alex >> >> >> >> Gerd Stolpmann wrote: >> >>> Am Montag, den 18.04.2005, 23:26 -0400 schrieb Mike Hamburg: >>> >>> >>>> I'm obviously too tired to be programming. That should, of course, >>>> read >>>> (1) fix a broken pipe error in Netcgi_fast or >>>> >>>> >>> >>> Well, we would need more details to help you. >>> >>> >>> >>>> (2) port the application to (and configure the webserver for) AJP >>>> >>>> >>> >>> Porting the application is quite simple, just follow the examples >>> coming >>> with Ocamlnet (in examples/jserv). For the webserver, you need mod_jk >>> (jk=Jakarta). The configuration reference is here: >>> >>> http://jakarta.apache.org/tomcat/connectors-doc/config/apache.html >>> >>> Note that Ocamlnet only supports AJP-1.2, not 1.3 which is the >>> current >>> default for Tomcat. >>> >>> Gerd >>> >>> >> >> > -- > ------------------------------------------------------------ > Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany > gerd@gerd-stolpmann.de http://www.gerd-stolpmann.de > ------------------------------------------------------------ > >