From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.9 required=5.0 tests=AWL,SPF_SOFTFAIL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by yquem.inria.fr (Postfix) with ESMTP id 55871BC37 for ; Wed, 29 Apr 2009 08:27:09 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AmcCAE+Q90mp5TxXe2dsb2JhbACWbwEBFiIFpmyHbIhNg3QF X-IronPort-AV: E=Sophos;i="4.40,265,1238968800"; d="scan'208";a="26954284" Received: from gateway0.eecs.berkeley.edu ([169.229.60.87]) by mail3-smtp-sop.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-SHA; 29 Apr 2009 08:27:08 +0200 Received: from [10.0.1.6] (adsl-70-137-145-160.dsl.snfc21.sbcglobal.net [70.137.145.160]) (authenticated bits=0) by gateway0.EECS.Berkeley.EDU (8.14.3/8.13.5) with ESMTP id n3T6R2Mm004397 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT); Tue, 28 Apr 2009 23:27:03 -0700 (PDT) Cc: Markus Mottl , OCaml List Message-Id: <08C6A0FC-90F2-43A4-AB78-4A3B68291FAA@cs.berkeley.edu> From: Brighten Godfrey To: Alain Frisch In-Reply-To: <49F7F135.5080504@gmail.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v930.3) Subject: Re: [Caml-list] Strange performance bug Date: Tue, 28 Apr 2009 23:27:02 -0700 References: <324B24CA-9671-42C0-B722-C7710C0C45C7@cs.berkeley.edu> <59BD1B3A-B449-4963-9910-ED5E755D00E6@cs.berkeley.edu> <49F7F135.5080504@gmail.com> X-Mailer: Apple Mail (2.930.3) X-Spam: no; 0.00; bug:01 frisch:01 regexp:01 bug:01 pcre:01 regexps:01 ocaml:01 pcre:01 regexps:01 ocaml:01 28,:98 parsed:01 wrote:01 wrote:01 compile:01 On Apr 28, 2009, at 11:18 PM, Alain Frisch wrote: > Brighten Godfrey wrote: >> (Changing to the precompiled regexp does make this bug go away -- >> but so do many other small changes, like commenting out the last >> line of the code, *after* the parsing is complete.) > > This last line (List.length first) + ...) forces the values first, > second and third to remain alive. How big are these values in memory? > > I would suggest to look closely at the memory usage and GC behavior > of your program. I did not investigate your problem, so here is a > random speculation: If the parsed file is big enough, PCRE will > compile a lot of regexps; maybe the OCaml PCRE binding does not > evaluate properly the memory usage of these regexps and so OCaml > does not trigger a GC soon enough to release the compiled regexps; > the process memory grows and the OS starts to swap your process. That occurred to me too, but there is no swapping. The process uses less than 40 MB of memory. Also, this wouldn't explain why it suddenly becomes slow exactly when it starts parsing the file the second time. ~Brighten