From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=1.0 required=5.0 tests=AWL,SPF_SOFTFAIL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id 5B23FBC37 for ; Wed, 29 Apr 2009 10:29:17 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AmcCAOqs90mp5TxXe2dsb2JhbACWbwEBFiIFpiWHeohNg3QF X-IronPort-AV: E=Sophos;i="4.40,265,1238968800"; d="scan'208";a="39197216" Received: from gateway0.eecs.berkeley.edu ([169.229.60.87]) by mail4-smtp-sop.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-SHA; 29 Apr 2009 10:29:15 +0200 Received: from [10.0.1.6] (adsl-70-137-145-160.dsl.snfc21.sbcglobal.net [70.137.145.160]) (authenticated bits=0) by gateway0.EECS.Berkeley.EDU (8.14.3/8.13.5) with ESMTP id n3T8T7Jx005306 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT); Wed, 29 Apr 2009 01:29:09 -0700 (PDT) Cc: Markus Mottl , OCaml List Message-Id: <6D9C5A68-1874-4BBC-AE3D-9CCC3614AF7C@cs.berkeley.edu> From: Brighten Godfrey To: Alain Frisch In-Reply-To: <49F7F59B.7070204@frisch.fr> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v930.3) Subject: Re: [Caml-list] Strange performance bug Date: Wed, 29 Apr 2009 01:29:07 -0700 References: <324B24CA-9671-42C0-B722-C7710C0C45C7@cs.berkeley.edu> <59BD1B3A-B449-4963-9910-ED5E755D00E6@cs.berkeley.edu> <49F7F135.5080504@gmail.com> <08C6A0FC-90F2-43A4-AB78-4A3B68291FAA@cs.berkeley.edu> <49F7F59B.7070204@frisch.fr> X-Mailer: Apple Mail (2.930.3) X-Spam: no; 0.00; bug:01 frisch:01 pcre:01 regexps:01 ocaml:01 regexp:01 ocaml:01 pcre:01 regexp:01 28,:98 continuously:98 wrote:01 wrote:01 parsing:01 heap:01 On Apr 28, 2009, at 11:37 PM, Alain Frisch wrote: > Brighten Godfrey wrote: >> That occurred to me too, but there is no swapping. The process >> uses less than 40 MB of memory. Also, this wouldn't explain why it >> suddenly becomes slow exactly when it starts parsing the file the >> second time. > > This point is interesting: it is where PCRE starts compiling a lot > of regexps while the OCaml heap is already full of non-garbage value > (the value bound to first). For each allocated regexp, the OCaml > PCRE binding pushes some pressure on the OCaml GC; maybe, then, the > GC runs too often for the amount of memory it can really release. I know nothing about the internals of these libraries. But, the program is continuously reading lines from the file. Thus, isn't there about the same amount of memory on the heap just before the problem starts and just after the problem starts? I guess it is plausible that somehow, closing the file and re-opening it triggers a bad interaction with the GC... But in comparison, using Str in the same way (i.e., compiling the regexp every time it is used) works fine. Thanks, ~Brighten