From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: * X-Spam-Status: No, score=1.1 required=5.0 tests=AWL,SPF_SOFTFAIL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by yquem.inria.fr (Postfix) with ESMTP id D4163BC37; Wed, 29 Apr 2009 21:19:09 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ah8BAMdE+Emp5TxXkWdsb2JhbACWaAEBAQEJCwoHEQWoJogoiE2DdQU X-IronPort-AV: E=Sophos;i="4.40,267,1238968800"; d="scan'208";a="26997537" Received: from gateway0.eecs.berkeley.edu ([169.229.60.87]) by mail3-smtp-sop.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-SHA; 29 Apr 2009 21:19:08 +0200 Received: from [10.0.1.6] (adsl-70-137-145-160.dsl.snfc21.sbcglobal.net [70.137.145.160]) (authenticated bits=0) by gateway0.EECS.Berkeley.EDU (8.14.3/8.13.5) with ESMTP id n3TJJ2FL015701 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT); Wed, 29 Apr 2009 12:19:03 -0700 (PDT) Cc: OCaml List Message-Id: <25CD1992-3F33-4AEA-97A3-EC54729EC4FE@cs.berkeley.edu> From: Brighten Godfrey To: Markus Mottl , Damien Doligez In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v930.3) Subject: Re: [Caml-list] Strange performance bug Date: Wed, 29 Apr 2009 12:19:01 -0700 References: <324B24CA-9671-42C0-B722-C7710C0C45C7@cs.berkeley.edu> <59BD1B3A-B449-4963-9910-ED5E755D00E6@cs.berkeley.edu> <49F7F135.5080504@gmail.com> <08C6A0FC-90F2-43A4-AB78-4A3B68291FAA@cs.berkeley.edu> <49F7F59B.7070204@frisch.fr> <6D9C5A68-1874-4BBC-AE3D-9CCC3614AF7C@cs.berkeley.edu> <0FDD87FB-2461-4BBC-BDB1-5316A5AE4D23@inria.fr> X-Mailer: Apple Mail (2.930.3) X-Spam: no; 0.00; bug:01 markus:01 mottl:01 damien:01 damien:01 markus:01 pcre:01 pcre:01 regexp:01 allocations:01 regexps:01 2009:98 48,:98 steadily:98 steadily:98 On Apr 29, 2009, at 9:03 AM, Markus Mottl wrote: > On Wed, Apr 29, 2009 at 10:48, Damien Doligez > wrote: >> Markus, you put your finger right on the problem. That program >> doesn't >> suddenly start to get slow, it gets steadily slower as it runs. The >> heap also gets steadily bigger, and the major GC does way too much >> work. > > Right, the combination of forcing the GC very often through PCRE and > keeping an ever increasing number of values around obviously leads to > this slowdown. Makes sense. (Though, i'd point out that there *are* discrete moments of sudden slowdown -- after each parse completes. Maybe the GC doesn't scan through the results until they are returned by the parse function...?) >> Maybe PCRE should change its settings to trigger GCs less often but, >> as Markus said, this doesn't look really important. > > I've never seen a case in practice where correctly done regexp > allocations happen so often that excessive full major collections are > triggered. There is no optimal "general" setting for how aggressive > the GC should be with regexps, but the current value should work fine > for just about anybody. A still reasonable higher setting would > probably not solve performance bugs of the sort above anyway. You seem to have solved my problem. But out of curiosity: Do you know how the GC settings in the Str library differ from PCRE's, and why? Why does PCRE need to explicitly trigger GCs at all? Thanks for your replies, ~Brighten