From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: * X-Spam-Status: No, score=1.9 required=5.0 tests=AWL,DNS_FROM_RFC_POST, SPF_NEUTRAL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id F39F5BC37 for ; Wed, 29 Apr 2009 21:39:00 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgUIAO9J+EnRVdypdWdsb2JhbACDAJMpPwEMCgsHEaZpgQiRHwEDAQODcgWHdA X-IronPort-AV: E=Sophos;i="4.40,267,1238968800"; d="scan'208";a="39233504" Received: from mail-fx0-f169.google.com ([209.85.220.169]) by mail4-smtp-sop.national.inria.fr with ESMTP; 29 Apr 2009 21:38:54 +0200 Received: by fxm17 with SMTP id 17so1763849fxm.27 for ; Wed, 29 Apr 2009 12:38:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=dePs6DzuOJ6XBxVIxmPf+QKerRhItckQKpWV9/3kUVc=; b=MJMHZq/FAh+d9FmSIAcumBf4nKlv3m9gzaQDBNZn7km69Id/m9RfZokTkD2ZNIL+Qm wAXlCBJy1NpyLxNpFOna6A/SjHt0tQzgZBsIybml9aNbpR0Qn9GT9OsSs/NVfoAD5laf J1892OLt6raLuXenPlb9Cbm9DeA2LtFFgs0t0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=QFRBU2jZDLRCCZNy/gtEC9VucI+w6QD/ZVXDDOPrf5ffycrkJD+ajJc0nnMQv27PGO FZJWKCI93lvuxC9WYfNmpt2NpupHTyeYg3N1Gh51lx5/3O2QyfE2Ezew23KyvH5hXH3a KnxkIsXmG3bB5K8sT9SogtIOPqcDqxGLGmaAc= MIME-Version: 1.0 Received: by 10.86.49.13 with SMTP id w13mr1042843fgw.38.1241033933661; Wed, 29 Apr 2009 12:38:53 -0700 (PDT) In-Reply-To: <25CD1992-3F33-4AEA-97A3-EC54729EC4FE@cs.berkeley.edu> References: <324B24CA-9671-42C0-B722-C7710C0C45C7@cs.berkeley.edu> <59BD1B3A-B449-4963-9910-ED5E755D00E6@cs.berkeley.edu> <49F7F135.5080504@gmail.com> <08C6A0FC-90F2-43A4-AB78-4A3B68291FAA@cs.berkeley.edu> <49F7F59B.7070204@frisch.fr> <6D9C5A68-1874-4BBC-AE3D-9CCC3614AF7C@cs.berkeley.edu> <0FDD87FB-2461-4BBC-BDB1-5316A5AE4D23@inria.fr> <25CD1992-3F33-4AEA-97A3-EC54729EC4FE@cs.berkeley.edu> Date: Wed, 29 Apr 2009 15:38:53 -0400 Message-ID: Subject: Re: [Caml-list] Strange performance bug From: Markus Mottl To: Brighten Godfrey Cc: Damien Doligez , OCaml List Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Spam: no; 0.00; bug:01 markus:01 mottl:01 markus:01 mottl:01 pcre:01 ocaml-values:01 pcre:01 regexps:01 deallocated:01 finalizer:01 ocaml-value:01 ocaml-heap:01 pcre-library:01 ocaml-heap:01 On Wed, Apr 29, 2009 at 15:19, Brighten Godfrey wrote= : > You seem to have solved my problem. =A0But out of curiosity: =A0Do you kn= ow how > the GC settings in the Str library differ from PCRE's, and why? =A0Why do= es > PCRE need to explicitly trigger GCs at all? Str regular expressions are pure OCaml-values, whereas PCRE regexps are allocated on the C-heap. Since C-heap values need to be deallocated explicitly, a finalizer has to be installed that does the job if the OCaml-value is not reachable anymore. The OCaml-runtime needs hints about how large these C-values are. Otherwise it may not scan the OCaml-heap aggressively enough, leaving a lot of those C-values floating around and eating up your memory. Right now the PCRE-library guarantees that not more than 500 regular expressions will float around at any one time. This can lead to considerable slowdowns if your OCaml-heap is large and you allocate regular expressions at very high rates. Regards, Markus --=20 Markus Mottl http://www.ocaml.info markus.mottl@gmail.com