From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id UAA02587; Fri, 16 Jan 2004 20:20:13 +0100 (MET) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id UAA02579 for ; Fri, 16 Jan 2004 20:20:11 +0100 (MET) Received: from fichte.ai.univie.ac.at (fichte.ai.univie.ac.at [131.130.174.156]) by nez-perce.inria.fr (8.11.1/8.11.1) with ESMTP id i0GJKA510914 for ; Fri, 16 Jan 2004 20:20:11 +0100 (MET) Received: from fichte.ai.univie.ac.at (markus@localhost [127.0.0.1]) by fichte.ai.univie.ac.at (8.12.3/8.12.3/Debian-6.6) with ESMTP id i0GJKAHn030712; Fri, 16 Jan 2004 20:20:10 +0100 Received: (from markus@localhost) by fichte.ai.univie.ac.at (8.12.3/8.12.3/Debian-6.6) id i0GJK9Xb030711; Fri, 16 Jan 2004 20:20:09 +0100 Date: Fri, 16 Jan 2004 20:20:09 +0100 From: Markus Mottl To: Yutaka OIWA Cc: caml-list@inria.fr Subject: Re: [Caml-list] ANNOUNCE: mod_caml 1.0.6 - includes security patch Message-ID: <20040116192009.GC26828@fichte.ai.univie.ac.at> Mail-Followup-To: Yutaka OIWA , caml-list@inria.fr References: <20040115140324.GA3047@redhat.com> <4006AC01.F2AD2741@decis.be> <20040115154211.GA8340@redhat.com> <20040115161943.GB9541@fichte.ai.univie.ac.at> <20040115165315.GA10912@redhat.com> <6290BE91-47EB-11D8-A8F5-000393B8133A@wetware.com> <20040116093454.GA23909@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.1i X-Loop: caml-list@inria.fr X-Spam: no; 0.00; caml-list:01 yutaka:01 oiwa:01 compilations:01 pcre:01 camlp:01 pcre-ocaml:01 ocaml-heap:01 pcre:01 pointers:01 char:01 bigarrays:01 regexp:01 regexp:01 backend:01 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk On Sat, 17 Jan 2004, Yutaka OIWA wrote: > The code generated by current Regexp/OCaml is something similar to the > above, (however, pattern compilations are performed only once per > execution per each pattern.) but if the backend regexp engine > (currently Regexp/OCaml uses PCRE/OCaml) supports optimization for > multiple regular expression matching, Regexp/OCaml can easily > utilize it. Analysis for patterns may be performed at compilation > (camlp4-translation) phase, if required. As mentioned in a previous post, this could be done using the callout features of PCRE-OCaml. Only problem: the string to be matched is internally copied to the C-heap (once), because the OCaml-GC could theoretically move the string to another memory location in the OCaml-heap during callouts. Thus, it may not be as efficient as you expect, and possibly only pay off if the patterns match the same, long string prefixes. Unfortunately, there is no workaround for this: you'd either have to rewrite PCRE so that you can return pointers to new string locations after each callout (no, thanks ;) or somehow be able to temporarily protect strings from being moved by the GC (not feasible either, I suppose; would, however, work with character strings in char Bigarrays if I am not mistaken). Regards, Markus -- Markus Mottl http://www.oefai.at/~markus markus@oefai.at ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners