From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id WAA01691; Sun, 4 Aug 2002 22:45:47 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id WAA01570 for ; Sun, 4 Aug 2002 22:45:46 +0200 (MET DST) Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.126.171]) by nez-perce.inria.fr (8.11.1/8.11.1) with ESMTP id g74Kjjf21343 for ; Sun, 4 Aug 2002 22:45:45 +0200 (MET DST) Received: from [212.227.126.160] (helo=mrelayng0.kundenserver.de) by moutng1.kundenserver.de with esmtp (Exim 3.35 #2) id 17bSG1-0003kB-00; Sun, 04 Aug 2002 22:45:45 +0200 Received: from [80.129.96.136] (helo=gate.gerd-stolpmann.de) by mrelayng0.kundenserver.de with asmtp (Exim 3.35 #2) id 17bSG1-0004V8-00; Sun, 04 Aug 2002 22:45:45 +0200 Received: from ice.gerd-stolpmann.de (ice.gerd-stolpmann.de [192.168.0.13]) by gate.gerd-stolpmann.de (Postfix) with ESMTP id 62B15CCDD; Sun, 4 Aug 2002 22:45:33 +0200 (CEST) Received: from ice (localhost [127.0.0.1]) by ice.gerd-stolpmann.de (Postfix) with ESMTP id 778641AD82; Sun, 4 Aug 2002 22:45:32 +0200 (MEST) Date: Sun, 4 Aug 2002 22:45:32 +0200 From: Gerd Stolpmann To: "Alexander V. Voinov" Cc: caml-list@inria.fr Subject: Re: [Caml-list] ocaml-3.05: a performance experience Message-ID: <20020804204532.GA9405@ice.gerd-stolpmann.de> References: <3D49FD72.68388864@quasar.ipa.nw.ru> <20020803123311.GA631@ice.gerd-stolpmann.de> <3D4C965D.775F23DD@quasar.ipa.nw.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <3D4C965D.775F23DD@quasar.ipa.nw.ru>; from avv@quasar.ipa.nw.ru on Sun, Aug 04, 2002 at 04:50:05 +0200 X-Mailer: Balsa 1.3.6 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk On 2002.08.04 04:50 Alexander V. Voinov wrote: > Hi Gerd, > > Gerd Stolpmann wrote: > > If XML validation is not needed, you could also rewrite your program > > to use the new event-based parsing in PXP-1.1.90. That would completely > > avoid to represent the XML tree in memory (and increase the speed, because > > GC of large memory footprints is expensive). > > thanks again, but it's not yet officially announced, is it? I managed to > download it, but I didn't find any direct link. Also, it this parsing > mode mentioned in the manual? It is experimental code, but event-based parsing will definitely remain in the parser until the next stable release. Details of the interface may change, however. (I call a release "stable" when the interface has matured, and when all regression tests have passed. The experimental releases usually work, but it is more likely that there is some "overlooked case" in the code.) The manual is not yet updated, there is only a description in the mli file, and a small example. In particular, there is type type event = | E_start_doc of (string * bool * dtd) | E_end_doc | E_start_tag of (string * (string * string) list * Pxp_lexer_types.entity_id) | E_end_tag of (string * Pxp_lexer_types.entity_id) | E_char_data of string | E_pinstr of (string * string) | E_comment of string | E_position of (string * int * int) | E_error of exn | E_end_of_stream and a function is called back for every of these events. For example, for QRS you would get the events E_start_doc("1.0",false,dtd) E_start_tag("A", ["x", "1"], ent_a) E_char_data "Q" E_start_tag("B", [], ent_b) E_char_data "R" E_end_tag("B", ent_b) E_char_data "S" E_end_tag("A", ent_a) It is already checked that the document is well-formed, so for end E_end_tag there is always a matching E_start_tag. Because the parser "pushes" the events to the application, this is a so-called "push parser". There are plans for a "pull parser", too (the application calls a next_event function to get the events), as this would allow to create streams of XML events. Gerd -- ---------------------------------------------------------------------------- Gerd Stolpmann Telefon: +49 6151 997705 (privat) Viktoriastr. 45 64293 Darmstadt EMail: gerd@gerd-stolpmann.de Germany ---------------------------------------------------------------------------- ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners