From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=AWL,SPF_FAIL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by yquem.inria.fr (Postfix) with ESMTP id BE6B9BC37 for ; Wed, 30 Sep 2009 12:16:48 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ao8DAEHNwkpQRFuwWWdsb2JhbACadQEWFb1LhCcE X-IronPort-AV: E=Sophos;i="4.44,480,1249250400"; d="scan'208";a="35300098" Received: from furbychan.cocan.org ([80.68.91.176]) by mail3-smtp-sop.national.inria.fr with ESMTP; 30 Sep 2009 12:16:26 +0200 Received: from rich by furbychan.cocan.org with local (Exim 4.63) (envelope-from ) id 1MswEQ-0007EN-BD; Wed, 30 Sep 2009 11:16:22 +0100 Date: Wed, 30 Sep 2009 11:16:22 +0100 To: Mikkel =?iso-8859-1?Q?Fahn=F8e_J=F8rgensen?= Cc: Till Varoquaux , Yaron Minsky , "caml-list@inria.fr" Subject: Re: [Caml-list] xpath or alternatives Message-ID: <20090930101622.GA15517@annexia.org> References: <20090928121745.GA19803@annexia.org> <9d3ec8300909280806me88dccbm2a6e4d98c9de191@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.13 (2006-08-11) From: Richard Jones X-Spam: no; 0.00; 0200,:01 mikkel:01 rgensen:01 yaron:01 combinator:01 parser:01 parser:01 syntax:01 abstraction:01 one-liner:01 ocaml:01 ocaml:01 2009:98 git:98 git:98 On Wed, Sep 30, 2009 at 01:00:15AM +0200, Mikkel Fahnøe Jørgensen wrote: > In line with what Yaron suggests, you can use a combinator parser. > > I do this to parse json, and this parser could be adapted to xml by > focusing on basic syntax and ignoring the details, or you could > prefilter xml and use the json parser directly. > > See the Fleece parser embedded here: > > There is also the object abstraction that dives into an object > hierarchy after parsing, see the Objects module. The combination of > these two makes it quite easy to work on structured data, but 3 lines > only come after some xml adaptation work - but you can see many > one-liner json access in the last part of the file. > > http://git.dvide.com/pub/symbiosis/tree/myocamlbuild_config.ml > > Otherwise there is xmlm which is self-contained in single xml file, > and as I recall, has some sort of zipper navigator. (I initially > intended to use it before deciding on the json format): > > http://erratique.ch/software/xmlm It's interesting you mention xmlm, because I couldn't write the code using xmlm at all. The discussion here has got quite theoretical, but it's not helping me to write the original 3 lines of Perl in OCaml. my $p = XML::XPath->new (xml => $xml); my @disks = $p->findnodes ('//devices/disk/source/@dev'); push (@disks, $p->findnodes ('//devices/disk/source/@file')); My best effort, using xml-light, is around 40 lines: http://git.et.redhat.com/?p=libguestfs.git;a=blob;f=ocaml/examples/viewer.ml;h=ef6627b1b92a4fff7d4fa1fa4aca63eeffc05ece;hb=HEAD#l322 Rich. -- Richard Jones Red Hat