From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=AWL,SPF_FAIL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id 7E2E5BC37 for ; Wed, 30 Sep 2009 17:09:19 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ao8DAHARw0pQRFuwWWdsb2JhbACadQEUF74ShCcEgU0 X-IronPort-AV: E=Sophos;i="4.44,480,1249250400"; d="scan'208";a="47609935" Received: from furbychan.cocan.org ([80.68.91.176]) by mail4-smtp-sop.national.inria.fr with ESMTP; 30 Sep 2009 17:09:19 +0200 Received: from rich by furbychan.cocan.org with local (Exim 4.63) (envelope-from ) id 1Mt0nu-0003O7-BT; Wed, 30 Sep 2009 16:09:18 +0100 Date: Wed, 30 Sep 2009 16:09:18 +0100 To: Alain Frisch Cc: caml-list@inria.fr Subject: Re: [Caml-list] xpath or alternatives Message-ID: <20090930150918.GC5126@annexia.org> References: <20090930101622.GA15517@annexia.org> <245077.49646.qm@web111516.mail.gq1.yahoo.com> <20090930115723.GL1104@annexia.org> <20090930125904.GA5126@annexia.org> <9d3ec8300909300633t51db9ed9jf3ad15e6b0551c1e@mail.gmail.com> <20090930140107.GB5126@annexia.org> <4AC37055.70503@frisch.fr> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4AC37055.70503@frisch.fr> User-Agent: Mutt/1.5.13 (2006-08-11) From: Richard Jones X-Spam: no; 0.00; 0200,:01 frisch:01 interfacing:01 cduce:01 syntax:01 ocamlduce:01 subtyping:01 subtyping:01 interfacing:01 frisch:01 pxp:01 ocamlduce:01 2009:98 expat:98 char:01 On Wed, Sep 30, 2009 at 04:51:01PM +0200, Alain Frisch wrote: > Richard Jones wrote: > > let devs = {{ map [xml] with > > | [[[_]]] > > | [[[_]]] -> > > [s] > > | _ -> [] }} in > > The following should work: > > let l = {{ [xml] }} in > let l = {{ map l with l -> l | _ -> [] }} in > let l = {{ map l with l -> l | _ -> [] }} in > let l = {{ map l with l -> l | _ -> [] }} in > let l = {{ map l with _ > | _-> s > | _ -> [] }} in > ... > > let () = > let l = {{ [xml] }} in > let l = {{ (((l.(_)) / .(_)) / .(_)) / }} in > let l = {{ map l with _ > | _ -> s > | _ -> [] }} in > .. Thanks Alain. My latest attempt was similar to your version 1 above, and it works :-) Now my code looks like your version 2: let xml = from_string xml in let xs = {{ [xml] }} in let xs = {{ (((xs.(_)) / .(_)) / .(_)) / }} in let xs = {{ map xs with | _ | _ -> [s] | _ -> [] }} in {: xs :} (plus the boilerplate for interfacing xml-light and CDuce). We're getting close to the xpath/perl solution (8 lines vs 3 lines), with some added type safety and the possibility of validating the XML. On the other hand, the code is hard to understand. It's not clear to me what the .( ) syntax means, nor why there is an apparently trailing / character. > This uses the constructions e/ and e.(t) as described in the manual. > > That said, using OCamlDuce for this kind of XML data-extraction seems > just crazy to me. I have some comments: (A) "Subtyping failed" is a very common error, but is only mentioned briefly in the manual. I have no idea what these errors mean, so they should have more explanation. Here is a simple one which was caused by me using a value instead of a list (but that is not at all obvious from the error message): Error: Subtyping failed Latin1 <= [ Latin1* ] Sample: [ Latin1Char ] (B) I think the interfacing code here: http://yquem.inria.fr/~frisch/ocamlcduce/samples/expat/ http://yquem.inria.fr/~frisch/ocamlcduce/samples/pxp/ http://yquem.inria.fr/~frisch/ocamlcduce/samples/xmllight/ should be distributed along with ocamlduce. Rich. -- Richard Jones Red Hat