caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: "Mikkel Fahnøe Jørgensen" <mikkel@dvide.com>
To: Richard Jones <rich@annexia.org>
Cc: Till Varoquaux <till@pps.jussieu.fr>,
	Yaron Minsky <yminsky@gmail.com>,
	"caml-list@inria.fr" <caml-list@inria.fr>
Subject: Re: [Caml-list] xpath or alternatives
Date: Wed, 30 Sep 2009 12:49:12 +0200	[thread overview]
Message-ID: <caee5ad80909300349r103957ffs69a33949c4ae265e@mail.gmail.com> (raw)
In-Reply-To: <20090930101622.GA15517@annexia.org>

2009/9/30 Richard Jones <rich@annexia.org>:
> On Wed, Sep 30, 2009 at 01:00:15AM +0200, Mikkel Fahnøe Jørgensen wrote:
>> In line with what Yaron suggests, you can use a combinator parser.
> It's interesting you mention xmlm, because I couldn't write
> the code using xmlm at all.

If you can manage to convert an xml document into a json like tagged
tree structure,
then a simple solution like

module Value = struct
56	    type value_type =
57	      Object of (string * value_type) list
58	    | Array of value_type list
59	    | String of string
60	    | Int of int
61	    | Float of float
62	    | Bool of bool
63	    | Null
64	  end
65	
..
665	  let get_object v = match v with Object x -> x
666	    | _ -> fail "json object expected"
..
685     let pattern_path value names =
686	    let rec again value = function
687	      | "*" :: names  -> List.iter (fun (n, v) -> try again v names
688	          with Invalid_argument _ | Not_found -> ()) (get_object value)
689	      | name :: names -> again (List.assoc name (get_object value)) names
690	      | [] -> raise (Found value)
691	    in try again value names; raise Not_found with Found value -> value
692	

combined with a path split function

22	  let split c s =
23	    let n = String.length s in
24	    let rec again i lst =
25	      begin try let k = String.rindex_from s i c in
26	        again (k - 1) ((if i = k then "" else (String.sub s (k + 1)
(i - k))) :: lst)
27	        with _ -> (String.sub s 0 (i + 1)) :: lst
28	      end
29	    in again (n - 1) []

will do almost exactly what you are asking for - notice the "*"
searches broadly in all subtrees. You can add your own xpath like
functions as you discover a need for them.

I believe that the xmlm examples has a tree transformation operation
that would easily be adapted to produce a json like tree, if modified
a little.

let out_tree o t =
  let frag = function
  | E (tag, childs) -> `El (tag, childs)
  | D d -> `Data d
  in
  Xmlm.output_doc_tree frag o t


> My best effort, using xml-light, is around 40 lines:

If you spend those 40 lines on a layer on top of a lightweight xml
parser, you might get away with 3 lines the next time.


  parent reply	other threads:[~2009-09-30 10:49 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-28 12:17 Richard Jones
2009-09-28 12:48 ` [Caml-list] " Yaron Minsky
2009-09-28 15:06   ` Till Varoquaux
2009-09-29 23:00     ` Mikkel Fahnøe Jørgensen
2009-09-30 10:16       ` Richard Jones
2009-09-30 10:36         ` Sebastien Mondet
2009-09-30 10:49         ` Mikkel Fahnøe Jørgensen [this message]
2009-09-30 11:05         ` Dario Teixeira
2009-09-30 11:57           ` Richard Jones
2009-09-30 12:59             ` Richard Jones
2009-09-30 13:33               ` Till Varoquaux
2009-09-30 14:01                 ` Richard Jones
2009-09-30 14:28                   ` Till Varoquaux
2009-09-30 14:51                   ` Alain Frisch
2009-09-30 15:09                     ` Richard Jones
2009-09-30 15:18                       ` Alain Frisch
2009-10-28  2:22         ` Daniel Bünzli
2009-09-30 13:39 ` Stefano Zacchiroli
2009-09-30 14:49   ` Gerd Stolpmann
2009-09-30 15:12     ` Stefano Zacchiroli
2009-09-30 15:22       ` Jordan Schatz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=caee5ad80909300349r103957ffs69a33949c4ae265e@mail.gmail.com \
    --to=mikkel@dvide.com \
    --cc=caml-list@inria.fr \
    --cc=rich@annexia.org \
    --cc=till@pps.jussieu.fr \
    --cc=yminsky@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).