caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: "Bünzli Daniel" <daniel.buenzli@erratique.ch>
To: caml-list caml-list <caml-list@yquem.inria.fr>
Subject: Re: [Caml-list] [OSR] Suggested topic - XML processing API
Date: Tue, 5 Feb 2008 09:36:02 +0100	[thread overview]
Message-ID: <36788A32-5BAC-4D57-9D0A-B9A20A49536F@erratique.ch> (raw)
In-Reply-To: <47A7EDFE.1050805@frisch.fr>

Le 5 févr. 08 à 06:02, Alain Frisch a écrit :

> As suggested before, you really need to say something, at least,  
> about:
[...]
- Whether character references and predefined entity references must  
be resolved. Hint : yes.


Le 5 févr. 08 à 06:02, Alain Frisch a écrit :

> - having a common spec for several libs makes more sense if they can  
> share common types; maybe you should use polymorphic variants  
> instead of regular ones?

Agreed. In xmlm these variants become polymorphic in the next version.

Other comments.

* IMHO, do not use camel casing. Underscores are more caml like, i.e.  
xml_node, etc.
* Regarding naming I would call xmlNode xml_tree and in general drop  
the xml prefix from the cases.
* "combine" argument, in my opinion parser should always combine  
adjacent pcdata nodes.
* As other may now know I don't like to raise exceptions, the next  
version of xmlm doesn't raise exceptions (but given recent discussions  
it seems others do like exceptions).
* Regarding the way the parser is invoked  I don't like the way it is  
done :

(1) The function "parse", I can only use it with channels this is not  
good (2) Having convenience parse_file is always useless to me since  
it is hard to know the exact kind of error handling performed by such  
functions without looking at its source.

The way I do this kind of things is to define an input abstraction  
type. First you create an input abstraction from a data source
(e.g. in_channel, strings, and a callback source) and then you invoke  
the parser with the input abstraction (actually I started an OSR on  
devising IO modules with non object-oriented IO sources and  
destination reflecting this view, but I'm reluctant to publish it).

In general I'd like to say that I'm a little bit dubious about this  
effort. Actually I would refrain from formalizing the actual way the  
parser is invoked, clients can also perform their bit of work. I would  
concentrate on defining :

1) Parsing _result_ types and a precise definition of the actual  
_form_ of the data they contain. More than one form may be defined.  
This is the most important thing if you would like to be able to  
switch implementation, the actual input procedure can easily be  
isolated from the rest of your source.

2) A minimal list of input sources (e.g. in_channel and string) from  
which the parser should be able to read without going in further  
details on how the actual input procedure should be performed. Just  
specify the state in which sources are accepted for input and left  
after output.

Best,

Daniel

  reply	other threads:[~2008-02-05  8:36 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-30  0:54 Jim Miller
2008-01-30  2:37 ` [Caml-list] " Bünzli Daniel
2008-01-30  3:26   ` Jim Miller
2008-01-30  7:35     ` Alain Frisch
2008-01-30 10:32       ` Bünzli Daniel
2008-01-30 10:35     ` Jon Harrop
2008-01-30 17:25       ` Jim Miller
2008-02-05  3:23         ` Jim Miller
2008-02-05  5:02           ` Alain Frisch
2008-02-05  8:36             ` Bünzli Daniel [this message]
2008-02-05  9:51               ` Vincent Hanquez
2008-02-05 10:13                 ` Jacques Garrigue
2008-02-05 11:14                   ` Vincent Hanquez
2008-02-05 10:31                 ` Bünzli Daniel
2008-02-05 10:43                   ` Nicolas Pouillard
2008-02-05 13:29                     ` Jon Harrop
2008-02-05 14:53                       ` micha
2008-02-05 14:53                         ` Jon Harrop
2008-02-05 14:57                       ` David Teller
2008-02-05 11:21                   ` Vincent Hanquez
2008-02-05  8:15           ` Vincent Hanquez
2008-02-05 11:16             ` Stefano Zacchiroli
2008-01-30 15:55   ` Vincent Hanquez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=36788A32-5BAC-4D57-9D0A-B9A20A49536F@erratique.ch \
    --to=daniel.buenzli@erratique.ch \
    --cc=caml-list@yquem.inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).