caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Alain Frisch <alain@frisch.fr>
To: "Bünzli Daniel" <daniel.buenzli@erratique.ch>
Cc: caml-list List <caml-list@inria.fr>
Subject: Re: [Caml-list] xmlm and names(paces)
Date: Wed, 06 Feb 2008 22:52:51 +0100	[thread overview]
Message-ID: <47AA2C33.70902@frisch.fr> (raw)
In-Reply-To: <998006DB-6778-42F1-B2F4-31CB5E9AA6A2@erratique.ch>

Bünzli Daniel wrote:
> As I previously said on this list I'm adding better namespace support to 
> xmlm. Up to now xmlm just parsed qualified names into their prefix and 
> local part (prefix, local). Now I'd like to provide the client with 
> expanded names (uri, local).
> 
> Initially I planned to give the client choice between getting qualified 
> names or expanded names. However the prefix of qualified names is really 
> meaningless (it can be alpha converted) and thus cannot be used to 
> recognize anything in a document. One of the aim of xmlm is simplicity, 
> as such \x13I think xmlm should only provide expanded names.

The problem with expanded names is that it makes it quite tedious to 
pattern-match on element/attribute names (uri are long!). Of course, it 
is a trivial exercise in Camlp4 to create a nice syntax for that.

Another option is to let the client provide a mapping from uri to fixed 
prefixes. (PXP can do that kind of prefix normalization.)

It is also a good idea to be able to parse XML documents that conform
to the XML spec but not the XML Namespaces spec.

What about something like that:

type name = string * [`N of string * string|`U of string * string|`X]

The first component of name gives the full unparsed name from the XML 
document. The second component gives the qname decomposition; it can be 
either a known normalized prefix (relative to a dictionnary provided by 
the client) or an unknown URI. Or it can be an error (the document does 
not conform to the XML Namespaces spec). If the client does not provide
any dictionnary of known prefixes, there will be no `N node. If the 
parser is run a non-namespace mode, there will be only `X nodes.

examples:
   ("html:p", `N ("xhtml", "p"))
         the prefix html refers to the known xhtml namespace

   ("foo:x", `U ("http://unknownnamespaceuri", "x"))
         the prefix foo refers to a unknown uri

   ("x:y", `X)
         the prefix x is not bound to any namespace

   ("x::z", `X)
         name is ill-formed w.r.t. the XML Namespaces spec.

Also, it is necessary to give the client a way to know the namespace 
bindings in scope at any node. Some XML languages like XML-Schema need 
this information. A possible way to do it is just to keep the xmlns 
declarations as regular attributes.

As a minor alternative, in order to reduce the syntactic overhead, it is 
possible to use a single string to encode the three possible cases. E.g.:

   "p:xhtml"  Known uri
   "x::http://unknownnamespaceuri" Unknown uri
   "" Ill-formed qname

-- Alain


  parent reply	other threads:[~2008-02-06 21:58 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-06 20:44 Bünzli Daniel
2008-02-06 20:59 ` [Caml-list] " David Teller
2008-02-06 21:26   ` Bünzli Daniel
2008-02-06 21:52 ` Alain Frisch [this message]
2008-02-06 22:56   ` Bünzli Daniel
2008-02-06 23:51     ` Bünzli Daniel
2008-02-07 22:03     ` Alain Frisch
2008-02-07  8:13 oleg
2008-02-07  8:59 ` [Caml-list] " Bünzli Daniel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47AA2C33.70902@frisch.fr \
    --to=alain@frisch.fr \
    --cc=caml-list@inria.fr \
    --cc=daniel.buenzli@erratique.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).