caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* xmlm and names(paces)
@ 2008-02-07  8:13 oleg
  2008-02-07  8:59 ` [Caml-list] " Bünzli Daniel
  0 siblings, 1 reply; 3+ messages in thread
From: oleg @ 2008-02-07  8:13 UTC (permalink / raw)
  To: caml-list


Buenzli Daniel wrote:
> As I previously said on this list I'm adding better namespace support to
> xmlm. Up to now xmlm just parsed qualified names into their prefix and
> local part (prefix, local). Now I'd like to provide the client with
> expanded names (uri, local).
>
> Initially I planned to give the client choice between getting qualified
> names or expanded names. However the prefix of qualified names is really
> meaningless (it can be alpha converted) and thus cannot be used to
> recognize anything in a document. One of the aim of xmlm is simplicity,
> as such I think xmlm should only provide expanded names.

It should be mentioned that the prefixes of qualified names cannot
just be alpha-converted. It is quite common to see the following,
quoted from http://www.w3.org/TR/xmlschema-0/

	<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
         ...
	  <xsd:element name="comment" type="xsd:string"/>

One can plainly see that the prefix, xsd, appears inside a quoted
string! If one wishes to rename the prefix xsd into just 's', one has
to look inside quoted strings (of course, not every occurrence of xsd
inside quoted string is the prefix. A quoted string, the content of an
attribute, may just as well be an opaque quoted string).

One may really wonder what kind of people wrote all those voluminous
XML recommendations. 

So, ideally one may wish to keep the original prefix (in addition to
its corresponding URL). It is also reasonable for a user to specify a
`shortcut'. Unlike the prefix, which is chosen by the author of the
document, a shortcut is chosen by the person who invokes a parser. In
the SSAX parser, the user specifies the association of URI with
shortcuts. The parser, having resolved the QName prefix to a URI, maps
that URI to the user-specified shortcut, if present. The shortcuts are
extensively discussed in
	http://okmij.org/ftp/Scheme/SXML.html#Namespaces

Incidentally, some of the design decisions of SSAX (despite being
produced by an enemy) might be pertinent to this discussion. SSAX is
actually a SAX parser, or a big macro that builds a parser out of
user-provided callbacks and reasonable defaults. One can use SSAX to
parse XML on the fly or to convert XML to anything one chooses. There
is also an instantiation of SSAX with reasonable callbacks that make
SSAX a DOM parser, converting XML into one particular output format,
SXML. Experience shows that this particular instantiation satisfies
most of the users. Still I have come across several users who needed
the full SSAX (e.g., for streaming conversion of XML into something
else).


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Caml-list] xmlm and names(paces)
  2008-02-07  8:13 xmlm and names(paces) oleg
@ 2008-02-07  8:59 ` Bünzli Daniel
  0 siblings, 0 replies; 3+ messages in thread
From: Bünzli Daniel @ 2008-02-07  8:59 UTC (permalink / raw)
  To: caml-list List


Le 7 févr. 08 à 09:13, oleg@okmij.org a écrit :

> It should be mentioned that the prefixes of qualified names cannot
> just be alpha-converted. It is quite common to see the following,
> quoted from http://www.w3.org/TR/xmlschema-0/
>
> 	<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
>         ...
> 	  <xsd:element name="comment" type="xsd:string"/>

Argh ! I now understand Alain's comment about the need to keep that  
information. This is a complete misuse of the namespace recommandation  
which says nothing about binding pefixes in attribute and character  
data. The w3c is really hopeless.

> One may really wonder what kind of people wrote all those voluminous
> XML recommendations.

You tell me.

> So, ideally one may wish to keep the original prefix (in addition to
> its corresponding URL). It is also reasonable for a user to specify a
> `shortcut'. Unlike the prefix, which is chosen by the author of the
> document, a shortcut is chosen by the person who invokes a parser.

As mentionned in my previous email with xmlm you can do that by  
yourself since all the info is there and you have full control on the  
parsing result. However for the aformentionned case this may mean a  
lot of work. On the other hand xml schema seems to be seen as a broken  
technology (even the xml spec editor says so iirc). So the question  
is, is it worth complexifiying the interface to facilitate the parsing  
of this obviously broken and marginal (is it ?) case.

> Incidentally, some of the design decisions of SSAX (despite being
> produced by an enemy) might be pertinent to this discussion.

Thanks for the link, I will have a look at it (functional programming  
languages are no enemies).

Best,

Daniel


^ permalink raw reply	[flat|nested] 3+ messages in thread

* xmlm and names(paces)
@ 2008-02-06 20:44 Bünzli Daniel
  0 siblings, 0 replies; 3+ messages in thread
From: Bünzli Daniel @ 2008-02-06 20:44 UTC (permalink / raw)
  To: caml-list List

Hello,

As I previously said on this list I'm adding better namespace support  
to xmlm. Up to now xmlm just parsed qualified names into their prefix  
and local part (prefix, local). Now I'd like to provide the client  
with expanded names (uri, local).

Initially I planned to give the client choice between getting  
qualified names or expanded names. However the prefix of qualified  
names is really meaningless (it can be alpha converted) and thus  
cannot be used to recognize anything in a document. One of the aim of  
xmlm is simplicity, as such \x13I think xmlm should only provide expanded  
names.

However maybe I'm missing something so I'd like to ask the list if  
someone think there is any use for clients to get qualified names ? If  
I you do please tell me.

Best,

Daniel

P.S. There is no distinction betwen qualified and expanded names if  
you parse documents that have no prefixes and no default namespace  
declarations.

P.P.S. Name expansion has a performance cost but if I support only  
expanded names I can better reduce it.


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2008-02-07  8:59 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-07  8:13 xmlm and names(paces) oleg
2008-02-07  8:59 ` [Caml-list] " Bünzli Daniel
  -- strict thread matches above, loose matches on Subject: below --
2008-02-06 20:44 Bünzli Daniel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).