caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [OSR] Suggested topic - XML processing API
@ 2008-01-30  0:54 Jim Miller
  2008-01-30  2:37 ` [Caml-list] " Bünzli Daniel
  0 siblings, 1 reply; 23+ messages in thread
From: Jim Miller @ 2008-01-30  0:54 UTC (permalink / raw)
  To: caml-list

[-- Attachment #1: Type: text/plain, Size: 757 bytes --]

Inspired by the existing recommendation on the cocan wiki for I/O, I'd like
to recommend the development of a standard interface for XML processing.
Currently there are many different implementations of XML parsers but their
interfaces are very different and don't allow for easy swapping of
implementations.

On a side note, the development of these recommendations seems to be very
much in the vein of an SRFI in the scheme world, which I think is an
excellent model to follow for these recommendations.  (which doesn't include
the actual mechanics of distribution or package management which seems to be
the bulk of OSR discussion).

I'm willing to start the topic on the wiki but I'm simply going to take an
existing XML processor and start from there.

[-- Attachment #2: Type: text/html, Size: 810 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] [OSR] Suggested topic - XML processing API
  2008-01-30  0:54 [OSR] Suggested topic - XML processing API Jim Miller
@ 2008-01-30  2:37 ` Bünzli Daniel
  2008-01-30  3:26   ` Jim Miller
  2008-01-30 15:55   ` Vincent Hanquez
  0 siblings, 2 replies; 23+ messages in thread
From: Bünzli Daniel @ 2008-01-30  2:37 UTC (permalink / raw)
  To: caml-list List


Le 30 janv. 08 à 01:54, Jim Miller a écrit :

> Inspired by the existing recommendation on the cocan wiki for I/O,  
> I'd like to recommend the development of a standard interface for  
> XML processing.  Currently there are many different implementations  
> of XML parsers but their interfaces are very different and don't  
> allow for easy swapping of implementations.

There are many approaches to xml parsing (partial implementations,  
leniency, well-formedness, validity, etc.), to parsing results (tree,  
custom data structure, stream, namespace support etc.) and to  
processing (mainly dependent on the parsing result). Xml processing  
cannot be seen as an abstract datatype with different implementations,  
there are different ways.

As such I'm not sure such an interface is really feasible. Now if you  
see a common pattern or concrete type signatures that could be changed  
to make parsers more compatible do not hesitate to communicate them.  
If it benefits the users of my parser and remains in its philosophy  
I'll happily implement them. But _you_ have to make concrete proposal,  
I'm not going to research this. Please do not just initiate a  
discussion because you like the abstract idea of being able to swap  
xml parser implementations, make proposals.

Best,

Daniel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] [OSR] Suggested topic - XML processing API
  2008-01-30  2:37 ` [Caml-list] " Bünzli Daniel
@ 2008-01-30  3:26   ` Jim Miller
  2008-01-30  7:35     ` Alain Frisch
  2008-01-30 10:35     ` Jon Harrop
  2008-01-30 15:55   ` Vincent Hanquez
  1 sibling, 2 replies; 23+ messages in thread
From: Jim Miller @ 2008-01-30  3:26 UTC (permalink / raw)
  To: caml-list List

> As such I'm not sure such an interface is really feasible. Now if you
> see a common pattern or concrete type signatures that could be changed
> to make parsers more compatible do not hesitate to communicate them.
> If it benefits the users of my parser and remains in its philosophy
> I'll happily implement them. But _you_ have to make concrete proposal,
> I'm not going to research this. Please do not just initiate a
> discussion because you like the abstract idea of being able to swap
> xml parser implementations, make proposals.
>
> Best,
>
>

Fair enough, I'll start with a proposal on the topic, though being
late I'm not going to go too deep.  If it gets through a first round
of discussion, I'll start a node on the wiki and be happy to take
point on maintaining a document based on feedback.

My interest in this is based on my experience in dealing with XML.
90% of what I need to do is parse simple documents defining a known
structure that are coming from either files, strings, or the network.
Its also based on the responses I've received when attempting to
evangelize OCaml to a crowd whose first task is typically to try and
connect to the network, read some XML, do some processing on the XML,
and generate a response.

The purpose of this minimum implementation is to provide a common API
to perform the following tasks:

- Define a simple type that can be used to construct a tree
representing XML data.
- Parse an existing XML document into a simple data structure allowing
access to the data
- Manipulate the result of parsing the XML document
- Construct simple XML documents

XML parser implementations are free to expand beyond this
implementation, this is merely a recommendation for a minimum
implementation.

type xmlNode =
 | XmlElement of (namespace: string * tagName: string * attributes:
(string * string) list * (children:xmlNode list) )
 | XmlPCData of (text:string)

with the following functions to parse data from different types of
sources.  The parsing, by default, should be non-validating but will
ensure well-formedness

val parse_file : string -> xmlNode
val parse_string: string -> xmlNode
val parse_channel: Pervasives.in_channel -> xmlNode

val to_string : xmlNode -> string

val iter : (xmlNode -> unit) -> xmlNode -> unit
val map : (xmlNode -> 'a) -> xmlNode -> 'a list
val fold : ('a -> xmlNode -> 'a) -> 'a -> xmlNode -> 'a

Additional sections/areas that would have to be defined:

o Handling errors while parsing

o Validation.

I personally prefer having a different set of methods that perform the
parsing with validation so that its obvious to me what is being
performed when I invoke a function.  I would be content with optional
arguments to the parse_ functions that are defined above but with the
default being to not validate.

o Callback/SAX style API

This is where I believe significant differences exist between XML
implementations.  I'm sure that the most that can be done here will be
to standardize the names of the functions or types that are used.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] [OSR] Suggested topic - XML processing API
  2008-01-30  3:26   ` Jim Miller
@ 2008-01-30  7:35     ` Alain Frisch
  2008-01-30 10:32       ` Bünzli Daniel
  2008-01-30 10:35     ` Jon Harrop
  1 sibling, 1 reply; 23+ messages in thread
From: Alain Frisch @ 2008-01-30  7:35 UTC (permalink / raw)
  To: Jim Miller; +Cc: caml-list

Jim Miller wrote:
> type xmlNode =
>  | XmlElement of (namespace: string * tagName: string * attributes:
> (string * string) list * (children:xmlNode list) )
>  | XmlPCData of (text:string)

There has been some discussions here a while ago about standardizing XML 
types across OCaml libraries. You might want to look up the archives.

Here are some random remarks.

First, you need to specify several things in the type above.

- the encoding of strings; if the parser cannot be configured, I guess 
that normalizing everything to utf-8 is the most natural choice.

- the handling of namespaces; does the first argument to XmlElement 
refers to the namespace prefix as used in the document (it'd make 
matching impossible because the document can use arbitrary prefixes), a 
normalized version (you'd need to provide the parser with more info), or 
the namespace URI (which makes pattern matching quite tedious). Also, it 
is sometimes necessary to keep the [prefix->uri] dictionnary available 
in at every node (e.g. to deal with XML Schema documents, where prefixes 
can be used in attribute values). Moreover, some XML documents may be 
valid w.r.t. to the XML spec without conforming to the XML Namespaces one.

- whether adjacent XmlPCData nodes are allowed or not.

- whether the parser performs whitespace normalization (and how).


Also, in many cases, the client of the parser might want to get more 
information, like locations in the source document.

If you intend to use the same type to produce XML documents from an 
internal representation, I think you might want to add an extra constructor:

   | XmlMany of xmlNode list

This makes it much easier to build and compose XML fragments in a 
modular way.

Also, you need to specify how the XML printer is supposed to deal with 
namespaces.



-- Alain


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] [OSR] Suggested topic - XML processing API
  2008-01-30  7:35     ` Alain Frisch
@ 2008-01-30 10:32       ` Bünzli Daniel
  0 siblings, 0 replies; 23+ messages in thread
From: Bünzli Daniel @ 2008-01-30 10:32 UTC (permalink / raw)
  To: caml-list caml-list


> Jim Miller wrote:
>> type xmlNode =
>> | XmlElement of (namespace: string * tagName: string * attributes:
>> (string * string) list * (children:xmlNode list) )
>> | XmlPCData of (text:string)

Attributes can have their own namespace, have a look a the spec [1]. I  
see it more that way (but I'm biaised).

type name = string * string
type attribute = name * string
type tag = name * attribute list

etc.


Adding to Alain's list, other things that need to be specified.

- what do you do with processing instructions and comments

- whether character references and predefined entities are resolved.

- how do you deal with external entity references.

- where does the parsing end (I don't do it according to the xml spec  
because from the words of the spec editor himself [2] the spec is  
broken).

I did document many of this issues for my own parser. You may want to  
check that out [3] it may show you some of the specification details  
that are needed (note that the tree and the cursor representations are  
going away in the next version).

Best,

Daniel

[1] http://www.w3.org/TR/REC-xml-names/
[2] http://www.xml.com/axml/notes/TrailingMisc.html
[3] http://erratique.ch/software/xmlm/doc/Xmlm#io


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] [OSR] Suggested topic - XML processing API
  2008-01-30  3:26   ` Jim Miller
  2008-01-30  7:35     ` Alain Frisch
@ 2008-01-30 10:35     ` Jon Harrop
  2008-01-30 17:25       ` Jim Miller
  1 sibling, 1 reply; 23+ messages in thread
From: Jon Harrop @ 2008-01-30 10:35 UTC (permalink / raw)
  To: caml-list

On Wednesday 30 January 2008 03:26:00 Jim Miller wrote:
> type xmlNode =
>
>  | XmlElement of (namespace: string * tagName: string * attributes:
>
> (string * string) list * (children:xmlNode list) )
>
>  | XmlPCData of (text:string)

Just a minor quibble but might I suggest removing the redundant Xml/xml from 
all of the identifiers:

type element =
  { namespace: string;
    tagName: string;
    attributes: (string * string) list;
    children: node list }

and node =
 | Element of element
 | Text of string;;

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] [OSR] Suggested topic - XML processing API
  2008-01-30  2:37 ` [Caml-list] " Bünzli Daniel
  2008-01-30  3:26   ` Jim Miller
@ 2008-01-30 15:55   ` Vincent Hanquez
  1 sibling, 0 replies; 23+ messages in thread
From: Vincent Hanquez @ 2008-01-30 15:55 UTC (permalink / raw)
  To: Bünzli Daniel; +Cc: caml-list List

On Wed, Jan 30, 2008 at 03:37:59AM +0100, Bünzli Daniel wrote:
>
> Le 30 janv. 08 à 01:54, Jim Miller a écrit :
>
>> Inspired by the existing recommendation on the cocan wiki for I/O, I'd 
>> like to recommend the development of a standard interface for XML 
>> processing.  Currently there are many different implementations of XML 
>> parsers but their interfaces are very different and don't allow for easy 
>> swapping of implementations.
>
> There are many approaches to xml parsing (partial implementations, 
> leniency, well-formedness, validity, etc.), to parsing results (tree, 
> custom data structure, stream, namespace support etc.) and to processing 
> (mainly dependent on the parsing result). Xml processing cannot be seen as 
> an abstract datatype with different implementations, there are different 
> ways.
>
> As such I'm not sure such an interface is really feasible. Now if you see a 
> common pattern or concrete type signatures that could be changed to make 
> parsers more compatible do not hesitate to communicate them. If it benefits 
> the users of my parser and remains in its philosophy I'll happily implement 
> them. But _you_ have to make concrete proposal, I'm not going to research 
> this. Please do not just initiate a discussion because you like the 
> abstract idea of being able to swap xml parser implementations, make 
> proposals.

a do-everything interface is absolutely impossible, but providing a "simple"
library that parse DOM and SAX style to fill _common_ needs, is
relatively "easy".

For people that got specifics needs, there's nothing preventing them to
implement/use a side library without using the common distributed library.

-- 
Vincent Hanquez


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] [OSR] Suggested topic - XML processing API
  2008-01-30 10:35     ` Jon Harrop
@ 2008-01-30 17:25       ` Jim Miller
  2008-02-05  3:23         ` Jim Miller
  0 siblings, 1 reply; 23+ messages in thread
From: Jim Miller @ 2008-01-30 17:25 UTC (permalink / raw)
  To: caml-list

Thanks for the comments.  I will start looking at all of the
suggestions tonight and make a second draft of the document that I'll
hope to have by tomorrow evening (I want to make sure that I research
the existing APIs as suggested).  At that point I'll make a page on
the Wiki where the document can evolve.

On Jan 30, 2008 5:35 AM, Jon Harrop <jon@ffconsultancy.com> wrote:
> On Wednesday 30 January 2008 03:26:00 Jim Miller wrote:
> > type xmlNode =
> >
> >  | XmlElement of (namespace: string * tagName: string * attributes:
> >
> > (string * string) list * (children:xmlNode list) )
> >
> >  | XmlPCData of (text:string)
>
> Just a minor quibble but might I suggest removing the redundant Xml/xml from
> all of the identifiers:
>
> type element =
>   { namespace: string;
>     tagName: string;
>     attributes: (string * string) list;
>     children: node list }
>
> and node =
>  | Element of element
>  | Text of string;;
>
> --
> Dr Jon D Harrop, Flying Frog Consultancy Ltd.
> http://www.ffconsultancy.com/products/?e
>
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] [OSR] Suggested topic - XML processing API
  2008-01-30 17:25       ` Jim Miller
@ 2008-02-05  3:23         ` Jim Miller
  2008-02-05  5:02           ` Alain Frisch
  2008-02-05  8:15           ` Vincent Hanquez
  0 siblings, 2 replies; 23+ messages in thread
From: Jim Miller @ 2008-02-05  3:23 UTC (permalink / raw)
  To: caml-list

I have created a page on the Cocan wiki for the discussion and
development of a common API for  XML processing:

http://www.cocan.org/osr/standard_api_for_xml_processors

The draft of the recommendation can be found here:

http://alastor.bittwiddlers.com/~gmiller/XmlApi.html

If you have any problems accessing this server please let me know and
I can make it more generally available.  I'm happy to post this to
another server if desired.  I didn't write this into the Wiki
primarily because my connection periods to the Internet do not
coincide with my periods of peak productivity.  I'm happy now to move
it to the Wiki if truly collaborative development is desired but
that'll have to take place another night.

I await suggestions/flames.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] [OSR] Suggested topic - XML processing API
  2008-02-05  3:23         ` Jim Miller
@ 2008-02-05  5:02           ` Alain Frisch
  2008-02-05  8:36             ` Bünzli Daniel
  2008-02-05  8:15           ` Vincent Hanquez
  1 sibling, 1 reply; 23+ messages in thread
From: Alain Frisch @ 2008-02-05  5:02 UTC (permalink / raw)
  To: Jim Miller; +Cc: caml-list

Jim Miller wrote:
> I await suggestions/flames.

As suggested before, you really need to say something, at least, about:
- namespaces (including for attributes): the most acceptable solution is 
probably to do nothing about them and remove the first argument to the 
XmlElement constructor;
- encoding of strings in the internal representation: probably always utf-8;
- whether the parser may create XmlMany nodes: probably no.

Other remarks:
- a 'pull' API (where the client asks the parser for more parsing 
events/tokens) is more convenient than the SAX-like 'push' API. You 
might want to add such an API to you spec;
- having a common spec for several libs makes more sense if they can 
share common types; maybe you should use polymorphic variants instead of 
regular ones?

-- Alain


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] [OSR] Suggested topic - XML processing API
  2008-02-05  3:23         ` Jim Miller
  2008-02-05  5:02           ` Alain Frisch
@ 2008-02-05  8:15           ` Vincent Hanquez
  2008-02-05 11:16             ` Stefano Zacchiroli
  1 sibling, 1 reply; 23+ messages in thread
From: Vincent Hanquez @ 2008-02-05  8:15 UTC (permalink / raw)
  To: Jim Miller; +Cc: caml-list

On Mon, Feb 04, 2008 at 10:23:28PM -0500, Jim Miller wrote:
> I have created a page on the Cocan wiki for the discussion and
> development of a common API for  XML processing:
> 
> http://www.cocan.org/osr/standard_api_for_xml_processors
> 
> The draft of the recommendation can be found here:
> 
> http://alastor.bittwiddlers.com/~gmiller/XmlApi.html
> 
> If you have any problems accessing this server please let me know and
> I can make it more generally available.  I'm happy to post this to
> another server if desired.  I didn't write this into the Wiki
> primarily because my connection periods to the Internet do not
> coincide with my periods of peak productivity.  I'm happy now to move
> it to the Wiki if truly collaborative development is desired but
> that'll have to take place another night.

please rename XmlElement, XmlPCData into Element and PCData.
you're suppose to be in the Xml namespace, so everything is already
prefixed by "Xml." when used.

same hold for xmlError.

-- 
Vincent Hanquez


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] [OSR] Suggested topic - XML processing API
  2008-02-05  5:02           ` Alain Frisch
@ 2008-02-05  8:36             ` Bünzli Daniel
  2008-02-05  9:51               ` Vincent Hanquez
  0 siblings, 1 reply; 23+ messages in thread
From: Bünzli Daniel @ 2008-02-05  8:36 UTC (permalink / raw)
  To: caml-list caml-list

Le 5 févr. 08 à 06:02, Alain Frisch a écrit :

> As suggested before, you really need to say something, at least,  
> about:
[...]
- Whether character references and predefined entity references must  
be resolved. Hint : yes.


Le 5 févr. 08 à 06:02, Alain Frisch a écrit :

> - having a common spec for several libs makes more sense if they can  
> share common types; maybe you should use polymorphic variants  
> instead of regular ones?

Agreed. In xmlm these variants become polymorphic in the next version.

Other comments.

* IMHO, do not use camel casing. Underscores are more caml like, i.e.  
xml_node, etc.
* Regarding naming I would call xmlNode xml_tree and in general drop  
the xml prefix from the cases.
* "combine" argument, in my opinion parser should always combine  
adjacent pcdata nodes.
* As other may now know I don't like to raise exceptions, the next  
version of xmlm doesn't raise exceptions (but given recent discussions  
it seems others do like exceptions).
* Regarding the way the parser is invoked  I don't like the way it is  
done :

(1) The function "parse", I can only use it with channels this is not  
good (2) Having convenience parse_file is always useless to me since  
it is hard to know the exact kind of error handling performed by such  
functions without looking at its source.

The way I do this kind of things is to define an input abstraction  
type. First you create an input abstraction from a data source
(e.g. in_channel, strings, and a callback source) and then you invoke  
the parser with the input abstraction (actually I started an OSR on  
devising IO modules with non object-oriented IO sources and  
destination reflecting this view, but I'm reluctant to publish it).

In general I'd like to say that I'm a little bit dubious about this  
effort. Actually I would refrain from formalizing the actual way the  
parser is invoked, clients can also perform their bit of work. I would  
concentrate on defining :

1) Parsing _result_ types and a precise definition of the actual  
_form_ of the data they contain. More than one form may be defined.  
This is the most important thing if you would like to be able to  
switch implementation, the actual input procedure can easily be  
isolated from the rest of your source.

2) A minimal list of input sources (e.g. in_channel and string) from  
which the parser should be able to read without going in further  
details on how the actual input procedure should be performed. Just  
specify the state in which sources are accepted for input and left  
after output.

Best,

Daniel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] [OSR] Suggested topic - XML processing API
  2008-02-05  8:36             ` Bünzli Daniel
@ 2008-02-05  9:51               ` Vincent Hanquez
  2008-02-05 10:13                 ` Jacques Garrigue
  2008-02-05 10:31                 ` Bünzli Daniel
  0 siblings, 2 replies; 23+ messages in thread
From: Vincent Hanquez @ 2008-02-05  9:51 UTC (permalink / raw)
  To: Bünzli Daniel; +Cc: caml-list caml-list

On Tue, Feb 05, 2008 at 09:36:02AM +0100, Bünzli Daniel wrote:
>> - having a common spec for several libs makes more sense if they can share 
>> common types; maybe you should use polymorphic variants instead of regular 
>> ones?
>
> Agreed. In xmlm these variants become polymorphic in the next version.

that's really a bad idea; As a user of xmlm, I hope you're going to
re-consider. the polymorphic variant namespace is so easily polluted by
random "value" that library should never use them or at least doesn't
advertise them as public interface.

-- 
Vincent Hanquez


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] [OSR] Suggested topic - XML processing API
  2008-02-05  9:51               ` Vincent Hanquez
@ 2008-02-05 10:13                 ` Jacques Garrigue
  2008-02-05 11:14                   ` Vincent Hanquez
  2008-02-05 10:31                 ` Bünzli Daniel
  1 sibling, 1 reply; 23+ messages in thread
From: Jacques Garrigue @ 2008-02-05 10:13 UTC (permalink / raw)
  To: tab; +Cc: daniel.buenzli, caml-list

From: tab@snarc.org (Vincent Hanquez)
> On Tue, Feb 05, 2008 at 09:36:02AM +0100, Bünzli Daniel wrote:
> >> - having a common spec for several libs makes more sense if they can share 
> >> common types; maybe you should use polymorphic variants instead of regular 
> >> ones?
> >
> > Agreed. In xmlm these variants become polymorphic in the next version.
> 
> that's really a bad idea; As a user of xmlm, I hope you're going to
> re-consider. the polymorphic variant namespace is so easily polluted by
> random "value" that library should never use them or at least doesn't
> advertise them as public interface.

I have no particular opinion on this particular case (if you want to
allow chaning the library, you can also functorize your code, which
would work with normal variants too), but could you explain how
polymorphic variant namespace can get polluted?
The point of polymorphic variants is precisely that pollution does
not exist (i.e. only constructors that appear in the same type
matter). This is what makes them so nice in libraries.

Jacques Garrigue


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] [OSR] Suggested topic - XML processing API
  2008-02-05  9:51               ` Vincent Hanquez
  2008-02-05 10:13                 ` Jacques Garrigue
@ 2008-02-05 10:31                 ` Bünzli Daniel
  2008-02-05 10:43                   ` Nicolas Pouillard
  2008-02-05 11:21                   ` Vincent Hanquez
  1 sibling, 2 replies; 23+ messages in thread
From: Bünzli Daniel @ 2008-02-05 10:31 UTC (permalink / raw)
  To: Vincent Hanquez; +Cc: caml-list caml-list


Le 5 févr. 08 à 10:51, Vincent Hanquez a écrit :

> On Tue, Feb 05, 2008 at 09:36:02AM +0100, Bünzli Daniel wrote:
>>> - having a common spec for several libs makes more sense if they  
>>> can share
>>> common types; maybe you should use polymorphic variants instead of  
>>> regular
>>> ones?
>>
>> Agreed. In xmlm these variants become polymorphic in the next  
>> version.
>
> that's really a bad idea; As a user of xmlm, I hope you're going to
> re-consider. the polymorphic variant namespace is so easily polluted  
> by
> random "value"

What people seem to fail to understand is that with polymorphic  
variants if you close them and write mlis you get exactly the same  
typechecking as with regular variants but without being tied to a  
particular module. For example if define

type encoding = [ `ISO_8859_1 | `US_ASCII | `UTF_16 | `UTF_16BE |  
`UTF_16LE | `UTF_8 ]
and then ask for this type exactly in a function type, e.g.
val encoding_to_string : encoding -> string
then you get exactly the same typechecking as with a regular variants  
on applications of encoding_to_string.
Using variants allows you to have a better decoupling between say your  
own modules that hande encodings and xmlm. As Jacques mentions this  
actually may prevents pollution from xmlm to your own modules.
Best,
Daniel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] [OSR] Suggested topic - XML processing API
  2008-02-05 10:31                 ` Bünzli Daniel
@ 2008-02-05 10:43                   ` Nicolas Pouillard
  2008-02-05 13:29                     ` Jon Harrop
  2008-02-05 11:21                   ` Vincent Hanquez
  1 sibling, 1 reply; 23+ messages in thread
From: Nicolas Pouillard @ 2008-02-05 10:43 UTC (permalink / raw)
  To: daniel.buenzli; +Cc: tab, caml-list

[-- Attachment #1: Type: text/plain, Size: 1822 bytes --]

Excerpts from daniel.buenzli's message of Tue Feb 05 11:31:22 +0100 2008:
> 
> Le 5 févr. 08 à 10:51, Vincent Hanquez a écrit :
> 
> > On Tue, Feb 05, 2008 at 09:36:02AM +0100, Bünzli Daniel wrote:
> >>> - having a common spec for several libs makes more sense if they  
> >>> can share
> >>> common types; maybe you should use polymorphic variants instead of  
> >>> regular
> >>> ones?
> >>
> >> Agreed. In xmlm these variants become polymorphic in the next  
> >> version.
> >
> > that's really a bad idea; As a user of xmlm, I hope you're going to
> > re-consider. the polymorphic variant namespace is so easily polluted  
> > by
> > random "value"
> 
> What people seem to fail to understand is that with polymorphic  
> variants if you close them and write mlis you get exactly the same  
> typechecking as with regular variants but without being tied to a  
> particular module. For example if define
> 
> type encoding = [ `ISO_8859_1 | `US_ASCII | `UTF_16 | `UTF_16BE |  
> `UTF_16LE | `UTF_8 ]
> and then ask for this type exactly in a function type, e.g.
> val encoding_to_string : encoding -> string
> then you get exactly the same typechecking as with a regular variants  
> on applications of encoding_to_string.
> Using variants allows you to have a better decoupling between say your  
> own modules that hande encodings and xmlm. As Jacques mentions this  
> actually may prevents pollution from xmlm to your own modules.

I  completely agree with this type of usage of polymorphic variants.

However  I  think  that  for  error  handling  option  and  either are simpler
solutions.  Then  going  to  polymorphic  variants  because  OCaml  don't have
"either"  in  pervasive  is sad (in fact I think that OCaml deserve a "either"
type, even more: an "Either" module).

-- 
Nicolas Pouillard aka Ertai

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 278 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] [OSR] Suggested topic - XML processing API
  2008-02-05 10:13                 ` Jacques Garrigue
@ 2008-02-05 11:14                   ` Vincent Hanquez
  0 siblings, 0 replies; 23+ messages in thread
From: Vincent Hanquez @ 2008-02-05 11:14 UTC (permalink / raw)
  To: Jacques Garrigue; +Cc: daniel.buenzli, caml-list

On Tue, Feb 05, 2008 at 07:13:40PM +0900, Jacques Garrigue wrote:
> > that's really a bad idea; As a user of xmlm, I hope you're going to
> > re-consider. the polymorphic variant namespace is so easily polluted by
> > random "value" that library should never use them or at least doesn't
> > advertise them as public interface.
> 
> I have no particular opinion on this particular case (if you want to
> allow chaning the library, you can also functorize your code, which
> would work with normal variants too), but could you explain how
> polymorphic variant namespace can get polluted?

I consider them as pollution since they don't have a "namespace", but that
might not be the right word here.

What i want to express is the fact that when I use "`Node", i can't tell
from where it's coming from;
is that an Xml node, a binary tree node, a Red/Black tree node, etc ? dunno.

when I have a "Xml.Node", i automaticly knows that Node is coming from
the Xml namespace, and it's an xml node.

> The point of polymorphic variants is precisely that pollution does
> not exist (i.e. only constructors that appear in the same type
> matter). This is what makes them so nice in libraries.

I beg to differ. I'm happy to know from where my symbol comes from,
since that's automatic code documentation for me. using polymorphic
variant just put everything in the same flat namespace.

-- 
Vincent Hanquez


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] [OSR] Suggested topic - XML processing API
  2008-02-05  8:15           ` Vincent Hanquez
@ 2008-02-05 11:16             ` Stefano Zacchiroli
  0 siblings, 0 replies; 23+ messages in thread
From: Stefano Zacchiroli @ 2008-02-05 11:16 UTC (permalink / raw)
  To: caml-list

On Tue, Feb 05, 2008 at 09:15:51AM +0100, Vincent Hanquez wrote:
> please rename XmlElement, XmlPCData into Element and PCData.
> you're suppose to be in the Xml namespace, so everything is already
> prefixed by "Xml." when used.

Full ack, no matter whether you ends up with camel case or _-separated
tokens: please drop the unneeded prefix.

Cheers.

-- 
Stefano Zacchiroli -*- PhD in Computer Science ............... now what?
zack@{upsilon.cc,cs.unibo.it,debian.org}  -<%>-  http://upsilon.cc/zack/
(15:56:48)  Zack: e la demo dema ?    /\    All one has to do is hit the
(15:57:15)  Bac: no, la demo scema    \/    right keys at the right time


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] [OSR] Suggested topic - XML processing API
  2008-02-05 10:31                 ` Bünzli Daniel
  2008-02-05 10:43                   ` Nicolas Pouillard
@ 2008-02-05 11:21                   ` Vincent Hanquez
  1 sibling, 0 replies; 23+ messages in thread
From: Vincent Hanquez @ 2008-02-05 11:21 UTC (permalink / raw)
  To: Bünzli Daniel; +Cc: caml-list caml-list

On Tue, Feb 05, 2008 at 11:31:22AM +0100, Bünzli Daniel wrote:
>> that's really a bad idea; As a user of xmlm, I hope you're going to
>> re-consider. the polymorphic variant namespace is so easily polluted by
>> random "value"
>
> What people seem to fail to understand is that with polymorphic variants if 
> you close them and write mlis you get exactly the same typechecking as with 
> regular variants but without being tied to a particular module. For example 
> if define

you might have the same typechecking, but consider my example in
the other thread. If I type `ISO_8850_1 somewhere by mistake, the error
reporting is going to be extremely crappy.

> type encoding = [ `ISO_8859_1 | `US_ASCII | `UTF_16 | `UTF_16BE | `UTF_16LE 
> | `UTF_8 ]
> and then ask for this type exactly in a function type, e.g.
> val encoding_to_string : encoding -> string
> then you get exactly the same typechecking as with a regular variants on 
> applications of encoding_to_string.

> Using variants allows you to have a better decoupling between say your own 
> modules that hande encodings and xmlm. As Jacques mentions this actually 
> may prevents pollution from xmlm to your own modules.

That's true you could do that if you want to provide interface, but you
got functor to achieve the same result, without using polymorphic
variant and without polluting your own modules, with still the same
strong type thing that people using ocaml are used to.

-- 
Vincent Hanquez


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] [OSR] Suggested topic - XML processing API
  2008-02-05 10:43                   ` Nicolas Pouillard
@ 2008-02-05 13:29                     ` Jon Harrop
  2008-02-05 14:53                       ` micha
  2008-02-05 14:57                       ` David Teller
  0 siblings, 2 replies; 23+ messages in thread
From: Jon Harrop @ 2008-02-05 13:29 UTC (permalink / raw)
  To: caml-list

On Tuesday 05 February 2008 10:43:51 Nicolas Pouillard wrote:
> However  I  think  that  for  error  handling  option  and  either are
> simpler solutions.  Then  going  to  polymorphic  variants  because  OCaml 
> don't have "either"  in  pervasive  is sad (in fact I think that OCaml
> deserve a "either" type, even more: an "Either" module).

... and an "Option" module for the "option" type. And "Int" and "Float" 
modules ...

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] [OSR] Suggested topic - XML processing API
  2008-02-05 14:53                       ` micha
@ 2008-02-05 14:53                         ` Jon Harrop
  0 siblings, 0 replies; 23+ messages in thread
From: Jon Harrop @ 2008-02-05 14:53 UTC (permalink / raw)
  To: caml-list

On Tuesday 05 February 2008 14:53:47 micha wrote:
> Am Dienstag, 5. Februar 2008 14:29:01 schrieb Jon Harrop:
> > ... and an "Option" module for the "option" type. And "Int" and "Float"
> > modules ...
>
> make a port of the sml basis lib structures an additional ocaml lib

And F#'s stdlib...

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] [OSR] Suggested topic - XML processing API
  2008-02-05 13:29                     ` Jon Harrop
@ 2008-02-05 14:53                       ` micha
  2008-02-05 14:53                         ` Jon Harrop
  2008-02-05 14:57                       ` David Teller
  1 sibling, 1 reply; 23+ messages in thread
From: micha @ 2008-02-05 14:53 UTC (permalink / raw)
  To: caml-list

Am Dienstag, 5. Februar 2008 14:29:01 schrieb Jon Harrop:
>
> ... and an "Option" module for the "option" type. And "Int" and "Float"
> modules ...

make a port of the sml basis lib structures an additional ocaml lib

 Michael


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] [OSR] Suggested topic - XML processing API
  2008-02-05 13:29                     ` Jon Harrop
  2008-02-05 14:53                       ` micha
@ 2008-02-05 14:57                       ` David Teller
  1 sibling, 0 replies; 23+ messages in thread
From: David Teller @ 2008-02-05 14:57 UTC (permalink / raw)
  To: Jon Harrop; +Cc: caml-list

We can do that. It's only a few days' work, after all. But it's another
day's work for the OSR :)

Cheers,
 David

On Tue, 2008-02-05 at 13:29 +0000, Jon Harrop wrote:
> On Tuesday 05 February 2008 10:43:51 Nicolas Pouillard wrote:
> > However  I  think  that  for  error  handling  option  and  either are
> > simpler solutions.  Then  going  to  polymorphic  variants  because  OCaml 
> > don't have "either"  in  pervasive  is sad (in fact I think that OCaml
> > deserve a "either" type, even more: an "Either" module).
> 
> ... and an "Option" module for the "option" type. And "Int" and "Float" 
> modules ...
> 
-- 
David Teller
 Security of Distributed Systems
  http://www.univ-orleans.fr/lifo/Members/David.Teller
 Angry researcher: French Universities need reforms, but the LRU act brings liquidations. 


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2008-02-05 14:58 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-01-30  0:54 [OSR] Suggested topic - XML processing API Jim Miller
2008-01-30  2:37 ` [Caml-list] " Bünzli Daniel
2008-01-30  3:26   ` Jim Miller
2008-01-30  7:35     ` Alain Frisch
2008-01-30 10:32       ` Bünzli Daniel
2008-01-30 10:35     ` Jon Harrop
2008-01-30 17:25       ` Jim Miller
2008-02-05  3:23         ` Jim Miller
2008-02-05  5:02           ` Alain Frisch
2008-02-05  8:36             ` Bünzli Daniel
2008-02-05  9:51               ` Vincent Hanquez
2008-02-05 10:13                 ` Jacques Garrigue
2008-02-05 11:14                   ` Vincent Hanquez
2008-02-05 10:31                 ` Bünzli Daniel
2008-02-05 10:43                   ` Nicolas Pouillard
2008-02-05 13:29                     ` Jon Harrop
2008-02-05 14:53                       ` micha
2008-02-05 14:53                         ` Jon Harrop
2008-02-05 14:57                       ` David Teller
2008-02-05 11:21                   ` Vincent Hanquez
2008-02-05  8:15           ` Vincent Hanquez
2008-02-05 11:16             ` Stefano Zacchiroli
2008-01-30 15:55   ` Vincent Hanquez

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).