Re: [Caml-list] Extensible graphs - Diego Olivier Fernandez Pons

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

From: Diego Olivier Fernandez Pons <Diego.FERNANDEZ_PONS@etu.upmc.fr>
To: Jon Harrop <jdh30@cam.ac.uk>
Cc: caml-list@inria.fr
Subject: Re: [Caml-list] Extensible graphs
Date: Wed, 31 Mar 2004 11:11:35 +0200 (DST)	[thread overview]
Message-ID: <Pine.A41.4.44.0403310910020.1585400-100000@ibm1> (raw)
In-Reply-To: <200403282300.57264.jdh30@cam.ac.uk>

    Bonjour,

> I'm writing code which I wish to sell in object form and I'd like it
> to contain a basic representation of a graph which can be extended.
> This basic graph might be something like:
>
> type leaf = A | B
> type node = Leaf of leaf | Group of node list

[...]

> People who use this code are likely to want to make a slightly more
> complicated graph which contains, say, an extra leaf type, an extra
> node type and more functions which act on the new type of graph,
> equivalent to this:

(Please forgive my approximative english and feel free to correct it
whenever needed. Moreover, if some elements seem unclear, do not
hesitate to ask for more explanations)

The problem is the extensibility of graph data structure distributed
in a compiled form. My answer will be two folds :
- generic advice based on my Caml programming experience
- specific advice based on my graph data structure implementation
experience

I have read a few of your web pages and you seem to be an "imperative
programmer" more used to languages such as C++ or Java rather than
functional ones like ML or Haskell.

In "Objective Caml" there is of course "Objective" which states
clearly the language has an object layer but there is still "Caml" and
its functional core. Relevant elements for data structure
implementation are :
- parametric polymorphism
- functors
- polymorphic variants
- private constructors

> Adding new functions which use the existing data types is easy, but
> I can't see any way to allow them to add new node types without
> requiring them to reimplement everything, or at least explicitly
> call the old routines from any new ones when they are used with the
> old data types.

What do you mean by "new node types" ?

If what you need is to allow any type to be a node, then you should
try a polymorphic data structure :

'a graph (where 'a stands for the type of the node)

the you could have
int graph : a graph in which every node contains an integer type
information
(int * char) graph : a graph in which every node contains an integer
and a char data
(int graph) graph : a graph in which every node contains an
int graph data
MyType graph : your own type data in every node

You will find parametric graph data structures in Baire (see the Hump
in the data structure section) and you can easily build your own ones
(e. g. with a parametric map data structure)

If the "node type" requires specific accessors (i.e. if it is a
module) then you should try functorial graphs.

the user code should look like

 module MyNode = struct ... end

 module MyGraph = Graph.Make (MyNode)

You will find an example of functorial graph library in OCamlGraph
(see the Hump, data structures section) even if in this case it is the
whole graph data structure which is abstracted from the (functorial)
graph algorithms.

You may also want to try "private constructors". It is a kind of
intermediate between the completely open types (e.g. int * int) and
"closed" functors. It is a rather new feature and I am not yet totally
confortable with it, therefor I won't say much more.

> type leaf = A | B | C
> type node =
>     | Leaf of leaf
>     | FunkyGroup of node list
>     | Group of node list

In the example you give, the "node" is not a node of the graph but
a node of the underlying tree that represents the graph : are you
really sure you need that ?

The main problem here is the pattern-matching since the predefined
functions (like count_leaves) based on it do not work any more.

Possible work-around are :

i) pattern-matching simulation via functors

I tried that once for binary trees

type 'a tree = E | N of 'a tree * 'a * 'a tree
type 'a tree2 = E | N of 'a tree2 * 'a * 'a tree2 * int

I didn't want to rewritte all functions like insert, fold, etc. which
do not depend on the extra int information

let rec height = function
  | E -> 0
  | N (l, _, r) -> 1 + max (height l) (height r)

I defined a module TreePatternMatcher

type 'a t
val is_empty : 'a t -> bool
val left_tree : 'a t -> 'a t
val right_tree : 'a t -> 'a t
val value : 'a t -> 'a
val partition : 'a t -> 'a t * 'a * 'a t

then I wrapped all functions in a functor using this interface

let rec height = function tree ->
  if is_empty tree then 0
  else let (l, _, r) = partition tree in
  1 + max (height l) (height right)

ii) polymorphic variants

In the previous case, the problem was "inside a constructor"
  N (l, v, r) against N (l, v, r, _)

If you only need to add patterns, then polymorphic variants could be
what you are looking for. The Caml manual gives a few examples of the
use of variants.

> I've also tried using inheritance by deriving everything from an ABC
> "node". But this just replaces this problem with another problem. If
> the types of node are all derived from a "node" ABC then you can
> easily add new types but you can't easily add new (method) functions
> to all types.

The object layer of Caml is in my opinion rather subtle and the only
case I have needed it is for adaptive programming (when a data
structure changes its representation silently)

Instead of writting

type tree =
  | TreeRepresentationOne ...
  | TreeRepresentationTwo ...

let insert x = function
  | TreeRep1 t -> TR1.add x t
  | TreeRep2 t -> TR2.add x t

you use the object layer to downcast to a common subtype.

> Is factoring out as much code as possible the best I can do, or is
> there a better way to approach this problem ?

It would be easier if you gave us a more detailled example. Anyway, in
my opinion you should first try simple solutions (polymorphic data
structures).

        Diego Olivier

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

next prev parent reply	other threads:[~2004-03-31  9:11 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-03-28 22:00 Jon Harrop
2004-03-31  9:11 ` Diego Olivier Fernandez Pons [this message]
2004-04-07  1:26   ` Jon Harrop

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.A41.4.44.0403310910020.1585400-100000@ibm1 \
    --to=diego.fernandez_pons@etu.upmc.fr \
    --cc=caml-list@inria.fr \
    --cc=jdh30@cam.ac.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).