caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Extracting common information
@ 2007-05-24 11:04 Niko Matsakis
  2007-05-24 11:31 ` [Caml-list] " Romain
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Niko Matsakis @ 2007-05-24 11:04 UTC (permalink / raw)
  To: caml-list

Hello,

Perhaps this is a beginner question.  If it's not appropriate for the  
list, I apologize in advance.  I am working on a simple compiler in  
Ocaml, and having some difficulty settling on the best design for my  
AST.

My initial attempt was to use a lot of variant types, like so:

> type type_tree = Primitive of string | Pointer of type_tree
> and  expr_tree = Unary of string * expr_tree |
>                  Binary of string * expr_tree * expr_tree
> (etc)

This seemed like it should work reasonably well, but then I realized  
I would want to thread along some annotation to store the types, line  
numbers, and things like that.  Since these annotations will change  
between compiler passes, I thought I would use a variant type like so:

> type 'ud type_tree = Primitive of 'ud * string |
>                      Pointer of 'ud * 'ud type_tree
> and 'ud expr_tree = Unary of 'ud * string * 'ud expr_tree |
>                     Binary of 'ud * string * 'ud expr_tree * 'ud  
> expr_tree
> etc

but then I ran into the problem that I sometimes want to be able to  
extract the user data without reference to the type of tree involved,  
and I can see no easy way to do that beyond:

> let ud_type type =
>   match type with
>     Primitive(ud,_) ud |
>     Pointer(ud,_) ud
> etc

Obviously, that is unappealing.

So, I am not sure what the right approach is here.  I could use  
objects, and then use subtyping (or structural subtyping?) to have a  
common base type with the userdata, but this seems significantly more  
verbose.  Also, I'm trying to explore alternatives to OOP (though if  
there is no compelling alternative, .

Various examples that I have looked at don't seem to really solve  
this problem: I guess they just try to structure things so that they  
don't ever need to extract the user data by itself.  This is an  
alternate approach that I could take, perhaps my problem is merely a  
failure of the imagination.  Another option, obviously, is not to  
embed the userdata in the variant type, but instead outside, by  
replacing any reference (for example) to type_tree with ('ud *  
type_tree).  This would be ok.

One example of when I want this ability is shown in the "Pointer"  
type constructor above: in this case, when making a pointer type I  
initially want the user data to be the same (as it contains the line  
number, let's say), but later, the user data may be different.   
Since, in my parser, I just have a "type" reduction that yields some  
kind of type_tree, it is hard to extract the user data.


thanks for any tips!

Niko


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] Extracting common information
  2007-05-24 11:04 Extracting common information Niko Matsakis
@ 2007-05-24 11:31 ` Romain
  2007-05-24 11:44 ` Bünzli Daniel
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Romain @ 2007-05-24 11:31 UTC (permalink / raw)
  To: Niko Matsakis; +Cc: caml-list

Hi,

Note that you can replace your pattern-matching with something like:

let ud_type type =
   match type with
     Primitive(ud,_)
   | Pointer(ud,_) -> ud

But although it is more factorized (which is good) it doesn't solve your 
problem.

One solution which I like better is something like:

type userdata = ... whatever you want (code location...) ...
type 'a node = {
   user_data: userdata;
   node: 'a;
}
type type_tree =
   Primitive of string
| Pointer of type_tree node
type expr_tree =
   Unary of string * expr_tree node
| Binary of string * expr_tree node * expr_tree node

Now instead of matching on a type_tree x, you'll match on x.node. And if 
you want the user data you just use x.user_data.

Hope it helps.

-- 
Romain Bardou


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] Extracting common information
  2007-05-24 11:04 Extracting common information Niko Matsakis
  2007-05-24 11:31 ` [Caml-list] " Romain
@ 2007-05-24 11:44 ` Bünzli Daniel
  2007-05-24 14:09 ` skaller
  2007-05-24 14:23 ` Stephen Weeks
  3 siblings, 0 replies; 7+ messages in thread
From: Bünzli Daniel @ 2007-05-24 11:44 UTC (permalink / raw)
  To: Niko Matsakis; +Cc: caml-list

Another solution is to use polymorphic variants [1] :

type 'ud type_tree = [ `Primitive of 'ud * string |
                      `Pointer of 'ud * 'ud type_tree ]
type 'ud expr_tree = [ `Unary of 'ud * string * 'ud expr_tree |
                     `Binary of 'ud * string * 'ud expr_tree * 'ud  
expr_tree ]

type 'ud any_tree = [ 'ud type_tree | 'ud expr_tree ]

let ud = function
| `Primitive (ud, _) | `Pointer (ud, _) | `Unary (ud, _, _) | `Binary  
(ud, _, _, _) -> ud

Daniel

[1] http://caml.inria.fr/pub/docs/manual-ocaml/manual006.html#htoc41


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] Extracting common information
  2007-05-24 11:04 Extracting common information Niko Matsakis
  2007-05-24 11:31 ` [Caml-list] " Romain
  2007-05-24 11:44 ` Bünzli Daniel
@ 2007-05-24 14:09 ` skaller
  2007-05-24 14:23 ` Stephen Weeks
  3 siblings, 0 replies; 7+ messages in thread
From: skaller @ 2007-05-24 14:09 UTC (permalink / raw)
  To: Niko Matsakis; +Cc: caml-list

On Thu, 2007-05-24 at 13:04 +0200, Niko Matsakis wrote:
> Hello,

> > type type_tree = Primitive of string | Pointer of type_tree
> > and  expr_tree = Unary of string * expr_tree |
> >                  Binary of string * expr_tree * expr_tree

> This seemed like it should work reasonably well, but then I realized  
> I would want to thread along some annotation to store the types, line  
> numbers, and things like that.  Since these annotations will change  
> between compiler passes, I thought I would use a variant type like so:
> 
> > type 'ud type_tree = Primitive of 'ud * string |
> >                      Pointer of 'ud * 'ud type_tree
> > and 'ud expr_tree = Unary of 'ud * string * 'ud expr_tree |
> >                     Binary of 'ud * string * 'ud expr_tree * 'ud  
> > expr_tree
> > etc
> 
> but then I ran into the problem that I sometimes want to be able to  
> extract the user data without reference to the type of tree involved,  
> and I can see no easy way to do that beyond:
> 
> > let ud_type type =
> >   match type with
> >     Primitive(ud,_) ud |
> >     Pointer(ud,_) ud
> > etc
> 
> Obviously, that is unappealing.

That doesn't matter: what you have done is the best way, except
the type for source references should be concrete, not a parameter.

Although you could factor this by separating the source
location from the term .. that's a bad idea because you will
be tempted to throw out the source location when analysing
the terms .. which prevents you emitting good errors.

Putting the source location directly in the terms FORCES
you to calculate a reasonable location for every term you
synthesise by a rewriting rule (via the type system), 
which is a good thing.

There is another way though .. just add a term like:

	`Source of src_loc * tree

which says 'my child was produced from this source location'.
I don't recommend this though, precisely because it can be used
wherever you like .. or not used at all.


-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] Extracting common information
  2007-05-24 11:04 Extracting common information Niko Matsakis
                   ` (2 preceding siblings ...)
  2007-05-24 14:09 ` skaller
@ 2007-05-24 14:23 ` Stephen Weeks
  2007-05-24 18:00   ` Peter Ilberg
  3 siblings, 1 reply; 7+ messages in thread
From: Stephen Weeks @ 2007-05-24 14:23 UTC (permalink / raw)
  To: Niko Matsakis; +Cc: caml-list

> I am working on a simple compiler in Ocaml, and having some difficulty
> settling on the best design for my AST.
...
> then I realized I would want to thread along some annotation to store the
> types, line numbers, and things like that.  Since these annotations will
> change between compiler passes,

I ran in to the same design problem when writing MLton.  My solution was to use
lisp-style property lists, but with appropriate ML static typing.  See

  http://mlton.org/PropertyList

The essence of the solution is to use a dynamically extensible sum type.  Each
tree node holds a list ref of values in the sum type.  Each compiler pass builds
a new variant of the sum type and stores whatever it needs in the list.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] Extracting common information
  2007-05-24 14:23 ` Stephen Weeks
@ 2007-05-24 18:00   ` Peter Ilberg
  2007-05-28 19:00     ` Niko Matsakis
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Ilberg @ 2007-05-24 18:00 UTC (permalink / raw)
  To: Niko Matsakis; +Cc: caml-list

On Thu, 24 May 2007, Niko Matsakis <niko@alum.mit.edu> wrote:
> I am working on a simple compiler in Ocaml, and having some difficulty
> settling on the best design for my AST.
[...]
> then I realized I would want to thread along some annotation to store  
> the types, line numbers, and things like that.

Simon Peyton Jones and David Lester discuss some options for an AST  
datatype
with annotations in section 6.2 (page 225) of "Implementing functional
languages: a tutorial". (You might have to read chapter 1 for background  
info.)
They use Haskell, but their solution should be relatively easy to port to
Ocaml, if you decide to use it.

You can find the book at:

http://research.microsoft.com/~simonpj/Papers/pj-lester-book/

--- Peter


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] Extracting common information
  2007-05-24 18:00   ` Peter Ilberg
@ 2007-05-28 19:00     ` Niko Matsakis
  0 siblings, 0 replies; 7+ messages in thread
From: Niko Matsakis @ 2007-05-28 19:00 UTC (permalink / raw)
  To: Peter Ilberg; +Cc: caml-list

Thanks all for your help.  I'll check out the various references.


Niko


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2007-05-28 19:00 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-05-24 11:04 Extracting common information Niko Matsakis
2007-05-24 11:31 ` [Caml-list] " Romain
2007-05-24 11:44 ` Bünzli Daniel
2007-05-24 14:09 ` skaller
2007-05-24 14:23 ` Stephen Weeks
2007-05-24 18:00   ` Peter Ilberg
2007-05-28 19:00     ` Niko Matsakis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).