caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* RE: Compiler translation of array indexing
@ 2000-01-19 17:33 Simon Peyton-Jones
  2000-01-19 22:50 ` Pierre Weis
  0 siblings, 1 reply; 7+ messages in thread
From: Simon Peyton-Jones @ 2000-01-19 17:33 UTC (permalink / raw)
  To: 'Pierre Weis', dmcclain; +Cc: caml-list

| > Instead of translating as in OCAML to Array.get and
| > Array.set, they could be translated into some equivalent to
| > 
| > x.(n)  ->  module_of(type_of x).get(x, n)
| 
| There are strong theoretical difficulties to obtain a type_of function
| embedded into a statically & strongly type checked language such as
| Caml. 

It's perhaps worth mentioning that one way of achieving
this effect is to use type-driven overloading, as Haskell
does.  One could say

	class Array a where
	  (.) :: a elt -> Int -> elt
	  ... more operations...

(Here 'a' is a type variable ranging over type constructors
of kind *->*.)

Now for each type (constructor) T that you want to make an
instance of Array, you would say

	instance Array T where
	  (.) = ...your implemention of indexing for T...


Such type classes are, however, fully implemented in Haskell.  
However, type classes (not to be confused with object oriented
classes as in OCaml) form quite a large design space, and its
interaction with OCaml classes and modules are quite unknown (to me
anyway).  


There's a paper about the type-class design space at
	http://research.microsoft.com/~simonpj/papers/multi.ps.gz

Simon




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Compiler translation of array indexing
  2000-01-19 17:33 Compiler translation of array indexing Simon Peyton-Jones
@ 2000-01-19 22:50 ` Pierre Weis
  2000-01-24 17:06   ` Andreas Rossberg
  0 siblings, 1 reply; 7+ messages in thread
From: Pierre Weis @ 2000-01-19 22:50 UTC (permalink / raw)
  To: Simon Peyton-Jones; +Cc: caml-list

> It's perhaps worth mentioning that one way of achieving
> this effect is to use type-driven overloading, as Haskell
> does.  One could say
> 
> 	class Array a where
> 	  (.) :: a elt -> Int -> elt
> 	  ... more operations...
> 
> (Here 'a' is a type variable ranging over type constructors
> of kind *->*.)

Thank you for the interesting information about Haskell.

As you suggested by your sketch program, the Haskell type classes are
undoubtedly a powerful and complex mechanism. 
However, it has some strange limitations that always seems extremely
``bizarre'' to me: for instance you may think you have a polymorphic
equality but this polymorphic equality cannot compare two polymorphic
objects (e.g. [] == [] is ill-typed). Strange, isn't it ?

For the example we are examining, I'm afraid that once more this
cannot be handled as desired by the Haskell type classes. In effect,
what we want in the first place (at least in Caml) is to get rid of
the two different syntactic notations for string and array accesses
(When we can do so, we can start to dream to an extension of the
mechanism to arbitrary vectors like data types).

Unfortunately, to the best of my knowledge, even if Haskell classes
can afford a zillion of instances of (.) for fancy vectors data types,
they just fail to handle the overloading of v.(0) and s.[0]: you
cannot declare string to belong to the class ``Array'' (since as you
carefully told us 'a' must have kind *->*, which is not the kind of
the string data type).

In the same vein, the next overloading I would want for .() would
certainly be for associative arrays (or hash tables in the Caml and
usual terminology). In this case, I want to declare that the (binary)
data type constructor ('a, 'b) hash_table as a member of the ``Array''
class. I don't think this is well-kinded.

Then, the following step would be to use the notation to access in lists.

To summerize, I really would like to write
 v.(1) instead of Array.get v 1
 s.(1) instead of String.get s 1

I would consider as an additional benefit of the new type system, if I
could use:
 t.("ok") instead of Hashtbl.find t "ok",
 l.("ok") instead of List.assoc "ok" l.

Unfortunately, I don't think that Haskell classes can handle the first
group, neither the second one; it just allows a strange mixture of the
two, since only the first and the last example can be declared as
member of the same class (and this would define list as a member of
``Array''!). Strange, isn't it ?

However, I just read the inference rules of the Haskell class system,
to figure out how overloading was resolved (tricky isn't it?). So I
never got an intuition of the class abstraction. So please, correct me
if I made false inferences about the class system.

In addition, since you are a true specialist of Haskell type classes,
may be you could tell us if there is some secret feature that you know
of in the Haskell type classes system that could elegantly deal with
these simple, useful, and natural examples ?

Sincerely,

Pierre Weis

INRIA, Projet Cristal, Pierre.Weis@inria.fr, http://cristal.inria.fr/~weis/





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Compiler translation of array indexing
  2000-01-19 22:50 ` Pierre Weis
@ 2000-01-24 17:06   ` Andreas Rossberg
  0 siblings, 0 replies; 7+ messages in thread
From: Andreas Rossberg @ 2000-01-24 17:06 UTC (permalink / raw)
  To: caml-list

Pierre Weis wrote:
> 
> To summerize, I really would like to write
>  v.(1) instead of Array.get v 1
>  s.(1) instead of String.get s 1
> 
> I would consider as an additional benefit of the new type system, if I
> could use:
>  t.("ok") instead of Hashtbl.find t "ok",
>  l.("ok") instead of List.assoc "ok" l.

Recent work by Mark Jones tries to make this work in Haskell/Hugs by
extending multi parameter type classes with dependency annotations:

	http://www.cse.ogi.edu/~mpj/fds.html

Using dependency annotations, one can declare a class like (in some
mixed Haskell/Caml pseudo syntax):

        class Index 'c 'i 'e | 'c -> 'i 'e where
                .() : 'c -> 'i -> 'e

(note that all parameters have kind *) and instances

        instance Index ('a array) int 'a
        instance Index string int char
        instance Index (('a,'b) Hashtbl.t) 'a 'b
        instance Index (('a*'b) list) 'a 'b

These should provide the desired type-driven behaviour for indexing. The
dependency annotation ('c -> 'i 'e) in the class ensures that index and
element type are determined by the collection type. For example, the
compiler could infer the following:

	[|1;2;3|].(0)         : int
	"hello".(3)           : char
	[(2,39);(4,0)].(2)    : int
	fun i -> "hello".(i)  : int -> char

Note however, that overloading can quickly become ambiguous, even with
dependencies. While

	let f x = x.(0)

would be possible and have qualified type like

	(Index 'c int 'e) => 'c -> 'e,

the only slightly more involved

	let g x = x.(0).(0)

would not be allowed without additional type annotations since

	(Index 'c1 int 'c2, Index 'c2 int 'e) => 'c1 -> 'e

is an ambiguous type.

Best regards,

        - Andreas



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Compiler translation of array indexing
  2000-01-19 17:08   ` Markus Mottl
@ 2000-01-19 21:17     ` Pierre Weis
  0 siblings, 0 replies; 7+ messages in thread
From: Pierre Weis @ 2000-01-19 21:17 UTC (permalink / raw)
  To: Markus Mottl; +Cc: caml-list

> The article says that the name of types and constructors is dropped for the
> conversion to the interchange format. But what about types like:
> 
>   type t = Foo | Bar
> 
> How are such cases treated? Another program might define this as
> 
>   type t = Bar | Foo
> 
> and possibly means the same thing with the same constructors, however, the
> internal representation might be completely different (maybe due to order
> of constructors).

1) Internal representations are indeed absolutely similar and
compatible, and this is carefully stated as a fundamental hypothesis
about the value representation algorithm of the compiler, in order to
prove the safety of our value I/O mechanism.

2) One can imagine that the Caml compiler sorts the names of
constructors (and labels) when a type is defined, to have a more
canonical representation of the values of data types.

3) Anyway, wih the current Caml compilers, since the representations
of Foo and Bar are indeed valid values of each of your t type, they
can safely be read as a value of the other type.

4) If you want to ensure a strong semantic property on the data you
are exchanging, you must use an abstract data type: in this
case a unique footprint is carried along with the values that are
exchanged, and this footprint can ensure some invariant properties of
those values.

> Are there primitives for disambiguating such cases? E.g. my program reads
> in data from another program, I see that it gets the constructors the wrong
> way round so I just use some primitive to get it right again?

This is not implemented. I don't know if this can really be done
polymorphically. This is further research :)

Pierre Weis

INRIA, Projet Cristal, Pierre.Weis@inria.fr, http://cristal.inria.fr/~weis/





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Compiler translation of array indexing
  2000-01-19 14:25 ` Pierre Weis
@ 2000-01-19 17:08   ` Markus Mottl
  2000-01-19 21:17     ` Pierre Weis
  0 siblings, 1 reply; 7+ messages in thread
From: Markus Mottl @ 2000-01-19 17:08 UTC (permalink / raw)
  To: Pierre Weis; +Cc: OCAML

> This work had also produced a safe value I/O system for Objective
> Caml, that is a fully typechecked and safe polymorphic input/output
> set of primitives for Objective Caml.
> The design and implementation is described into the following
> (forcoming) article : http://pauillac.inria.fr/~weis/articles/jfla2000.ps.Z

Wow! This is really interesting stuff! I don't know much French, but if I
get the main ideas of the article, the techniques presented in it give
quite some new perspectives to programming in statically typed languages in
general and in OCaml specifically.

So far, one of the valid arguments of people using dynamically typed
languages is that I/O in statically typed languages is very inflexible.
The implementation of the ideas presented in this article would, however,
provide a strong counter argument: then we would not only have flexible,
but also *safe* I/O!

Since it is very time consuming for me to read the French version, I'd like
to ask a short question here:

The article says that the name of types and constructors is dropped for the
conversion to the interchange format. But what about types like:

  type t = Foo | Bar

How are such cases treated? Another program might define this as

  type t = Bar | Foo

and possibly means the same thing with the same constructors, however, the
internal representation might be completely different (maybe due to order
of constructors).

Of course, as long as all constructors take parameters of different types
like in

  type t = String of string | Int of int | ...

there is always a unique translation.

Are there primitives for disambiguating such cases? E.g. my program reads
in data from another program, I see that it gets the constructors the wrong
way round so I just use some primitive to get it right again?

> We also plan to distribute Jun's implementation in the near future, to
> let you play with it.

This would be great! I am looking forward to this!

Best regards,
Markus Mottl

-- 
Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Compiler translation of array indexing
  2000-01-18 16:34 David McClain
@ 2000-01-19 14:25 ` Pierre Weis
  2000-01-19 17:08   ` Markus Mottl
  0 siblings, 1 reply; 7+ messages in thread
From: Pierre Weis @ 2000-01-19 14:25 UTC (permalink / raw)
  To: David McClain; +Cc: caml-list

> Instead of translating as in OCAML to Array.get and
> Array.set, they could be translated into some equivalent to
> 
> x.(n)  ->  module_of(type_of x).get(x, n)

There are strong theoretical difficulties to obtain a type_of function
embedded into a statically & strongly type checked language such as
Caml. A (limited) form of this type_of facility is the entire subject of
the following reserch paper:

ftp://ftp.inria.fr/INRIA/Projects/cristal/Pierre.Weis/generics.dvi.Z

This is also the subject of Jun Furuse's PHD: he is working since 1996
to implement this type-checking discipline into a working O'Caml
compiler. We also made some progress on the theoretical side, compared
to the above reference, and all these results will be reported in Jun's
thesis (strongly expected before this summer).

This work had also produced a safe value I/O system for Objective
Caml, that is a fully typechecked and safe polymorphic input/output
set of primitives for Objective Caml.
The design and implementation is described into the following
(forcoming) article : http://pauillac.inria.fr/~weis/articles/jfla2000.ps.Z
(Limited to those of you who can read french, since it has not yet
been translated into english, sorry for that.)

We also plan to distribute Jun's implementation in the near future, to
let you play with it.

Best regards,

Pierre Weis

INRIA, Projet Cristal, Pierre.Weis@inria.fr, http://cristal.inria.fr/~weis/





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Compiler translation of array indexing
@ 2000-01-18 16:34 David McClain
  2000-01-19 14:25 ` Pierre Weis
  0 siblings, 1 reply; 7+ messages in thread
From: David McClain @ 2000-01-18 16:34 UTC (permalink / raw)
  To: caml-list

I would like to suggest a slightly more expressive form for the translation
of array indexing operators x.(..) and x.[..]. In Dylan these get translated
into message sends to generic function "element" (or "aref" in the case of
multiple indices). Instead of translating as in OCAML to Array.get and
Array.set, they could be translated into some equivalent to

x.(n)  ->  module_of(type_of x).get(x, n)

This would allow us to develop modules containing refined versions of get,
set, unsafe_get, and unsafe_set to do whatever would be appropriate for the
object being indexed. Currently, I don't know of any operators named
"module_of" and "type_of" and these might in fact not make much sense given
the vast optimizations performed in the compiler -- there are not unique
tags for disparate data types since the language is type checked at compile
time.

Just some thoughts and wishes...

David McClain, Sr. Scientist
Raytheon Systems Co.
Tucson, AZ





^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2000-01-25  8:39 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-01-19 17:33 Compiler translation of array indexing Simon Peyton-Jones
2000-01-19 22:50 ` Pierre Weis
2000-01-24 17:06   ` Andreas Rossberg
  -- strict thread matches above, loose matches on Subject: below --
2000-01-18 16:34 David McClain
2000-01-19 14:25 ` Pierre Weis
2000-01-19 17:08   ` Markus Mottl
2000-01-19 21:17     ` Pierre Weis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).