caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* RE: reference initialization
@ 2000-05-11 14:28 Stephanie Weirich
  2000-05-12 20:38 ` Hongwei Xi
  0 siblings, 1 reply; 26+ messages in thread
From: Stephanie Weirich @ 2000-05-11 14:28 UTC (permalink / raw)
  To: 'Hongwei  Xi', 'caml-list@inria.fr'



> -----Original Message-----
> From: Hongwei Xi [mailto:hwxi@ececs.uc.edu]
> Sent: Wednesday, May 10, 2000 12:50 AM
> To: caml-list@inria.fr
> Subject: reference initialization
> 
> 
> > Wrong. You have references, which are quite better than pointers
> > (they are typed, and necessarily initialized)
> 
> I have given some thoughts on this one.
> 
[... parts elided ...]
> 
> Can you still say that the ML strategy is better than the Java
> strategy? I thus argue that it is better using dynamic checking
> to detect reading from uninitialized reference than simply
> assigning a value to every reference upon its creation.
> 
> To summarize, my bottom line question is: what is really achieved
> by assigning a reference a (wrong) initial value? Isn't this just
> like an ostrich solving its problem by burying its head in sand?
> 
> Of course, another problem with the ML strategy is efficiency loss
> (which, though, is often negligible as discussed here before)

The real difference between ML references and Java pointers is that ML
separates "reference" from "possibly unitialized", while Java combines the
two into one construct. It's not to difficult to implement Java-style
pointers in ML, just use an option ref. For example:

type 'a ptr = a' option ref
exception NullPointer
let new () = ref None
let get x = match !x with Some y -> y | None -> raise NullPointer
let set x y = x := Some y 

ML, of course, lacks the syntactic support to use these pointers as
gracefully as Java can. On the other hand, the problem with _Java_ is
efficiency loss, as the programmer cannot syntactically enforce that the
reference is initialized -- requiring a null check at every use.

-Stephanie




^ permalink raw reply	[flat|nested] 26+ messages in thread
* RE: reference initialization
@ 2000-05-20 20:13 Simon Peyton-Jones
  2000-05-22 16:10 ` Pierre Weis
  0 siblings, 1 reply; 26+ messages in thread
From: Simon Peyton-Jones @ 2000-05-20 20:13 UTC (permalink / raw)
  To: Pierre Weis, hwxi; +Cc: caml-list

| val initialize :
|  int -> 'a -> ((int -> 'a) -> (int -> 'a -> unit) -> unit) -> 'a array
|  [initialize n x f] returns a fresh array of length [n],
|  with elements initialized by function [f].
|  All the elements of the new array must be assigned once and only
|  once by the function [f]. [f] received two functions as arguments,
|  one to access elements of the new array, and the other to set the
|  elements of the new array. [f] can access to element [i] of the new
|  array provided [f] has already properly initialized element [i].

This is all good stuff.  It might be worth mentioning Haskell's approach
in this context.  The main array former is

	array :: Ix ix => (ix,ix) -> [(ix,elt)] -> Array ix elt

'array' takes an upper and lower bound, both of type 'ix'.  ix is a type
variable that must be bound to a type in class Ix, which supports indexing
operations.  So that I can avoid discussion of type classes, let's
specialise
array much as 'initialize' is above, to vectors indexed by Ints:

	array :: (Int,Int) -> [(Int,elt)] -> Array Int elt

So array takes an uppper and lower bound, and a list of (index,value) pairs;
it returns an array with each element filled in.  The list should mention
each element just once, but that is not statically checked.

In conjunction with list comprehensions, this gives rise to quite nice code.
For example, to tabulate a function we might write

	array (1,n) [(i, f i) | i <- [1..n]]

Referring to existing elements is easy via recursion:

	fibs = arrray (1,n) ([(1,1),(2,1)] ++
			       [(i, fibs!(i-1) + fibs!(i-2)) | i <- [3..n]])

Notice how easily the boundary cases are specified.

Laziness means that the compiler does not need to figure out which
order to do the initialisation. In a strict language it would be a little
harder -- or perhaps one could say "it's done in the order the index,value
pairs are given".


You may wonder about the efficiency issue.  After all, it doesn't seem
great to build an intermediate list just before constructing an array.
But a bit of short-cut deforestation does the trick, and eliminates the
intermediate list.  I have to admit that a lot of things have to Work Right
in the compiler to really get the for-loop you intended, but that's what
compilers are for.


None of this is specific to Haskell or to a lazy language.  Caml could
well use it.  I mention it on this list becuase it's an aspect of the
Haskell design that I think has worked particularly well, and which 
might be of use to Camlers.

A lot of the benefit comes from the re-use (for arrays) of the 
list comprehension notation.  I forget whether Caml has such notation,
but again, it's nothing to do with laziness and it's a notation which
is really useful.

Simon





^ permalink raw reply	[flat|nested] 26+ messages in thread
* RE: reference initialization
@ 2000-05-11 13:48 Dave Berry
  0 siblings, 0 replies; 26+ messages in thread
From: Dave Berry @ 2000-05-11 13:48 UTC (permalink / raw)
  To: Hongwei Xi, caml-list

IMO, the "ML strategy" is to use an option type.  SML has 
	datatype 'a option = NONE | SOME of 'a
although I prefer 
	datatype 'a option = NULL | VALUE of 'a.
I don't know if Caml provides such a type as standard, although
it's easy to define your own.

Then your code checks whether the value has been assigned (dynamically,
as you require), and can take whatever action is appropriate.  If you want
to throw an exception, you can.

With this approach, the ML type system tells you which values may be null.
This has an advantage over the Java approach, where any value may be
null or uninitialised.

Dave.

-----Original Message-----
From: Hongwei Xi [mailto:hwxi@ececs.uc.edu]
Sent: Wednesday, May 10, 2000 5:50 AM
To: caml-list@inria.fr
Subject: reference initialization


> Wrong. You have references, which are quite better than pointers
> (they are typed, and necessarily initialized)

Suppose I use a reference 'x'. If I know what the initial value
of 'x' should be, I'll of course prefer to initialize it with that
value. Now suppose I don't, that is, I intend to assign a value
to 'v' later (maybe in a loop or in a conditional branch)

(1) ML strategy: initialize x with any value 'v' of an appropriate
type (sometimes, such a 'v' is difficult to find or takes time to
construct (consider 'x' to be a large array)).




^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: Caml wish list
@ 2000-04-25 18:16 Pierre Weis
  2000-05-10  4:50 ` reference initialization Hongwei Xi
  0 siblings, 1 reply; 26+ messages in thread
From: Pierre Weis @ 2000-04-25 18:16 UTC (permalink / raw)
  To: William Chesters; +Cc: caml-list

> Well, I can't remember offhand how SMJ/NJ handles overloading, but it
> seems to work reasonably well.  On the other hand, I am nowadays a
> convert to ocaml's hard line on overloading---it's an absolute curse
> of many, many C++ programs by coders whose enthusiasm outruns their
> judgement.

We think many Caml users feel that using different addition function
(+) and (+.) for integers and floats is a safe way of programming, but
really frustrating at the same time.

So do we. Therefore we are currently in the effort of adding some kind
of limited overloading to Caml. This is not a trivial problem, if you
want something really good both at the expressiveness and the
efficiency levels.

      *  *  * 

As you mentioned, SML provides a set of overloaded functions such as
(+), but it is quite limited. In addition, for the sake of efficiency
and simplicity of the typing and compilation schemes, the overloaded
functions in SML have a restricted ``functional'' status: you cannot
abstract pieces of code containing overloaded operators. Hence, you
cannot define derived overloaded functions using already existing
overloaded functions.

A simple example that novice SML users often encounter is the double
function using overloaded (+):

# let double = fun x -> x + x;;

In SML, this double function is not an overloaded function for
integers and floats, because the type of (+) must always be statically
deduced from the context, which is impossible here. This definition is
just either rejected (SML '90), or resolved using a default type
assignment for overloaded symbols (SML '97): then (+) is typed as its
default type int -> int -> int, and consequently double gets type int
-> int.

This way, the compiler always has the precise type of use for each
occurrence of (+). Therefore occurrences of (+) can be replaced by 
the corresponding primitives, for example:

        # 1 + 2            ====>   add_int 1 2
        # 1.0 + 2.0        ====>   add_float 1.0 2.0
        # fun x -> x + x   ====>   fun x -> add_int x x

Therefore overloaded functions are just "typeful macros" in SML.

We think this is one reasonable solution, since the users will feel
much less frustration than using the separated functions (+) and (+.),
and the compilation for these overloaded functions is straightforward
and has no run time overhead.

      *  *  * 

However, we are not promoting this simple overloading facility,
because we think the ``abstraction restriction'' to be unnatural in a
functional setting: even if the definition of new overloaded symbols
is prohibited, the abstraction over overloaded operators (as in double
above) should be possible in higher-order functional languages like
SML or Caml!

Our "extensional polymorphism" framework is a bit more complicated
than the ``default assigment'' static scheme, but it provides
abstraction and "real" overloaded functions. (By the way, we call
those "generic" functions.) Using extensional polymorphism, the double
function is polymorphic, being defined for ``any argument whose type
is suitable for (+)''. Thus, you can use it for integers and floats!

# let double = fun x -> x + x;;
val double : $a -> $a = <fun>

# double 1;;
- : int = 2 

# double 3.14;;
- : float = 6.28

We use special type variables $a, $b, ..., called dynamic type
variables, to denote type parameters abstracted into type schemes for
polymorphic generic functions. In the definition of double, the
context of (+) is left unresolved, and double becomes polymorphic. The
compilation of double abstracts the type context for the internal (+),
and dynamically passes it to (+) to select the appropriate addition
primitive.

You may worry about efficient compilation of this feature, since, as
examplify by the double generic function, some information from the
typing context has to be passed at run time, and hence double involves
an extra argument (a type) to be provided to (+) at run time, and also
an additional pattern matching selection to apply the proper addition
primitive; undoubtedly, the generic double function is a bit slower
than its direct non-overloaded counterpart definitions for integers
and floats, like

# let double_int x = x + x;;
# let double_float x = x +. x;;

However, writing those two definitions means twice more code, hence
twice more possibilities of bugs, twice more maintenance, and twice
more identifiers to remember. This simplification is worth a slight
efficiency loss ...

Note also that usual simple overloading of operators is still as
efficient as it can be, since static type annotations allow the
inlining of the corresponding primitives, at least for the predefined
set of usual arithmetic operators (+, -, *, ...).

Polymorphic uses of overloaded operators in generic functions need an
extra type book keeping for dynamic resolution, but there is no
possible efficiency comparison with other compilation schemes, since
these functions are completely new in ML and cannot be expressed
neither in SML nor in Caml.

As a side effect of the extensional polymorphism discipline, we
manipulate types at run time, and this provides various benefits and
new features to the lasnguage:

        * safe value I/O (persistent value I/O between different
          programs, or safe usage of input_value/output_value),
        * dynamic values,
        * some limited polymorphic print facility,
        * a new kind of computation mixing types and values ...

The current prototype is an extension of the 2.04 version of Objective
Caml; we are planning to release it when it will be fully stable, like
the label extension in 2.99. We don't know yet which version of
Objective Caml it will be 3.0, 3.x or 3.99...

Hope this helps,

Jun Furuse & Pierre Weis





^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2000-05-24  8:05 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-05-11 14:28 reference initialization Stephanie Weirich
2000-05-12 20:38 ` Hongwei Xi
2000-05-15  8:49   ` Xavier Leroy
2000-05-15 17:47     ` Hongwei Xi
2000-05-15 21:33       ` Pierre Weis
2000-05-16  2:53         ` Hongwei Xi
2000-05-18 16:16           ` Pierre Weis
2000-05-19  6:54             ` Max Skaller
2000-05-22 15:28               ` Pierre Weis
2000-05-22 22:29                 ` Max Skaller
2000-05-15 22:20       ` Dave Mason
2000-05-15  9:36   ` Eijiro Sumii
  -- strict thread matches above, loose matches on Subject: below --
2000-05-20 20:13 Simon Peyton-Jones
2000-05-22 16:10 ` Pierre Weis
2000-05-11 13:48 Dave Berry
2000-04-25 18:16 Caml wish list Pierre Weis
2000-05-10  4:50 ` reference initialization Hongwei Xi
2000-05-11 13:58   ` Pierre Weis
2000-05-11 18:59     ` Hongwei Xi
2000-05-12 17:07       ` Pierre Weis
2000-05-12 19:59         ` Hongwei Xi
2000-05-15  6:58           ` Max Skaller
2000-05-15 17:56             ` Hongwei Xi
2000-05-14 14:37         ` John Max Skaller
2000-05-13  7:07       ` Daniel de Rauglaudre
2000-05-13  7:09       ` Daniel de Rauglaudre
2000-05-11 16:02   ` John Prevost

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).