From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: weis
Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id WAA12183 for caml-redistribution; Sun, 10 Oct 1999 22:23:23 +0200 (MET DST)
Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id SAA11452 for <caml-list@pauillac.inria.fr>; Sun, 10 Oct 1999 18:11:13 +0200 (MET DST)
Received: from igw3.watson.ibm.com (igw3.watson.ibm.com [198.81.209.18])
	by nez-perce.inria.fr (8.8.7/8.8.7) with ESMTP id SAA23159
	for <caml-list@inria.fr>; Sun, 10 Oct 1999 18:11:10 +0200 (MET DST)
From: chet@watson.ibm.com
Received: from mailhub.watson.ibm.com (mailhub.watson.ibm.com [9.2.250.97])
	by igw3.watson.ibm.com (8.9.3/8.9.3/05-14-1999) with ESMTP id MAA11790;
	Sun, 10 Oct 1999 12:11:10 -0400
Received: from nautilus.chet.org (lig32-226-139-164.us.lig-dial.ibm.com [32.226.139.164]) by mailhub.watson.ibm.com (8.8.7/Feb-20-98) with ESMTP id MAA21160; Sun, 10 Oct 1999 12:11:08 -0400
Received: from bismarck.chet.org (root@bismarck.chet.org [192.168.10.15])
	by nautilus.chet.org (8.8.8/8.8.8/Debian/GNU) with ESMTP id MAA09728;
	Sun, 10 Oct 1999 12:12:51 -0400
Received: from bismarck.chet.org (chet@localhost [127.0.0.1])
	by bismarck.chet.org (8.8.8/8.8.8/Debian/GNU) with ESMTP id MAA16130;
	Sun, 10 Oct 1999 12:10:18 -0400
Message-Id: <199910101610.MAA16130@bismarck.chet.org>
To: John Prevost <prevost@maya.com>
cc: chet@watson.ibm.com
cc: skaller <skaller@maxtal.com.au>, caml-list@inria.fr
Subject: Re: Proposal for study: Add a categorical Initial type to ocaml 
In-reply-to: Your message of 09 Oct 1999 18:43:35 EDT.
             <ya2n1tsb1ew.fsf@po.maya.com> 
Date: Sun, 10 Oct 1999 12:10:18 -0400
Sender: weis


After some thought, it occurred to me that, given ZINC's
representation of objects, it should be possible to add a new
type-constructor,

type 'a unboxed_option =
  ubSome of 'a
| ubNone

wherein ubSome is represented by the identitty function in the code,
and ubNone by NULL -- zero.

If I remember right, there is currently no use for NULL pointers in
CAML.  So all we need to do is to write down the proper low-level
lambda-language expressions for each of the constructor and destructor
operations, as well as for equality-checking, and we're done (I
think?).

[[ubSome]] == [lam v]v

[This is really equal to "if v == NULL then NULL else v", as explained
below.]

[[ubNone]] == 0

[[match e with
  ubSome x -> B1
| ubNone -> B2]]

 ==

  let v1 = [[e]]
  in if v1 == NULL then [[B2]]
  else let x = v1
       in [[B1]]


Equality is taken care of by just doing what we do today, but making
sure that NULL == NULL.

I haven't thought this thru enough to believe that I could *prove*
that well-typed programs don't go wrong, but I do *believe* it.  We
can think of the GC maintenance bit as being an implicit discriminator
used to distinguish between the two cases -- ubSome or ubNone -- for
*unboxed* values.  Likewise, for *boxed* values, since no boxed value
can be *null* (all boxed values are really represented in the heap,
even -- especially -- nullary constructors) the null-ness of the
ubNone value is used to discriminate.

It is a happy coincidence that you don't have to strip off that
discriminator "tag" when you destructure, and that you don't have to
add the "tag" when you construct.

Going further, things like ('a unboxed_option) unboxed_option are
somewhat flawed, but they do work, (as they must, if this is going to
work) -- of three possibilities,

(a)  ubSome(ubNone)
(b)  ubSome(ubSome x)
(c)  ubNone

at construction time, (a) and (c) collapse, together, but everything
is still well-typed.

So there is a behavioural difference in programs -- they can detect
that unboxed_option is actually unboxed, by using the fact that these
two cases collapse.  But seems like a small price to pay for something
which gives you a rather useful feature of systems-programming
languages.

Comments,
--chet--