From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: weis Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id TAA32030 for caml-redistribution; Mon, 11 Oct 1999 19:32:15 +0200 (MET DST) Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id OAA19072 for ; Mon, 11 Oct 1999 14:34:15 +0200 (MET DST) Received: from isil.maya.com (isil.maya.com [192.70.254.5]) by nez-perce.inria.fr (8.8.7/8.8.7) with ESMTP id OAA20208 for ; Mon, 11 Oct 1999 14:34:11 +0200 (MET DST) Received: (from prevost@localhost) by isil.maya.com (8.9.3/8.9.3) id IAA03210; Mon, 11 Oct 1999 08:40:08 -0400 Sender: weis To: skaller Cc: caml-list@inria.fr Subject: Re: Proposal for study: Add a categorical Initial type to ocaml References: <37FC6572.719AD12D@tsc.uc3m.es> <37FE1E1F.4790FE85@maxtal.com.au> <38002EBF.DF0EF779@maxtal.com.au> From: John Prevost Date: 11 Oct 1999 08:40:08 -0400 In-Reply-To: skaller's message of "Sun, 10 Oct 1999 16:14:23 +1000" Message-ID: User-Agent: Gnus/5.070096 (Pterodactyl Gnus v0.96) Emacs/20.4 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii skaller writes: > > I have a question--how is this different (except for the efficiency of > > using null) from using a 'a option array? The array still must be > > initialized--in this case to all null, or in the unsafe case, well, > > it's unsafe. > > It isn't, in theory, it is exactly the same (and therefore sound). > The difference lies in the way it is represented: using the initial > the 'a option wrapping is done by the system, not the client programmer > (sort of like a Haskell monad) and therefore is more robust, more > efficient, and leaves the client code cleaner. This is much the > same argument as for exceptions. Hiding implementation details from the programmer does not always make code cleaner. You can have the monad comparison, though. I'll use it in a second. > > If anything, I would argue that this points at representing 'a option > > specially for already boxed values by using a null pointer for None, > > rather than having a special strange value that comes out of arrays > > and magically creates exceptions. > > That would make 'a option faster to use, but still require the > overhead of writing code that needless does matches where it is > known (dynamically) that only one of the cases can be used (such as > in a variable length array). Okay--so here's what I would do: "Inefficient" safe array implementation: type 'a exparray = { mutable ea_size : int; mutable ea_length : int; mutable ea_array : 'a array } let length a = a.ea_length let get a i = if i < a.ea_length then a.ea_array.(i) else raise (Invalid_argument "Earray.get") let set a i v = if i < a.ea_length then a.ea_array.(i) <- v else raise (Invalid_argument "Earray.set") let expand a n v = begin (if a.ea_length + n >= a.ea_size then let new_size = if n > a.ea_size then a.ea_size + n else a.ea_size * 2 in let new_arr = unsafe_create new_size in begin Array.blit a.ea_array 0 new_arr 0 a.ea_length; a.ea_array <- new_arr; a.ea_size <- new_size; end); Array.fill a.ea_array a.ea_length n v; a.ea_length <- a.ea_length + n end let create i = { ea_size = i; ea_length = 0; ea_array = unsafe_create i } let make i v = { ea_size = i; ea_length = i; ea_array = Array.make i v } Something like that. We can probably define unsafe_create like this: let unsafe_create i = (Obj.magic (Array.make i 0) : 'a array) (The above works, I've tested it. I highly recommend not trying to print out values of type 'a earray from the toplevel, though.) So this is the pretty unsafe version. There can't be a really unsafe version because not initializing the memory at all would play havoc with the GC. Now. What can we gather from this? First: by abusing Obj.magic, we can do what we want. Obviously, it's not strictly a nice thing to do, but if we can prove to ourselves that the code works right, then from the point of view of the API's client, everything's great. What else can we gather? Well, what would be different if this were implemented with options? There'd be the boxing inefficiency, of course. Anything else? Not really. We'd be guaranteeing something we can show to be true anyway, since that's why we were okay with using Obj.magic up there. So, is there a problem with my solution of working around the type system under the API level? > > Again, rather than introduce new features into the language, add a new > > unsafe optimization and make the option type more efficient for boxed > > values. > > The initial is not 'strange', > rather it is fundamental, the dual of the 'unit' type. No--it's strange. The reason it's strange is that it's something that can happen to a user that they don't expect. Like null in Java. There's nothing unsound about null (well, unless you consider that trying to call a method on null fails to be unsound, which I guess I probably do), but that doesn't mean it's not a pain in the ass. Why is it a pain in the ass? There's no way to turn it off. Like I've been saying, with option you can turn it off, with Obj.magic, the implementor ought to be damned sure he's doing things right. But with these special "uninitialized value" sorts of things, people like me who've gotten used to good type systems keep looking over their shoulders because they're afraid it might turn around and bite them in the butt. It's appropriate for exceptions to be thrown when out of bounds array accesses happen--although I'd rather it weren't. It's reasonable for the programmer to check and make sure this doesn't happen. It's not reasonable for the programmer to check that index i of an array that was passed in has actually been assigned before now. Much like it's not reasonable for the programmer to have to check that the argument is not null on a function that doesn't claim to accept null as an argument (my Java pet peeve). I really think adding anything that can cause more runtime errors is likely to be a wart, and no language needs more warts. Hrm. That last bit is too rambly, but I'll send this anyway. John.