From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by yquem.inria.fr (Postfix) with ESMTP id 23234BB81 for ; Fri, 22 Jul 2005 16:27:00 +0200 (CEST) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id j6MEQxii024297 for ; Fri, 22 Jul 2005 16:26:59 +0200 Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id QAA31052 for ; Fri, 22 Jul 2005 16:26:59 +0200 (MET DST) Received: from mail.physik.uni-muenchen.de (mail.physik.uni-muenchen.de [192.54.42.129]) by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id j6MEQwwU024294 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL) for ; Fri, 22 Jul 2005 16:26:58 +0200 Received: from localhost (unknown [127.0.0.1]) by mail.physik.uni-muenchen.de (Postfix) with ESMTP id 3E4502005A for ; Fri, 22 Jul 2005 16:26:58 +0200 (CEST) Received: from mail.physik.uni-muenchen.de ([127.0.0.1]) by localhost (mail.physik.uni-muenchen.de [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 04440-01-69 for ; Fri, 22 Jul 2005 16:26:56 +0200 (CEST) Received: from mailhost.cip.physik.uni-muenchen.de (kaiser.cip.physik.uni-muenchen.de [141.84.136.1]) by mail.physik.uni-muenchen.de (Postfix) with ESMTP id 1616F20008 for ; Fri, 22 Jul 2005 16:26:56 +0200 (CEST) Received: from katrin.cip.physik.uni-muenchen.de (katrin.cip.physik.uni-muenchen.de [141.84.136.12]) by mailhost.cip.physik.uni-muenchen.de (Postfix) with ESMTP id DB4BE26EDA for ; Fri, 22 Jul 2005 16:26:55 +0200 (CEST) Received: by katrin.cip.physik.uni-muenchen.de (Postfix, from userid 3092) id 777EE16BBA; Fri, 22 Jul 2005 16:26:55 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by katrin.cip.physik.uni-muenchen.de (Postfix) with ESMTP id 720AB22135 for ; Fri, 22 Jul 2005 16:26:55 +0200 (CEST) Date: Fri, 22 Jul 2005 16:26:55 +0200 (CEST) From: Thomas Fischbacher To: caml-list@inria.fr Subject: How to do this properly with OCaml? Message-ID: X-BOFH: Daemons did it MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: amavisd-new at physik.uni-muenchen.de X-Miltered: at concorde with ID 42E10233.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Miltered: at concorde with ID 42E10232.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; ocaml:01 ocaml:01 binary:01 heap:01 heap:01 indirection:01 bool:01 bool:01 toplevel:01 indirection:01 arrays:01 arrays:01 rec:01 terminate:01 rec:01 X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.0.3 I repeatedly stumble across problems where I would like to use a programming style which seems quite out of line with how one is supposed to use OCaml (to say it otherwise, it looks outright wrong and horrendously ugly), but which nevertheless might be just appropriate for the situation in question. So, I would like to hear the opinion of the OCaml community about this. Imagine constructing a binary heap of tuples. It is somewhat neat to be able to just use an array to store the entries of the heap which is pre-allocated to some fixed size and replaced by a larger array whenever more space is needed. The only thing known about the heap's entries at compile time is that they are bound to be tuples, hence boxed objects, but nothing else is known about their size or even type. As one does not have a prototype of such a tuple at the time the array is created, it seems to me as if the best thing one could do in OCaml would be: (1) Make an array of option and initially fill it with None. (2) Make an optional array of tuples which is None until the first entry is made. One drawback of approach (2) is that when we "remove" entries from the heap, the underlying array will keep unnecessary references to values which by chance might be quite big, and which we might want to be reclaimed by GC, so that's not too beautiful for a general-purpose application. The major drawback of (1), however, is that there is a further level of indirection for array entries, which means some unnecessary consing, as well as more work for the machine to get at a given entry. If I just define a function let whatever () = Obj.magic (1,2); then let a = Array.make 10 (whatever());; would give me more or less just what I want. An array of boxed things that I then would like to use as in: # let () = a.(2) <- (1,2,3,4) in a.(2);; - : int * int * int * int = (1, 2, 3, 4) # let () = a.(3) <- (5,8,2,1) in a.(2);; - : int * int * int * int = (1, 2, 3, 4) # a.(3);; - : int * int * int * int = (5, 8, 2, 1) Mind you - in this case, I will only assign int*int*int*int tuples to that array, or the result of the evaluation of whatever() when I want to kill an entry. Plus, I guarantee never to look at any entry that is set to whatever(). In some other situation, I might be inclined to use the same code, but with string*bool instead of int*int*int*int. But again, I will promise never to put anything else but string*bool into the underlying array, and never look at entries which are not set properly. (Which, of course, includes never printing such an array at the toplevel unless it is fully occupied.) Please don't tell me that "well, OCaml is not Lisp, after all". This I already know. But is there no proper way to get around that (technically speaking) unnecessary extra level of indirection that is forced upon me due to static type checking? The very same problem appears in a different guise when one tries to write a function that flattens 'a array array -> 'a array. The best I could come up with here is: let join_arrays aa = let len_total = Array.fold_left (fun sf a -> sf+Array.length a) 0 aa and nr_arrays = Array.length aa in if len_total=0 then if nr_arrays = 0 then [||] else aa.(0) (* ^ Take a close look! *) else (* Here, we face the problem that we have to get some value of the proper type to init the joined array with. *) let first_entry = (let rec seek pos = let array_here = aa.(pos) in if Array.length array_here = 0 then seek (pos+1) else array_here.(0) (* This is guaranteed to terminate, as len_total>0! *) in seek 0) in let result = Array.make len_total first_entry in let transfer_to_result arr offset = let nr = Array.length arr in for i = 0 to nr-1 do result.(i+offset) <- arr.(i) done in let rec populate nr_array_now offset = if nr_array_now = nr_arrays then result else let array_now = aa.(nr_array_now) in let () = transfer_to_result array_now offset in populate (1+nr_array_now) (offset+Array.length array_now) in populate 0 0 ;; Of course, it is pretty awkward having to scan for the first element in the first nonempty array in an array of arrays, especially as that element really does not play a that special role. Could anyone please tell me how to do all this in a more appropriate way? -- regards, tf@cip.physik.uni-muenchen.de (o_ Thomas Fischbacher - http://www.cip.physik.uni-muenchen.de/~tf //\ (lambda (n) ((lambda (p q r) (p p q r)) (lambda (g x y) V_/_ (if (= x 0) y (g g (- x 1) (* x y)))) n 1)) (Debian GNU)