From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by yquem.inria.fr (Postfix) with ESMTP id 9D70DD45F for ; Fri, 4 Nov 2005 01:16:41 +0100 (CET) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id jA40GfYa005727 for ; Fri, 4 Nov 2005 01:16:41 +0100 Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id BAA23028 for ; Fri, 4 Nov 2005 01:16:40 +0100 (MET) Received: from conn.mc.mpls.visi.com (conn.mc.mpls.visi.com [208.42.156.2]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id jA40Gdrr005724 for ; Fri, 4 Nov 2005 01:16:40 +0100 Received: from [192.168.42.2] (bhurt.dsl.visi.com [208.42.141.66]) by conn.mc.mpls.visi.com (Postfix) with ESMTP id C18B48120; Thu, 3 Nov 2005 18:16:34 -0600 (CST) Date: Thu, 3 Nov 2005 18:20:08 -0600 (CST) From: Brian Hurt X-X-Sender: bhurt@localhost.localdomain To: "Seth J. Fogarty" Cc: caml-list Subject: Re: [Caml-list] References, compact bollean values (and other questions) In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Miltered: at nez-perce with ID 436AA869.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Miltered: at nez-perce with ID 436AA867.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; caml-list:01 mutable:01 mutable:01 pointers:01 integers:01 integers:01 unboxed:01 usefull:01 ocamlopt:01 booleans:01 bitfields:01 ocamlc:01 booleans:01 ocamlc:01 pervasives:01 X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.0.3 On Thu, 3 Nov 2005, Seth J. Fogarty wrote: > First question: I notice references are implemented using mutable > records. Does this imply tuples of references are slower than mutable > records? > I.E. > type a = int ref * int ref * int ref > vs > type a = {mutable a : int; mutable b : int; mutable c : int} The mutable structure will almost certainly be less memory and faster. The tuple of structures will be a tuple of three pointers to three different mutable integers. The structure will simply be three integers, stored unboxed in the structure. A usefull trick for answering these sorts of questions is to use ocamlopt -S and look at the generated assembly code. > > Second: I need a non-varient data type. > It needs to store three integers and six mutable boolean. > I would like this to be done as compactly as possible, ideally in 128 > contiguious bits. The best way to store multiple booleans compactly and mutable is as bitfields in an int. As such, a structure like: type t = { x: int; y: int; z: int; mutable bitfield: int } is probably your best bet. Note that the above structure takes up only 40 bytes of memory on 64-bit systems, 20 bytes on 32-bit systems. > > Will either records or tuples store these contiguously? > If I use tuples and references, will ocamlc 'compact' booleans, or > give them 32 bits each? I think this is highly unlikely. > If I use records, will ocamlc 'compact' booleans, or give them 32 bits each? > If I use an integer to represent all six booleans, is there an easy > way to gain access to/flip a single bit? I can only find shifting > operations in pervasives Ocaml will generall give a full word to every individual variable, even booleans and chars. The pervasives module is automatically opened, so anything in it is automatically available to you. So you can write code like: let get_bool3 s = ((s.bitfield lsl 3) land 1) != 0;; Note that Ocaml will rather aggressively inline this function, so it's not that expensive. > > Third question: How aggressive/intelligent is the ocamlc inlining? > Should I worry about inlining functions that I expect to be used > millions of times? The general rule here is that you don't need to worry about it. One thing to remember is that function calls aren't that expensive. On modern hardware, I've timed function calls as costing about 1.5 clock cycles. This is compared to 100-350+ clock cycles for a cache miss. I wouldn't worry about it. Look up the post I made a couple of days ago, my list for what order you should optimize in. Frankly, it's sounding a lot like you're prematurely optimizing. Brian