From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.7 required=5.0 tests=AWL,MSGID_MULTIPLE_AT autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id 38408BBAF for ; Mon, 1 Jun 2009 16:21:45 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: An8CACSBI0pRZ90vkWdsb2JhbACXDIEDAQEBAQkLCgcRA7NLhAwF X-IronPort-AV: E=Sophos;i="4.41,284,1241388000"; d="scan'208";a="40888370" Received: from mtaout01-winn.ispmail.ntl.com ([81.103.221.47]) by mail4-smtp-sop.national.inria.fr with ESMTP; 01 Jun 2009 16:21:44 +0200 Received: from aamtaout03-winn.ispmail.ntl.com ([81.103.221.35]) by mtaout01-winn.ispmail.ntl.com (InterMail vM.7.08.04.00 201-2186-134-20080326) with ESMTP id <20090601142143.ZGSU6742.mtaout01-winn.ispmail.ntl.com@aamtaout03-winn.ispmail.ntl.com>; Mon, 1 Jun 2009 15:21:43 +0100 Received: from romulus.metastack.com ([81.102.132.77]) by aamtaout03-winn.ispmail.ntl.com (InterMail vG.2.02.00.01 201-2161-120-102-20060912) with ESMTP id <20090601142141.LHFL2093.aamtaout03-winn.ispmail.ntl.com@romulus.metastack.com>; Mon, 1 Jun 2009 15:21:41 +0100 Received: from COUNTERTENOR ([172.16.0.18]) (authenticated bits=0) by romulus.metastack.com (8.14.2/8.14.2) with ESMTP id n51ELcYB000730 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Mon, 1 Jun 2009 15:21:39 +0100 From: "David Allsopp" To: "'Dario Teixeira'" , Cc: References: <820784.66094.qm@web111511.mail.gq1.yahoo.com> In-Reply-To: <820784.66094.qm@web111511.mail.gq1.yahoo.com> Subject: RE: [Caml-list] Width subtyping Date: Mon, 1 Jun 2009 15:21:36 +0100 Message-ID: <008901c9e2c4$46975500$d3c5ff00$@allsopp@metastack.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 12.0 Thread-Index: AcniwLoQJv0Ec9iQRSSHfv8kH5B5hwAAXOyA Content-Language: en-gb Organization: MetaStack Solutions Ltd. X-Scanned-By: MIMEDefang 2.65 on 81.102.132.77 X-Cloudmark-Analysis: v=1.0 c=1 a=HfWiYCSjhmBpXBUCpGgA:9 a=G21yP0jLHg5A7ncdrXlnlVvSHRYA:4 a=Tx-4PAczoSwSQ5QU:21 a=VtQ0k0HsBM0yLv26:21 X-Spam: no; 0.00; subtyping:01 bytecode:01 ocamlopt:01 optimise:01 caching:01 compiler:01 bytecode:01 paging:98 wrote:01 caml-list:01 int:01 snip:02 width:97 binary:02 slower:02 Dario Teixeira wrote: > Thanks -- that is also an interesting solution. I'm guessing it will > be faster, though it might consume more memory in cases where only one > field is actually used. I'll have to try it side-by-side with the > object based solution to see how they compare in the real world with my actual > problem.. So as I wouldn't immediately be told "the overhead is irrelevant", I ran a "quick" benchmark before my last post. I compared summing 7 million 3-int-field records and 7 million 3-int-field objects using names a, b, c. On my machine, averaged out and with enough RAM that neither paging nor the GC get in the way, objects were nearly 3 times slower. I then tried with 10-int-field records and objects - in this case, objects were just over 4 times slower (read on). This was in bytecode - I didn't bother with native code. > > Which might turn out to be not that big a deal. After all, the object > based solution also adds some default overhead per object created, > right? The overhead of an object is 2 words (one contains tracking information about the actual class of the object and the other is a unique identifier) + (I think) 1 extra word for every field (because each int is boxed in a 1-tuple so that its tag can be recorded). Importantly, accessing fields in a record is an O(1) operation but for objects it's O(log n) for the number of fields because a binary search is done to locate the field with the correct tag. ocamlopt may of course optimise this search by caching the absolute index of the field in a "recognised" object layout; I didn't look in the compiler. My test between 3 and 10 fields would suggest that this optimisation does not apply in bytecode. David