From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by yquem.inria.fr (Postfix) with ESMTP id F3224BB9C for ; Fri, 2 Dec 2005 16:07:24 +0100 (CET) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id jB2F7OdW004496 for ; Fri, 2 Dec 2005 16:07:24 +0100 Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id QAA27500 for ; Fri, 2 Dec 2005 16:07:23 +0100 (MET) Received: from wproxy.gmail.com (wproxy.gmail.com [64.233.184.198]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id jB2F7M70004483 for ; Fri, 2 Dec 2005 16:07:23 +0100 Received: by wproxy.gmail.com with SMTP id i30so139647wra for ; Fri, 02 Dec 2005 07:07:22 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=E43xW6Ns9rm23WWVWw7WVHq6zlRpr/c+ztBi0j+2W0NAxZ3csBpkZjw642FqDh6WKJ9JqEX+pOk0vDdtIGMlo3WT03QRVlfiUmCHXgRiNt/wsBkW8WDrrr9Yu7UQqaFwPzUzIM74dRxUNQFs5mUGFMJ16Wp6Umq43ODirHa8ti0= Received: by 10.65.11.4 with SMTP id o4mr1509506qbi; Fri, 02 Dec 2005 07:07:21 -0800 (PST) Received: by 10.65.192.4 with HTTP; Fri, 2 Dec 2005 07:07:21 -0800 (PST) Message-ID: Date: Fri, 2 Dec 2005 10:07:21 -0500 From: "Michael D. Adams" To: caml-list@inria.fr Subject: Re: [Caml-list] Efficency of varient types In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline References: <4387ACC9.2040107@motion-twin.com> X-Miltered: at nez-perce with ID 4390632C.001 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Miltered: at nez-perce with ID 4390632A.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; caml-list:01 bool:01 char:01 pointers:01 pointer:01 segfault:01 ocaml:01 pointer:01 char:01 pointers:01 constructors:01 argued:01 ocaml:01 constructors:01 minor:01 X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=RCVD_BY_IP autolearn=disabled version=3.0.3 On 11/27/05, Brian Hurt wrote: > On Sun, 27 Nov 2005, Michael D. Adams wrote: > > I should clarify what I am proposing because it also only requires one > > word. The bottom one or two bits of that word would say which variant > > it is. The remaining bits representing any data with that variant. > > (Obviously this would only work if the associated data is small > > enough, but fortunately most of the primitives are small enough (e.g. > > int (31 bits), bool (1 bit), char (8 bits).)) > > Except that this runs into the problem of knowing the subtle details of > the GC- not only how it works now, but what (if any) gaurentees are being > given for future development. For example, is it gaurenteed that the GC > will recognize words with the low two bits being 10 as not pointers? Or > does the low bit being 0 means the GC may (or may not) assume it's a > pointer, and weird word values would cause the GC to segfault? Even if i= t > works now, will it continue to work with Ocaml 3.10, 3.20, 4.0, etc. I > don't know- I don't work at INRIA. This is exactly what I was warning > about earlier. A brief glance at the code seems to confirm that the GC is only checking the last bit even though it could check the last two bits.=20 That means I'll have to go with a 30 bit representation of ints with the pointer tags like '01' for ints, '011' for bools, '111' for char, '0' for pointers to boxed constructors. I use '01' for int instead of '11' b/c it requires fewer operations to add and multiply ints. The consideration of pointers doesn't matter in this case. Some have argued, though, that even when pointers do matter, ints should still be '0' and pointers '1', b/c always accessing pointers with an offset is cheap (there's an addressing mode for it) and it makes int operations cheaper. Yes, this means I'm digging around the internal representation. I would rather OCaml were smart enough to figure out from a type declaration that it could use pointer tags and unbox some of the constructors. Oh well, maybe in the next version of OCaml. (/me nudges Inria) > The other thing I'll throw in here is a warning against premature > optimization. OK, so method A is 15x slower than method B. But how big > of an effect will this have on the overall program? If method B is one > clock cycle, and method A is 15 clock cycles, probably not much. 15 cloc= k > cycles is about the cost of one mispredicted branch, and much, much > cheaper than a cache miss. If you're doing anything at all interesting, > the most likely case is that the 14 clock difference will be minor > compared to the costs of the rest of the program. I would have absolutely agreed with you until a couple months ago when a friend pointed out to me the difference between optimizing an application and optimizing a language. Applications are usually dominated by some bottleneck that is slower than the rest. Most applications just need to be fast 'enough'.=20 Having 'ls' take 10 seconds is to slow but the difference between 10 ms and 1 ms usually doesn't matter. These two facts mean that you shouldn't prematurely optimize your application. Compare this with a language implementation. If some part of the implementation that is commonly used, then every application that uses that implementation will have to pay for it and in cases like the cost of primative operations on a value (which is tied to the cost of tags), those applications will have to pay for it almost everywhere.=20 Furthermore, "fast enough" doesn't apply as much with languages, because if I make my implementation generate twice as fast code, then that means someone optimizing their application with my implementation only has to do half as much optimization to reach what ever their "fast enough" level is. So the rules for optimizing language implementation are a little different than for applications. Ok, there is a "fast enough" for my language implementation. My original objective was simply to show that if writing a compiler for a language (e.g. scheme, python), it is a lot smarter to compile down to OCaml than to compile down to C like a lot of implementations do. The results should be easier to write and will probably run faster because you can use an already fast runtime (OCaml's) instead of writing your own in C which will likely be slow unless you spend a lot of time on it. Thus I'm fast enough if I'm significantly faster than python.=20 However, that 15x slow down for using tags hurts b/c python is only about 50x slower than C. I know there will be other slow downs I will have to pay for and 15x blows the budget. As far as doing type inference, I will do that eventually, but I want to try to see how fast I can make it without type inference, then add inference later. A friend of mine that wrote one of the worlds fastest Scheme implementations (i.e. on par or better than equivalent C), claims the cost of tagging is low if done 'right' (i.e. pointer tags instead of block tags) and that type inference wont gain much once that is done. So to test his claim I need to first write a very fast implementation that uses pointer tags, then see how much it speeds up when I add type inference. Michael D. Adams mdmkolbe@gmail.com