From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from discorde.inria.fr (discorde.inria.fr [192.93.2.38]) by yquem.inria.fr (Postfix) with ESMTP id 01BE2BC69 for ; Sun, 6 May 2007 23:09:08 +0200 (CEST) Received: from bsd4.nyct.net (mail-out8.nyct.net [216.139.141.8]) by discorde.inria.fr (8.13.6/8.13.6) with ESMTP id l46L96b8022847 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sun, 6 May 2007 23:09:07 +0200 Received: from [192.168.42.2] ([216.139.135.144]) by bsd4.nyct.net (8.13.4/8.13.4) with ESMTP id l46L94qB017527; Sun, 6 May 2007 17:09:05 -0400 (EDT) (envelope-from bhurt@spnz.org) Date: Sun, 6 May 2007 17:11:54 -0400 (EDT) From: Brian Hurt X-X-Sender: bhurt@localhost To: skaller Cc: "caml-list@inria.fr" Subject: Re: [Caml-list] 246 constructors? In-Reply-To: <1178480953.4443.4.camel@rosella.wigram> Message-ID: References: <1178480953.4443.4.camel@rosella.wigram> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Miltered: at discorde with ID 463E43F2.000 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; constructors:01 constructors:01 non-terminal:01 byte:01 comming:01 pointers:01 arrays:01 arrays:01 2007,:98 closures:01 wrote:01 constructor:01 caml-list:01 grammar:01 strings:01 On Mon, 7 May 2007, skaller wrote: > I just got this message: > > File "dyp_parse.ml", line 492, characters 5-19461: > Too many non-constant constructors -- maximum is 246 non-constant > constructors > > Really??? > > (The file is generated by dypgen, it has a constructor for > every terminal and non-terminal in the Felix grammar) > > At least on a 64 bit machine this is a sad restriction: > any chance of a two byte code here? The problem is that this is comming out of the tag word. The current structure is: 8 bits for the tag information 2 bits for the GC color information n-10 bits for the size of the block where n is the number of bits in a word. The 246 limitation comes from the 256 possible tag values minus 10 "special" tag values for various objects that need to be handled special by the GC (objects that don't need to be scanned for pointers, like floats and strings, closures, objects, lazy thunks, etc.). Unfortunately, this bumps right into the other favorite thread: the 2M limitation on arrays. The maximum size of an array is the maximum size of a block- and the more bits that are given over to tag bits, the fewer bits there are for block size, so the smaller arrays can be. I think a better solution, although I have no idea how difficult it'd be to implement, would be "large tag blocks". For blocks with large tag values (say, >245), the tag value in the tag word would be the special "large tag" tag (say, value 246), and the last word of the block would be the "large" tag value. I say last, as this means you can still get the nth word of a block without needing to special case large blocks. This would allow a full word to be used for tag values (giving way more than enough tag values), and wouldn't change either memory usage or code for blocks with small tag values. Brian