From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: weis
Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id SAA15040 for caml-redistribution; Tue, 14 Dec 1999 18:27:52 +0100 (MET)
Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id NAA32503 for <caml-list@pauillac.inria.fr>; Tue, 14 Dec 1999 13:31:19 +0100 (MET)
Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35])
	by concorde.inria.fr (8.8.7/8.8.7) with ESMTP id NAA22054;
	Tue, 14 Dec 1999 13:31:15 +0100 (MET)
Received: (from xleroy@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id NAA17447; Tue, 14 Dec 1999 13:31:14 +0100 (MET)
Message-ID: <19991214133114.35181@pauillac.inria.fr>
Date: Tue, 14 Dec 1999 13:31:14 +0100
From: Xavier Leroy <Xavier.Leroy@inria.fr>
To: John Carol Langford <jcl@gs176.sp.cs.cmu.edu>, caml-list@inria.fr
Subject: Re: ocaml limitations
References: <199912130434.XAA12728@gs176.sp.cs.cmu.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 0.89.1
In-Reply-To: <199912130434.XAA12728@gs176.sp.cs.cmu.edu>; from John Carol Langford on Sun, Dec 12, 1999 at 11:34:02PM -0500
Sender: weis

> The performance problems from two limitations.  The first is in the
> compiler and runtime - it's the limitation on the array size on 32 bit
> machines - I only have linux PC's available to work on.  It appears
> that the garbage collector needs some type information to work with
> arrays and enough bits are set aside for type information that not
> enough bits are allowed to specify a large array size.  You can get
> around this large array size problem by simulating a large array with
> an array of arrays, but there is a significant performance penalty.

This is correct.  We plan to address this by adding "large arrays"
that would be allocatd outside the Caml heap, and handled via an extra
indirection.

> The second problem is a language failure - there is no 'short int'
> type in ocaml.  Due to the combinatorics of my problem it would be
> very convenient to use 16 bit integers.  Using 32 bit integers instead
> doubles the footprint of the program - which is unacceptable in this
> case.  Consequently, I simulate 16 bit integers using masking games -
> which again incurs a performance penalty.  

Right.  Notice that the code you sent only simulates 15 bit integers
(because there are only 31 bits in a value of type "int").  Arrays of
16 bit integers can be expressed using strings and byte accesses, but
the performance isn't going to be much better than with your code.

> Consequently, I'm wondering if there are plans to
> remove either (or both) of these limitations in the near future and
> lacking that if there are better workarounds.

We have plans to support large integer and float arrays with various
sizes of elements at some point in the future, but the design isn't
complete yet.  In particular, there is a tension between flexible but
slow accesses (e.g. multidimensionnal arrays with automatic conversion
between the C and Fortran layouts) on the one hand, and basic but fast
primitives.

- Xavier Leroy