From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: weis Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id MAA25670 for caml-redistribution; Mon, 13 Dec 1999 12:44:42 +0100 (MET) Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id FAA23018 for ; Mon, 13 Dec 1999 05:34:05 +0100 (MET) Received: from gs176.sp.cs.cmu.edu (GS176.SP.CS.CMU.EDU [128.2.198.136]) by nez-perce.inria.fr (8.8.7/8.8.7) with ESMTP id FAA06109 for ; Mon, 13 Dec 1999 05:34:03 +0100 (MET) Received: from gs176.sp.cs.cmu.edu (localhost [127.0.0.1]) by gs176.sp.cs.cmu.edu (8.9.3/8.9.3) with ESMTP id XAA12728 for ; Sun, 12 Dec 1999 23:34:02 -0500 Message-Id: <199912130434.XAA12728@gs176.sp.cs.cmu.edu> To: caml-list@inria.fr Subject: ocaml limitations Date: Sun, 12 Dec 1999 23:34:02 -0500 From: John Carol Langford Sender: weis I have been encountering some fundamental limitations of the ocaml language and compiler that are killing my performance - to the tune of a factor of 10 off equivalent C code. This is a serious problem because the program I'm working on is both RAM and cpu intensive. The performance problems from two limitations. The first is in the compiler and runtime - it's the limitation on the array size on 32 bit machines - I only have linux PC's available to work on. It appears that the garbage collector needs some type information to work with arrays and enough bits are set aside for type information that not enough bits are allowed to specify a large array size. You can get around this large array size problem by simulating a large array with an array of arrays, but there is a significant performance penalty. The second problem is a language failure - there is no 'short int' type in ocaml. Due to the combinatorics of my problem it would be very convenient to use 16 bit integers. Using 32 bit integers instead doubles the footprint of the program - which is unacceptable in this case. Consequently, I simulate 16 bit integers using masking games - which again incurs a performance penalty. These two problems together add up to using a function considerably more complicated then an array dereference on the inner loop: let get_first i = i land (num_array -1) let get_second i = i lsr (log_num_array+1) let get_third i = (i lsr log_num_array) land 1 let get_split split i = let first_index = get_first i and second_index = get_second i and third_index = get_third i in let ret = split.(first_index).(second_index) in let real_ret = if third_index = 1 then ret else ret lsr 16 in real_ret land 65535 Naturally, the performance (relative to what the hardware is capable of) is terrible. Consequently, I'm wondering if there are plans to remove either (or both) of these limitations in the near future and lacking that if there are better workarounds. Any suggestions? -John