From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: weis Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id SAA32103 for caml-redistribution; Sat, 18 Dec 1999 18:17:54 +0100 (MET) Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id BAA19080 for ; Sat, 18 Dec 1999 01:59:35 +0100 (MET) Received: from isil.maya.com (isil.maya.com [192.70.254.5]) by nez-perce.inria.fr (8.8.7/8.8.7) with ESMTP id BAA15553; Sat, 18 Dec 1999 01:59:33 +0100 (MET) Received: (from prevost@localhost) by isil.maya.com (8.9.3/8.9.3) id TAA09969; Fri, 17 Dec 1999 19:51:03 -0500 X-Authentication-Warning: isil.maya.com: prevost set sender to prevost@maya.com using -f Sender: weis To: Xavier Leroy Cc: caml-list@inria.fr Subject: Re: Internals details for cmmgen.ml References: <87emcv5i5d.fsf@isil.maya.com> <19991211190908.14090@pauillac.inria.fr> From: John Prevost Date: 17 Dec 1999 19:51:02 -0500 In-Reply-To: Xavier Leroy's message of "Sat, 11 Dec 1999 19:09:08 +0100" Message-ID: <87d7s5rs09.fsf@isil.maya.com> User-Agent: Gnus/5.0803 (Gnus v5.8.3) Emacs/20.4 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Xavier Leroy writes: > Using a "checkbound" is perhaps the simplest solution. Otherwise, > some system-wide exceptions such as Invalid_argument are assigned > global symbols and you don't need to guess their integer index inside > their defining module: just emit the C-- code corresponding to > > (raise (symbol "Invalid_argument") (string "my message")) I'll try this. > If you're really into high-performance stuff, you could fold the > permission check and the bounds check in one "checkbound" instruction. > Just arrange the "write enable" flag to be (the Caml integer) 0 if > write is allowed, and -1 if it is not. Then, generate something like > > (checkbound (or index (write_enable_flag region) (size region))) I don't think this is necessary--especially since both reading and writing are things that could fail (so I need more than one bit). When you're going for extreme speed, you'll probably use the unsafe versions. (How does -unsafe work, by the way? Does it make the C-- "checkbound" stuff work differently?) > Although hacking cmmgen.ml is fun, you could get a more portable > implementation by writing it in ML using unsafe string accesses. > Those will happily work on any char *, not necessarily on well-formed > Caml strings. Something like: > > external mmap : ... -> string > type t = { data: string; length: int } > > let read_char reg idx = > if idx < 0 || idx >= reg.length > then raise (Invalid_argument "Region.read_char") > else String.unsafe_get reg.data idx > > It will be a bit slower, but maybe not too much. Hmm. What do you mean by "a more portable implementation"? One which doesn't require compiler modifications, or one which works with bytecode? I believe that with bytecode, the C functions are sufficient. As for unsafe string access: but doesn't the pointer point to an O'Caml block, which includes a tag and length information? John.