From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id WAA22895; Sat, 28 Aug 2004 22:46:26 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id WAA22234 for ; Sat, 28 Aug 2004 22:46:25 +0200 (MET DST) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id i7SKkPqF011082 for ; Sat, 28 Aug 2004 22:46:25 +0200 Received: from bourg.inria.fr (bourg.inria.fr [128.93.11.100]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id WAA17999 for ; Sat, 28 Aug 2004 22:46:24 +0200 (MET DST) Received: from basile by bourg.inria.fr with local (Exim 4.34) id 1C1A5C-0002WO-Mq for caml-list@inria.fr; Sat, 28 Aug 2004 22:45:54 +0200 Date: Sat, 28 Aug 2004 22:45:54 +0200 To: caml-list@inria.fr Subject: [Caml-list] Re: (GC issues) Alternative Bytecodes for OCaml Message-ID: <20040828204554.GA9252@bourg.inria.fr> References: <004f01c48c19$9950d8e0$0100a8c0@warp> <412F9059.2080808@orcaware.com> <20040827221801.GB1545@annexia.org> <20040828.083835.21913928.yoriyuki@mbg.ocn.ne.jp> <20040828164012.GA6420@bourg.inria.fr> <003f01c48d20$fa1497e0$0100a8c0@warp> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <003f01c48d20$fa1497e0$0100a8c0@warp> User-Agent: Mutt/1.5.6+20040803i From: "Basile Starynkevitch [local]" X-Miltered: at nez-perce with ID 4130EF21.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Loop: caml-list@inria.fr X-Spam: no; 0.00; bytecodes:01 basile:01 basile:01 2004:99 cannasse:01 citing:01 opaque:01 chunks:01 vtables:01 setter:01 runtime:01 hooks:01 implemented:01 vtables:01 runtime:01 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk On Sat, Aug 28, 2004 at 07:03:44PM +0200, Nicolas Cannasse wrote: Citing me (Basile S.) > > Unfortunately, Parrot http://www.parrotcode.org/docs/intro.html > > implement all structured objects as PMCs, which are opaque data chunks > > on which all operations goes thru a table of functions (similar to C++ > > vtables). So, using PMCs for tuple implementations would require an > > indirect call (even with Parrot JIT technology) for every basic tuple > > operations (ie allocation, tag fetching, field fetching). Also, I am > > not sure that Parrot has terminal (tail recursive) calls which are > > essential to Ocaml (and other functional languages). > > That's the kind of things I was thinking when talking about object oriented > VM . In the OO world, everything is an object and people tends to abstract > everything - even fields, so pure efficient data structures are not > available anymore, only through expensive getter/setter and/or field lookup. > A functionnal VM typicaly need both : efficient direct access data > structures and objects with runtime lookup. I'm not sure a functionnal VM need objects, except for making more easy hooks to the external world (like the Custom block of Ocaml provides). BTW, Ocaml objects (in the Ocaml source sense) are not implemented as objects with C vtables in the runtime (the equivalent of the vtable is for Ocaml only -it is not useful to the GC- and contains Ocaml closures, not C function pointers). > [...] > > Alos, note that Ocaml (and other functional languages) need a > > performant garbage collector (because immutable tuples and closures > > are allocated and accessed a lot), and that conservative GCs (à la > > Boehm) have lots of interesting features (in particular, > > conservativeness which makes then C-friendly, and multithreading), but > > are slower than naive generational copying precise GC (see > > http://starynkevitch.net/Basile/qishintro.html for some benches). > > I would like to get your thoughs about the following, since you already > wrote a GC as well as OCamlJIT. > IMHO, having an any way to interface with C is a key feature for any > language. Because the OCaml GC is not scanning the C stack, it is needed to > use a lot of macros to take care that newly allocated blocks are not > garbaged. > > What would be the cost of extending the OCaml GC with a C stack scanning > pass ? [Damien Doligez would probably answer better than I do] Ocaml does scan *explicitly* the C stack. Telling *explicitly* what are the local GC roots of C routines is the task of CAML_param & CAML_local macros (FWIW, my Qish GC use very similar tricks,and every exact GC has to do similarily.). The difference between Ocaml GC's (or Qish) and Boehm's GC is that Boehm's GC is conservative while Ocaml GC is exact (or precise): it uses a non-copying scheme (some kind of incremental mark&sweep IIRC) and scan every word of the C call stack, assuming it could be a GC-ed pointer (so when you have a bit pattern - say a floating point constant like pi - which happens to be a valid address it is followed by Boehm's GC - which may leak this way). Some variants (eg Bartlett' mostly copying GC) of conservative GC scan conservatively the C stack and handle exactly the heap. The Ocaml GC has to work quickly for young GC-ed values (which, notably in functional programs, tend to die quickly, because a functional programming style -with no mutable state- favors quick allocation of many short-lived temporary values). Hence, a copying scheme is preferable for the young generation. But a GC which work by copying should move (ie update) pointers, so cannot be conservative: it has to know which words are GC-ed pointers, and which are not (otherwise, your floating point pi constant on the stack would change mysteriously to a bitpattern of a newly forwared pointer). > Would it be costly/unsafe to move the pointers on the C stack ? It would be usafe, because a copying GC scheme is preferable (it is faster for young objects), hence pointers have to be moved on the C stack > What kind of speed factor can you loose when using for example Boehm > GC in OCaml VM instead of OCaml GC ? I don't know, but I would expect something like several programs running 2 or 3 times slower at least with a bytecode interpreter using Boehm's GC instead of the current Ocaml bytecode VM & GC. In my simple experiments (with Qish), a Boehm allocation is about 10 times slower than a C malloc, which is at least 10 times slower than the usual pointer increment of a copying generational GC (like in Ocaml or Qish). So I would guess (but I didn't test) that a Boehm allocation might be nearly 100 times slower than an Ocaml tuple allocation. > Which would be better of theses two ? > * a conservative GC everywhere (Boehm GC for example) I believe it would slow down significantly the OCaml VM. > * OCaml GC but with all C allocations made directly into the old > generation (for conservativness) You still has to tell the (exact) minor collector about these allocations (ie something like putting them on the store list) - so you won't gain much in C coding style, and you'll lose in performance. The result of C primitives is usually a temporary value (e.g. because C primitives tend to be wrapped by Ocaml code doing checking and changing C errors into Ocaml exceptions), so you'll probably better put then in the young generation. > I'm also of course interested in Damien thoughts about theses questions. Of course, Damien Doligez is much more expert than I am. I'll be delighted to read his thoughts on this. All my thoughts above are just thoughts, I did not take the time to experiment it (and putting Boehm's GC inside Ocaml VM is not a trivial task), so I may be grossly wrong. This interesting discussion triggers another interesting question: would Ocaml coders still use Ocaml if its implementation was (say) 3 or 10 times slower than the current implementation? Alternatively, do people use my OcamlJitRun program which could provide (on several programs) a speedup of nearly 2 on their bytecode program (which don't have to be changed neither in their C primitives nor in the byteocde, hence the Ocaml source code). -- Basile STARYNKEVITCH -- basile dot starynkevitch at inria dot fr other email : basile at starynkevitch dot net Project cristal.inria.fr - temporarily http://cristal.inria.fr/~starynke --- all opinions are only mine ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners