From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id E73C1BC58 for ; Mon, 13 Sep 2010 23:26:09 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ArEBAAI0jkxKfVI0kGdsb2JhbACVKowkCBUBAQEBCQkMBxEDH6s/i0UBBY8iAQSFQA X-IronPort-AV: E=Sophos;i="4.56,361,1280700000"; d="scan'208";a="69454888" Received: from mail-ww0-f52.google.com ([74.125.82.52]) by mail4-smtp-sop.national.inria.fr with ESMTP; 13 Sep 2010 23:26:09 +0200 Received: by wwb17 with SMTP id 17so6277510wwb.9 for ; Mon, 13 Sep 2010 14:26:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=domainkey-signature:received:received:from:to:cc:references :in-reply-to:subject:date:organization:message-id:mime-version :content-type:content-transfer-encoding:x-mailer:thread-index :content-language; bh=/Ksw/UF23aBA9kDNUioslR+drBDPUMoN6WHCyP008Js=; b=BFnMkRa5U35cOPSfq93gJZcKn65jrCrv0UPtofQTKC5a1C7O6szO7iJKZMK4WqFSCC zt1wu57y1UKtRm+4ltAhL1fz2v1OxqEOVTT68FPIsuf+S6CJ78bibeyINHJDA6VAFYSU MduI3Gg2bgXEGfZQRzlazHQnacpajZ5o7X34Y= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma; h=from:to:cc:references:in-reply-to:subject:date:organization :message-id:mime-version:content-type:content-transfer-encoding :x-mailer:thread-index:content-language; b=TKaXxt4NOZAWip8pmtiTvQJY4iGW0khN+/IuGZrXNreAgvPrHvnY0S4lsV4DxDJyO6 Y1X0Qu5KdUxoMFHIvLEL8IdfAQLYUCP7/ZHTQSuoVatL+ClnKcTNDEEOzCKGSfwEMbl5 Gs3WmweoAGPfnfqSv/xafX+9X4lTCUShgq4Bs= Received: by 10.227.138.141 with SMTP id a13mr1357726wbu.208.1284413169070; Mon, 13 Sep 2010 14:26:09 -0700 (PDT) Received: from WinEight ([87.115.93.151]) by mx.google.com with ESMTPS id r18sm4259868weo.0.2010.09.13.14.26.05 (version=SSLv3 cipher=RC4-MD5); Mon, 13 Sep 2010 14:26:07 -0700 (PDT) From: Jon Harrop To: "'Eray Ozkural'" , "'Sylvain Le Gall'" Cc: References: In-Reply-To: Subject: RE: [Caml-list] Re: How to re-implement the GC? Date: Mon, 13 Sep 2010 22:25:37 +0100 Organization: Flying Frog Consultancy Message-ID: <009601cb538a$3be05160$b3a0f420$@com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 12.0 Thread-Index: ActTTiAFKWfx+Qx8Rv2bUwcXzoUw4wANbvkw Content-Language: en-gb X-Spam: no; 0.00; re-implement:01 eray:01 ocaml:01 ocaml:01 compiler:01 recursion:01 run-time:01 ocaml's:01 bytecode:01 compilation:01 speedup:01 pointer:01 cheers:01 polymorphic:01 garbage:01 Hi Eray, Retrofitting a new multicore-friendly GC onto OCaml is difficult for two main reasons: 1. You want plug-and-play GCs but the OCaml compiler is tightly coupled to the old GC (although OC4MC has decoupled them!). 2. Recovering similar performance whilst reusing the same data representation is extremely difficult because the current design relies so heavily on lightweight allocation. You really want to change the data representation to avoid unnecessary boxing (e.g. never box or tag int, floats or tuples) in order to reduce the allocation rate and, consequently, the stress on the garbage collector but OCaml cannot express value types and its ability to represent polymorphic recursion makes this extremely difficult to implement. As Sylvain has said, OC4MC is your best bet if you want to try to write a new GC for OCaml. However, making more extensive changes has the potential to address many more problems (e.g. convey run-time type information for generic printing) so you might consider alternatives like trying to compile OCaml's bytecode into HLVM's IR for JIT compilation because HLVM already has a multicore friendly GC and it is much easier to develop. > Ah, that's interesting. I wonder if it provides any real speedup on new > architectures compared to storing the pointer in RAM. For a multicore GC, using a register to refer to thread-local data is a huge win because accessing thread-local data is very slow. I made that mistake in HLVM's first multicore-capable GC and it was basically useless as a consequence because all of the time was spent looking up thread-local data. When I started passing the thread-local data around as an extra argument to every function (not necessarily consuming a register all the time because LLVM is free to spill it), performance improved enormously. Cheers, Jon.