From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <yotambarnoy@gmail.com>
X-Original-To: caml-list@sympa.inria.fr
Delivered-To: caml-list@sympa.inria.fr
Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83])
	by sympa.inria.fr (Postfix) with ESMTPS id D2CEE7EE49
	for <caml-list@sympa.inria.fr>; Fri, 20 Sep 2013 04:18:52 +0200 (CEST)
Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender
  authenticity information available from domain of
  yotambarnoy@gmail.com) identity=pra;
  client-ip=209.85.220.182;
  receiver=mail2-smtp-roc.national.inria.fr;
  envelope-from="yotambarnoy@gmail.com";
  x-sender="yotambarnoy@gmail.com";
  x-conformance=sidf_compatible
Received-SPF: Pass (mail2-smtp-roc.national.inria.fr: domain of
  yotambarnoy@gmail.com designates 209.85.220.182 as permitted
  sender) identity=mailfrom; client-ip=209.85.220.182;
  receiver=mail2-smtp-roc.national.inria.fr;
  envelope-from="yotambarnoy@gmail.com";
  x-sender="yotambarnoy@gmail.com";
  x-conformance=sidf_compatible; x-record-type="v=spf1"
Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender
  authenticity information available from domain of
  postmaster@mail-vc0-f182.google.com) identity=helo;
  client-ip=209.85.220.182;
  receiver=mail2-smtp-roc.national.inria.fr;
  envelope-from="yotambarnoy@gmail.com";
  x-sender="postmaster@mail-vc0-f182.google.com";
  x-conformance=sidf_compatible
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AmICAKyvO1LRVdy2lGdsb2JhbABbhBHBMYEYCBYOAQEBAQcLCwkSKoIlAQEEAUABGx0BAwELBgUEBzsiAREBBQEcBi6HVQEDCQacBIxRgweEIQoZJw1kiQABBQyPWweEHgOXfJAMGCmEaCA
X-IPAS-Result: AmICAKyvO1LRVdy2lGdsb2JhbABbhBHBMYEYCBYOAQEBAQcLCwkSKoIlAQEEAUABGx0BAwELBgUEBzsiAREBBQEcBi6HVQEDCQacBIxRgweEIQoZJw1kiQABBQyPWweEHgOXfJAMGCmEaCA
X-IronPort-AV: E=Sophos;i="4.90,941,1371074400"; 
   d="scan'208";a="33608712"
Received: from mail-vc0-f182.google.com ([209.85.220.182])
  by mail2-smtp-roc.national.inria.fr with ESMTP/TLS/RC4-SHA; 20 Sep 2013 04:18:51 +0200
Received: by mail-vc0-f182.google.com with SMTP id hf12so7023573vcb.41
        for <caml-list@inria.fr>; Thu, 19 Sep 2013 19:18:50 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20120113;
        h=mime-version:in-reply-to:references:from:date:message-id:subject:to
         :cc:content-type;
        bh=ky7N00Xlsk4y7H5T1rihmvvL0v1LYHpf+y6pYi1S5+A=;
        b=S6pt60M2k0hfxc4edvdfVmz9CUQJAYYH4QROUWexYc2GDzOu/o5gITrXmdHC15Ev+t
         LC24VYQ3lXmhf/H3kb1KyC2wyIKK6TYVNvbmSzCglWI5ECHM+3gygG4k2jDS525cyPFW
         nK7xW1ph530deUbKFZF7y7pXE2GZ0SipnWT3V1YZBJzDzWAiYxKwGQx7yomPQBKPxtGQ
         YTimXybIOS7WYohm8+XQEElEQfVax1BS7FVwEVkTH+mRJcD+I9wSfetPgZwXGKkhlf7S
         EkacbToDO+i2/2DGUu4L/0t/XRE/gBsRgCS43q6+oiblY6vfCM3sMkPicoMkqCLEFk01
         xsDw==
X-Received: by 10.220.164.70 with SMTP id d6mr3881706vcy.19.1379643530854;
 Thu, 19 Sep 2013 19:18:50 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.220.160.70 with HTTP; Thu, 19 Sep 2013 19:18:30 -0700 (PDT)
In-Reply-To: <20130919101020.GE25801@frosties>
References: <CAN6ygOnK+xut5W0poyzrZcC770kwZ4VgKY1du=bUsPhOeCP7sg@mail.gmail.com>
 <1379410360.10274.156.camel@zotac> <CAN6ygOniyQEo421humUzA4eAeMjjJvfXLKePeJounYzANhZ-Kw@mail.gmail.com>
 <20130919101020.GE25801@frosties>
From: Yotam Barnoy <yotambarnoy@gmail.com>
Date: Thu, 19 Sep 2013 22:18:30 -0400
Message-ID: <CAN6ygOk3Fhk8p2b2bQRCbKOVJPN1TS610dgknaw6swkTbj0Z+w@mail.gmail.com>
To: Goswin von Brederlow <goswin-v-b@web.de>
Cc: Ocaml Mailing List <caml-list@inria.fr>
Content-Type: multipart/alternative; boundary=047d7b414aa492c79404e6c74a6f
Subject: Re: [Caml-list] Expanding the Float Array tag


--047d7b414aa492c79404e6c74a6f
Content-Type: text/plain; charset=ISO-8859-1

> 1) unboxed floats
>
> The double_array_tag saves a lot (50%) of ram because the floats are
> not individually boxed. For mixed blocks a bitfield could also allow
> unboxed floats.
>
> Inspecting the contents of such a block would be more complex then
> because, on 32bit, a is-float bit means the float is stored in 2
> fields and the inspection has to combine the current field with the
> next and skip over the next field.
>
> You want to use a 16-bit bitfield to indicate which members are float.
> This would work for record and constructors with up to 16 members. I
> would modify this a bit. The lower 15 bit indicate which of the first
> 15 members are float while the 16th bit indicates if all remaining
> members are floats. If you declare the 16-bit value as signed and use
> arithmetic shift right to iterate through the bits you get this
> naturally.
>
> This sounds good. On 64 bit platforms, of course, you could use the second
double word to indicate many more unboxed floats. And let's not forget that
this impacts performance as well as memory usage.


> 2) values the GC doesn't need to scan
>
> This would probably be far more often usefull. There are tons of
> tuples and records that contain only primitive types (int, bool, unit,
> ...). This is also true for a lot of variant types. So a simple
> flat_tag would only cover half the cases. A flat bit would be better.
> On the other hand a bitfield for values the GC doesn't have to inspect
> seems pointless. Each value already has a bit for that.
>

Yeah. As soon as I start going down this path, I start thinking of
haskell's heap solution, which is so elegant: a size field indicates how
many pointers there are in the data structure, and all pointers are moved
to the beginning of the structure. This allows them to have full size
integers as well as unboxed floats. The cost is a less intuitive object
layout. Of course, they don't have to worry about polymorphic compare or
marshaling, since typeclasses cover all of these things in the form of
attached dictionaries.

Think of how much work they're saving the garbage collector though: they
never have to read non-pointer values. If only haskell wasn't lazy  :) ...
BTW can anyone think of a downside to haskell's approach (other than the
fact that ocaml needs to keep track of which values are floats for
polymorphic reasons)?

Yotam

--047d7b414aa492c79404e6c74a6f
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><div class=3D"gmail_extra"><br>=A0<div class=3D"gmail_=
quote"><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-=
left:1px #ccc solid;padding-left:1ex">1) unboxed floats<br>
<br>
The double_array_tag saves a lot (50%) of ram because the floats are<br>
not individually boxed. For mixed blocks a bitfield could also allow<br>
unboxed floats.<br>
<br>
Inspecting the contents of such a block would be more complex then<br>
because, on 32bit, a is-float bit means the float is stored in 2<br>
fields and the inspection has to combine the current field with the<br>
next and skip over the next field.<br>
<br>
You want to use a 16-bit bitfield to indicate which members are float.<br>
This would work for record and constructors with up to 16 members. I<br>
would modify this a bit. The lower 15 bit indicate which of the first<br>
15 members are float while the 16th bit indicates if all remaining<br>
members are floats. If you declare the 16-bit value as signed and use<br>
arithmetic shift right to iterate through the bits you get this<br>
naturally.<br>
<br></blockquote><div>This sounds good. On 64 bit platforms, of course, you=
 could use the second double word to indicate many more unboxed floats. And=
 let&#39;s not forget that this impacts performance as well as memory usage=
.<br>

<br>=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;b=
order-left:1px #ccc solid;padding-left:1ex">
2) values the GC doesn&#39;t need to scan<br>
<br>
This would probably be far more often usefull. There are tons of<br>
tuples and records that contain only primitive types (int, bool, unit,<br>
...). This is also true for a lot of variant types. So a simple<br>
flat_tag would only cover half the cases. A flat bit would be better.<br>
On the other hand a bitfield for values the GC doesn&#39;t have to inspect<=
br>
seems pointless. Each value already has a bit for that.<br></blockquote><di=
v><br></div><div>Yeah. As soon as I start going down this path, I start thi=
nking of haskell&#39;s heap solution, which is so elegant: a size field ind=
icates how many pointers there are in the data structure, and all pointers =
are moved to the beginning of the structure. This allows them to have full =
size integers as well as unboxed floats. The cost is a less intuitive objec=
t layout. Of course, they don&#39;t have to worry about polymorphic compare=
 or marshaling, since typeclasses cover all of these things in the form of =
attached dictionaries. <br>

<br>Think of how much work they&#39;re saving the garbage collector though:=
 they never have to read non-pointer values. If only haskell wasn&#39;t laz=
y=A0 :) ... BTW can anyone think of a downside to haskell&#39;s approach (o=
ther than the fact that ocaml needs to keep track of which values are float=
s for polymorphic reasons)? <br>

</div><div>=A0<br></div><div>Yotam<br></div><br></div></div></div>

--047d7b414aa492c79404e6c74a6f--