caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Markus Mottl <markus.mottl@gmail.com>
To: Gabriel Scherer <gabriel.scherer@gmail.com>
Cc: Lukasz Stafiniak <lukstafi@gmail.com>, OCaml List <caml-list@inria.fr>
Subject: Re: [Caml-list] Covariant GADTs
Date: Wed, 21 Sep 2016 20:39:12 -0400	[thread overview]
Message-ID: <CAP_800oy7Ug9PO7YajRxwH+ZsYthkOSefXEOKYh55eUsfEa-Zw@mail.gmail.com> (raw)
In-Reply-To: <CAPFanBGDV6YJsijYrf8pXDzNwEEyQN32CX0+k2sbvrc56cCsUg@mail.gmail.com>

The direct comparison with the Jane Street implementation showed a 40%
speed increase for some random things I tried, but that's not a fair
comparison.  If I improve the JS code, e.g. to avoid allocations, the
performance improvement due to the GADT + inlined records drops to
only about 10%.

In terms of memory, a freshly created set costs 7 machine words in the
original code vs. 5 for the GADT.  Adding one rank costs 4 machine
words in the standard implementation vs. only 2 for GADTs.  That's a
pretty significant size reduction.  The GADT representation would
surely help in programs that allocate a lot of these values, but the
values don't tend to grow much internally due to the tree compression
algorithm.  I'm sure there are better examples where a program would
typically allocate GADT-based data structures of more significant
size.

Regards,
Markus

On Wed, Sep 21, 2016 at 5:40 PM, Gabriel Scherer
<gabriel.scherer@gmail.com> wrote:
> Very nice. Would you have more precise numbers for the "considerably more
> efficient" part? It's not always easy to find clear benefits to inline
> records on representative macro-benchmarks.
>
> On Thu, Sep 22, 2016 at 2:04 AM, Markus Mottl <markus.mottl@gmail.com>
> wrote:
>>
>> Here is a complete working example of the advantages of using GADTs
>> with inline records.  It also uses the [@@unboxed] feature now
>> available with OCaml 4.04 as discussed before here, though it required
>> a little workaround due to an apparent bug in the current beta.
>>
>> The below implementation of the union-find algorithm is considerably
>> more efficient (with the 4.04 beta only) than the Union_find
>> implementation in the Jane Street Core kernel.  The problem admittedly
>> lends itself to the GADT + inline record trick.
>>
>> There is actually one advantage to using an intermediate, unboxed GADT
>> tag compared to records with existentially quantified fields (if they
>> were available): functions matching the tag don't require those
>> horrible type annotations for locally abstract types, because the
>> match automatically sets up the scope for you.  Having to write "Node
>> foo" instead of just "foo" in some places isn't too bad.  Not sure
>> it's possible to have the best of both worlds.
>>
>> ----------
>> module Union_find = struct
>>   (* This does not work yet due to an OCaml 4.04 beta bug
>>   type ('a, 'kind) tree =
>>     | Root : { mutable value : 'a; mutable rank : int } -> ('a, [ `root ])
>> tree
>>     | Inner : { mutable parent : 'a node } -> ('a, [ `inner ]) tree
>>
>>   and 'a node = Node : ('a, _) tree -> 'a node  [@@ocaml.unboxed]
>>
>>   type 'a t = ('a, [ `inner ]) tree
>>   *)
>>
>>   type ('a, 'kind, 'parent) tree =
>>     | Root : { mutable value : 'a; mutable rank : int } ->
>>       ('a, [ `root ], 'parent) tree
>>     | Inner : { mutable parent : 'parent } -> ('a, [ `inner ], 'parent)
>> tree
>>
>>   type 'a node = Node : ('a, _, 'a node) tree -> 'a node
>> [@@ocaml.unboxed]
>>
>>   type 'a t = ('a, [ `inner ], 'a node) tree
>>
>>   let create v = Inner { parent = Node (Root { value = v; rank = 0 }) }
>>
>>   let rec compress ~repr:(Inner inner as repr) = function
>>     | Node (Root _ as root) -> repr, root
>>     | Node (Inner next_inner as repr) ->
>>         let repr, _ as res = compress ~repr next_inner.parent in
>>         inner.parent <- Node repr;
>>         res
>>
>>   let compress_inner (Inner inner as repr) = compress ~repr inner.parent
>>
>>   let get_root (Inner inner) =
>>     match inner.parent with
>>     | Node (Root _ as root) -> root  (* Avoids compression call *)
>>     | Node (Inner _ as repr) ->
>>         let repr, root = compress_inner repr in
>>         inner.parent <- Node repr;
>>         root
>>
>>   let get t = let Root r = get_root t in r.value
>>
>>   let set t x = let Root r = get_root t in r.value <- x
>>
>>   let same_class t1 t2 = get_root t1 == get_root t2
>>
>>   let union t1 t2 =
>>     let Inner inner1 as repr1, (Root r1 as root1) = compress_inner t1 in
>>     let Inner inner2 as repr2, (Root r2 as root2) = compress_inner t2 in
>>     if root1 == root2 then ()
>>     else
>>       let n1 = r1.rank in
>>       let n2 = r2.rank in
>>       if n1 < n2 then inner1.parent <- Node repr2
>>       else begin
>>         inner2.parent <- Node repr1;
>>         if n1 = n2 then r1.rank <- r1.rank + 1
>>       end
>> end  (* Union_find *)
>> ----------
>>
>> Regards,
>> Markus
>>
>> On Wed, Sep 21, 2016 at 6:14 AM, Lukasz Stafiniak <lukstafi@gmail.com>
>> wrote:
>> > On Wed, Sep 21, 2016 at 12:11 PM, Lukasz Stafiniak <lukstafi@gmail.com>
>> > wrote:
>> >>
>> >> A simple solution would be to "A-transform" (IIRC the term) accesses
>> >
>> > Sorry, I forgot to define this. I mean rewrite rules like:
>> > [f r.x] ==> [let x = r.x in f x]
>> > where subsequently the existential variable is introduced (unpacked)
>> > at the let-binding level. This corresponds to a single-variant GADT
>> > pattern match.
>> >
>> >> to fields with existential type variables. This would give a more
>> >> narrow scope on the expression level than you suggest, but a
>> >> well-defined one prior to type inference. To broaden the scope you
>> >> would need to let-bind the field access yourself at the appropriate
>> >> level.
>>
>>
>>
>> --
>> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>>
>> --
>> Caml-list mailing list.  Subscription management and archives:
>> https://sympa.inria.fr/sympa/arc/caml-list
>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>
>



-- 
Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com

  reply	other threads:[~2016-09-22  0:39 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-17 17:38 Markus Mottl
2016-09-18  8:17 ` Petter A. Urkedal
2016-09-19  1:52   ` Markus Mottl
2016-09-19  8:58     ` octachron
2016-09-19 10:18       ` Mikhail Mandrykin
2016-09-19 13:37         ` Mikhail Mandrykin
2016-09-19 14:46         ` Markus Mottl
2016-09-19 14:53           ` Mikhail Mandrykin
2016-09-19 15:03             ` Markus Mottl
2016-09-20 21:07               ` Markus Mottl
2016-09-21 10:11                 ` Lukasz Stafiniak
2016-09-21 10:14                   ` Lukasz Stafiniak
2016-09-21 17:04                     ` Markus Mottl
2016-09-21 21:40                       ` Gabriel Scherer
2016-09-22  0:39                         ` Markus Mottl [this message]
2016-09-24  5:09                           ` Yaron Minsky
2016-10-04 10:33                 ` Jacques Garrigue
2016-09-19 14:39       ` Markus Mottl
2016-09-19 10:05     ` Goswin von Brederlow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAP_800oy7Ug9PO7YajRxwH+ZsYthkOSefXEOKYh55eUsfEa-Zw@mail.gmail.com \
    --to=markus.mottl@gmail.com \
    --cc=caml-list@inria.fr \
    --cc=gabriel.scherer@gmail.com \
    --cc=lukstafi@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).