caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Yaron Minsky <yminsky@janestreet.com>
To: Markus Mottl <markus.mottl@gmail.com>
Cc: Gabriel Scherer <gabriel.scherer@gmail.com>,
	Lukasz Stafiniak <lukstafi@gmail.com>,
	OCaml List <caml-list@inria.fr>
Subject: Re: [Caml-list] Covariant GADTs
Date: Sat, 24 Sep 2016 14:09:45 +0900	[thread overview]
Message-ID: <CACLX4jQK1Y9LAKto1WxKEh4RP8xyqEvhnjy=Lo1x8ijeKUudWw@mail.gmail.com> (raw)
In-Reply-To: <CAP_800oy7Ug9PO7YajRxwH+ZsYthkOSefXEOKYh55eUsfEa-Zw@mail.gmail.com>

This looks like a nice improvement. A PR would be very welcome...

y

On Thu, Sep 22, 2016 at 9:39 AM, Markus Mottl <markus.mottl@gmail.com> wrote:
> The direct comparison with the Jane Street implementation showed a 40%
> speed increase for some random things I tried, but that's not a fair
> comparison.  If I improve the JS code, e.g. to avoid allocations, the
> performance improvement due to the GADT + inlined records drops to
> only about 10%.
>
> In terms of memory, a freshly created set costs 7 machine words in the
> original code vs. 5 for the GADT.  Adding one rank costs 4 machine
> words in the standard implementation vs. only 2 for GADTs.  That's a
> pretty significant size reduction.  The GADT representation would
> surely help in programs that allocate a lot of these values, but the
> values don't tend to grow much internally due to the tree compression
> algorithm.  I'm sure there are better examples where a program would
> typically allocate GADT-based data structures of more significant
> size.
>
> Regards,
> Markus
>
> On Wed, Sep 21, 2016 at 5:40 PM, Gabriel Scherer
> <gabriel.scherer@gmail.com> wrote:
>> Very nice. Would you have more precise numbers for the "considerably more
>> efficient" part? It's not always easy to find clear benefits to inline
>> records on representative macro-benchmarks.
>>
>> On Thu, Sep 22, 2016 at 2:04 AM, Markus Mottl <markus.mottl@gmail.com>
>> wrote:
>>>
>>> Here is a complete working example of the advantages of using GADTs
>>> with inline records.  It also uses the [@@unboxed] feature now
>>> available with OCaml 4.04 as discussed before here, though it required
>>> a little workaround due to an apparent bug in the current beta.
>>>
>>> The below implementation of the union-find algorithm is considerably
>>> more efficient (with the 4.04 beta only) than the Union_find
>>> implementation in the Jane Street Core kernel.  The problem admittedly
>>> lends itself to the GADT + inline record trick.
>>>
>>> There is actually one advantage to using an intermediate, unboxed GADT
>>> tag compared to records with existentially quantified fields (if they
>>> were available): functions matching the tag don't require those
>>> horrible type annotations for locally abstract types, because the
>>> match automatically sets up the scope for you.  Having to write "Node
>>> foo" instead of just "foo" in some places isn't too bad.  Not sure
>>> it's possible to have the best of both worlds.
>>>
>>> ----------
>>> module Union_find = struct
>>>   (* This does not work yet due to an OCaml 4.04 beta bug
>>>   type ('a, 'kind) tree =
>>>     | Root : { mutable value : 'a; mutable rank : int } -> ('a, [ `root ])
>>> tree
>>>     | Inner : { mutable parent : 'a node } -> ('a, [ `inner ]) tree
>>>
>>>   and 'a node = Node : ('a, _) tree -> 'a node  [@@ocaml.unboxed]
>>>
>>>   type 'a t = ('a, [ `inner ]) tree
>>>   *)
>>>
>>>   type ('a, 'kind, 'parent) tree =
>>>     | Root : { mutable value : 'a; mutable rank : int } ->
>>>       ('a, [ `root ], 'parent) tree
>>>     | Inner : { mutable parent : 'parent } -> ('a, [ `inner ], 'parent)
>>> tree
>>>
>>>   type 'a node = Node : ('a, _, 'a node) tree -> 'a node
>>> [@@ocaml.unboxed]
>>>
>>>   type 'a t = ('a, [ `inner ], 'a node) tree
>>>
>>>   let create v = Inner { parent = Node (Root { value = v; rank = 0 }) }
>>>
>>>   let rec compress ~repr:(Inner inner as repr) = function
>>>     | Node (Root _ as root) -> repr, root
>>>     | Node (Inner next_inner as repr) ->
>>>         let repr, _ as res = compress ~repr next_inner.parent in
>>>         inner.parent <- Node repr;
>>>         res
>>>
>>>   let compress_inner (Inner inner as repr) = compress ~repr inner.parent
>>>
>>>   let get_root (Inner inner) =
>>>     match inner.parent with
>>>     | Node (Root _ as root) -> root  (* Avoids compression call *)
>>>     | Node (Inner _ as repr) ->
>>>         let repr, root = compress_inner repr in
>>>         inner.parent <- Node repr;
>>>         root
>>>
>>>   let get t = let Root r = get_root t in r.value
>>>
>>>   let set t x = let Root r = get_root t in r.value <- x
>>>
>>>   let same_class t1 t2 = get_root t1 == get_root t2
>>>
>>>   let union t1 t2 =
>>>     let Inner inner1 as repr1, (Root r1 as root1) = compress_inner t1 in
>>>     let Inner inner2 as repr2, (Root r2 as root2) = compress_inner t2 in
>>>     if root1 == root2 then ()
>>>     else
>>>       let n1 = r1.rank in
>>>       let n2 = r2.rank in
>>>       if n1 < n2 then inner1.parent <- Node repr2
>>>       else begin
>>>         inner2.parent <- Node repr1;
>>>         if n1 = n2 then r1.rank <- r1.rank + 1
>>>       end
>>> end  (* Union_find *)
>>> ----------
>>>
>>> Regards,
>>> Markus
>>>
>>> On Wed, Sep 21, 2016 at 6:14 AM, Lukasz Stafiniak <lukstafi@gmail.com>
>>> wrote:
>>> > On Wed, Sep 21, 2016 at 12:11 PM, Lukasz Stafiniak <lukstafi@gmail.com>
>>> > wrote:
>>> >>
>>> >> A simple solution would be to "A-transform" (IIRC the term) accesses
>>> >
>>> > Sorry, I forgot to define this. I mean rewrite rules like:
>>> > [f r.x] ==> [let x = r.x in f x]
>>> > where subsequently the existential variable is introduced (unpacked)
>>> > at the let-binding level. This corresponds to a single-variant GADT
>>> > pattern match.
>>> >
>>> >> to fields with existential type variables. This would give a more
>>> >> narrow scope on the expression level than you suggest, but a
>>> >> well-defined one prior to type inference. To broaden the scope you
>>> >> would need to let-bind the field access yourself at the appropriate
>>> >> level.
>>>
>>>
>>>
>>> --
>>> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>>>
>>> --
>>> Caml-list mailing list.  Subscription management and archives:
>>> https://sympa.inria.fr/sympa/arc/caml-list
>>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>
>>
>
>
>
> --
> Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs

  reply	other threads:[~2016-09-24  5:10 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-17 17:38 Markus Mottl
2016-09-18  8:17 ` Petter A. Urkedal
2016-09-19  1:52   ` Markus Mottl
2016-09-19  8:58     ` octachron
2016-09-19 10:18       ` Mikhail Mandrykin
2016-09-19 13:37         ` Mikhail Mandrykin
2016-09-19 14:46         ` Markus Mottl
2016-09-19 14:53           ` Mikhail Mandrykin
2016-09-19 15:03             ` Markus Mottl
2016-09-20 21:07               ` Markus Mottl
2016-09-21 10:11                 ` Lukasz Stafiniak
2016-09-21 10:14                   ` Lukasz Stafiniak
2016-09-21 17:04                     ` Markus Mottl
2016-09-21 21:40                       ` Gabriel Scherer
2016-09-22  0:39                         ` Markus Mottl
2016-09-24  5:09                           ` Yaron Minsky [this message]
2016-10-04 10:33                 ` Jacques Garrigue
2016-09-19 14:39       ` Markus Mottl
2016-09-19 10:05     ` Goswin von Brederlow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACLX4jQK1Y9LAKto1WxKEh4RP8xyqEvhnjy=Lo1x8ijeKUudWw@mail.gmail.com' \
    --to=yminsky@janestreet.com \
    --cc=caml-list@inria.fr \
    --cc=gabriel.scherer@gmail.com \
    --cc=lukstafi@gmail.com \
    --cc=markus.mottl@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).