From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by sympa.inria.fr (Postfix) with ESMTPS id A3B3C7F008 for ; Wed, 25 May 2016 14:29:52 +0200 (CEST) IronPort-PHdr: 9a23:lrU3TBxZMchN/VfXCy+O+j09IxM/srCxBDY+r6Qd0O4RIJqq85mqBkHD//Il1AaPBtWKra4fwLaL+4nbGkU+or+5+EgYd5JNUxJXwe43pCcHRPC/NEvgMfTxZDY7FskRHHVs/nW8LFQHUJ2mPw6anHS+4HYoFwnlMkItf6KuSt+U05/8i7360qaQSjsLrQL1Wal1IhSyoFeZnegtqqwmFJwMzADUqGBDYeVcyDAgD1uSmxHh+pX4p8Y7oGx48sgs/M9YUKj8Y79wDfkBVGxnYFYO+dbzuBLfYQyK73oaGiVKw1sbSzTCuVvcU4rxuybnrfFwwm3SGMTsUbEyE3z26q5xVB7uziEKKj4w/XzMkcFqpLlarResoQc5yInRNtKvOeJ6b5/aKNkdX2tadtdYVCtLBZynbIpJCPAObshCqIyojVwUrAD2NACvA+no1SRLhzei16og0v8JAAjF2wYnAMkNsTLTttqjZ/Raavy80KSdlWaLVPhRwzqordWRfw== Authentication-Results: mail2-smtp-roc.national.inria.fr; spf=None smtp.pra=boris@yakobowski.org; spf=Pass smtp.mailfrom=boris@yakobowski.org; spf=None smtp.helo=postmaster@11.mo1.mail-out.ovh.net Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of boris@yakobowski.org) identity=pra; client-ip=188.165.48.29; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="boris@yakobowski.org"; x-sender="boris@yakobowski.org"; x-conformance=sidf_compatible Received-SPF: Pass (mail2-smtp-roc.national.inria.fr: domain of boris@yakobowski.org designates 188.165.48.29 as permitted sender) identity=mailfrom; client-ip=188.165.48.29; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="boris@yakobowski.org"; x-sender="boris@yakobowski.org"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of postmaster@11.mo1.mail-out.ovh.net) identity=helo; client-ip=188.165.48.29; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="boris@yakobowski.org"; x-sender="postmaster@11.mo1.mail-out.ovh.net"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A0BWAgCumUVXgB0wpbxbFoN3fQaodowChQeBdiSFMzoCgTUHOhIBAQEBAQEBAREBAQsLCQkhL4IughYBAQMBEhFJAgsFCwsLGh0CAiISAQUBChIGEwgKEIdzAw8IBAqkdoExPjGLO4YjhxoDhDMBAQEBBgEBAQEBIolwgQOEIgEBgxyCWQEEhkEMgTSHFokghgCCeIUoZoFRjGWODRIegQ8PGAWCOx2BTWwBAYhQgTUBAQE X-IPAS-Result: A0BWAgCumUVXgB0wpbxbFoN3fQaodowChQeBdiSFMzoCgTUHOhIBAQEBAQEBAREBAQsLCQkhL4IughYBAQMBEhFJAgsFCwsLGh0CAiISAQUBChIGEwgKEIdzAw8IBAqkdoExPjGLO4YjhxoDhDMBAQEBBgEBAQEBIolwgQOEIgEBgxyCWQEEhkEMgTSHFokghgCCeIUoZoFRjGWODRIegQ8PGAWCOx2BTWwBAYhQgTUBAQE X-IronPort-AV: E=Sophos;i="5.26,364,1459807200"; d="scan'208,217";a="219690112" Received: from 11.mo1.mail-out.ovh.net ([188.165.48.29]) by mail2-smtp-roc.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 25 May 2016 14:29:51 +0200 Received: from player729.ha.ovh.net (b9.ovh.net [213.186.33.59]) by mo1.mail-out.ovh.net (Postfix) with ESMTP id 5605DFFB91A for ; Wed, 25 May 2016 14:29:51 +0200 (CEST) Received: from mail-wm0-f46.google.com (mail-wm0-f46.google.com [74.125.82.46]) (Authenticated sender: boris@yakobowski.org) by player729.ha.ovh.net (Postfix) with ESMTPSA id 455C15E006F for ; Wed, 25 May 2016 14:29:51 +0200 (CEST) Received: by mail-wm0-f46.google.com with SMTP id z87so60580092wmh.0 for ; Wed, 25 May 2016 05:29:51 -0700 (PDT) X-Gm-Message-State: ALyK8tKaHIshCI78ck6MtTssjenbK/seGCE7bwRv9J+jyOgYJBtJTvFH3VACZjjZ5Nmgqz4Gj5HBbEqt42gh/w== X-Received: by 10.194.22.8 with SMTP id z8mr3547206wje.102.1464179391012; Wed, 25 May 2016 05:29:51 -0700 (PDT) MIME-Version: 1.0 Reply-To: boris@yakobowski.org Received: by 10.194.137.84 with HTTP; Wed, 25 May 2016 05:29:31 -0700 (PDT) In-Reply-To: <965631B03C670145ABB9F693E51622530D1279D7@DENBGAT9EK5MSX.ww902.siemens.net> References: <965631B03C670145ABB9F693E51622530D1279D7@DENBGAT9EK5MSX.ww902.siemens.net> From: Boris Yakobowski Date: Wed, 25 May 2016 14:29:31 +0200 X-Gmail-Original-Message-ID: Message-ID: To: "Neuhaeusser, Martin" Cc: "caml-list@inria.fr" Content-Type: multipart/alternative; boundary=047d7b5d350a7d3aaf0533a9d46a X-Ovh-Tracer-Id: 11150631204441536543 X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: -51 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrfeekledrgeefgdeglecutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfqggfjnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenogfuuhhsphgvtghtffhomhgrihhnucdlgeelmd X-Validation-by: boris@yakobowski.org Subject: Re: [Caml-list] Hash consed Patricia trees --047d7b5d350a7d3aaf0533a9d46a Content-Type: text/plain; charset=UTF-8 Hi, The Value Analysis plugin of Frama-C uses hash-consing of Patricial trees extensively. In fact, some analyses would not run without it at all. See Section 9 of cristal.inria.fr/~doligez/publications/cuoq-doligez-mlw-2008.ps for more details. Unfortunately, as mentioned there, no figures exist for with hash-consing vs. without hash-consing -- but most of the examples would have failed without it. Although I'm not sure what was implemented exactly at the time, one important feature when using hash-consed Patricia trees is the possibility of using caches. Alain mentioned this in this mail: > Also, you get a nice unique integer for each tree. This allow you to > memoize efficiently set operations (like union, intersection, for which > you can use memoization in the inner loop, not only at toplevel), and to > build sets of sets (and so on). I should stress that the possibility of memoizing *in the inner loop*, is crucial. When performing e.g. unions or map2 operations, it is possible to return a result in constant time when either - the two trees is equivalent (because e.g. union s s == s) - the two trees have already been merged, and the result is in the cache. In practice, most operations become O(D ln D), where D is the number of differences between the two trees, or even O(1) if the cache is big enough and the operations repetitive enough. If this kind of caching may be useful to you, the files hptmap*.ml* of Frama-C provides very nice iterators and abstractions. HTH, On Mon, May 23, 2016 at 4:33 PM, Neuhaeusser, Martin < martin.neuhaeusser@siemens.com> wrote: > Dear all, > > during some experiments with integer set implementations, I came across a > discussion on that list that proposed to use Patricia trees and hash > consing on the tree nodes' constructors to achieve maximal sharing: > > http://caml.inria.fr/pub/ml-archives/caml-list/2008/03/5be97d51e2e8aab16b9e7e369a5a5533.en.html > > Is anyone aware of a corresponding implementation that also has a > performance benefit (or, at least, no negative performance impact) compared > to standard sets or to non-hash consed Patricia trees? Or is anyone aware > of a paper on that matter? > > Sadly, in all my experiments, the combination of Patricia trees with hash > consing applied to the terms representing the tree has a horrible impact on > performance (a slowdown by an order of magnitude). After spending some > thoughts, this seems to be reasonable given the structure of a Patricia > tree. In particular, we found no way to make significand use of the > reflexivity properties obtained by hash consing in set operations like > subset or union. In our benchmarks, the time for constructing hash-consed > subtrees during set operations outweighs any gains obtained by the > "physical equality = set equality" property. Or is the whole point in the > earlier discussion the possibility to use hash consing tags for memoization > of set operations? > > Any hints and comments are highly appreciated. It would really be great if > some of the participants from the 2008 discussion could perhaps share their > experience. > > Best regards, > Martin > > -- > Caml-list mailing list. Subscription management and archives: > https://sympa.inria.fr/sympa/arc/caml-list > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs -- Boris --047d7b5d350a7d3aaf0533a9d46a Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi,

The Value Analys= is plugin of Frama-C uses hash-consing of Patricial trees extensively. In f= act, some analyses would not run without it at all. See Section 9 of cristal.inria.fr/~doligez/publications/cuoq-doligez-mlw-= 2008.ps for more details. Unfortunately, as mentioned there, no = figures exist for with hash-consing vs. without hash-consing -- but most of= the examples would have failed without it.

Although I'm n= ot sure what was implemented exactly at the time, one important feature whe= n using hash-consed Patricia trees is the possibility of using caches. Alai= n mentioned this in this mail:
> Also, you get a nice=
 unique integer for each tree. This allow you to=20
> memoize efficiently set operations (like union, intersection, for whic=
h=20
> you can use memoization in the inner loop, not only at toplevel), and =
to=20
> build sets of sets (and so on).
I should stress that the possibil= ity of memoizing *in the inner loop*, is crucial. When performing e.g. unio= ns or map2 operations, it is possible to return a result in constant time w= hen either
- the two trees is equivalent (because e.g. union s s = =3D=3D s)
- the two trees have already been merged, and the result= is in the cache.
In practice, most operations become O(D ln = D), where D is the number of differences between the two trees, or even O(1= ) if the cache is big enough and the operations repetitive enough.

If this kind of caching may be useful to you, the files hpt= map*.ml* of Frama-C provides very nice iterators and abstractions.
=

HTH,


On Mon, M= ay 23, 2016 at 4:33 PM, Neuhaeusser, Martin <martin.neuhaeu= sser@siemens.com> wrote:
De= ar all,

during some experiments with integer set implementations, I came across a d= iscussion on that list that proposed to use Patricia trees and hash consing= on the tree nodes' constructors to achieve maximal sharing:
http:= //caml.inria.fr/pub/ml-archives/caml-list/2008/03/5be97d51e2e8aab16b9e7e369= a5a5533.en.html

Is anyone aware of a corresponding implementation that also has a performan= ce benefit (or, at least, no negative performance impact) compared to stand= ard sets or to non-hash consed Patricia trees? Or is anyone aware of a pape= r on that matter?

Sadly, in all my experiments, the combination of Patricia trees with hash c= onsing applied to the terms representing the tree has a horrible impact on = performance (a slowdown by an order of magnitude). After spending some thou= ghts, this seems to be reasonable given the structure of a Patricia tree. I= n particular, we found no way to make significand use of the reflexivity pr= operties obtained by hash consing in set operations like subset or union. I= n our benchmarks, the time for constructing hash-consed subtrees during set= operations outweighs any gains obtained by the "physical equality =3D= set equality" property. Or is the whole point in the earlier discussi= on the possibility to use hash consing tags for memoization of set operatio= ns?

Any hints and comments are highly appreciated. It would really be great if = some of the participants from the 2008 discussion could perhaps share their= experience.

Best regards,
Martin

--
Caml-list mailing list.=C2=A0 Subscription management and archives:
https://sympa.inria.fr/sympa/arc/caml-list
Beginner's list: http://groups.yahoo.com/group/ocam= l_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
<= /blockquote>



--
Boris
--047d7b5d350a7d3aaf0533a9d46a--