From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=AWL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by yquem.inria.fr (Postfix) with ESMTP id 81065BBC1 for ; Fri, 7 Mar 2008 11:09:16 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AsoBAIKj0EfAXQImh2dsb2JhbACQfgEBAQgKKYENmXU X-IronPort-AV: E=Sophos;i="4.25,461,1199660400"; d="scan'208";a="8112537" Received: from discorde.inria.fr ([192.93.2.38]) by mail2-smtp-roc.national.inria.fr with ESMTP; 07 Mar 2008 11:09:10 +0100 Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by discorde.inria.fr (8.13.6/8.13.6) with ESMTP id m27A992I019295 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=OK) for ; Fri, 7 Mar 2008 11:09:10 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AlkBAAqj0EfBL1AZk2dsb2JhbACQfgEBAQEHBAYJIIENmXQ X-IronPort-AV: E=Sophos;i="4.25,461,1199660400"; d="scan'208";a="9986882" Received: from gw.exalead.com (HELO exalead.com) ([193.47.80.25]) by mail3-smtp-sop.national.inria.fr with ESMTP; 07 Mar 2008 11:09:09 +0100 Received: from [192.168.204.148] (madpc064.exalead.com [192.168.204.148]) (authenticated bits=0) by exalead.com (8.14.0/8.14.0) with ESMTP id m27A965g018244; Fri, 7 Mar 2008 11:09:08 +0100 Message-ID: <47D11442.6090409@exalead.com> Date: Fri, 07 Mar 2008 11:09:06 +0100 From: Berke Durak User-Agent: Thunderbird 1.5.0.10 (X11/20070221) MIME-Version: 1.0 To: "Harrison, John R" Cc: Caml-list List Subject: Re: [Caml-list] Canonical Set/Map datastructure? References: <47CECF23.1020508@exalead.com> <47CFBF04.9030703@exalead.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Miltered: at discorde with ID 47D11445.000 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; berke:01 durak:01 berke:01 durak:01 hash-consing:01 hashtable:01 worst-case:01 hashes:01 hash:01 worst-case:01 hash:01 hashtbl:01 proble:01 bounded:01 nodes:01 Harrison, John R a écrit : > Hi Berke, > > | However, the idea of combining hash-consing and Patricia trees, > | however elegant, does not suit my problem. Basically, you are > | cheating by using an auxiliary data structure, the hashtable (which is > | also O(n^2) worst-case). > > The code I pointed you to uses hashes, but not hash tables. As for the > worst-case efficiency, what you say is true, but it's unlikely to > matter in practice. And I suspect it may be unavoidable in principle, > which is the interesting thing I learned last time this was discussed. > See for example the following paper and its references: Hi John, Thanks for your code. I'm sorry I thought you maintained a separate hash table. It's very interesting code and I'll give it a try. But it seems to have two properties which make it unsuitable for the particular application I had in mind: - The theoretical worst-case performance of hash-based data structures can be attained if the hash function has bad dispersal. As the code relies on Hashtbl.hash, which does not hash deeply, this could potentially be a proble, in particular if your keys have long "common prefixes" that are not distinguished by the hash function. It might work well in practice but I feel a little uncomfortable. - Also, it does not preserve the natural order for keys, and that is particularly bad, because I often use, for instance, float-indexed maps or sets as a substitute for heaps. The paper with the inverse cubic lower bound is *very* interesting; we don't see plausible lower bounds often in complexity theory, especially with such large assumptions (just bounded out-degree for the graph nodes). So it seems there is little hope to have a drop-in replacement for Set or Map that is compatible with the natural order of things, a.k.a., Pervasives.compare. -- Berke DURAK