From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by sympa.inria.fr (Postfix) with ESMTPS id 4659F7EE51 for ; Fri, 12 Apr 2013 17:01:58 +0200 (CEST) Received-SPF: Neutral (mail2-smtp-roc.national.inria.fr: domain of simon.cruanes.2007@m4x.org does not assert whether or not 129.104.30.34 is permitted sender) identity=pra; client-ip=129.104.30.34; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="SRS0=/W0k=N7=m4x.org=simon.cruanes.2007@bounces.m4x.org"; x-sender="simon.cruanes.2007@m4x.org"; x-conformance=sidf_compatible; x-record-type="spf2.0" Received-SPF: Pass (mail2-smtp-roc.national.inria.fr: domain of SRS0=/W0k=N7=m4x.org=simon.cruanes.2007@bounces.m4x.org designates 129.104.30.34 as permitted sender) identity=mailfrom; client-ip=129.104.30.34; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="SRS0=/W0k=N7=m4x.org=simon.cruanes.2007@bounces.m4x.org"; x-sender="SRS0=/W0k=N7=m4x.org=simon.cruanes.2007@bounces.m4x.org"; x-conformance=sidf_compatible; x-record-type="spf2.0" Received-SPF: Pass (mail2-smtp-roc.national.inria.fr: domain of postmaster@mx1.polytechnique.org designates 129.104.30.34 as permitted sender) identity=helo; client-ip=129.104.30.34; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="SRS0=/W0k=N7=m4x.org=simon.cruanes.2007@bounces.m4x.org"; x-sender="postmaster@mx1.polytechnique.org"; x-conformance=sidf_compatible; x-record-type="v=spf1" X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ar4BAJQgaFGBaB4inGdsb2JhbABQgzyDLr4ygQwWDgEBAQEBCAsJCRQogh8BAQUjBAsBVgkCGAICBRYLAgIJAwIBAgFFEwgBAYgQDI9GmwSSUoEjjUFQghiBEwOTNolRjGGBOA X-IPAS-Result: Ar4BAJQgaFGBaB4inGdsb2JhbABQgzyDLr4ygQwWDgEBAQEBCAsJCRQogh8BAQUjBAsBVgkCGAICBRYLAgIJAwIBAgFFEwgBAYgQDI9GmwSSUoEjjUFQghiBEwOTNolRjGGBOA X-IronPort-AV: E=Sophos;i="4.87,462,1363129200"; d="scan'208";a="13027887" Received: from mx1.polytechnique.org ([129.104.30.34]) by mail2-smtp-roc.national.inria.fr with ESMTP/TLS/ADH-AES256-SHA; 12 Apr 2013 17:01:57 +0200 Received: from [128.93.135.248] (guest-rocq-135248.inria.fr [128.93.135.248]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ssl.polytechnique.org (Postfix) with ESMTPSA id 3610A140C2969 for ; Fri, 12 Apr 2013 17:01:57 +0200 (CEST) Message-ID: <516821DB.7060600@m4x.org> Date: Fri, 12 Apr 2013 17:01:47 +0200 From: simon cruanes User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130403 Thunderbird/17.0.5 MIME-Version: 1.0 To: caml-list@inria.fr References: <58e60ce1.4599.13dfeacfdb3.Coremail.syshen@nudt.edu.cn> In-Reply-To: <58e60ce1.4599.13dfeacfdb3.Coremail.syshen@nudt.edu.cn> X-Enigmail-Version: 1.5.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-AV-Checked: ClamAV using ClamSMTP at svoboda.polytechnique.org (Fri Apr 12 17:01:57 2013 +0200 (CEST)) X-Spam-Flag: No, tests=bogofilter, spamicity=0.119905, queueID=55E62140C2977 X-Org-Mail: simon.cruanes.2007@polytechnique.org Subject: Re: [Caml-list] [CAML]:: efficient data structure for storing and searching int list list If the order in the lists does not matter, I would suggest some kind of Trie (http://en.wikipedia.org/wiki/Trie) to store the *sorted* int lists; the algorithm for search would recursively explore all the branches of the trie that can be a superlist of the input list. Here is a code snippet (not thoroughly tested): (* ------------------- %< ------ >% ----------------- *) type trie = | Node of bool * (* end of a list? *) (int * trie) list (* subtries, indexed by their first element *) let empty = Node (false, []) (* add [l] to [trie], assuming [l] is sorted *) let rec add trie l = match trie, l with | Node (_, subtries), [] -> Node (true, subtries) | Node (b, subtries), x::l' -> let subtrie = try List.assoc x subtries with Not_found -> Node (false, []) in (* recursive add *) let subtrie = add subtrie l' in let subtries = List.remove_assoc x subtries in Node (b, (x,subtrie) :: subtries) (* find whether [l] is a sublist of some list of [trie] *) let rec find trie l = match trie, l with | _, [] -> true | Node (_, subtries), (x::l') -> find_list x subtries l' and find_list x subtries l' = match subtries with | [] -> false | (y,subtrie)::subtries' -> (if y < x then find subtrie (x::l') else if y = x then find subtrie l' else false) || find_list x subtries' l' (* ------------------- %< ------ >% ----------------- *) Cheers! Simon On 12/04/2013 16:36, 沈胜宇 wrote: > Dear all: > > I have an int list list, whose name is LL > > and I need to frequently decide whether a particular int list, whose > name is L, is a sublist of an element of LL. > > Is there any efficent data structure to do this? > > At the mean time, I store LL as (int, bool) Hashtbl.t list, that is, > each element of LL is stored as a hash table. > > So searching L in LL is reduce to decide whether there exist an element > of LL, such every element of L hit in this element. > > At the mean time, the space is not a big problem, but the run time > overhead is major concern, > > So if there exist any more faster data structure? > > Thank you > > Shen > >