From: simon cruanes <simon.cruanes.2007@m4x.org>
To: caml-list@inria.fr
Subject: Re: [Caml-list] [CAML]:: efficient data structure for storing and searching int list list
Date: Fri, 12 Apr 2013 17:01:47 +0200 [thread overview]
Message-ID: <516821DB.7060600@m4x.org> (raw)
In-Reply-To: <58e60ce1.4599.13dfeacfdb3.Coremail.syshen@nudt.edu.cn>
If the order in the lists does not matter, I would suggest some kind of
Trie (http://en.wikipedia.org/wiki/Trie) to store the *sorted* int
lists; the algorithm for search would recursively explore all the
branches of the trie that can be a superlist of the input list.
Here is a code snippet (not thoroughly tested):
(* ------------------- %< ------ >% ----------------- *)
type trie =
| Node of bool * (* end of a list? *)
(int * trie) list (* subtries, indexed by their first
element *)
let empty = Node (false, [])
(* add [l] to [trie], assuming [l] is sorted *)
let rec add trie l =
match trie, l with
| Node (_, subtries), [] -> Node (true, subtries)
| Node (b, subtries), x::l' ->
let subtrie =
try List.assoc x subtries
with Not_found -> Node (false, []) in
(* recursive add *)
let subtrie = add subtrie l' in
let subtries = List.remove_assoc x subtries in
Node (b, (x,subtrie) :: subtries)
(* find whether [l] is a sublist of some list of [trie] *)
let rec find trie l =
match trie, l with
| _, [] -> true
| Node (_, subtries), (x::l') ->
find_list x subtries l'
and find_list x subtries l' = match subtries with
| [] -> false
| (y,subtrie)::subtries' ->
(if y < x then find subtrie (x::l')
else if y = x then find subtrie l'
else false) || find_list x subtries' l'
(* ------------------- %< ------ >% ----------------- *)
Cheers!
Simon
On 12/04/2013 16:36, 沈胜宇 wrote:
> Dear all:
>
> I have an int list list, whose name is LL
>
> and I need to frequently decide whether a particular int list, whose
> name is L, is a sublist of an element of LL.
>
> Is there any efficent data structure to do this?
>
> At the mean time, I store LL as (int, bool) Hashtbl.t list, that is,
> each element of LL is stored as a hash table.
>
> So searching L in LL is reduce to decide whether there exist an element
> of LL, such every element of L hit in this element.
>
> At the mean time, the space is not a big problem, but the run time
> overhead is major concern,
>
> So if there exist any more faster data structure?
>
> Thank you
>
> Shen
>
>
next prev parent reply other threads:[~2013-04-12 15:01 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-12 14:36 沈胜宇
2013-04-12 15:01 ` simon cruanes [this message]
2013-04-12 15:48 ` Jean-Francois Monin
2013-04-13 6:58 ` 沈胜宇
2013-04-13 7:56 ` Gabriel Scherer
2013-04-12 22:15 ` Toby Kelsey
2013-04-13 6:57 ` 沈胜宇
2013-04-23 9:05 ` Goswin von Brederlow
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=516821DB.7060600@m4x.org \
--to=simon.cruanes.2007@m4x.org \
--cc=caml-list@inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).