caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Francois BERENGER <francois.c.berenger@vanderbilt.edu>
To: OCaml Mailing List <caml-list@inria.fr>
Subject: [Caml-list] [ANN] first release of minivpt: a minimalist vantage-point tree implementation in OCaml
Date: Mon, 27 Mar 2017 09:53:14 -0500	[thread overview]
Message-ID: <703b6bb1-7a0d-27d6-4d10-cd52fdf03500@vanderbilt.edu> (raw)

Dear list,

A vantage point tree allows to do fast (but exact) nearest neighbor 
searches in a space of any dimension provided that you have a distance 
function (to measure the distance between any two points in that space).

The code is here:
https://github.com/UnixJunkie/vp-tree

The interface looks like this:
---
type 'a t

val create: ('a -> 'a -> float) -> 'a list -> 'a t

val nearest_neighbor: ('a -> 'a -> float) -> 'a -> 'a t -> float * 'a

val to_list: 'a t -> 'a list

val is_empty: 'a t -> bool
---

The creation of a vp-tree can take some time.
Queries are supposed to be fast.
In my tests, queries can run several times faster than a brute
force search once you have enough points indexed by your vpt.

The implementation creates an optimal vp-tree; don't use it
on large points set (> 10,000 points).

You can install it via:
$ opam install minivpt

Contributions are very welcome to support large point sets, queries
with a tolerance parameter (as in "only points within distance to query
less than the tolerance"), and returning all points (instead of just 
one) if there are several points within the same distance to your query 
point.

A usage example might be:
---
let vpt = Vp_tree.create distance_fun points in
let dist_to_query, nearest_point = \
Vp_tree.nearest_neighbor distance_fun query_point vpt in
[...]
---

My implementation follows the paper:
"Data Structures and Algorithms for Nearest Neighbor Search in General 
Metric Spaces" by Peter N. Yianilos.

Regards,
Francois.

             reply	other threads:[~2017-03-27 14:53 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-27 14:53 Francois BERENGER [this message]
2017-03-28  8:47 ` François Bobot
2017-03-28 13:22   ` Francois BERENGER

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=703b6bb1-7a0d-27d6-4d10-cd52fdf03500@vanderbilt.edu \
    --to=francois.c.berenger@vanderbilt.edu \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).