caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Ivan Gotovchits <ivg@ieee.org>
To: jean-marc.alliot@irit.fr
Cc: caml-list <caml-list@inria.fr>
Subject: Re: [Caml-list] Choosing a random element in a Map or a Set
Date: Thu, 18 Mar 2021 08:34:21 -0400	[thread overview]
Message-ID: <CALdWJ+yp0FS2Qmbb54+xNjeiXjRbnp1P0pDRZcnMC66CBQKpWA@mail.gmail.com> (raw)
In-Reply-To: <3e61c49d-8b20-378b-0c24-75b8da14b4f7@alliot.org>

[-- Attachment #1: Type: text/plain, Size: 2455 bytes --]

On Thu, Mar 18, 2021 at 5:41 AM <jean-marc.alliot@irit.fr> wrote:

> Dear list,
>
> I am trying to take a random element from a Map or a Set.
>
> Currently, I generate one random int between 1 et Card(map), and I iter
> until I reach this element. The solution is in O(n) which is not great...
>
> I was about to code the following solution: take the current Map code
> and add a function which follows the balanced binary tree from the root
> and takes randomly a left or right turn at each node until we reach a
> leaf (we only need to generate one random int and use its binary
> representation to choose the left or right direction at each node). This
> should be in O(log(n))
>
> Before I start coding like an idiot:
> 1) Is there another, more intelligent, solution?
>

I would create an array of keys and then randomly choose a key from that
array. If you want to be on the functional side, you can shuffle the array
and turn it into a list.
This solution involves some duplication but is easy to implement and is
especially useful if you do not what repetitions in your selection.

An alternative solution that doesn't suffer from an extra memory overhead
would be writing a function that generates a random key (instead of a
random integer). It is not always possible,
especially when the domain of keys is infinite. However, if your set/map is
total (i.e., maps all keys in the domain to values) then it is the perfect
solution.

Finally, as an amalgamation of the above approaches, you can hash-cons your
keys (essentially turning keys into integers). It usually involves an extra
data structure to maintain your index,
but gives you the additional benefits of hash-consing, e.g., faster and
tighter maps and sets.



> 2) Is my solution biased? I think it is not, as long as the tree is
> properly balanced but...
>

OCaml trees are relaxed AVL trees with the maximum difference between trees
height equal to 2 (not 1 as in classical AVL). For small trees, it is quite
a substantial difference, e.g., imagine a tree of 9 elements with the left
tree having a height of 3 and the right of height 1.
The right tree will have only one element and the left will have 7
elements. Therefore, the probability of selecting the right element will be
1/2 vs 1/9. Even for larger trees, it remains the problem as a large tree
is made of small subtrees :) Therefore your probability distribution
will be far from uniform.



>
> Thanks
>
> Jean-Marc
>
>
>

[-- Attachment #2: Type: text/html, Size: 3348 bytes --]

  parent reply	other threads:[~2021-03-18 12:35 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-18  9:44 jean-marc.alliot
2021-03-18 10:06 ` François Pottier
2021-03-18 14:20   ` Simon Cruanes
2021-03-18 17:06     ` jean-marc.alliot
2021-03-18 11:10 ` Nicolas Barnier
2021-03-18 11:32   ` François Pottier
2021-03-19  1:32   ` Francois Berenger
2021-03-18 12:34 ` Ivan Gotovchits [this message]
2021-03-18 12:48 ` Ivan Gotovchits

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALdWJ+yp0FS2Qmbb54+xNjeiXjRbnp1P0pDRZcnMC66CBQKpWA@mail.gmail.com \
    --to=ivg@ieee.org \
    --cc=caml-list@inria.fr \
    --cc=jean-marc.alliot@irit.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).