From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by c5ff346549e7 (Postfix) with ESMTPS id 91C8E5D4 for ; Thu, 18 Mar 2021 12:35:25 +0000 (UTC) X-IronPort-AV: E=Sophos;i="5.81,258,1610406000"; d="scan'208,217";a="498663248" Received: from prod-listesu18.inria.fr (HELO sympa.inria.fr) ([128.93.162.160]) by mail2-relais-roc.national.inria.fr with ESMTP; 18 Mar 2021 13:35:22 +0100 Received: by sympa.inria.fr (Postfix, from userid 20132) id BA908E0148; Thu, 18 Mar 2021 13:35:22 +0100 (CET) Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by sympa.inria.fr (Postfix) with ESMTPS id C326AE0131 for ; Thu, 18 Mar 2021 13:35:19 +0100 (CET) Authentication-Results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=ivg@ieee.org; spf=Pass smtp.mailfrom=ivg@ieee.org; spf=None smtp.helo=postmaster@mail-oo1-f52.google.com IronPort-PHdr: =?us-ascii?q?A9a23=3Av4aZ1x1dfdz2oX4wsmDOSgQyDhhPgJ3EezUN459?= =?us-ascii?q?isYplN5qZl7zcNUDSrc9gkEXOFd2Cra4d2qyP6P6rAj1IyK3CmUhKSIZLWR4Bh?= =?us-ascii?q?JdetC0bK+nBN3fGKuX3ZTcxBsVIWQwt1Xi6NU9IBJS2PAWK8TW94jEIBxrwKxd?= =?us-ascii?q?+KPjrFY7OlcS30P2594HObwlSizexfLd/IA+roQjft8QajoVvJ6IswRbVv3VEf?= =?us-ascii?q?Phby3l1LlyJhRb84cmw/J9n8ytOvv8q6tBNX6bncakmVLJUFDspPXw7683trhn?= =?us-ascii?q?DUBCA5mAAXWUMkxpHGBbK4RfnVZrsqCT6t+592C6HPc3qSL0/RDqv47t3RBLul?= =?us-ascii?q?SwKMSMy/mPKhcxqlK9UrxyhqB5/zYDaY4+bKeRwcb/GcNwAWWZMRNxcWzBdDo6?= =?us-ascii?q?+aYYEEuoPPfxfr4n4v1YArgW+ChOqBOjyyzFIgWP23aok0+s9EQHG3RAgH8kTu?= =?us-ascii?q?3nTrdX1KqgSXPu0zKbW0zrMcela2TDn6IjHax0sp+yHUr1sf8TL00YvCx/FgUu?= =?us-ascii?q?KqYzjJz6b1usAvWab4ud8Se6jlm0qpx9wrzWuyMkhiYrEipwLx17E6Cl0zps5K?= =?us-ascii?q?NKkRUNmb9CoDJVeuiCaOoZoXs4sQ25ltSAnwbMFoZ62ZDYGxIgjyhLFaPGKc5K?= =?us-ascii?q?E7g/iWeqMOzt1hGxpdbSijBio60eg0PfzVsys3VZKsCVFlt7Mu2gI1xPJ68iHT?= =?us-ascii?q?uJx/kCm2TqSzgzT5OFJLV4umarULJ4hxbEwlp4NvkjZAiD2n0D2gLeXdkUi5Oe?= =?us-ascii?q?o9/zqbqv6qpKYLYN5iQHzPr4zlsG+HOg0KAgDU3aD9eS5zrLj/En5QLtQjv0xl?= =?us-ascii?q?6nUqJHaJdoUpqOiAg9azJgs5AilAzehytQYkmELLEhZdxKfk4jpJ1bOLejkAvi?= =?us-ascii?q?lhlSslC5nx/THPr36HpXANWPDkbfkfbZl8UFQ0gszzdZF55JVEL4NOvzzWlWi/?= =?us-ascii?q?ODfWx00Ogrxxu/9A5N00ocfXn6nA7WYLOXcqwym/OUqdsiSbYldlzHhLOYu5//?= =?us-ascii?q?yljdtmEESVaik0JZRb2q3SKc1a36FaGbh149SWVwBuRAzGbSCoG3HaiZaYjOJZ?= =?us-ascii?q?4x55jw/D+qOCI7CQsWqh+XE0nrhWJJRYW9CBxaHFnK6L+2sa7I3cCuXZ/RZvHk?= =?us-ascii?q?cT7HJY44s2BzosxX1meIPBtqRwTURsNfY7PYw4uTSkR8o8jkcJ8WQ3mzLSHt7z?= =?us-ascii?q?Dpgeg=3D=3D?= IronPort-HdrOrdr: =?us-ascii?q?A9a23=3AolLaL66m9mBeqFY1cwPXwd+BI+orLtY04lQ7?= =?us-ascii?q?vn1ZYxpTb8CeioSSjO0WvCWZtB89elEF3eqBNq6JXG/G+fdOirU5EL++UGDd11?= =?us-ascii?q?eAA41v4IDryT+lOwCWzIBg/Ih6dawWMrzNJH17l9u/wATQKbYd6fyG6r3tueDF?= =?us-ascii?q?03x2RxprYK0I1XYbNi+/EldqADVAH4YzDpCG5sFK4wOnY2l/VLXYOlAgf8zu4+?= =?us-ascii?q?LGj4jnZxluPW9D1CCrgSmz4LD3VziUty1uNw9n+Kwv+2TJnwvy6syYwpaG4yTR?= =?us-ascii?q?3WPS8Jha8eGJoudrP8CUj9hQFzOEsHfPWK1aR7aAsDopydvD1H8WlrD3ySsIDo?= =?us-ascii?q?BY8XuUVGewuB7k2w78yl8Vmgbf4G7dpXPipMDjLQhKc/ZptMZ8ehvd601lltl5?= =?us-ascii?q?yapK0WXxjestMTrw2AD0593JUFVBk0q5pmdKq59os1VvFZEfY/tQoOUkjT5oLK?= =?us-ascii?q?s=3D?= X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: =?us-ascii?q?A0BrAwBYSFNgfzShVdFaHgEBCxIMghKBd?= =?us-ascii?q?IIBOjGEQnEtkCwDijeGFINthjeBaAsBAwENNAQBAYRQAoF5Ah0HAQQ1BQ0CEAE?= =?us-ascii?q?BBQEBAQIBAwMEARMBAQ0LCwgnhWoNgjgig2sBAQMBEhEdAQEjFAEECwsLNwICI?= =?us-ascii?q?hIBBQEcGRsHhTAFIZ8IgQQ9iy6BMoMEAQEGhX2BPAkSgSeGfQEBhkQmHIFKQoF?= =?us-ascii?q?HgWdQLj6EB4NQgmCBWGVLI4E3Z6lkkVeDDpxMIqQnt3MQI4FJgXl9CDsxBoIyU?= =?us-ascii?q?BkNjh8LF4hhhWEoLzgCBgEJAQEDCZA1AQE?= X-IPAS-Result: =?us-ascii?q?A0BrAwBYSFNgfzShVdFaHgEBCxIMghKBdIIBOjGEQnEtkCw?= =?us-ascii?q?DijeGFINthjeBaAsBAwENNAQBAYRQAoF5Ah0HAQQ1BQ0CEAEBBQEBAQIBAwMEA?= =?us-ascii?q?RMBAQ0LCwgnhWoNgjgig2sBAQMBEhEdAQEjFAEECwsLNwICIhIBBQEcGRsHhTA?= =?us-ascii?q?FIZ8IgQQ9iy6BMoMEAQEGhX2BPAkSgSeGfQEBhkQmHIFKQoFHgWdQLj6EB4NQg?= =?us-ascii?q?mCBWGVLI4E3Z6lkkVeDDpxMIqQnt3MQI4FJgXl9CDsxBoIyUBkNjh8LF4hhhWE?= =?us-ascii?q?oLzgCBgEJAQEDCZA1AQE?= X-IronPort-AV: E=Sophos;i="5.81,258,1610406000"; d="scan'208,217";a="376128561" X-MGA-submission: =?us-ascii?q?MDEpUgi3wbvMYGmQnlnNcHo+6I3I8kNFunNdFQ?= =?us-ascii?q?o9R03qyt6tFYThsx1YQZ9WGqd3AVd/v5HiVJEm5ELOpf8/+86a4iE8nG?= =?us-ascii?q?tZNNd9VLAxzSNP16j6SENjDZxf/MqwNgzg4vdMnVVi9OLmNCh0oPXVHo?= =?us-ascii?q?20iAKb6IAtTIbOOG00M3//Ww=3D=3D?= Received: from mail-oo1-f52.google.com ([209.85.161.52]) by mail3-smtp-sop.national.inria.fr with ESMTP/TLS/AES256-GCM-SHA384; 18 Mar 2021 13:35:06 +0100 Received: by mail-oo1-f52.google.com with SMTP id c12-20020a4ae24c0000b02901bad05f40e4so1389895oot.4 for ; Thu, 18 Mar 2021 05:35:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ieee.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=1ee3yYRms3UjbUZncL2YVFxJIOPBAFmjb69dYH8oqSY=; b=CvRgC24hgk9dVaGSGnWWmpzufc9nDuHZjlOrzsUjmpaXCyhpqlos+w1MF2yoZphrjW ZgerNHcwyPaykfL2GhPSeVZJ7LQTYVkTxevLrbndZYFvLqUhaOuPbgnMto+/Qi9Mcc6y 0vctJjQ3Nnkyb5Motzj9fTkmF6c91IYw0ET9E= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=1ee3yYRms3UjbUZncL2YVFxJIOPBAFmjb69dYH8oqSY=; b=NUOjpRdtBwZox2Jrgo6v5VHqebTQkeT0XOxwyLFXxeinjpXxHxjraTTXQmEoSOvbm3 kgrYDnA/rRbSRx0wKkTELYT/kaCPlIGDsQQO18mj3I/Jq87eA4LlGko/SND/2pPfaN1F i3F7/1bJM73cokz/kbnW67lLMAuMeueaaq44U6D1dmUkEi08LRjQ28SMlQkU/XQyNzHY HruMbA0NZsX//nN9RXen+EQgaX4X28RAXLCg5w3tY1OFBHtKeHxDQwJqiZnvzh9dn6eY 2LasHdggIvOZuuf4TVxa/F1oujzqACyeaEVkwbFOc1T3rq5LLdStPle4Wk1MspB/gKg8 +1BA== X-Gm-Message-State: AOAM530gUwpmZihvpR5L//Zkq6d9jivCL8tbP4XTeOjz0SkT2fCmGJJl b7aNRYKcoN6ZK92DAVQdn8AMgxlFpzhohTDqrK0ummZRsQ1WVQ== X-Google-Smtp-Source: ABdhPJz7tfnaLLuhRVMuwpXaEEitqmtQDeuoAPmSsL/DyUYGCX+lpAgcH/zidlfb30B9WUwZm1puq+QLGK37UrujC50= X-Received: by 2002:a4a:37c5:: with SMTP id r188mr7370236oor.77.1616070904727; Thu, 18 Mar 2021 05:35:04 -0700 (PDT) MIME-Version: 1.0 References: <3e61c49d-8b20-378b-0c24-75b8da14b4f7@alliot.org> In-Reply-To: <3e61c49d-8b20-378b-0c24-75b8da14b4f7@alliot.org> From: Ivan Gotovchits Date: Thu, 18 Mar 2021 08:34:21 -0400 Message-ID: To: jean-marc.alliot@irit.fr Cc: caml-list Content-Type: multipart/alternative; boundary="00000000000035938e05bdced498" Subject: Re: [Caml-list] Choosing a random element in a Map or a Set Reply-To: Ivan Gotovchits X-Loop: caml-list@inria.fr X-Sequence: 18433 Errors-To: caml-list-owner@inria.fr Precedence: list Precedence: bulk Sender: caml-list-request@inria.fr X-no-archive: yes List-Id: List-Help: List-Subscribe: List-Unsubscribe: List-Post: List-Owner: List-Archive: Archived-At: --00000000000035938e05bdced498 Content-Type: text/plain; charset="UTF-8" On Thu, Mar 18, 2021 at 5:41 AM wrote: > Dear list, > > I am trying to take a random element from a Map or a Set. > > Currently, I generate one random int between 1 et Card(map), and I iter > until I reach this element. The solution is in O(n) which is not great... > > I was about to code the following solution: take the current Map code > and add a function which follows the balanced binary tree from the root > and takes randomly a left or right turn at each node until we reach a > leaf (we only need to generate one random int and use its binary > representation to choose the left or right direction at each node). This > should be in O(log(n)) > > Before I start coding like an idiot: > 1) Is there another, more intelligent, solution? > I would create an array of keys and then randomly choose a key from that array. If you want to be on the functional side, you can shuffle the array and turn it into a list. This solution involves some duplication but is easy to implement and is especially useful if you do not what repetitions in your selection. An alternative solution that doesn't suffer from an extra memory overhead would be writing a function that generates a random key (instead of a random integer). It is not always possible, especially when the domain of keys is infinite. However, if your set/map is total (i.e., maps all keys in the domain to values) then it is the perfect solution. Finally, as an amalgamation of the above approaches, you can hash-cons your keys (essentially turning keys into integers). It usually involves an extra data structure to maintain your index, but gives you the additional benefits of hash-consing, e.g., faster and tighter maps and sets. > 2) Is my solution biased? I think it is not, as long as the tree is > properly balanced but... > OCaml trees are relaxed AVL trees with the maximum difference between trees height equal to 2 (not 1 as in classical AVL). For small trees, it is quite a substantial difference, e.g., imagine a tree of 9 elements with the left tree having a height of 3 and the right of height 1. The right tree will have only one element and the left will have 7 elements. Therefore, the probability of selecting the right element will be 1/2 vs 1/9. Even for larger trees, it remains the problem as a large tree is made of small subtrees :) Therefore your probability distribution will be far from uniform. > > Thanks > > Jean-Marc > > > --00000000000035938e05bdced498 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Thu, Mar 18, 2021 at 5:41 AM <<= a href=3D"mailto:jean-marc.alliot@irit.fr">jean-marc.alliot@irit.fr>= wrote:
Dear lis= t,

I am trying to take a random element from a Map or a Set.

Currently, I generate one random int between 1 et Card(map), and I iter
until I reach this element. The solution is in O(n) which is not great...
I was about to code the following solution: take the current Map code
and add a function which follows the balanced binary tree from the root
and takes randomly a left or right turn at each node until we reach a
leaf (we only need to generate one random int and use its binary
representation to choose the left or right direction at each node). This should be in O(log(n))

Before I start coding like an idiot:
1) Is there another, more intelligent, solution?

<= /div>
I would create an array of keys and then randomly choose a key fr= om that array. If you want to be on the functional side, you can shuffle th= e array and turn it into a list.
This solution involves some= duplication but is easy to implement and is especially useful if you do no= t what repetitions in your selection.

An alte= rnative solution that doesn't suffer from an extra memory overhead woul= d be writing a function that generates a random key (instead of a random in= teger). It is not always possible,
especially when the domai= n of keys is infinite. However, if your set/map is total (i.e., maps all ke= ys in the domain to values) then it is the perfect solution.
=
Finally, as an amalgamation of the above approaches, you can= hash-cons your keys (essentially turning keys into integers). It usually i= nvolves an extra data structure to maintain your index,
but gives= you the additional benefits of hash-consing, e.g., faster and tighter maps= and sets.

=C2=A0
2) Is my solution biased? I think it is not, as long as the tree is
properly balanced but...

OCaml trees ar= e relaxed AVL trees with the maximum difference between trees height equal = to 2 (not 1 as in classical AVL). For small trees, it is quite a substantia= l difference, e.g., imagine a tree of 9 elements with the left tree having = a height of 3 and the right of height 1.
The right tree will have= only one element and the left will have 7 elements. Therefore, the probabi= lity of selecting the right element will be 1/2 vs 1/9. Even for larger tre= es, it remains the problem as a large tree is made of small subtrees :) The= refore your probability distribution
will be far from unifor= m.

=C2=A0

Thanks

Jean-Marc


--00000000000035938e05bdced498--