caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: tim@fungible.com (Tim Freeman)
To: georg.g@home.se
Cc: garrigue@kurims.kyoto-u.ac.jp, caml-list@inria.fr
Subject: Hashing research (was Re: [Caml-list] Big executables ...)
Date: Tue, 19 Mar 2002 19:46:24 -0700	[thread overview]
Message-ID: <1191-Tue19Mar2002204259-0800-tim@fungible.com> (raw)
In-Reply-To: <00d901c1cf94$c5fa7700$f58c72d5@invariant.se> (message from =?iso-8859-1?Q?Johan_Georg_Granstr=F6m?= on Tue, 19 Mar 2002 23:21:02 +0100)

From: georg.g@home.se
>IMHO this a perfect research problem:
>
>Find a mapping H:S->B where S is the set of module signatures and
>B is the set of binary (arbitrary length) strings. Such that if and only if
>s_1 is a subset of s_2 then there is some relation between H(s_1) and
>H(s_2), thus  s_1<s_2 iff H(s_1) R H(s_2).
>
>Perhaps you could drop "and only if" and let H(s_1) R H(s_2) imply
>s_1 < s_2 with 99.9...% certainty.

I think you can't do it with constant-sized hashes.  For instance, if
s_2 has 100 elements, then it has 2 ** 100 subsets.  Since R has to
behave correctly on most of those 2 ** 100 subsets, those subsets need
to have almost 2 ** 100 different hashes, so your hash can't be less
than 100 bits.

You have to know the name for each entry point into the library anyway
so you can do the linking.  We could just have one hash for the type
per entry point.  Hmm; MD5 is only 16 bytes, or 32 bytes of hex, or 22
bytes of base 62 (digits plus upper and lower case letters), so maybe
we just append the MD5 checksum to the end of the symbol.  If that's
too much and we're willing to have less-than-cryptographic security we
could truncate the added checksum to whatever number of bits is small
enough and still have a very good chance of getting the right answer.

-- 
Tim Freeman       
tim@fungible.com
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


  reply	other threads:[~2002-03-20  4:45 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-03-16 16:05 [Caml-list] Big executables from ocamlopt; dynamic libraries again Tim Freeman
2002-03-18  1:12 ` Jacques Garrigue
2002-03-18  1:29   ` Tim Freeman
2002-03-18  5:20     ` Jacques Garrigue
2002-03-18 10:10       ` [Caml-list] Big executables from ocamlopt; dynamic librariesagain Warp
2002-03-18 13:14       ` [Caml-list] Big executables from ocamlopt; dynamic libraries again Sven
2002-03-18 15:51       ` [Caml-list] Type-safe backward compatibility for .so's Tim Freeman
2002-03-18 18:46       ` [Caml-list] Big executables from ocamlopt; dynamic libraries again malc
2002-03-19 22:21       ` Johan Georg Granström
2002-03-20  2:46         ` Tim Freeman [this message]
2002-03-18 10:12     ` Nicolas George
2002-03-18 13:11   ` Sven

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1191-Tue19Mar2002204259-0800-tim@fungible.com \
    --to=tim@fungible.com \
    --cc=caml-list@inria.fr \
    --cc=garrigue@kurims.kyoto-u.ac.jp \
    --cc=georg.g@home.se \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).