The best compromise to me is to leave the default for Hashtbl, but properly document this aspect in the manual (with succint explanation and one relevant pointer). That way:
- you don't break compatibility
- you keep default reproducibility (which is a real feature)
- you teach beginners like myself on tough aspects related to the use of a datastructure in some frequent use cases.

Having a default randomisation seems much too intrusive to me (notwithstanding the importance of web programming for ocaml's future ;o)) and for those changes that modify the semantic of a program, you have to be in control. So I second Paolo's opinion.

However, assuming ocaml users will be "aware of attacks", at least more than users of other languages, is not only controversial but also non relevant: there are beginners in ocaml too (who may also be beginners in programming), and these should be taken care of. A small note in the doc on the DOS vulnerability that may arise from the use of Hashtbl in a web application context is enough to protect the users I think, and is of interest for the casual Hashtbl reader. This note could appear in a different font/color than the main description of [Hashtbl.create], to preserve the readability of the docs.

my 2 cent
ph.

2012/3/13 Paolo Donadeo <p.donadeo@gmail.com>
In my humble opinion, here we have two different vision of what
computer programming is, or should be. Your statement "maybe it's
better to assume that the programmer will not be aware of attacks" may
be true for the average Java programmer (please, no flame, no insult
intended to Java programmers reading this list!) but not for an OCaml
programmer. I want to be perfectly aware of attacks, and I want to be
in control of the data structure I use, and not "be unaware"...

In Python, the other language I use every day, dictionaries are
implemented as hash tables and not having reproducibility is a PITA.


--
Paolo


On Tue, Mar 13, 2012 at 10:54, Romain Bardou <bardou@lsv.ens-cachan.fr> wrote:
> Hi,
>
>
>> As you and Gerd said, the new Hashtbl implementation in the upcoming
>> major release has everything needed to randomize hash tables by
>> seeding.  The question at this point is whether randomization should
>> be the default or not: some of our big users who don't do Web stuff
>> value reproducibility highly...  We (OCaml core developers) will take
>> a decision soon.
>
>
> FWIW, as a developer I do not expect reproducibility from Hash tables (nor
> from the Random module actually) but I do expect some way to control
> reproducibility (i.e. read the current seed, give my own seed). Maybe it's
> better to assume that the programmer will not be aware of attacks, and
> provide him with a safer environment.
>
> On the other hand, when you find a bug and need reproducibility, it's too
> late if you have used a random seed without recording it. And could it break
> some existing applications?
>
> I guess you('re) already had(having) this discussion though.
>
> Cheers,
>
> --
> Romain


--
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs