The best compromise to me is to leave the default for Hashtbl, but properly
document this aspect in the manual (with succint explanation and one
relevant pointer). That way:
- you don't break compatibility
- you keep default reproducibility (which is a real feature)
- you teach beginners like myself on tough aspects related to the use of a
datastructure in some frequent use cases.
Having a default randomisation seems much too intrusive to me
(notwithstanding the importance of web programming for ocaml's future ;o))
and for those changes that modify the semantic of a program, you have to be
in control. So I second Paolo's opinion.
However, assuming ocaml users will be "aware of attacks", at least more
than users of other languages, is not only controversial but also non
relevant: there are beginners in ocaml too (who may also be beginners in
programming), and these should be taken care of. A small note in the doc on
the DOS vulnerability that may arise from the use of Hashtbl in a web
application context is enough to protect the users I think, and is of
interest for the casual Hashtbl reader. This note could appear in a
different font/color than the main description of [Hashtbl.create], to
preserve the readability of the docs.
my 2 cent
ph.
2012/3/13 Paolo Donadeo
> In my humble opinion, here we have two different vision of what
> computer programming is, or should be. Your statement "maybe it's
> better to assume that the programmer will not be aware of attacks" may
> be true for the average Java programmer (please, no flame, no insult
> intended to Java programmers reading this list!) but not for an OCaml
> programmer. I want to be perfectly aware of attacks, and I want to be
> in control of the data structure I use, and not "be unaware"...
>
> In Python, the other language I use every day, dictionaries are
> implemented as hash tables and not having reproducibility is a PITA.
>
>
> --
> Paolo
>
>
> On Tue, Mar 13, 2012 at 10:54, Romain Bardou
> wrote:
> > Hi,
> >
> >
> >> As you and Gerd said, the new Hashtbl implementation in the upcoming
> >> major release has everything needed to randomize hash tables by
> >> seeding. The question at this point is whether randomization should
> >> be the default or not: some of our big users who don't do Web stuff
> >> value reproducibility highly... We (OCaml core developers) will take
> >> a decision soon.
> >
> >
> > FWIW, as a developer I do not expect reproducibility from Hash tables
> (nor
> > from the Random module actually) but I do expect some way to control
> > reproducibility (i.e. read the current seed, give my own seed). Maybe
> it's
> > better to assume that the programmer will not be aware of attacks, and
> > provide him with a safer environment.
> >
> > On the other hand, when you find a bug and need reproducibility, it's too
> > late if you have used a random seed without recording it. And could it
> break
> > some existing applications?
> >
> > I guess you('re) already had(having) this discussion though.
> >
> > Cheers,
> >
> > --
> > Romain
>
>
> --
> Caml-list mailing list. Subscription management and archives:
> https://sympa-roc.inria.fr/wws/info/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>
>