caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] Hashtbl and shrinking
@ 2018-03-29 20:49 Armaël Guéneau
  2018-03-30  9:40 ` Christoph Bauer
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Armaël Guéneau @ 2018-03-29 20:49 UTC (permalink / raw)
  To: caml-list; +Cc: Arthur Charguéraud

Dear caml-list,

It happens Arthur Charguéraud and myself were looking this afternoon at
the implementation of the Hashtbl module, provided in our beloved OCaml
standard library.

We noticed there is no shrinking implemented, that would typically
happen when the number of elements goes under a certain threshold
compared to the size of the underlying array.

This yields a space complexity that we think may not be what people
might expect: an almost empty hashtable will consume as much memory as
when it was full. The time complexity might also be worse than expected,
because the GC will still spend time scanning the whole array even when
there are only a few elements.

What do people here think about this?
If we submitted a patch implementing shrinking for Hashtbl, could it be
detrimental for some specific workloads?

— Armaël

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa.inria.fr/sympa/arc/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] Hashtbl and shrinking
  2018-03-29 20:49 [Caml-list] Hashtbl and shrinking Armaël Guéneau
@ 2018-03-30  9:40 ` Christoph Bauer
  2018-03-30 19:10 ` Cedric Cellier
  2018-04-02  0:34 ` Francois BERENGER
  2 siblings, 0 replies; 4+ messages in thread
From: Christoph Bauer @ 2018-03-30  9:40 UTC (permalink / raw)
  To: caml-list

On 29.03.2018 22:49, Armaël Guéneau wrote:
> We noticed there is no shrinking implemented, that would typically
> happen when the number of elements goes under a certain threshold
> compared to the size of the underlying array.

Hi Armaël,
 
FYI there was a report about this on mantis: 
https://caml.inria.fr/mantis/view.php?id=5555

Kind regards
Christoph Bauer

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa.inria.fr/sympa/arc/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] Hashtbl and shrinking
  2018-03-29 20:49 [Caml-list] Hashtbl and shrinking Armaël Guéneau
  2018-03-30  9:40 ` Christoph Bauer
@ 2018-03-30 19:10 ` Cedric Cellier
  2018-04-02  0:34 ` Francois BERENGER
  2 siblings, 0 replies; 4+ messages in thread
From: Cedric Cellier @ 2018-03-30 19:10 UTC (permalink / raw)
  To: caml-list

For the record, the batteries implementation of Hashtbl had resizing,
until 1073b283f removed it for reasons not detailed in the commit
message:

https://github.com/ocaml-batteries-team/batteries-included/commit/1073b283f#diff-e7f93b9409d8530370753934c701a692


-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa.inria.fr/sympa/arc/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] Hashtbl and shrinking
  2018-03-29 20:49 [Caml-list] Hashtbl and shrinking Armaël Guéneau
  2018-03-30  9:40 ` Christoph Bauer
  2018-03-30 19:10 ` Cedric Cellier
@ 2018-04-02  0:34 ` Francois BERENGER
  2 siblings, 0 replies; 4+ messages in thread
From: Francois BERENGER @ 2018-04-02  0:34 UTC (permalink / raw)
  To: caml-list

On 03/30/2018 05:49 AM, Armaël Guéneau wrote:
> Dear caml-list,
> 
> It happens Arthur Charguéraud and myself were looking this afternoon at
> the implementation of the Hashtbl module, provided in our beloved OCaml
> standard library.
> 
> We noticed there is no shrinking implemented, that would typically
> happen when the number of elements goes under a certain threshold
> compared to the size of the underlying array.
> 
> This yields a space complexity that we think may not be what people
> might expect: an almost empty hashtable will consume as much memory as
> when it was full. The time complexity might also be worse than expected,
> because the GC will still spend time scanning the whole array even when
> there are only a few elements.
> 
> What do people here think about this?

That your change would introduce some time complexity that may not be
what people expect for some operations.

That being said, if a resize operations was added to the module, appart
from changing the current interface, it would not harm people much I guess.

We could accept this in batteries, or even revive the old code that was
doing this previously.

Also, if you are so much concerned about the size of your hashtbl, why
don't you use a map?

Regards,
Francois.

> If we submitted a patch implementing shrinking for Hashtbl, could it be
> detrimental for some specific workloads?
> 
> — Armaël
> 

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa.inria.fr/sympa/arc/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-04-02  0:34 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-29 20:49 [Caml-list] Hashtbl and shrinking Armaël Guéneau
2018-03-30  9:40 ` Christoph Bauer
2018-03-30 19:10 ` Cedric Cellier
2018-04-02  0:34 ` Francois BERENGER

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).