caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Memory statistics tool
@ 2008-07-23 10:54 Dr. Thomas Fischbacher
  2008-07-23 11:47 ` [Caml-list] " Daniel Bünzli
  2008-07-23 12:44 ` dmitry grebeniuk
  0 siblings, 2 replies; 10+ messages in thread
From: Dr. Thomas Fischbacher @ 2008-07-23 10:54 UTC (permalink / raw)
  To: Caml-list List


Dear OCaml folks,

when building large applications that work on complicated and highly
networked data, one issue that easily comes up is to get some idea
about what chunks of data eat all your memory. Now, it would be
marvellous for data structure optimization purposes if there were a
function

memory_footprint: 'a -> int64 (or maybe float),

which takes as argument a root
(e.g. Obj.magic [|Obj.magic firstthingy; Obj.magic secondthingy;
       Obj.magic thirdthingy|])
and tells me how many cells are occupied by those ML data structures
reachable from that root. Basically, this would correspond to using
the GC's traversal mechanism and doing some internal statistics at the
same time. My guess would be that the Marshal module "almost" has such
a function already, to determine the amount of memory required to hold
a string-serialized value. But as these values get compacted, the length
of the string does not correspond to the number of words occupied by the
in-memory data.

Is there already something like that? Has anyone already built such
a tool?

-- 
best regards,
Thomas Fischbacher
t.fischbacher@soton.ac.uk




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] Memory statistics tool
  2008-07-23 10:54 Memory statistics tool Dr. Thomas Fischbacher
@ 2008-07-23 11:47 ` Daniel Bünzli
  2008-07-23 12:40   ` Jan Kybic
  2008-07-23 12:44 ` dmitry grebeniuk
  1 sibling, 1 reply; 10+ messages in thread
From: Daniel Bünzli @ 2008-07-23 11:47 UTC (permalink / raw)
  To: Caml-list List


Le 23 juil. 08 à 12:54, Dr. Thomas Fischbacher a écrit :

> Is there already something like that? Has anyone already built such
> a tool?

Also had this wish the other day, I found objsize [1] but didn't use  
it -- did a rough approximation by traversing the datastructure. A  
generic implementation using only the Obj module and a lookup table to  
track visited nodes would be nice but I forgot too much about all the  
cases in the representation of caml values to implement it quickly and  
correctly.

Daniel

[1] http://caml.inria.fr/cgi-bin/hump.fr.cgi?contrib=614

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] Memory statistics tool
  2008-07-23 11:47 ` [Caml-list] " Daniel Bünzli
@ 2008-07-23 12:40   ` Jan Kybic
  0 siblings, 0 replies; 10+ messages in thread
From: Jan Kybic @ 2008-07-23 12:40 UTC (permalink / raw)
  To: Caml-list List

>> Is there already something like that? Has anyone already built such
>> a tool?
>
> Also had this wish the other day, I found objsize [1] but didn't use
> it -- did a rough approximation by traversing the datastructure. A

I have been using Size by Jean-Christophe Filliatre. It worked fine
for me.

http://www.lri.fr/~filliatr/ftp/ocaml/ds/

Jan

-- 
-------------------------------------------------------------------------
Jan Kybic <kybic@fel.cvut.cz>                       tel. +420 2 2435 5721
http://cmp.felk.cvut.cz/~kybic                      ICQ 200569450


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] Memory statistics tool
  2008-07-23 10:54 Memory statistics tool Dr. Thomas Fischbacher
  2008-07-23 11:47 ` [Caml-list] " Daniel Bünzli
@ 2008-07-23 12:44 ` dmitry grebeniuk
  2008-07-23 13:09   ` Dr. Thomas Fischbacher
  1 sibling, 1 reply; 10+ messages in thread
From: dmitry grebeniuk @ 2008-07-23 12:44 UTC (permalink / raw)
  To: caml-list

Hello.

DTF> memory_footprint: 'a -> int64 (or maybe float),

  objsize, now hosted on OCaml forge:
http://forge.ocamlcore.org/projects/objsize/

-- 
WBR,
 dmitry                          mailto:gds-mlsts@moldavcable.com


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] Memory statistics tool
  2008-07-23 12:44 ` dmitry grebeniuk
@ 2008-07-23 13:09   ` Dr. Thomas Fischbacher
  2008-07-23 13:16     ` Alain Frisch
  0 siblings, 1 reply; 10+ messages in thread
From: Dr. Thomas Fischbacher @ 2008-07-23 13:09 UTC (permalink / raw)
  To: dmitry grebeniuk; +Cc: caml-list


dmitry grebeniuk wrote:

> DTF> memory_footprint: 'a -> int64 (or maybe float),
> 
>   objsize, now hosted on OCaml forge:
> http://forge.ocamlcore.org/projects/objsize/

Many thanks! I just had a glance at it, but it seems to be just how one
would have to approach such a problem. (The issue with hash-based
approaches to find previously visited substructures is that during
traversal, a GC may occur. Now I just assume that this may involve
relocation and heap compaction in OCaml. The problem then is that
OCaml does not properly support what would be known as eq hash tables
in Lisp.)

-- 
best regards,
Thomas Fischbacher
t.fischbacher@soton.ac.uk



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] Memory statistics tool
  2008-07-23 13:09   ` Dr. Thomas Fischbacher
@ 2008-07-23 13:16     ` Alain Frisch
  2008-07-24 12:48       ` Dr. Thomas Fischbacher
  0 siblings, 1 reply; 10+ messages in thread
From: Alain Frisch @ 2008-07-23 13:16 UTC (permalink / raw)
  To: Dr. Thomas Fischbacher; +Cc: dmitry grebeniuk, caml-list

> Many thanks! I just had a glance at it, but it seems to be just how one
> would have to approach such a problem. (The issue with hash-based
> approaches to find previously visited substructures is that during
> traversal, a GC may occur. Now I just assume that this may involve
> relocation and heap compaction in OCaml. The problem then is that
> OCaml does not properly support what would be known as eq hash tables
> in Lisp.)

As long as the data structure supports the polymorphic hash function, it 
should work to simply use a regular hash table with the polymorphic hash 
function and physical equality, as in:

module S = Hashtbl.Make(struct
   type t = Obj.t
   let hash = Hashtbl.hash
   let equal = (==)
end);;


(Of course, this might be quite slow.)

-- Alain


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] Memory statistics tool
  2008-07-23 13:16     ` Alain Frisch
@ 2008-07-24 12:48       ` Dr. Thomas Fischbacher
  2008-07-24 15:14         ` Alain Frisch
  0 siblings, 1 reply; 10+ messages in thread
From: Dr. Thomas Fischbacher @ 2008-07-24 12:48 UTC (permalink / raw)
  To: Alain Frisch; +Cc: dmitry grebeniuk, caml-list

Alain Frisch wrote:

>>Many thanks! I just had a glance at it, but it seems to be just how one
>>would have to approach such a problem. (The issue with hash-based
>>approaches to find previously visited substructures is that during
>>traversal, a GC may occur. Now I just assume that this may involve
>>relocation and heap compaction in OCaml. The problem then is that
>>OCaml does not properly support what would be known as eq hash tables
>>in Lisp.)
> 
> 
> As long as the data structure supports the polymorphic hash function, it
> should work to simply use a regular hash table with the polymorphic hash
> function and physical equality, as in:
> 
> module S = Hashtbl.Make(struct
>    type t = Obj.t
>    let hash = Hashtbl.hash
>    let equal = (==)
> end);;

Why? (I.e. I'm not convinced yet.)

-- 
best regards,
Thomas Fischbacher
t.fischbacher@soton.ac.uk




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] Memory statistics tool
  2008-07-24 12:48       ` Dr. Thomas Fischbacher
@ 2008-07-24 15:14         ` Alain Frisch
  2008-07-24 15:44           ` Dr. Thomas Fischbacher
  0 siblings, 1 reply; 10+ messages in thread
From: Alain Frisch @ 2008-07-24 15:14 UTC (permalink / raw)
  To: Dr. Thomas Fischbacher; +Cc: dmitry grebeniuk, caml-list

Dr. Thomas Fischbacher wrote:
> Alain Frisch wrote:
>> As long as the data structure supports the polymorphic hash function, it
>> should work to simply use a regular hash table with the polymorphic hash
>> function and physical equality, as in:
>>
>> module S = Hashtbl.Make(struct
>>    type t = Obj.t
>>    let hash = Hashtbl.hash
>>    let equal = (==)
>> end);;
> 
> Why? (I.e. I'm not convinced yet.)

The two functions (hash and equal) are invariant w.r.t. changes of 
physical memory location of their arguments.

-- Alain


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] Memory statistics tool
  2008-07-24 15:14         ` Alain Frisch
@ 2008-07-24 15:44           ` Dr. Thomas Fischbacher
  2008-07-24 16:12             ` Alain Frisch
  0 siblings, 1 reply; 10+ messages in thread
From: Dr. Thomas Fischbacher @ 2008-07-24 15:44 UTC (permalink / raw)
  To: Alain Frisch; +Cc: dmitry grebeniuk, caml-list

Alain Frisch wrote:

>>>As long as the data structure supports the polymorphic hash function, it
>>>should work to simply use a regular hash table with the polymorphic hash
>>>function and physical equality, as in:
>>>
>>>module S = Hashtbl.Make(struct
>>>   type t = Obj.t
>>>   let hash = Hashtbl.hash
>>>   let equal = (==)
>>>end);;
>>
>>Why? (I.e. I'm not convinced yet.)
> 
> 
> The two functions (hash and equal) are invariant w.r.t. changes of
> physical memory location of their arguments.

The OCaml manual gives no guarantee that Hashtbl.hash does not cons, so
I cannot assume this. Now, without that guarantee, there is a nasty race
condition in which the determination of the hash bucket causes objects
to move in memory. But still, we are safe, as we are just testing for
equality, and the hash bucket does not depend on the memory address,
but on the substructure of the hashed entity.

So, ok, you convinced me.


Anyway, it works now -- thanks to Dmitry's code, I can now do
things like...:

tf@alpha:~/ocaml$ nsim_i

In [1]: ocaml.memory_footprint(ocaml.make_element("E",[3],3,1))
Out[1]: (154.0, 49.0, 5.0)

In [2]:

...and use the interactive Python toplevel of our micromagnetic
simulator "nmag" to find out how much memory is used by the OCaml
data structures under the hood. Excellent. Thanks, Dmitry!

-- 
best regards,
Thomas Fischbacher
t.fischbacher@soton.ac.uk



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] Memory statistics tool
  2008-07-24 15:44           ` Dr. Thomas Fischbacher
@ 2008-07-24 16:12             ` Alain Frisch
  0 siblings, 0 replies; 10+ messages in thread
From: Alain Frisch @ 2008-07-24 16:12 UTC (permalink / raw)
  To: Dr. Thomas Fischbacher; +Cc: dmitry grebeniuk, caml-list

Dr. Thomas Fischbacher wrote:
> The OCaml manual gives no guarantee that Hashtbl.hash does not cons, so
> I cannot assume this.

Indeed, Hashtbl.hash can cons, but this does not contradict my point: 
its result does not depend on the physical location of objects in memory
(if it did, it would be impossible to use this function at all).

> Now, without that guarantee, there is a nasty race
> condition in which the determination of the hash bucket causes objects
> to move in memory.

Yes, objects can move in memory, but what is wrong with that? Their new 
hash value will remain the same.

-- Alain


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2008-07-24 16:12 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-07-23 10:54 Memory statistics tool Dr. Thomas Fischbacher
2008-07-23 11:47 ` [Caml-list] " Daniel Bünzli
2008-07-23 12:40   ` Jan Kybic
2008-07-23 12:44 ` dmitry grebeniuk
2008-07-23 13:09   ` Dr. Thomas Fischbacher
2008-07-23 13:16     ` Alain Frisch
2008-07-24 12:48       ` Dr. Thomas Fischbacher
2008-07-24 15:14         ` Alain Frisch
2008-07-24 15:44           ` Dr. Thomas Fischbacher
2008-07-24 16:12             ` Alain Frisch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).