caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Re: [Caml-list] Question about string ref and Hashtbl.
       [not found] <20040320.075814.105434271.yoriyuki@mbg.ocn.ne.jp>
@ 2004-03-20  0:05 ` Bob Bailey
  2004-03-20  3:13   ` Remi Vanicat
  2004-03-22  8:32   ` Jean-Christophe Filliatre
  0 siblings, 2 replies; 6+ messages in thread
From: Bob Bailey @ 2004-03-20  0:05 UTC (permalink / raw)
  To: Caml List

That's an interesting idea.  SO the key string and the value string will really point to the same
location in memory.  

So if I did (string,string ref) Hashtbl.t,
then the string ref would be a pointer to the key string.
Would that allow me to compare two string ref variables together?  Would the comparason of the
locations of the strings mean I wouldn't have to do a full string compare?

so in my module
type index = string ref
type dict = (string,index) Hashtbl.t

So in the case of
let a = Hashtbl.find dict "a"
let b = Hashtbl.find dict "a"

would a==b be the same as !a == !b?  
Also, would a==b be a faster comparason in this case?

thanx
bob




--- Yamagata Yoriyuki <yoriyuki@mbg.ocn.ne.jp> wrote:
> From: Bob Bailey <bobbaileyjr@yahoo.com>
> Subject: [Caml-list] Question about string ref and Hashtbl.
> Date: Fri, 19 Mar 2004 14:03:23 -0800 (PST)
> 
> > type t = 
> >   {
> >    mutable last:int;
> >    names: (string,int) Hashtbl.t;
> >    indexes:(int,string ref) Hashtbl.t;
> >   }
> 
> If I were you, I would define the dictonary as
> 
> type t = (string, string) Hashtbl.t
> 
> and add function as
> 
> let add dict str =
>     try
> 	Hashtbl.find dict str
>     with  Not_found ->
> 	  Hashtbl.add dict str str
> 
> In addition, if some of your strings become unused during exectuion,
> I would recommend useing WeakHash.
> 
> module Dict = Weak.Make (struct 
>        type t = string 
>        let equal = (=) 
>        let hash = Hashtbl.hash
> end)
> 
> and use Dict as Hasthtbl.  In this way, if a string becomes unused,
> it is GC-ed and removed from the dict.
> 
> --
> Yamagata Yoriyuki


__________________________________
Do you Yahoo!?
Yahoo! Mail - More reliable, more storage, less spam
http://mail.yahoo.com

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] Question about string ref and Hashtbl.
  2004-03-20  0:05 ` [Caml-list] Question about string ref and Hashtbl Bob Bailey
@ 2004-03-20  3:13   ` Remi Vanicat
  2004-03-22  8:32   ` Jean-Christophe Filliatre
  1 sibling, 0 replies; 6+ messages in thread
From: Remi Vanicat @ 2004-03-20  3:13 UTC (permalink / raw)
  To: caml-list

Bob Bailey <bobbaileyjr@yahoo.com> writes:

> That's an interesting idea.  SO the key string and the value string
> will really point to the same location in memory.
>
> So if I did (string,string ref) Hashtbl.t, then the string ref would
> be a pointer to the key string. 

It is a strange thing that the key and value of a location of the
Hashtbl are really the same thing. 

Mmm, i believe that I've understood what you want to do, and if you
don't want to modify  the value associated with a key, then you don't
need to use the type ref. the type ref have one and only one interest:
you can modify it.

> Would that allow me to compare two
> string ref variables together?  Would the comparason of the locations
> of the strings mean I wouldn't have to do a full string compare?
>
> so in my module type index = string ref type dict = (string,index)
> Hashtbl.t
>
> So in the case of let a = Hashtbl.find dict "a" let b = Hashtbl.find
> dict "a"
>
> would a==b be the same as !a == !b?  

Well if 

let a = Hashtbl.find dict "a"

and

let b = Hashtbl.find dict "a"

and if the Hashtbl have not been modified between the two call, then
a==b. and if a==b then !a == !b. but beware :

# let st = "test";;
val st : string = "test"
# let a = ref st;;
val a : string ref = {contents = "test"}
# let b = ref st;;
val b : string ref = {contents = "test"}
# a == b;;
- : bool = false
# !a == !b;;
- : bool = true

> Also, would a==b be a faster comparason in this case?

No, == test the equality of the pointer to the value, so it take the
same time for all type. By the way :

# "test" == "test";;
- : bool = false

May be you want to use =, but with (=) then (a = b) is equal to 
(!a = !b) for all reference a and b, and the evaluation time are more
or less the same (may be (!a = !b) is faster, but I won't bet on it).
-- 
Rémi Vanicat

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] Question about string ref and Hashtbl.
  2004-03-20  0:05 ` [Caml-list] Question about string ref and Hashtbl Bob Bailey
  2004-03-20  3:13   ` Remi Vanicat
@ 2004-03-22  8:32   ` Jean-Christophe Filliatre
  1 sibling, 0 replies; 6+ messages in thread
From: Jean-Christophe Filliatre @ 2004-03-22  8:32 UTC (permalink / raw)
  To: Bob Bailey; +Cc: Caml List


Bob Bailey writes:
 > That's an interesting idea.  SO the key string and the value string will really point to the same
 > location in memory.  
 > 
 > So if I did (string,string ref) Hashtbl.t,
 > then the string ref would be a pointer to the key string.
 > Would that allow me to compare two string ref variables together?  Would the comparason of the
 > locations of the strings mean I wouldn't have to do a full string compare?

What you are trying to achieve  is called hash-consing and comes up to
Lisp a long time ago. It is indeed implemented using a hash table, but
refs are not needed:

	let (hash_cons : string -> string) = 
	  let h = Hashtbl.create 97 in
	  fun s -> try Hashtbl.find h s with Not_found -> Hashtbl.add h s s; s

A string is mapped to itself  whenever it is encountered for the first
time.

If  you achieve  full hash  consing  (i.e. any  string passes  through
function hash_cons) then you can  safely substitute == for = (they are
now equivalent).

If you need  a total order over strings  (e.g. to instantiate Set.Make
or Map.Make) you need to associate unique integer keys to strings. You
may have a look at module Hashcons available here:
http://www.lri.fr/~filliatr/software.en.html 
which already  does this, and at  the companion modules  Hset and Hmap
(sets and  map for  hash-consed values). There  is also a  short paper
describing the hash-consing technique in ocaml.

Hops this helps,
-- 
Jean-Christophe Filliâtre

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] Question about string ref and Hashtbl.
  2004-03-19 22:48 ` Karl Zilles
@ 2004-03-19 23:58   ` Bob Bailey
  0 siblings, 0 replies; 6+ messages in thread
From: Bob Bailey @ 2004-03-19 23:58 UTC (permalink / raw)
  To: Caml List

that's what I was doing before I started down this road.  I am actually concatinating the strings
to make a larger hash of string words... (representing a hierarchical name type thing) 
SO I am actually copying strings all over the place... I am also reading a new set of data from a
file that has information that I need to cross reference...

So that's why I am trying to centralize strings.  That way I can create integer indexes into a
dictionary and save memory by not carrying around large numbers of strings that are really
duplicates.

thanks though
bob

--- Karl Zilles <zilles@1969.ws> wrote:
> Bob Bailey wrote:
>  > I am trying to only store the string once in memory.  This will allow 
> me to store the
>  > index of the string in the dictionary instead of copying the string 
> all over
>  > memory.
> 
> You can just pass a string around.. it doesn't actually make a copy of 
> the data.. what you're passing is just a pointer.
> 
> -------------------
> To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
> Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


__________________________________
Do you Yahoo!?
Yahoo! Mail - More reliable, more storage, less spam
http://mail.yahoo.com

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] Question about string ref and Hashtbl.
  2004-03-19 22:03 Bob Bailey
@ 2004-03-19 22:48 ` Karl Zilles
  2004-03-19 23:58   ` Bob Bailey
  0 siblings, 1 reply; 6+ messages in thread
From: Karl Zilles @ 2004-03-19 22:48 UTC (permalink / raw)
  To: Bob Bailey; +Cc: Caml List

Bob Bailey wrote:
 > I am trying to only store the string once in memory.  This will allow 
me to store the
 > index of the string in the dictionary instead of copying the string 
all over
 > memory.

You can just pass a string around.. it doesn't actually make a copy of 
the data.. what you're passing is just a pointer.

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Caml-list] Question about string ref and Hashtbl.
@ 2004-03-19 22:03 Bob Bailey
  2004-03-19 22:48 ` Karl Zilles
  0 siblings, 1 reply; 6+ messages in thread
From: Bob Bailey @ 2004-03-19 22:03 UTC (permalink / raw)
  To: Caml List

Sorry for the last mistake... sent by accident.

I have created the following structure

type t = 
  {
   mutable last:int;
   names: (string,int) Hashtbl.t;
   indexes:(int,string ref) Hashtbl.t;
  }

I am trying to only store the string once in memory.  This will allow me to store the
index of the string in the dictionary instead of copying the string all over
memory.

The add function is
let add dict str =
  try
   Hashtbl.find dict.names str
  with Not_found ->
    (
      Hashtbl.add dict.names str dict.last;
      Hashtbl.add dict.indexes dict.last (ref str);
      dict.last <- dict.last +1;
      dict.last - 1
    )

So with this code can I expect the correct behavior.

BTW, I have used ocamldebug to try to examine the addresses of the strings to see if the reference
points to the same string in the hashtbl key.
Also, I know that I can specialize the t.indexes with an IntHashtbl.  I have actually done that
but to simplify the example I have left it out.

Any advice would be appreciated.
bob


__________________________________
Do you Yahoo!?
Yahoo! Mail - More reliable, more storage, less spam
http://mail.yahoo.com

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2004-03-22  8:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20040320.075814.105434271.yoriyuki@mbg.ocn.ne.jp>
2004-03-20  0:05 ` [Caml-list] Question about string ref and Hashtbl Bob Bailey
2004-03-20  3:13   ` Remi Vanicat
2004-03-22  8:32   ` Jean-Christophe Filliatre
2004-03-19 22:03 Bob Bailey
2004-03-19 22:48 ` Karl Zilles
2004-03-19 23:58   ` Bob Bailey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).