caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* generic data type -> int function
@ 2005-03-24 16:38 Hal Daume III
  2005-03-25 11:37 ` [Caml-list] " Jean-Christophe Filliatre
  2005-03-25 19:15 ` Kim Nguyen
  0 siblings, 2 replies; 11+ messages in thread
From: Hal Daume III @ 2005-03-24 16:38 UTC (permalink / raw)
  To: Caml Mailing List

Hi all --

Is there a straightforward way (or a built in function, or...) to 
automatically map an enumerated data type to integers (and back, if 
possible, but that's not strictly necessary).  In particular, I need 
something like:


type a = A | B | C | D
type b = E | F | G 
type c = H of a | I of b

and I would like to automatically have three functions:

let a_to_int = function
  A -> 0 | B -> 1 | C -> 2 | D -> 2

let b_to_int = function
  E -> 0 | F -> 1 | G -> 2

let c_to_int = function
   H a -> a_to_int a
 | I b -> 3 + b_to_int b

obviously in this case it's simple, but if now i add a "D2" to 'a', 
everything gets screwed up.

Is there a build in magic function that will do this, or some way to do 
this otherwis (like with camlp4 or something)?

Thanks in advance -- 

 - Hal


-- 
 Hal Daume III                                   | hdaume@isi.edu
 "Arrest this man, he talks in maths."           | www.isi.edu/~hdaume


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Caml-list] generic data type -> int function
  2005-03-24 16:38 generic data type -> int function Hal Daume III
@ 2005-03-25 11:37 ` Jean-Christophe Filliatre
  2005-03-30  3:26   ` Hal Daume III
  2005-03-25 19:15 ` Kim Nguyen
  1 sibling, 1 reply; 11+ messages in thread
From: Jean-Christophe Filliatre @ 2005-03-25 11:37 UTC (permalink / raw)
  To: Hal Daume III; +Cc: Caml Mailing List


Hi,

 > Is there a straightforward way (or a built in function, or...) to 
 > automatically map an enumerated data type to integers (and back, if 
 > possible, but that's not strictly necessary).  

I don't think there such  a built-in function.  But using Obj.magic to
convert  constant  constructors  to  integers is  safe  (the  constant
constructors of a type are represented by integers starting from 0):

======================================================================
# type t = A|B|C|D;;
type t = A | B | C | D
# (Obj.magic A : int);;
- : int = 0
# (Obj.magic D : int);;
- : int = 3
======================================================================

Going the  way back  obviously requires a  dynamic check  (the integer
needs to be within the right bounds).

Note that I  do not encourage the use of Obj.magic.  I even think that
writing your own function to  convert constructors to integers will be
equally fast (since pattern-matching is compiled using a constant time
lookup table in this case); and you can macro-generate such functions.

Hope this helps,
-- 
Jean-Christophe


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Caml-list] generic data type -> int function
  2005-03-24 16:38 generic data type -> int function Hal Daume III
  2005-03-25 11:37 ` [Caml-list] " Jean-Christophe Filliatre
@ 2005-03-25 19:15 ` Kim Nguyen
  2005-03-29  7:29   ` Oliver Bandel
  1 sibling, 1 reply; 11+ messages in thread
From: Kim Nguyen @ 2005-03-25 19:15 UTC (permalink / raw)
  To: Hal Daume III; +Cc: Caml Mailing List

Le jeudi 24 mars 2005 à 08:38 -0800, Hal Daume III a écrit :
> Hi all --
> 
> Is there a straightforward way (or a built in function, or...) to 
> automatically map an enumerated data type to integers (and back, if 
> possible, but that's not strictly necessary).  In particular, I need 
> something like:

Hi,
	you can use the polymorphic function : Hashtbl.hash_param
	which happens to map constructors to their internal tag.
	You should be aware that this is only a (cool) side-effect of 	
	the current implementation and could change in the future.
	This is a bit of a hack but prevents you from using Obj.magic
	or automatically generating pattern matching  (which you should
	regenerate every time you change the type).


# type t = A | B | C | D of int | E of string | G;;
type t = A | B | C | D of int | E of string | G

# let to_int x = Hashtbl.hash_param 1 1 x;;
val to_int : 'a -> int = <fun>

# to_int A;; (* first constant constructor *)
- : int = 0

# to_int B;;
- : int = 1

# to_int C;;
- : int = 2

# to_int (D(42));; (* first non-constant constructor *)
- : int = 0

# to_int (E("foo"));;
- : int = 1

# to_int G;;
- : int = 3	

Regards,
-- 
Kim


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Caml-list] generic data type -> int function
  2005-03-25 19:15 ` Kim Nguyen
@ 2005-03-29  7:29   ` Oliver Bandel
  0 siblings, 0 replies; 11+ messages in thread
From: Oliver Bandel @ 2005-03-29  7:29 UTC (permalink / raw)
  To: caml-list

On Fri, Mar 25, 2005 at 08:15:59PM +0100, Kim Nguyen wrote:
> Le jeudi 24 mars 2005 à 08:38 -0800, Hal Daume III a écrit :
> > Hi all --
> > 
> > Is there a straightforward way (or a built in function, or...) to 
> > automatically map an enumerated data type to integers (and back, if 
> > possible, but that's not strictly necessary).  In particular, I need 
> > something like:
> 
> Hi,
> 	you can use the polymorphic function : Hashtbl.hash_param
> 	which happens to map constructors to their internal tag.
> 	You should be aware that this is only a (cool) side-effect of 	
> 	the current implementation and could change in the future.
> 	This is a bit of a hack but prevents you from using Obj.magic
> 	or automatically generating pattern matching  (which you should
> 	regenerate every time you change the type).
> 
> 
> # type t = A | B | C | D of int | E of string | G;;
> type t = A | B | C | D of int | E of string | G
> 
> # let to_int x = Hashtbl.hash_param 1 1 x;;
> val to_int : 'a -> int = <fun>

Well, I'm not clear what this function computes.
I never used Hashtbl.hash_param and I do not understand the
descrption.... but:

> 
> # to_int A;; (* first constant constructor *)
> - : int = 0
> 
> # to_int B;;
> - : int = 1
> 
> # to_int C;;
> - : int = 2
> 
> # to_int (D(42));; (* first non-constant constructor *)
> - : int = 0
> 
> # to_int (E("foo"));;
> - : int = 1
> 
> # to_int G;;
> - : int = 3	


It seems to me that your solution does not create the result that was
asked for.

The value 0 and 1 in your above example are more than once
the output of the function to_int.

But as I understand it, there should never be one integer-output
more than once!

IMHO a function is searched, that creates something like a
coding table, as sociologists often do coding their questions/answers
into integers (because of a lack of good software that does this automatically;
maybe today the software is better, but it seems many sociologists
do such integer-coding tables nevertheless habitually ;-)).

But when you have an integer as an output more than once,
you go into big trouble then...


So, "to_int A;;" and "to_int (D(42));;" must not have the same output value!!!

Ciao,
   Oliver


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Caml-list] generic data type -> int function
  2005-03-25 11:37 ` [Caml-list] " Jean-Christophe Filliatre
@ 2005-03-30  3:26   ` Hal Daume III
  2005-03-30 22:27     ` Oliver Bandel
  2005-03-30 22:29     ` Oliver Bandel
  0 siblings, 2 replies; 11+ messages in thread
From: Hal Daume III @ 2005-03-30  3:26 UTC (permalink / raw)
  To: Caml Mailing List

Unsatisfied with any of the solutions offered to me, I threw together a 
quick perl script to do this for me.  For anyone who wants it, you can get 
it at:

  http://www.isi.edu/~hdaume/type_to_enum.pl

It's very limited in that it has no knowledge of built in types, and type 
specs must all be on one line per type, but for my purposes it works 
keenly.

Example input:

type etype = GPE | LOC | ORG | PER | NAE_e | BOS_e
type mtype = BAR | NAM | NOM | PRE | PRO | OTHER | NAE_m | BOS_m
type pairs = EM of etype*mtype | EE of etype*etype | MM of mtype*mtype
type pairs2 = EP of etype * pairs | MP of mtype * pairs


Corresponding output:

let int_of_etype = function | GPE -> 0 | LOC -> 1 | ORG -> 2 | PER -> 3 | NAE_e -> 4 | BOS_e -> 5
let int_of_mtype = function | BAR -> 0 | NAM -> 1 | NOM -> 2 | PRE -> 3 | PRO -> 4 | OTHER -> 5 | NAE_m -> 6 | BOS_m -> 7
let int_of_pairs = function | EM (etype_0, mtype_1) -> 0 + 1 * (int_of_etype etype_0 + 6 * (int_of_mtype mtype_1)) | EE (etype_0, etype_1) -> 48 + 1 * (int_of_etype etype_0 + 6 * (int_of_etype etype_1)) | MM (mtype_0, mtype_1) -> 84 + 1 * (int_of_mtype mtype_0 + 8 * (int_of_mtype mtype_1))
let int_of_pairs2 = function | EP (etype_0, pairs_1) -> 0 + 1 * (int_of_etype etype_0 + 6 * (int_of_pairs pairs_1)) | MP (mtype_0, pairs_1) -> 888 + 1 * (int_of_mtype mtype_0 + 8 * (int_of_pairs pairs_1))


I've stress tested it a bit and it seems to be all in working order.

On Fri, 25 Mar 2005, Jean-Christophe Filliatre wrote:

> 
> Hi,
> 
>  > Is there a straightforward way (or a built in function, or...) to 
>  > automatically map an enumerated data type to integers (and back, if 
>  > possible, but that's not strictly necessary).  
> 
> I don't think there such  a built-in function.  But using Obj.magic to
> convert  constant  constructors  to  integers is  safe  (the  constant
> constructors of a type are represented by integers starting from 0):
> 
> ======================================================================
> # type t = A|B|C|D;;
> type t = A | B | C | D
> # (Obj.magic A : int);;
> - : int = 0
> # (Obj.magic D : int);;
> - : int = 3
> ======================================================================
> 
> Going the  way back  obviously requires a  dynamic check  (the integer
> needs to be within the right bounds).
> 
> Note that I  do not encourage the use of Obj.magic.  I even think that
> writing your own function to  convert constructors to integers will be
> equally fast (since pattern-matching is compiled using a constant time
> lookup table in this case); and you can macro-generate such functions.
> 
> Hope this helps,
> 

-- 
 Hal Daume III                                   | hdaume@isi.edu
 "Arrest this man, he talks in maths."           | www.isi.edu/~hdaume


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Caml-list] generic data type -> int function
  2005-03-30  3:26   ` Hal Daume III
@ 2005-03-30 22:27     ` Oliver Bandel
  2005-03-31 14:33       ` Hal Daume III
  2005-03-30 22:29     ` Oliver Bandel
  1 sibling, 1 reply; 11+ messages in thread
From: Oliver Bandel @ 2005-03-30 22:27 UTC (permalink / raw)
  To: caml-list

On Tue, Mar 29, 2005 at 07:26:02PM -0800, Hal Daume III wrote:
[...]
> type etype = GPE | LOC | ORG | PER | NAE_e | BOS_e
> type mtype = BAR | NAM | NOM | PRE | PRO | OTHER | NAE_m | BOS_m
> type pairs = EM of etype*mtype | EE of etype*etype | MM of mtype*mtype
> type pairs2 = EP of etype * pairs | MP of mtype * pairs
> 
> 
> Corresponding output:
> 
> let int_of_etype = function | GPE -> 0 | LOC -> 1 | ORG -> 2 | PER -> 3 | NAE_e -> 4 | BOS_e -> 5
> let int_of_mtype = function | BAR -> 0 | NAM -> 1 | NOM -> 2 | PRE -> 3 | PRO -> 4 | OTHER -> 5 | NAE_m -> 6 | BOS_m -> 7
> let int_of_pairs = function | EM (etype_0, mtype_1) -> 0 + 1 * (int_of_etype etype_0 + 6 * (int_of_mtype mtype_1)) | EE (etype_0, etype_1) -> 48 + 1 * (int_of_etype etype_0 + 6 * (int_of_etype etype_1)) | MM (mtype_0, mtype_1) -> 84 + 1 * (int_of_mtype mtype_0 + 8 * (int_of_mtype mtype_1))
> let int_of_pairs2 = function | EP (etype_0, pairs_1) -> 0 + 1 * (int_of_etype etype_0 + 6 * (int_of_pairs pairs_1)) | MP (mtype_0, pairs_1) -> 888 + 1 * (int_of_mtype mtype_0 + 8 * (int_of_pairs pairs_1))
[...]

That's nice, but for what do you need this?

Maybe this way is not necessary to go.

So, what is this good for?!

Ciao,
   Oliver


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Caml-list] generic data type -> int function
  2005-03-30  3:26   ` Hal Daume III
  2005-03-30 22:27     ` Oliver Bandel
@ 2005-03-30 22:29     ` Oliver Bandel
  2005-03-31 14:33       ` Hal Daume III
  1 sibling, 1 reply; 11+ messages in thread
From: Oliver Bandel @ 2005-03-30 22:29 UTC (permalink / raw)
  To: caml-list

On Tue, Mar 29, 2005 at 07:26:02PM -0800, Hal Daume III wrote:
> Unsatisfied with any of the solutions offered to me, I threw together a 
> quick perl script to do this for me.  For anyone who wants it, you can get 
> it at:
> 
>   http://www.isi.edu/~hdaume/type_to_enum.pl
> 
> It's very limited in that it has no knowledge of built in types, and type 
> specs must all be on one line per type, but for my purposes it works 
> keenly.
> 
> Example input:
> 
> type etype = GPE | LOC | ORG | PER | NAE_e | BOS_e
> type mtype = BAR | NAM | NOM | PRE | PRO | OTHER | NAE_m | BOS_m
> type pairs = EM of etype*mtype | EE of etype*etype | MM of mtype*mtype
> type pairs2 = EP of etype * pairs | MP of mtype * pairs
> 
> 
> Corresponding output:
> 
> let int_of_etype = function | GPE -> 0 | LOC -> 1 | ORG -> 2 | PER -> 3 | NAE_e -> 4 | BOS_e -> 5
> let int_of_mtype = function | BAR -> 0 | NAM -> 1 | NOM -> 2 | PRE -> 3 | PRO -> 4 | OTHER -> 5 | NAE_m -> 6 | BOS_m -> 7


...ooops...  GPE as well as BAR has the same output-value...
...so... if there is no problem with that kind of similar integers,
why don't you use Kims solution?!
Ut seemed to me that it has solved your problem, at least in this respect.

Ciao,
   Oliver


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Caml-list] generic data type -> int function
  2005-03-30 22:27     ` Oliver Bandel
@ 2005-03-31 14:33       ` Hal Daume III
  2005-03-31 17:01         ` Richard Jones
  0 siblings, 1 reply; 11+ messages in thread
From: Hal Daume III @ 2005-03-31 14:33 UTC (permalink / raw)
  To: Oliver Bandel; +Cc: caml-list

> On Tue, Mar 29, 2005 at 07:26:02PM -0800, Hal Daume III wrote:
> [...]
> > type etype = GPE | LOC | ORG | PER | NAE_e | BOS_e
> > type mtype = BAR | NAM | NOM | PRE | PRO | OTHER | NAE_m | BOS_m
> > type pairs = EM of etype*mtype | EE of etype*etype | MM of mtype*mtype
> > type pairs2 = EP of etype * pairs | MP of mtype * pairs
> > 
> > 
> > Corresponding output:
> > 
> > let int_of_etype = function | GPE -> 0 | LOC -> 1 | ORG -> 2 | PER -> 3 | NAE_e -> 4 | BOS_e -> 5
> > let int_of_mtype = function | BAR -> 0 | NAM -> 1 | NOM -> 2 | PRE -> 3 | PRO -> 4 | OTHER -> 5 | NAE_m -> 6 | BOS_m -> 7
> > let int_of_pairs = function | EM (etype_0, mtype_1) -> 0 + 1 * (int_of_etype etype_0 + 6 * (int_of_mtype mtype_1)) | EE (etype_0, etype_1) -> 48 + 1 * (int_of_etype etype_0 + 6 * (int_of_etype etype_1)) | MM (mtype_0, mtype_1) -> 84 + 1 * (int_of_mtype mtype_0 + 8 * (int_of_mtype mtype_1))
> > let int_of_pairs2 = function | EP (etype_0, pairs_1) -> 0 + 1 * (int_of_etype etype_0 + 6 * (int_of_pairs pairs_1)) | MP (mtype_0, pairs_1) -> 888 + 1 * (int_of_mtype mtype_0 + 8 * (int_of_pairs pairs_1))
> [...]
> 
> That's nice, but for what do you need this?
> 
> Maybe this way is not necessary to go.
> 
> So, what is this good for?!

I want to use complex data types as indices in arrays, essentially.

-- 
 Hal Daume III                                   | hdaume@isi.edu
 "Arrest this man, he talks in maths."           | www.isi.edu/~hdaume


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Caml-list] generic data type -> int function
  2005-03-30 22:29     ` Oliver Bandel
@ 2005-03-31 14:33       ` Hal Daume III
  0 siblings, 0 replies; 11+ messages in thread
From: Hal Daume III @ 2005-03-31 14:33 UTC (permalink / raw)
  To: Oliver Bandel; +Cc: caml-list

> > type etype = GPE | LOC | ORG | PER | NAE_e | BOS_e
> > type mtype = BAR | NAM | NOM | PRE | PRO | OTHER | NAE_m | BOS_m
> > type pairs = EM of etype*mtype | EE of etype*etype | MM of mtype*mtype
> > type pairs2 = EP of etype * pairs | MP of mtype * pairs
> > 
> > 
> > Corresponding output:
> > 
> > let int_of_etype = function | GPE -> 0 | LOC -> 1 | ORG -> 2 | PER -> 3 | NAE_e -> 4 | BOS_e -> 5
> > let int_of_mtype = function | BAR -> 0 | NAM -> 1 | NOM -> 2 | PRE -> 3 | PRO -> 4 | OTHER -> 5 | NAE_m -> 6 | BOS_m -> 7
> 
> 
> ...ooops...  GPE as well as BAR has the same output-value...
> ...so... if there is no problem with that kind of similar integers,
> why don't you use Kims solution?!
> Ut seemed to me that it has solved your problem, at least in this respect.

Because it cannot handle the 'type pairs' or 'type pairs2' above.  The 
enumerations are unique per type.

-- 
 Hal Daume III                                   | hdaume@isi.edu
 "Arrest this man, he talks in maths."           | www.isi.edu/~hdaume


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Caml-list] generic data type -> int function
  2005-03-31 14:33       ` Hal Daume III
@ 2005-03-31 17:01         ` Richard Jones
  2005-03-31 18:04           ` Hal Daume III
  0 siblings, 1 reply; 11+ messages in thread
From: Richard Jones @ 2005-03-31 17:01 UTC (permalink / raw)
  To: caml-list

On Thu, Mar 31, 2005 at 06:33:12AM -0800, Hal Daume III wrote:
> I want to use complex data types as indices in arrays, essentially.

How about using Hashtbl or Map with the key being the complex data
type?  Of course it won't be as fast an integer index into an Array,
but it might be easier to understand.

Rich.

-- 
Richard Jones, CTO Merjis Ltd.
Merjis - web marketing and technology - http://merjis.com
Team Notepad - intranets and extranets for business - http://team-notepad.com


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Caml-list] generic data type -> int function
  2005-03-31 17:01         ` Richard Jones
@ 2005-03-31 18:04           ` Hal Daume III
  0 siblings, 0 replies; 11+ messages in thread
From: Hal Daume III @ 2005-03-31 18:04 UTC (permalink / raw)
  To: caml-list

On Thu, 31 Mar 2005, Richard Jones wrote:

> On Thu, Mar 31, 2005 at 06:33:12AM -0800, Hal Daume III wrote:
> > I want to use complex data types as indices in arrays, essentially.
> 
> How about using Hashtbl or Map with the key being the complex data
> type?  Of course it won't be as fast an integer index into an Array,
> but it might be easier to understand.

Way way way too slow.  As it is, using arrays, my program takes about 5 
hours to run.  A first version using a Hashtbl took well over two days.

-- 
 Hal Daume III                                   | hdaume@isi.edu
 "Arrest this man, he talks in maths."           | www.isi.edu/~hdaume


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2005-03-31 18:05 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-03-24 16:38 generic data type -> int function Hal Daume III
2005-03-25 11:37 ` [Caml-list] " Jean-Christophe Filliatre
2005-03-30  3:26   ` Hal Daume III
2005-03-30 22:27     ` Oliver Bandel
2005-03-31 14:33       ` Hal Daume III
2005-03-31 17:01         ` Richard Jones
2005-03-31 18:04           ` Hal Daume III
2005-03-30 22:29     ` Oliver Bandel
2005-03-31 14:33       ` Hal Daume III
2005-03-25 19:15 ` Kim Nguyen
2005-03-29  7:29   ` Oliver Bandel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).