caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Efficency of varient types
@ 2005-11-25 23:53 Michael D. Adams
  2005-11-26  0:31 ` [Caml-list] " Nicolas Cannasse
                   ` (4 more replies)
  0 siblings, 5 replies; 27+ messages in thread
From: Michael D. Adams @ 2005-11-25 23:53 UTC (permalink / raw)
  To: caml-list

I have recently learned about OCaml and have been impressed by how
fast it is in the benchmarks.  However I have discovered that variant
types can slow down a program quite a bit.  For example, consider the
Ackermann function implemented for int (see program 1) and the same
Ackermann function implemented for a "type value = Int of int | String
of int" (program 2).  The second one is ten times slower!  (Using
ocamlopt.)

I am working on a program that translates code from scheme into ocaml.
 (Later I will try python into ocaml.)  Because it is a dynamicly
typed language, the most natural translation would make everything a
function of a large variant type like:

type value = Int of int
  | Char of char
  | String of string
  (* etc. *)

However the point of my project is to translate into *efficient* ocaml
code so I would like to avoid the ten times slow down that I've seen
with variant types.

Is there is a way to make program 2 as fast as program 1?  (Still
keeping the variant type of course.)

Why is program 2 so slow?  Is it the cost of heap allocation?  The GC?
 Having to access memory instead of registers?  Constructor matching?

One trick that I've thought of would be to push some of the tag
information into the pointer.  For example an Int would be represented
exactly like an int (i.e. an odd address), but a String would be
represented as a pointer to a block (i.e. an even address).  I could
hack around with the Obj module (indeed some experiments with that
show very good performance), but it would be nice if there were a
cleaner way such as a way to annotate constructors so OCaml tries to
avoid using blocks for them.  (Stuffing tags into pointers is very
common in the Scheme world.)

Along those lines, what does the OCaml GC do when it finds an even but
not multiple of 4 pointer?  Somewhere in the documentation it noted
that everything is allocated on 4 byte boundaries so in theory the GC
could ignore everything that is not 4 byte aligned, not just odd
pointers.  This would leave more bits available for stuffing the tag
into the pointer.  For example, XX1 could be an Int constructor, 110 a
char constructor, 010 a bool constructor and X00 be a pointer to any
of the constructors that are represented with blocks.

Michael D. Adams
mdmkolbe@gmail.com

(* Program 1 *)
(*************)
let rec ack m n =
  if m = 0 then n + 1
  else if n = 0 then ack (m - 1) 1
  else ack (m - 1) (ack m (n - 1))

let _ = ack 3 9

(* Program 2 *)
(*************)
type value = Int of int | String of string

let zero = Int 0
let one = Int 1
let plus x y = match x,y with
  | Int x,Int y -> Int (x + y)
let minus x y = match x,y with
  | Int x,Int y -> Int (x - y)

let rec ack m n =
  if m = zero then plus n one
  else if n = zero then ack (minus m one) one
  else ack (minus m one) (ack m (minus n one))

let _ = ack (Int 3) (Int 9)


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] Efficency of varient types
  2005-11-25 23:53 Efficency of varient types Michael D. Adams
@ 2005-11-26  0:31 ` Nicolas Cannasse
  2005-11-26  1:22   ` Brian Hurt
                     ` (2 more replies)
  2005-11-26  1:18 ` Jon Harrop
                   ` (3 subsequent siblings)
  4 siblings, 3 replies; 27+ messages in thread
From: Nicolas Cannasse @ 2005-11-26  0:31 UTC (permalink / raw)
  To: Michael D. Adams; +Cc: caml-list

Michael D. Adams wrote:
> I have recently learned about OCaml and have been impressed by how
> fast it is in the benchmarks.  However I have discovered that variant
> types can slow down a program quite a bit.  For example, consider the
> Ackermann function implemented for int (see program 1) and the same
> Ackermann function implemented for a "type value = Int of int | String
> of int" (program 2).  The second one is ten times slower!  (Using
> ocamlopt.)

In order to understand what there is such difference, it's useful to 
learn the ocaml memory model at runtime :

- int are 31 bits unboxed value with last bit set to 1 in order to 
differenciate them with GC allocated pointers.
- tagged variants are GC allocated blocks with a discriminating "tag" in 
the header.
- chars and booleans are integers at runtime

The second bit is used to mark an exception but it's only internal and 
temporary when dealing with callbacks.

If you have a tagged variant where all constructors have a parameter, 
you can use Obj module to unbox the Int variant but the code is a lot 
less readable.

Nicolas


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] Efficency of varient types
  2005-11-25 23:53 Efficency of varient types Michael D. Adams
  2005-11-26  0:31 ` [Caml-list] " Nicolas Cannasse
@ 2005-11-26  1:18 ` Jon Harrop
  2005-11-27 14:57 ` Lukasz Stafiniak
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 27+ messages in thread
From: Jon Harrop @ 2005-11-26  1:18 UTC (permalink / raw)
  To: caml-list

On Friday 25 November 2005 23:53, Michael D. Adams wrote:
> I am working on a program that translates code from scheme into ocaml.
>  (Later I will try python into ocaml.)  Because it is a dynamicly
> typed language, the most natural translation would make everything a
> function of a large variant type like:
>
> type value = Int of int
>
>   | Char of char
>   | String of string

Yes, exactly.

> However the point of my project is to translate into *efficient* ocaml
> code so I would like to avoid the ten times slow down that I've seen
> with variant types.

This is the main reason why dynamically typed languages are typically much 
slower than statically typed languages.

Look at my ray tracer comparison:

  http://www.ffconsultancy.com/free/ray_tracer/languages.html

Adding type declarations makes the Lisp implementation much faster because, 
without them and without static typing, the Lisp compiler (SBCL) resorts to 
boxing almost everything as you have done.

The Stalin-compiled Scheme implementation is vastly faster than the 
SBCL-compiled Lisp implementation for many reasons. One of the most important 
reasons is that Stalin is a whole-program optimising compiler and spends a 
significant part of its 10 minute compile time removing as much boxing as 
possible.

> Is there is a way to make program 2 as fast as program 1?  (Still
> keeping the variant type of course.)

No.

Transforming from version 2 to version 1 (unboxing) is exactly what the 
optimisers in the compilers of dynamically typed languages are doing.

> Why is program 2 so slow?  Is it the cost of heap allocation?  The GC?
> Having to access memory instead of registers?  Constructor matching?

All of the above.

> One trick that I've thought of would be to push some of the tag
> information into the pointer.  For example an Int would be represented
> exactly like an int (i.e. an odd address), but a String would be
> represented as a pointer to a block (i.e. an even address).  I could
> hack around with the Obj module (indeed some experiments with that
> show very good performance), but it would be nice if there were a
> cleaner way such as a way to annotate constructors so OCaml tries to
> avoid using blocks for them.  (Stuffing tags into pointers is very
> common in the Scheme world.)

You would probably be better off trying to apply the unboxing optimisations 
(e.g. that Stalin does).

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] Efficency of varient types
  2005-11-26  0:31 ` [Caml-list] " Nicolas Cannasse
@ 2005-11-26  1:22   ` Brian Hurt
  2005-11-26  9:39     ` Nicolas Cannasse
  2005-11-28  0:17     ` Obj or not Obj Christophe Raffalli
  2005-11-26  2:54   ` [Caml-list] Efficency of varient types skaller
  2005-11-27  5:06   ` Michael D. Adams
  2 siblings, 2 replies; 27+ messages in thread
From: Brian Hurt @ 2005-11-26  1:22 UTC (permalink / raw)
  To: Nicolas Cannasse; +Cc: Michael D. Adams, caml-list



On Sat, 26 Nov 2005, Nicolas Cannasse wrote:

> If you have a tagged variant where all constructors have a parameter, you can 
> use Obj module to unbox the Int variant but the code is a lot less readable.

As a general rule, unless you work at INRIA, you should never, ever use 
the Obj module.  Not only will the code be signifigantly less readable, 
it'll also be signifigantly more error prone, especially obscure, hard to 
debug unless you are intimately familiar with the inners of the garbage 
collector, segfault you program at random times, boy doesn't this feel 
like C++ bugs.

Brian


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] Efficency of varient types
  2005-11-26  0:31 ` [Caml-list] " Nicolas Cannasse
  2005-11-26  1:22   ` Brian Hurt
@ 2005-11-26  2:54   ` skaller
  2005-11-27  5:06   ` Michael D. Adams
  2 siblings, 0 replies; 27+ messages in thread
From: skaller @ 2005-11-26  2:54 UTC (permalink / raw)
  To: Nicolas Cannasse; +Cc: Michael D. Adams, caml-list

On Sat, 2005-11-26 at 01:31 +0100, Nicolas Cannasse wrote:
> Michael D. Adams wrote:
> > I have recently learned about OCaml and have been impressed by how
> > fast it is in the benchmarks.  However I have discovered that variant
> > types can slow down a program quite a bit.  

> In order to understand what there is such difference, it's useful to 
> learn the ocaml memory model at runtime :

As everyone else says .. but I'll add my piece since I have studied
this particular function in great detail:

http://felix.sourceforge.net/current/speed/en_flx_perf_0005.html

The performance is *heavily* affected by stack space per
function call (in fact, directly proportional to it with
the limits of cache size). Boxing isn't an issue -- any
translator boxing anything in this test is out of the race.

This is a terrible function to choose to compare general
performance, precisely because it is so heavily dominated by
optimisation of stack use. More general benchmarks including
for example Jon Harrop's ray tracer show Ocaml does quite
well, despite boxing. So except in very particular cases,
I wouldn't worry too much about Ocaml being slow.

My own translator Felix (written in Ocaml) wins this test easily 
for unknown reasons -- trouncing even gcc, despite the fact
it actually generates C++ code compiled with gcc using
the same options.

-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] Efficency of varient types
  2005-11-26  1:22   ` Brian Hurt
@ 2005-11-26  9:39     ` Nicolas Cannasse
  2005-11-28  0:17     ` Obj or not Obj Christophe Raffalli
  1 sibling, 0 replies; 27+ messages in thread
From: Nicolas Cannasse @ 2005-11-26  9:39 UTC (permalink / raw)
  To: Brian Hurt; +Cc: Michael D. Adams, caml-list

Brian Hurt wrote:
> 
> 
> On Sat, 26 Nov 2005, Nicolas Cannasse wrote:
> 
>> If you have a tagged variant where all constructors have a parameter, 
>> you can use Obj module to unbox the Int variant but the code is a lot 
>> less readable.
> 
> 
> As a general rule, unless you work at INRIA, you should never, ever use 
> the Obj module.  Not only will the code be signifigantly less readable, 
> it'll also be signifigantly more error prone, especially obscure, hard 
> to debug unless you are intimately familiar with the inners of the 
> garbage collector, segfault you program at random times, boy doesn't 
> this feel like C++ bugs.
> 
> Brian
> 

Still, you have sometimes to write C libraries, even if they can 
compromise the safety of the program. Using Obj in a OCaml library with 
an abstract interface in order to get additional optimizations can be 
useful, but only if you know what you're doing.

The original author was talking about some code generator that was 
targeting OCaml so I guess in that case especialy using Obj makes sense.

Nicolas


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] Efficency of varient types
  2005-11-26  0:31 ` [Caml-list] " Nicolas Cannasse
  2005-11-26  1:22   ` Brian Hurt
  2005-11-26  2:54   ` [Caml-list] Efficency of varient types skaller
@ 2005-11-27  5:06   ` Michael D. Adams
  2005-11-27  5:45     ` Brian Hurt
  2 siblings, 1 reply; 27+ messages in thread
From: Michael D. Adams @ 2005-11-27  5:06 UTC (permalink / raw)
  To: caml-list

On 11/25/05, Nicolas Cannasse <ncannasse@motion-twin.com> wrote:
> Michael D. Adams wrote:
> > I have recently learned about OCaml and have been impressed by how
> > fast it is in the benchmarks.  However I have discovered that variant
> > types can slow down a program quite a bit.  For example, consider the
> > Ackermann function implemented for int (see program 1) and the same
> > Ackermann function implemented for a "type value = Int of int | String
> > of int" (program 2).  The second one is ten times slower!  (Using
> > ocamlopt.)
>
> In order to understand what there is such difference, it's useful to
> learn the ocaml memory model at runtime :
>
> - int are 31 bits unboxed value with last bit set to 1 in order to
> differenciate them with GC allocated pointers.
> - tagged variants are GC allocated blocks with a discriminating "tag" in
> the header.
> - chars and booleans are integers at runtime

That all makes sense.

> The second bit is used to mark an exception but it's only internal and
> temporary when dealing with callbacks.

Could to elaborate on this "second bit"?  (I assume you mean the bit
in the two's position.)  Or is there a document that might describe
it?

I am very interested in how this bit is used and whether the GC will
ignore values ending with the bits 10.

> If you have a tagged variant where all constructors have a parameter,
> you can use Obj module to unbox the Int variant but the code is a lot
> less readable.

I agree, which is why it was my hope that OCaml might do some of that
for me.  Consider a home brew bool type, "type mybool = Mytrue |
Myfalse".  If the compiler were smart enough, it could represent that
as an unboxed type.  From there it might be a small step to
semi-unboxed types such as the one I started this discussion with,
"type value = Int of int | Bool of bool | String of string".

It sounds like that is not possible, so I have to settle for the Obj module.

Michael D. Adams
mdmkolbe@gmail.com

P.S. I should note that experiments using the Obj module to manually
do semi-boxing show very good performance.  The following code
performs only 50% slower than a completely unboxed version.  Compare
that with 900% slower with boxed, variant types.

let plus x y =
  if Obj.is_int (Obj.repr x) && Obj.is_int (Obj.repr y)
    then Obj.magic ((Obj.magic x) + (Obj.magic y))
    else Obj.magic (0)

let minus x y =
  if Obj.is_int (Obj.repr x) && Obj.is_int (Obj.repr y)
    then Obj.magic ((Obj.magic x) - (Obj.magic y))
    else Obj.magic (0)

let zero = 0
let one = 1

let rec ack m n =
  if m = zero then plus n one
  else if n = zero then ack (minus m one) one
  else ack (minus m one) (ack m (minus n one))

let _ = ack (3) (9)


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] Efficency of varient types
  2005-11-27  5:06   ` Michael D. Adams
@ 2005-11-27  5:45     ` Brian Hurt
  2005-11-27 10:02       ` Nicolas Cannasse
  2005-11-27 15:35       ` Michael D. Adams
  0 siblings, 2 replies; 27+ messages in thread
From: Brian Hurt @ 2005-11-27  5:45 UTC (permalink / raw)
  To: Michael D. Adams; +Cc: caml-list



On Sun, 27 Nov 2005, Michael D. Adams wrote:

> I agree, which is why it was my hope that OCaml might do some of that
> for me.  Consider a home brew bool type, "type mybool = Mytrue |
> Myfalse".  If the compiler were smart enough, it could represent that
> as an unboxed type.

The compiler is smart enough to do that already- variants without 
associated data are represented as ints.

> From there it might be a small step to
> semi-unboxed types such as the one I started this discussion with,
> "type value = Int of int | Bool of bool | String of string".

The problem here is that the variants take two words- one word which says 
which variant it is, and the second word which is the unboxed int, unboxed 
bool, or pointer to the boxed string.

The problem with unboxing multi-word structures is that this would break 
other things.  For example, consider List.rev:

let rev lst =
     let rec loop accum = function
         | h :: t -> loop (h :: accum) t
         | [] -> accum
     in
     loop [] lst
;;

This function has a type of 'a list -> 'a list.  Unlike C++ templates, 
Ocaml only generates one copy of this function that works on all types. 
This is because all the list elements fit into a single word- either their 
unboxed types that fit into a single word (int, boolean, char), or the 
members of the list are just pointers to the real (boxed) data.

Brian


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] Efficency of varient types
  2005-11-27  5:45     ` Brian Hurt
@ 2005-11-27 10:02       ` Nicolas Cannasse
  2005-11-27 15:35       ` Michael D. Adams
  1 sibling, 0 replies; 27+ messages in thread
From: Nicolas Cannasse @ 2005-11-27 10:02 UTC (permalink / raw)
  To: Brian Hurt; +Cc: Michael D. Adams, caml-list

Brian Hurt wrote:
 >
 >
 > On Sun, 27 Nov 2005, Michael D. Adams wrote:
 >
 >> I agree, which is why it was my hope that OCaml might do some of that
 >> for me.  Consider a home brew bool type, "type mybool = Mytrue |
 >> Myfalse".  If the compiler were smart enough, it could represent that
 >> as an unboxed type.
 >
 >
 > The compiler is smart enough to do that already- variants without
 > associated data are represented as ints.

Exactly.
So actually the custom bool can be "safely" casted to the corresponding 
real bool since they have same runtime representation.

 >> From there it might be a small step to
 >> semi-unboxed types such as the one I started this discussion with,
 >> "type value = Int of int | Bool of bool | String of string".
 >
 >
 > The problem here is that the variants take two words- one word which
 > says which variant it is, and the second word which is the unboxed int,
 > unboxed bool, or pointer to the boxed string.

It's a bit more complex.
A variant have a header storing :
  - it's tag
  - some GC bits
  - it's size (number of fields)
Then the data is followed by N fields which are the variant parameters.
That's why there is a semantical difference between

   Tag of int * int

and

   Tag of (int * int)

In the first case it's a 2 fields block, in the second it's a on-field 
block storing a 2 fields tuple.

I think that the compiler is pretty good at unboxing floats and maybe 
int32 when doing some numerical calculus. Other optimizations are more 
tricky.

Nicolas


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] Efficency of varient types
  2005-11-25 23:53 Efficency of varient types Michael D. Adams
  2005-11-26  0:31 ` [Caml-list] " Nicolas Cannasse
  2005-11-26  1:18 ` Jon Harrop
@ 2005-11-27 14:57 ` Lukasz Stafiniak
  2005-11-27 15:47   ` Lukasz Stafiniak
  2005-11-28  8:14   ` Christophe Raffalli
  2005-11-28  7:24 ` David Baelde
  2005-12-01 17:05 ` Stefan Monnier
  4 siblings, 2 replies; 27+ messages in thread
From: Lukasz Stafiniak @ 2005-11-27 14:57 UTC (permalink / raw)
  To: Michael D. Adams; +Cc: caml-list

2005/11/26, Michael D. Adams <mdmkolbe@gmail.com>:
> I have recently learned about OCaml and have been impressed by how
> fast it is in the benchmarks.  However I have discovered that variant
> types can slow down a program quite a bit.
[...]
> I am working on a program that translates code from scheme into ocaml.
>  (Later I will try python into ocaml.)  Because it is a dynamicly
> typed language, the most natural translation would make everything a
> function of a large variant type like:
>
> type value = Int of int
>  | Char of char
>  | String of string
>  (* etc. *)
>

Use MetaOCaml with native-code compilation and untagging.
MetaOCaml retypechecks and outputs the untagged version if it can.

Best Regards,
Lukasz


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] Efficency of varient types
  2005-11-27  5:45     ` Brian Hurt
  2005-11-27 10:02       ` Nicolas Cannasse
@ 2005-11-27 15:35       ` Michael D. Adams
  2005-11-27 18:08         ` Brian Hurt
  1 sibling, 1 reply; 27+ messages in thread
From: Michael D. Adams @ 2005-11-27 15:35 UTC (permalink / raw)
  To: caml-list

On 11/27/05, Brian Hurt <bhurt@spnz.org> wrote:
> On Sun, 27 Nov 2005, Michael D. Adams wrote:
> > I agree, which is why it was my hope that OCaml might do some of that
> > for me.  Consider a home brew bool type, "type mybool = Mytrue |
> > Myfalse".  If the compiler were smart enough, it could represent that
> > as an unboxed type.
>
> The compiler is smart enough to do that already- variants without
> associated data are represented as ints.

I am glad to hear that.

> > From there it might be a small step to
> > semi-unboxed types such as the one I started this discussion with,
> > "type value = Int of int | Bool of bool | String of string".
>
> The problem here is that the variants take two words- one word which says
> which variant it is, and the second word which is the unboxed int, unboxed
> bool, or pointer to the boxed string.

I should clarify what I am proposing because it also only requires one
word.  The bottom one or two bits of that word would say which variant
it is.  The remaining bits representing any data with that variant. 
(Obviously this would only work if the associated data is small
enough, but fortunately most of the primitives are small enough (e.g.
int (31 bits), bool (1 bit), char (8 bits).))

(Please forgive me if the following is verbose, I want to make sure my
idea is clear.  In the scheme world I think it's called "pointer
tagging" or "tagged pointers".)

Consider "type value = Int of int | Bool of bool | None | String of
string".  For the moment I will assume that the GC only assumes 4 byte
aligned values are pointers.

Since int is 31 bits that leaves one bit to say what type it is.  We
don't want the GC grabbing it, so we will represent it with a single
'1' bit appended.  A 6 would be represented by 1101 (110 for the 6 and
1 for the "tag").  This means that "Int 6" only takes one word.

Let's look at "Bool of bool".  We only need one bit of representation
for that, leaving 31 bits free for tagging.  We can represent it by
appending a three bit tag of '010' to the regular bool value.  So true
is '1010' and false is '0010'.  (NB: this depends on the GC ignoring
non-4-byte-aligned values.) Again "Bool true" and "Bool false" would
only be one word.

The "None" case can use any free bit pattern.  We can give it the
three bit tag '110'.  Since there is no data, "None" would be '0110',
only one word.

A string can not fit in one word, so it will have to be boxed. We
would store a pointer to the block structure storing the data that
Nicolas mentioned (tag, GC bits, size).  Since all pointers allocated
by OCaml are four byte aligned, it would be something like '10001100'.
 Because the last two bits are '00' that forms as a sort of 'tag' that
signals "Hey, this thing is a pointer to find out what variant it is
you will have to actually dereference it and find the tag in the block
structure".  Again this pointer would be one word long.

As you mentioned functions like List.rev don't care what their
arguments are so long as they are a single word.  The above
representation satisfies that.

The only other interesting part is pattern matching.  Matching would
require examination of the final bits of a value.  The final bits
would determine the constructor, unless they were '00' (i.e. word
aligned) in which case the value is a pointer to a block that contains
the tag that determines the constructor.  In portable assembly (a.k.a.
C) a match operation would look something like the following.

if (x & 0x03) {
  /* It's an unboxed constructor */
  if (x & 0x01) { /* It's an Int */ }
  else if (x & 0x07 == 0x02) { /* It's a Bool */ }
  else if (x & 0x07 == 0x06) { /* It's a None */ }
  else { /* Malformed value */ }
} else {
  /* It's word aligned so it's a boxed constructor */
  switch (Tag_val(x)) {
    case STRING_TAG: /* It's a String */ break;
    default: /* Malformed value */
  }
}


Michael D. Adams
mdmkolbe@gmail.com


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] Efficency of varient types
  2005-11-27 14:57 ` Lukasz Stafiniak
@ 2005-11-27 15:47   ` Lukasz Stafiniak
  2005-11-28  8:14   ` Christophe Raffalli
  1 sibling, 0 replies; 27+ messages in thread
From: Lukasz Stafiniak @ 2005-11-27 15:47 UTC (permalink / raw)
  To: Michael D. Adams; +Cc: caml-list

2005/11/27, Lukasz Stafiniak <lukstafi@gmail.com>:
> 2005/11/26, Michael D. Adams <mdmkolbe@gmail.com>:
> > I am working on a program that translates code from scheme into ocaml.
> Use MetaOCaml with native-code compilation and untagging.

Err,  in your case untagging and printed representation. Not much of
MetaOCaml functionality, and only works when program is not actually
dynamically typed. But MetaOCaml is always worth giving a try :)

> Best Regards,
> Lukasz
>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] Efficency of varient types
  2005-11-27 15:35       ` Michael D. Adams
@ 2005-11-27 18:08         ` Brian Hurt
  2005-12-02 15:07           ` Michael D. Adams
  0 siblings, 1 reply; 27+ messages in thread
From: Brian Hurt @ 2005-11-27 18:08 UTC (permalink / raw)
  To: Michael D. Adams; +Cc: caml-list



On Sun, 27 Nov 2005, Michael D. Adams wrote:

> I should clarify what I am proposing because it also only requires one
> word.  The bottom one or two bits of that word would say which variant
> it is.  The remaining bits representing any data with that variant.
> (Obviously this would only work if the associated data is small
> enough, but fortunately most of the primitives are small enough (e.g.
> int (31 bits), bool (1 bit), char (8 bits).))

Except that this runs into the problem of knowing the subtle details of 
the GC- not only how it works now, but what (if any) gaurentees are being 
given for future development.  For example, is it gaurenteed that the GC 
will recognize words with the low two bits being 10 as not pointers?  Or 
does the low bit being 0 means the GC may (or may not) assume it's a 
pointer, and weird word values would cause the GC to segfault?  Even if it 
works now, will it continue to work with Ocaml 3.10, 3.20, 4.0, etc.  I 
don't know- I don't work at INRIA.  This is exactly what I was warning 
about earlier.

The other thing I'll throw in here is a warning against premature 
optimization.  OK, so method A is 15x slower than method B.  But how big 
of an effect will this have on the overall program?  If method B is one 
clock cycle, and method A is 15 clock cycles, probably not much.  15 clock 
cycles is about the cost of one mispredicted branch, and much, much 
cheaper than a cache miss.  If you're doing anything at all interesting, 
the most likely case is that the 14 clock difference will be minor 
compared to the costs of the rest of the program.

Brian


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Obj or not Obj
  2005-11-26  1:22   ` Brian Hurt
  2005-11-26  9:39     ` Nicolas Cannasse
@ 2005-11-28  0:17     ` Christophe Raffalli
  2005-11-28  8:41       ` [Caml-list] " Nicolas Cannasse
  2005-11-28  8:43       ` Alain Frisch
  1 sibling, 2 replies; 27+ messages in thread
From: Christophe Raffalli @ 2005-11-28  0:17 UTC (permalink / raw)
  To: Brian Hurt, caml-list


>
> As a general rule, unless you work at INRIA, you should never, ever 
> use the Obj module.  Not only will the code be signifigantly less 
> readable, it'll also be signifigantly more error prone, especially 
> obscure, hard to debug unless you are intimately familiar with the 
> inners of the garbage collector, segfault you program at random times, 
> boy doesn't this feel like C++ bugs.
>

As nicolas Canasse already said this is a too strong statement, that 
should/could be commented by somone else at INRIA
(otherwise, make install should not install obj at all ;-). Here are a 
few reason

- using Obj requires the same knowledge as writing C interface ... and C 
is more obscure that OCaml's code with Obj.
For instance, my bindlib library for bound variables would be much more 
error prone in C .. and by the way what do you think of
software distributed as patch for Ocaml like FreshOcaml ...

- then Obj is sometimes necessary, typically
  - when trying to have OCaml accept programs that it thinks wrongly to 
be not typable (like programs extracted from Coq proof)
  - when you have an "environment" (list, array, etc ...) that needs to 
contain variables of various AND unknown type (unkown because you are
    writing libraries or general tools, this is the case of bindlib, 
code genarated by ocamlyacc and  many others). Moreover, all these programs
    would be  typable using dependent types ...
  - for all the examples above using C would be stupid ! especially for 
ocamlyacc ;-)

Nevertheless, I almost agree, you should just make sure before using Obj 
that
  - you read and learn a lot about the runtime
  - you can not find another way round, even if it cost a little time

Just a small word about one of my bad experiment with Obj about tail rec 
map ... it is easy to get a tail
rec map using Obj to make ocaml's list mutable ... it is just slower 
that the original map and using roughly the same
amount of memory, because every cons cell  you gain ... is now in the 
list of grey val as soon as your list does not
fit in the minor heap anymore !! So even writing you own mutable_list 
type would be as bad !

Explanation: the list of grey val is a list of pointers from the major 
heap to the minor heap which
are created when using mutable data structure and which would break 
incremental GC if each minor GC did not scan
the list of grey val) ...

This also does mean that you should know a lot about the runtime when 
using mutable data structures and caring about performance !!
 
So Obj or not Obj (or mutable or not mutable) is always a hard question 
that needs time ...
but definitely, I think the above sentence is too strong.




^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] Efficency of varient types
  2005-11-25 23:53 Efficency of varient types Michael D. Adams
                   ` (2 preceding siblings ...)
  2005-11-27 14:57 ` Lukasz Stafiniak
@ 2005-11-28  7:24 ` David Baelde
  2005-11-28  7:49   ` Jacques Garrigue
  2005-11-28  7:53   ` Ville-Pertti Keinonen
  2005-12-01 17:05 ` Stefan Monnier
  4 siblings, 2 replies; 27+ messages in thread
From: David Baelde @ 2005-11-28  7:24 UTC (permalink / raw)
  To: Michael D. Adams; +Cc: caml-list

Hi,

First, I'd like to point that you're talking about sum types and not
variants. For short, variants are the backquoted labels `Int of int,
and I think they cost more at runtime, for they cannot be encoded as
integers.

My solution for your problem would be to write a simple type inference
in order to check your scheme programs before translating. It's not so
hard if you use tools like lambda-prolog for example. Of course you
won't be able to translate untypable programs, but I think that's no
problem. The only difficulty which I foresee is having polymorphism
infered, at least it is not obvious to do with lambda-prolog, and
you'll probably miss it.

Cheers.
--
David


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] Efficency of varient types
  2005-11-28  7:24 ` David Baelde
@ 2005-11-28  7:49   ` Jacques Garrigue
  2005-11-28 10:01     ` Jon Harrop
  2005-11-28  7:53   ` Ville-Pertti Keinonen
  1 sibling, 1 reply; 27+ messages in thread
From: Jacques Garrigue @ 2005-11-28  7:49 UTC (permalink / raw)
  To: david.baelde; +Cc: caml-list

From: David Baelde <david.baelde@gmail.com>

> First, I'd like to point that you're talking about sum types and not
> variants. For short, variants are the backquoted labels `Int of int,
> and I think they cost more at runtime, for they cannot be encoded as
> integers.

Unfortunately there is no such clear distinction: sum types are also
often called variant types (inside the compiler too), and polymorphic
variants can be abbreviated as "variants". Some people even call the
latter "open sum types", to make things even muddier.

More importantly, polymorphic variants are internally encoded as
integers, so there is no representation cost for constant tags.
Polymorphic variants with value arguments take extra space, as they
put the variant tag in an extra field, and cannot be optimized for the
multi-argument case, but we are then talking of cases where allocation
is required anyway.

Jacques Garrigue


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] Efficency of varient types
  2005-11-28  7:24 ` David Baelde
  2005-11-28  7:49   ` Jacques Garrigue
@ 2005-11-28  7:53   ` Ville-Pertti Keinonen
  1 sibling, 0 replies; 27+ messages in thread
From: Ville-Pertti Keinonen @ 2005-11-28  7:53 UTC (permalink / raw)
  To: david.baelde; +Cc: Michael D. Adams, caml-list

On Mon, 2005-11-28 at 08:24 +0100, David Baelde wrote:

> First, I'd like to point that you're talking about sum types and not
> variants. For short, variants are the backquoted labels `Int of int,
> and I think they cost more at runtime, for they cannot be encoded as
> integers.

Sum types are also known as variant types.  This terminology is even
used in the OCaml documentation.

What you're talking about are *polymorphic* variants, and they are in
fact encoded as integers, but they don't fit in the tag-portion of the
GC header so they take up more space than regular variants if they have
any parameters.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] Efficency of varient types
  2005-11-27 14:57 ` Lukasz Stafiniak
  2005-11-27 15:47   ` Lukasz Stafiniak
@ 2005-11-28  8:14   ` Christophe Raffalli
  1 sibling, 0 replies; 27+ messages in thread
From: Christophe Raffalli @ 2005-11-28  8:14 UTC (permalink / raw)
  To: Lukasz Stafiniak; +Cc: Michael D. Adams, caml-list

Lukasz Stafiniak a écrit :

>2005/11/26, Michael D. Adams <mdmkolbe@gmail.com>:
>  
>
>>I have recently learned about OCaml and have been impressed by how
>>fast it is in the benchmarks.  However I have discovered that variant
>>types can slow down a program quite a bit.
>>    
>>
>[...]
>  
>
>>I am working on a program that translates code from scheme into ocaml.
>> (Later I will try python into ocaml.)  Because it is a dynamicly
>>typed language, the most natural translation would make everything a
>>function of a large variant type like:
>>
>>type value = Int of int
>> | Char of char
>> | String of string
>> (* etc. *)
>>
>>    
>>

I would suggest that you run a typing algorithm on your scheme program 
with an extra type "top" for
polymorphic function where object are boxed as proposed above.

Then, when typing fails, you insert in your code the proper conversion 
which are defined by induction over types:


(from int to top) : fun x -> Int x
(from char to top) : fun x -> Char x
...

(from top to int) : function Int x -> x | _ -> failwith "bad scheme program"
...

(from a to a) : identity (needed for optimization bellow)

(from a -> b to a' -> b') : fun f x -> (from b to b') f ((from a' to a) x)

(from a list to b list) : fun l -> List.map (from a to b) l

The pb here is to try to avoid as much as possible the composed function 
(the two latest, especially the latest, the one for
function is not that bad, it does not allocate a lot of memory at 
runtime) ... a possible way to do this is the following:

in the typing algorithm, do not solve the unification constraint 
immediately, just collect them as triple

(a,b,p) where a and b are the type that should be equal and p is the 
position in the code where to insert the (from to ...) function.

Then, consider the smallest relation on type variable with a < b if 
something like (b, list a, p) or (list a, b) appear (idem for arrow and
type constructor ...

there may be cycle in this relation, but do your best to propagate 
unification constraint from the "maximum" type for this relation ...
the solution being not unique, this is not a trivial task.

Moreover, if you compile a library, you may get a very nice solution for 
the library ... that require a lot of translation when you
finally use the library. But then, the only solution is probably to ask 
a little help from the programmer ...

This is the hardness of this problem that makes statically typed 
language better than dynamically typed language.

For lisp, there is one more specific pb, you use "cons" both for pairs 
and lists so for each "cons" there is a choice to translate it as (,) or 
(::)
... exploring all the possibility will probably increase a lot the 
complexity of the algorithm... but feaseable solution may be possible by 
trying
to delay the problem again chosing the order in which you solve 
unification constraint (you try to make "pertinent" information 
propagate before
doing compromising choices) ...

But I am sure this is an active field of reasearch in the lisp compiler 
community ...

I hope this helps.

Christophe


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] Obj or not Obj
  2005-11-28  0:17     ` Obj or not Obj Christophe Raffalli
@ 2005-11-28  8:41       ` Nicolas Cannasse
  2005-11-28  9:27         ` Christophe Raffalli
  2005-11-28  9:33         ` skaller
  2005-11-28  8:43       ` Alain Frisch
  1 sibling, 2 replies; 27+ messages in thread
From: Nicolas Cannasse @ 2005-11-28  8:41 UTC (permalink / raw)
  To: Christophe Raffalli; +Cc: Brian Hurt, caml-list

 > Just a small word about one of my bad experiment with Obj about tail rec
 > map ... it is easy to get a tail
 > rec map using Obj to make ocaml's list mutable ... it is just slower
 > that the original map and using roughly the same
 > amount of memory, because every cons cell  you gain ... is now in the
 > list of grey val as soon as your list does not
 > fit in the minor heap anymore !! So even writing you own mutable_list
 > type would be as bad !
 >
 > Explanation: the list of grey val is a list of pointers from the major
 > heap to the minor heap which
 > are created when using mutable data structure and which would break
 > incremental GC if each minor GC did not scan
 > the list of grey val) ...

This is not correct.
When you're building a mutable list you're allocating first in the minor 
heap, then all cons go into the major heap at once, so only the "last" 
cons before a GC cycle is greyed. It is indeed true that for short 
lists, using the stack is better than using mutable structures.

But using the stack is getting speed against correctness, since it will 
overflow for big lists.

Nicolas


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] Obj or not Obj
  2005-11-28  0:17     ` Obj or not Obj Christophe Raffalli
  2005-11-28  8:41       ` [Caml-list] " Nicolas Cannasse
@ 2005-11-28  8:43       ` Alain Frisch
  1 sibling, 0 replies; 27+ messages in thread
From: Alain Frisch @ 2005-11-28  8:43 UTC (permalink / raw)
  To: Christophe Raffalli; +Cc: Brian Hurt, caml-list

Christophe Raffalli wrote:
> - using Obj requires the same knowledge as writing C interface ...

It requires more knowledge: you must know how GC roots in the stack are
detected in native code. In this respect, (Obj.magic 0) and (Obj.magic
()) are not the same, contrary to what you might believe if you only
know how to write a C interface. (Obj.magic 0) makes the code generator
believe that the result can only be an integer; it will thus not traced
by the GC. Now, the same happens for the result of an expression like
(if ... then Obj.magic 0 else ...) because the compiler knows, thanks to
the type system, that if the then branch is an integer, then so is the
else branch. You can imagine the consequences. (I've made this mistake -
good thing Xavier was not too far.)

-- Alain


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] Obj or not Obj
  2005-11-28  8:41       ` [Caml-list] " Nicolas Cannasse
@ 2005-11-28  9:27         ` Christophe Raffalli
  2005-11-28  9:33         ` skaller
  1 sibling, 0 replies; 27+ messages in thread
From: Christophe Raffalli @ 2005-11-28  9:27 UTC (permalink / raw)
  To: Nicolas Cannasse; +Cc: Christophe Raffalli, Brian Hurt, caml-list

[-- Attachment #1: Type: text/plain, Size: 1860 bytes --]

Nicolas Cannasse a écrit :
>> Just a small word about one of my bad experiment with Obj about tail rec
>> map ... it is easy to get a tail
>> rec map using Obj to make ocaml's list mutable ... it is just slower
>> that the original map and using roughly the same
>> amount of memory, because every cons cell  you gain ... is now in the
>> list of grey val as soon as your list does not
>> fit in the minor heap anymore !! So even writing you own mutable_list
>> type would be as bad !
>>
>> Explanation: the list of grey val is a list of pointers from the major
>> heap to the minor heap which
>> are created when using mutable data structure and which would break
>> incremental GC if each minor GC did not scan
>> the list of grey val) ...
> 
> This is not correct.
> When you're building a mutable list you're allocating first in the minor
> heap, then all cons go into the major heap at once, so only the "last"
> cons before a GC cycle is greyed. It is indeed true that for short
> lists, using the stack is better than using mutable structures.
> 
> But using the stack is getting speed against correctness, since it will
> overflow for big lists.
> 

I did timing for big lists with mutation using object and rev +  rev_map
and the later was faster ...

If the list is big, and there is allocating computation for the car of
each list in map, then I think a lot of cons cells can be greyed once
... even all if each computation in the car triggers at least one minor
collection ... but then the complexity of map does not matter ... so my
explanation for the observed timing was wrong ...

may be this is just the test to know if one have to grey the cons-cell
that's costly ?

any way I now think rev + rev_map is better than dirty Obj for that pb.

remark: I only tested on my athlon 32 ...



> Nicolas


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 894 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] Obj or not Obj
  2005-11-28  8:41       ` [Caml-list] " Nicolas Cannasse
  2005-11-28  9:27         ` Christophe Raffalli
@ 2005-11-28  9:33         ` skaller
  1 sibling, 0 replies; 27+ messages in thread
From: skaller @ 2005-11-28  9:33 UTC (permalink / raw)
  To: Nicolas Cannasse; +Cc: Christophe Raffalli, Brian Hurt, caml-list

On Mon, 2005-11-28 at 09:41 +0100, Nicolas Cannasse wrote:

> But using the stack is getting speed against correctness, since it will 
> overflow for big lists.

But that is an artefact of Unix implementations
designed for flat C code, rather than FPLs. In theory
both stack and heap are bounded only by the smaller of
artificial (that is, deliberately imposed) limits and
total address space.

-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] Efficency of varient types
  2005-11-28  7:49   ` Jacques Garrigue
@ 2005-11-28 10:01     ` Jon Harrop
  2005-11-28 10:26       ` Luc Maranget
  0 siblings, 1 reply; 27+ messages in thread
From: Jon Harrop @ 2005-11-28 10:01 UTC (permalink / raw)
  To: caml-list

On Monday 28 November 2005 07:49, Jacques Garrigue wrote:
> More importantly, polymorphic variants are internally encoded as
> integers, so there is no representation cost for constant tags.

Is pattern matching over constant ordinary variants not more efficient than 
pattern matching over constant polymorphic variants when compiled to native 
code?

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] Efficency of varient types
  2005-11-28 10:01     ` Jon Harrop
@ 2005-11-28 10:26       ` Luc Maranget
  0 siblings, 0 replies; 27+ messages in thread
From: Luc Maranget @ 2005-11-28 10:26 UTC (permalink / raw)
  To: Jon Harrop; +Cc: caml-list

> On Monday 28 November 2005 07:49, Jacques Garrigue wrote:
> > More importantly, polymorphic variants are internally encoded as
> > integers, so there is no representation cost for constant tags.
> 
> Is pattern matching over constant ordinary variants not more efficient than 
> pattern matching over constant polymorphic variants when compiled to native 
> code?
> 
> -- 

As far as I understandteh question, constant constructors both in
ordinary variants and polymorphic variants are encoded as integers.

Hence matching over those are basically the same. However there is
one (small) difference.

Ordinary variants are translated into consecutive integers, starting
from zero, while polymorphic variant are more arbitrary integers.
As a result the final code uses some tricks more often in the first
case.

While obvious in particular cases, the resulting efficiency gain is,
well, difficult to assess in general.

Consider for instance :

let rec fib x = match x with
| 0|1 -> 1
| n -> fib (n-1) + fib (n-2)
  Compiled code for matching will consist in one (unsigned) test.

and

let rec fob x = match x with
| `ZeroIMean|`OneISay -> 1
| n -> ...
  Compiled code for matching will consist in two (equality) tests.

Your mileage may vary, of course.


-- Luc


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Efficency of varient types
  2005-11-25 23:53 Efficency of varient types Michael D. Adams
                   ` (3 preceding siblings ...)
  2005-11-28  7:24 ` David Baelde
@ 2005-12-01 17:05 ` Stefan Monnier
  2005-12-02 15:07   ` [Caml-list] " Michael D. Adams
  4 siblings, 1 reply; 27+ messages in thread
From: Stefan Monnier @ 2005-12-01 17:05 UTC (permalink / raw)
  To: caml-list

> I am working on a program that translates code from scheme into ocaml.
>  (Later I will try python into ocaml.)  Because it is a dynamicly
> typed language, the most natural translation would make everything a
> function of a large variant type like:

Google for "soft typing".  You can start without soft typing and add it
later on to get better performance.


        Stefan


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] Efficency of varient types
  2005-11-27 18:08         ` Brian Hurt
@ 2005-12-02 15:07           ` Michael D. Adams
  0 siblings, 0 replies; 27+ messages in thread
From: Michael D. Adams @ 2005-12-02 15:07 UTC (permalink / raw)
  To: caml-list

On 11/27/05, Brian Hurt <bhurt@spnz.org> wrote:
> On Sun, 27 Nov 2005, Michael D. Adams wrote:
> > I should clarify what I am proposing because it also only requires one
> > word.  The bottom one or two bits of that word would say which variant
> > it is.  The remaining bits representing any data with that variant.
> > (Obviously this would only work if the associated data is small
> > enough, but fortunately most of the primitives are small enough (e.g.
> > int (31 bits), bool (1 bit), char (8 bits).))
>
> Except that this runs into the problem of knowing the subtle details of
> the GC- not only how it works now, but what (if any) gaurentees are being
> given for future development.  For example, is it gaurenteed that the GC
> will recognize words with the low two bits being 10 as not pointers?  Or
> does the low bit being 0 means the GC may (or may not) assume it's a
> pointer, and weird word values would cause the GC to segfault?  Even if it
> works now, will it continue to work with Ocaml 3.10, 3.20, 4.0, etc.  I
> don't know- I don't work at INRIA.  This is exactly what I was warning
> about earlier.

A brief glance at the code seems to confirm that the GC is only
checking the last bit even though it could check the last two bits. 
That means I'll have to go with a 30 bit representation of ints with
the pointer tags like '01' for ints, '011' for bools, '111' for char,
'0' for pointers to boxed constructors.

<Tangent>
I use '01' for int instead of '11' b/c it requires fewer operations to
add and multiply ints.  The consideration of pointers doesn't matter
in this case.  Some have argued, though, that even when pointers do
matter, ints should still be '0' and pointers '1', b/c always
accessing pointers with an offset is cheap (there's an addressing mode
for it) and it makes int operations cheaper.
</Tangent>

Yes, this means I'm digging around the internal representation.  I
would rather OCaml were smart enough to figure out from a type
declaration that it could use pointer tags and unbox some of the
constructors.  Oh well, maybe in the next version of OCaml.  (/me
nudges Inria)

> The other thing I'll throw in here is a warning against premature
> optimization.  OK, so method A is 15x slower than method B.  But how big
> of an effect will this have on the overall program?  If method B is one
> clock cycle, and method A is 15 clock cycles, probably not much.  15 clock
> cycles is about the cost of one mispredicted branch, and much, much
> cheaper than a cache miss.  If you're doing anything at all interesting,
> the most likely case is that the 14 clock difference will be minor
> compared to the costs of the rest of the program.

I would have absolutely agreed with you until a couple months ago when
a friend pointed out to me the difference between optimizing an
application and optimizing a language.

Applications are usually dominated by some bottleneck that is slower
than the rest.  Most applications just need to be fast 'enough'. 
Having 'ls' take 10 seconds is to slow but the difference between 10
ms and 1 ms usually doesn't matter.  These two facts mean that you
shouldn't prematurely optimize your application.

Compare this with a language implementation.  If some part of the
implementation that is commonly used, then every application that uses
that implementation will have to pay for it and in cases like the cost
of primative operations on a value (which is tied to the cost of
tags), those applications will have to pay for it almost everywhere. 
Furthermore, "fast enough" doesn't apply as much with languages,
because if I make my implementation generate twice as fast code, then
that means someone optimizing their application with my implementation
only has to do half as much optimization to reach what ever their
"fast enough" level is.  So the rules for optimizing language
implementation are a little different than for applications.

Ok, there is a "fast enough" for my language implementation.  My
original objective was simply to show that if writing a compiler for a
language (e.g. scheme, python), it is a lot smarter to compile down to
OCaml than to compile down to C like a lot of implementations do.  The
results should be easier to write and will probably run faster because
you can use an already fast runtime (OCaml's) instead of writing your
own in C which will likely be slow unless you spend a lot of time on
it.  Thus I'm fast enough if I'm significantly faster than python. 
However, that 15x slow down for using tags hurts b/c python is only
about 50x slower than C.  I know there will be other slow downs I will
have to pay for and 15x blows the budget.

As far as doing type inference, I will do that eventually, but I want
to try to see how fast I can make it without type inference, then add
inference later.  A friend of mine that wrote one of the worlds
fastest Scheme implementations (i.e. on par or better than equivalent
C), claims the cost of tagging is low if done 'right' (i.e. pointer
tags instead of block tags) and that type inference wont gain much
once that is done.  So to test his claim I need to first write a very
fast implementation that uses pointer tags, then see how much it
speeds up when I add type inference.

Michael D. Adams
mdmkolbe@gmail.com


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] Re: Efficency of varient types
  2005-12-01 17:05 ` Stefan Monnier
@ 2005-12-02 15:07   ` Michael D. Adams
  0 siblings, 0 replies; 27+ messages in thread
From: Michael D. Adams @ 2005-12-02 15:07 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: caml-list

On 12/1/05, Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> > I am working on a program that translates code from scheme into ocaml.
> >  (Later I will try python into ocaml.)  Because it is a dynamicly
> > typed language, the most natural translation would make everything a
> > function of a large variant type like:
>
> Google for "soft typing".  You can start without soft typing and add it
> later on to get better performance.

Thanks, it looks very interesting.  I haven't into it yet, and I will
be interested to see whether it matches my ideas about how it could be
done (i.e. use subsumption to find the least general generalization
instead of unification to find the least specific substitution like
you would with normal ("hard"?) type inference).

Michael D. Adams
mdmkolbe@gmail.com


^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2005-12-02 15:07 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-11-25 23:53 Efficency of varient types Michael D. Adams
2005-11-26  0:31 ` [Caml-list] " Nicolas Cannasse
2005-11-26  1:22   ` Brian Hurt
2005-11-26  9:39     ` Nicolas Cannasse
2005-11-28  0:17     ` Obj or not Obj Christophe Raffalli
2005-11-28  8:41       ` [Caml-list] " Nicolas Cannasse
2005-11-28  9:27         ` Christophe Raffalli
2005-11-28  9:33         ` skaller
2005-11-28  8:43       ` Alain Frisch
2005-11-26  2:54   ` [Caml-list] Efficency of varient types skaller
2005-11-27  5:06   ` Michael D. Adams
2005-11-27  5:45     ` Brian Hurt
2005-11-27 10:02       ` Nicolas Cannasse
2005-11-27 15:35       ` Michael D. Adams
2005-11-27 18:08         ` Brian Hurt
2005-12-02 15:07           ` Michael D. Adams
2005-11-26  1:18 ` Jon Harrop
2005-11-27 14:57 ` Lukasz Stafiniak
2005-11-27 15:47   ` Lukasz Stafiniak
2005-11-28  8:14   ` Christophe Raffalli
2005-11-28  7:24 ` David Baelde
2005-11-28  7:49   ` Jacques Garrigue
2005-11-28 10:01     ` Jon Harrop
2005-11-28 10:26       ` Luc Maranget
2005-11-28  7:53   ` Ville-Pertti Keinonen
2005-12-01 17:05 ` Stefan Monnier
2005-12-02 15:07   ` [Caml-list] " Michael D. Adams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).