caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* compiling large file hogs RAM and takes a long time.
@ 2007-06-06 16:30 Sam Steingold
  2007-06-06 16:51 ` [Caml-list] " skaller
                   ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Sam Steingold @ 2007-06-06 16:30 UTC (permalink / raw)
  To: caml-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I wrote a parser generator for a tick data file.
The generated OCaml file is ~1Mb and contains ~120 variant type
definitions (each with 2 to ~100 variants) plus one polymorphic variant
type.
When I compiled it with ocamlopt (3.09.3), it took almost 10 minutes and
consumed ~500MB RAM (Firefox and Thunderbird were killed by the kernel
to make space for ocamlopt).
I run Linux 2.6.18.8 on a dual CPU Pentium D 2.80GHz with 1GB of RAM.
(in 32-bit mode)

So, is this something to be expected? 10 min / 500MB seems like a little
bit excessive to me, given that the file is really very simple conceptually.

Thanks!
Sam.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGZuEfPp1Qsf2qnMcRAjSYAKCFD4QBQdxk7qCGRb2LyHPHz5QrSQCgmjLl
yxHaXIU7wK6j+gWhG+zbZvA=
=nfep
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] compiling large file hogs RAM and takes a long time.
  2007-06-06 16:30 compiling large file hogs RAM and takes a long time Sam Steingold
@ 2007-06-06 16:51 ` skaller
  2007-06-06 17:05   ` Sam Steingold
  2007-06-07 15:52 ` Sam Steingold
  2007-07-09 20:22 ` large parametrized polymorphic variant type combinations take forever to compile Sam Steingold
  2 siblings, 1 reply; 29+ messages in thread
From: skaller @ 2007-06-06 16:51 UTC (permalink / raw)
  To: Sam Steingold; +Cc: caml-list

On Wed, 2007-06-06 at 12:30 -0400, Sam Steingold wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> I wrote a parser generator for a tick data file.
> The generated OCaml file is ~1Mb and contains ~120 variant type
> definitions (each with 2 to ~100 variants) plus one polymorphic variant
> type.
> When I compiled it with ocamlopt (3.09.3), it took almost 10 minutes and
> consumed ~500MB RAM (Firefox and Thunderbird were killed by the kernel
> to make space for ocamlopt).

Use ocamlopt.opt .. ocamlopt is actually a bytecode program ..

-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: compiling large file hogs RAM and takes a long time.
  2007-06-06 16:51 ` [Caml-list] " skaller
@ 2007-06-06 17:05   ` Sam Steingold
  0 siblings, 0 replies; 29+ messages in thread
From: Sam Steingold @ 2007-06-06 17:05 UTC (permalink / raw)
  To: skaller; +Cc: caml-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

skaller wrote:
> On Wed, 2007-06-06 at 12:30 -0400, Sam Steingold wrote:
>>
>> I wrote a parser generator for a tick data file.
>> The generated OCaml file is ~1Mb and contains ~120 variant type
>> definitions (each with 2 to ~100 variants) plus one polymorphic variant
>> type.
>> When I compiled it with ocamlopt (3.09.3), it took almost 10 minutes and
>> consumed ~500MB RAM (Firefox and Thunderbird were killed by the kernel
>> to make space for ocamlopt).
> 
> Use ocamlopt.opt .. ocamlopt is actually a bytecode program ..

actually, I use omake and what I see is "ocamlfind ocamlc ...".

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGZulTPp1Qsf2qnMcRAiICAJ47QKCgtDnr8eBD904qDBY79zhc/wCeLpQ+
zUgkpYY9u5oOEgZypvi3FYY=
=wf5J
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: compiling large file hogs RAM and takes a long time.
  2007-06-06 16:30 compiling large file hogs RAM and takes a long time Sam Steingold
  2007-06-06 16:51 ` [Caml-list] " skaller
@ 2007-06-07 15:52 ` Sam Steingold
  2007-06-08  1:02   ` [Caml-list] " Jacques Garrigue
  2007-07-09 20:22 ` large parametrized polymorphic variant type combinations take forever to compile Sam Steingold
  2 siblings, 1 reply; 29+ messages in thread
From: Sam Steingold @ 2007-06-07 15:52 UTC (permalink / raw)
  To: caml-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sam Steingold wrote:
> I wrote a parser generator for a tick data file.
> The generated OCaml file is ~1Mb and contains ~120 variant type
> definitions (each with 2 to ~100 variants) plus one polymorphic variant
> type.
> When I tried to compile it (3.09.3), it took almost 10 minutes and
> consumed ~500MB RAM (Firefox and Thunderbird were killed by the kernel
> to make space for ocaml).
> I run Linux 2.6.18.8 on a dual CPU Pentium D 2.80GHz with 1GB of RAM.
> (in 32-bit mode)
> 
> So, is this something to be expected? 10 min / 500MB seems like a little
> bit excessive to me, given that the file is really very simple conceptually.

update: the main problem is a polymorphic variant type with 3050
variants and a function that returns its instances.
chunking the type, i.e., replacing
type foo=[|`A|`B|`C|`D]
with
type foo_1=[|`A|`B]
type foo_2=[|`C|`D]
type foo=[|foo_1|foo_2]
(but with ~55 variants in ~55 types) did not help.
Removing the type definition did solve the RAM problem - according to
top(1), ocamlopt.opt was taking ~80MB, and it compiled the file in about
10 minutes.

Any chance there is some quadratic code in polymorphic variant type
processing?!

Sam.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGaCmjPp1Qsf2qnMcRAn52AJ4vKgALmumuKvzgo3aiG4LaavmKJACeKfWn
glb7ZDcjbMt/lIWYWMmPF5M=
=xr5y
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Re: compiling large file hogs RAM and takes a long time.
  2007-06-07 15:52 ` Sam Steingold
@ 2007-06-08  1:02   ` Jacques Garrigue
  2007-06-08  1:51     ` skaller
  2007-06-08 12:30     ` [Caml-list] " Jacques Garrigue
  0 siblings, 2 replies; 29+ messages in thread
From: Jacques Garrigue @ 2007-06-08  1:02 UTC (permalink / raw)
  To: sds; +Cc: caml-list

From: Sam Steingold <sds@gnu.org>
> update: the main problem is a polymorphic variant type with 3050
> variants and a function that returns its instances.
> chunking the type, i.e., replacing
> type foo=[|`A|`B|`C|`D]
> with
> type foo_1=[|`A|`B]
> type foo_2=[|`C|`D]
> type foo=[|foo_1|foo_2]
> (but with ~55 variants in ~55 types) did not help.
> Removing the type definition did solve the RAM problem - according to
> top(1), ocamlopt.opt was taking ~80MB, and it compiled the file in about
> 10 minutes.
> 
> Any chance there is some quadratic code in polymorphic variant type
> processing?!

There is, and this is a known problem:
  http://caml.inria.fr/mantis/view.php?id=4053

I'm sorry, but I don't see any easy way out.
At least on the basic time complexity.
However, the space complexity could be improved, if I find what is
gobbling memory so fast.

         Jacques


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Re: compiling large file hogs RAM and takes a long time.
  2007-06-08  1:02   ` [Caml-list] " Jacques Garrigue
@ 2007-06-08  1:51     ` skaller
  2007-06-08  2:26       ` Yaron Minsky
                         ` (2 more replies)
  2007-06-08 12:30     ` [Caml-list] " Jacques Garrigue
  1 sibling, 3 replies; 29+ messages in thread
From: skaller @ 2007-06-08  1:51 UTC (permalink / raw)
  To: Jacques Garrigue; +Cc: sds, caml-list

On Fri, 2007-06-08 at 10:02 +0900, Jacques Garrigue wrote:
> > 
> > Any chance there is some quadratic code in polymorphic variant type
> > processing?!
> 
> There is, and this is a known problem:
>   http://caml.inria.fr/mantis/view.php?id=4053
> 
> I'm sorry, but I don't see any easy way out.
> At least on the basic time complexity.

You mention in the ticket there is a hard way out .. using
binary trees; hard because it would require changes everywhere
in the compiler. Is this actually enough? Seems to reduce

	O(n * n * log n)

to 

	O( n * log n * log n)

which is still pretty bad.. is that right?

How about a one level digital lookup, that is, use the first
8 bits of the key to choose one of 256 binary trees?

In commercial code O() performance isn't usually that relevant.
Hybrid data structure is probably best, even if the O() performance
is worse: if a one level digital lookup reduces quadratic time
by 256 that reduces a 4 hour compilation to 1 minute.. :)

Then 1,000-10,000 type constructors would be handled easily, although
around 100,000 you'd be trouble again .. but I think you'd be in
trouble anyhow if you had that many!

Also .. I suspect the OP is only using polymorphic variants to 
work around the 246 constructor limitation on non-polymorphic variants.
In this case, they could just use factored non-polymorphic variants.

the point being: polymorphic variants should be used when you need
polymorphism and are willing to pay the price. This is typically
in a compiler or some other system where there are only a moderate
number of constructors, but many ways to combine them.

So maybe the correct fix isn't to modify polymorphic variants ..
but to repair the 246 limitation on non-polymorphic variants
by automatically factoring them .. it is, after all, entirely
syntactic sugar. Of course there's be a run time penalty from
double indirection .. but that's the price you pay for extremely
fast single indirected non-polymorphic variants when the number
of constructors is small.

Hmm .. and this could 'almost' be done with camlp4.. :)

-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Re: compiling large file hogs RAM and takes a long time.
  2007-06-08  1:51     ` skaller
@ 2007-06-08  2:26       ` Yaron Minsky
  2007-06-08  9:05       ` Thomas Fischbacher
  2007-06-08 13:39       ` Sam Steingold
  2 siblings, 0 replies; 29+ messages in thread
From: Yaron Minsky @ 2007-06-08  2:26 UTC (permalink / raw)
  To: skaller; +Cc: Jacques Garrigue, sds, caml-list

[-- Attachment #1: Type: text/plain, Size: 429 bytes --]

On 6/7/07, skaller <skaller@users.sourceforge.net> wrote:
>
>
> Also .. I suspect the OP is only using polymorphic variants to
> work around the 246 constructor limitation on non-polymorphic variants.
> In this case, they could just use factored non-polymorphic variants.


Actually the OP is using polymorphic variants so that the resulting data can
be stored in a polymap.  So regular variants are not such a good
replacement.

[-- Attachment #2: Type: text/html, Size: 734 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Re: compiling large file hogs RAM and takes a long time.
  2007-06-08  1:51     ` skaller
  2007-06-08  2:26       ` Yaron Minsky
@ 2007-06-08  9:05       ` Thomas Fischbacher
  2007-06-08  9:35         ` skaller
  2007-06-08 13:39       ` Sam Steingold
  2 siblings, 1 reply; 29+ messages in thread
From: Thomas Fischbacher @ 2007-06-08  9:05 UTC (permalink / raw)
  To: skaller; +Cc: Jacques Garrigue, sds, caml-list

skaller wrote:

> You mention in the ticket there is a hard way out .. using
> binary trees; hard because it would require changes everywhere
> in the compiler. Is this actually enough? Seems to reduce
> 
> 	O(n * n * log n)
> 
> to 
> 
> 	O( n * log n * log n)
 >
> which is still pretty bad.. is that right?

What would be so bad about O(n log n log n)?
After all, in our relevant computing universe, there
always is an upper bound to log n which is quite
a small constant (like 25 or so).

-- 
best regards,
Thomas Fischbacher
tf@functionality.de


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Re: compiling large file hogs RAM and takes a long time.
  2007-06-08  9:05       ` Thomas Fischbacher
@ 2007-06-08  9:35         ` skaller
  2007-06-08  9:55           ` Thomas Fischbacher
  0 siblings, 1 reply; 29+ messages in thread
From: skaller @ 2007-06-08  9:35 UTC (permalink / raw)
  To: Thomas Fischbacher; +Cc: Jacques Garrigue, sds, caml-list

On Fri, 2007-06-08 at 10:05 +0100, Thomas Fischbacher wrote:
> skaller wrote:
> 
> > You mention in the ticket there is a hard way out .. using
> > binary trees; hard because it would require changes everywhere
> > in the compiler. Is this actually enough? Seems to reduce
> > 
> > 	O(n * n * log n)
> > 
> > to 
> > 
> > 	O( n * log n * log n)
>  >
> > which is still pretty bad.. is that right?
> 
> What would be so bad about O(n log n log n)?
> After all, in our relevant computing universe, there
> always is an upper bound to log n which is quite
> a small constant (like 25 or so).

Perhaps you're right .. still, there's a big difference
between a 5 minute compile (enough time to make a coffee!)
and a 2 hour compile (enough time to have lunch!)

I just went through this: Felix normal build time is
about 20 minutes including all tests. With an early version
of Dypgen parser, this was changed to around 4 hours...
luckily, back to 20 minutes now. Even a small % increase
in times can be quite significant.

I usually illustrate this by asking if you'd be happy
to give up your 2 / 52 week annual holiday .. that's
under 4% of the year.. I'm sure you'd be horrified ..
4% is a LOT!

-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Re: compiling large file hogs RAM and takes a long time.
  2007-06-08  9:35         ` skaller
@ 2007-06-08  9:55           ` Thomas Fischbacher
  0 siblings, 0 replies; 29+ messages in thread
From: Thomas Fischbacher @ 2007-06-08  9:55 UTC (permalink / raw)
  To: skaller; +Cc: Jacques Garrigue, sds, caml-list

skaller wrote:

> Perhaps you're right .. still, there's a big difference
> between a 5 minute compile (enough time to make a coffee!)
> and a 2 hour compile (enough time to have lunch!)

The point I wanted to make here is: when it comes to the evaluation
whether extra logarithmic factors in time complexity do matter or
not, it is very important to also pay attention to the constants
involved.

For example, take a set of N magnetic needles distributed randomly
over space. In order to find the magnetic field experienced by every
single needle that is created by all the others, you can do a naive
O(N^2) algorithm that just sums contributions. There is a more
clever - and technically very sophisticated - technique that uses
multipole expansions to get a good approximation (the
"fast multipole method"), which can be computed in O(N log N),
but the constants are such that the break even point is not
reached before N=1000, and depending how well your code is
written, may lie above N=5000.

If such a situation arouse, I would, for example,
not have any objections against trading an O(N log N) algorithm
for an O(N log N log N) algorithm that is conceptually so much
simpler that the constant factor is smaller by two orders
of magnitude...

> I usually illustrate this by asking if you'd be happy
> to give up your 2 / 52 week annual holiday .. that's
> under 4% of the year.. I'm sure you'd be horrified ..
> 4% is a LOT!

I am always amazed when I hear this, thinking that we
Europeans typically get at least 5 weeks of holiday
per year. :-)

Admittedly, according to the CIA world factbook, GDP
per capita is about 1/4 higher in the US than in Germany
(say), but I still prefer the more relaxed European model.

-- 
best regards,
Thomas Fischbacher
tf@functionality.de


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] compiling large file hogs RAM and takes a long time.
  2007-06-08  1:02   ` [Caml-list] " Jacques Garrigue
  2007-06-08  1:51     ` skaller
@ 2007-06-08 12:30     ` Jacques Garrigue
  2007-06-15 15:41       ` Sam Steingold
  1 sibling, 1 reply; 29+ messages in thread
From: Jacques Garrigue @ 2007-06-08 12:30 UTC (permalink / raw)
  To: sds; +Cc: caml-list

> > Any chance there is some quadratic code in polymorphic variant type
> > processing?!
> 
> There is, and this is a known problem:
>   http://caml.inria.fr/mantis/view.php?id=4053
> 
> I'm sorry, but I don't see any easy way out.
> At least on the basic time complexity.
> However, the space complexity could be improved, if I find what is
> gobbling memory so fast.

I found the reason, and this is now fixed in CVS 3.10.
I can now compile a pattern matching with 3000 cases in 10s and just
10MB, which is better than 40s and 250MB before :-)

This doesn't change the quadratic time complexity, but as other
mentioned, constants do matter.

Jacques Garrigue


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: compiling large file hogs RAM and takes a long   time.
  2007-06-08  1:51     ` skaller
  2007-06-08  2:26       ` Yaron Minsky
  2007-06-08  9:05       ` Thomas Fischbacher
@ 2007-06-08 13:39       ` Sam Steingold
  2 siblings, 0 replies; 29+ messages in thread
From: Sam Steingold @ 2007-06-08 13:39 UTC (permalink / raw)
  To: skaller; +Cc: Jacques Garrigue, caml-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

skaller wrote:
> On Fri, 2007-06-08 at 10:02 +0900, Jacques Garrigue wrote:
>>> Any chance there is some quadratic code in polymorphic variant type
>>> processing?!
>> There is, and this is a known problem:
>>   http://caml.inria.fr/mantis/view.php?id=4053
>>
>> I'm sorry, but I don't see any easy way out.
>> At least on the basic time complexity.
> 
> You mention in the ticket there is a hard way out .. using
> binary trees; hard because it would require changes everywhere
> in the compiler. Is this actually enough? Seems to reduce
> 
> 	O(n * n * log n)
> 
> to 
> 
> 	O( n * log n * log n)
> 
> which is still pretty bad.. is that right?

for n=16,000 you have log n~14, i.e., a factor of 1,000.
even allowing for an unfavorable constant, the factor of 100 would be a
huge win.
I think going from n*n*log to n*log*log would be a worthy project.

Sam.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGaVv+Pp1Qsf2qnMcRAkKzAJ4lWTSha/aF/3Uhr9+2E6C3nIGwpACgio0q
Z0Yb0Aru1bbV0vzwt/1eMXM=
=8UNH
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: compiling large file hogs RAM and takes a long time.
  2007-06-08 12:30     ` [Caml-list] " Jacques Garrigue
@ 2007-06-15 15:41       ` Sam Steingold
  2007-06-15 18:56         ` [Caml-list] " Jon Harrop
  0 siblings, 1 reply; 29+ messages in thread
From: Sam Steingold @ 2007-06-15 15:41 UTC (permalink / raw)
  To: Jacques Garrigue; +Cc: caml-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jacques Garrigue wrote:
>>> Any chance there is some quadratic code in polymorphic variant type
>>> processing?!
>> There is, and this is a known problem:
>>   http://caml.inria.fr/mantis/view.php?id=4053
>>
>> I'm sorry, but I don't see any easy way out.
>> At least on the basic time complexity.
>> However, the space complexity could be improved, if I find what is
>> gobbling memory so fast.
> 
> I found the reason, and this is now fixed in CVS 3.10.
> I can now compile a pattern matching with 3000 cases in 10s and just
> 10MB, which is better than 40s and 250MB before :-)

I observe a time reduction of 40% (from 10 minutes down to 6 minutes)
and RAM usage reduction of 60% (from 450MB down to 160MB)
for my (generated) file.
Very nice, thanks!

> This doesn't change the quadratic time complexity, but as other
> mentioned, constants do matter.

Indeed, but it appears that in this case improving constants does not
solve the problem completely.

Sam.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGcrMoPp1Qsf2qnMcRAu6EAJ449kDgvpfKPSRoI1vlIiLqjNcILwCeMvi9
6EWbtykwW5kD62QvxW/hU6Y=
=XRSH
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Re: compiling large file hogs RAM and takes a long time.
  2007-06-15 15:41       ` Sam Steingold
@ 2007-06-15 18:56         ` Jon Harrop
  2007-06-15 20:06           ` Sam Steingold
  0 siblings, 1 reply; 29+ messages in thread
From: Jon Harrop @ 2007-06-15 18:56 UTC (permalink / raw)
  To: caml-list

On Friday 15 June 2007 16:41:28 Sam Steingold wrote:
> Indeed, but it appears that in this case improving constants does not
> solve the problem completely.

I had a problem with a non tail recursion between a pair of functions in 
ocamlopt that made it stack overflow on some of my autogenerated code. I'll 
try to track it down again if anyone is interested in fixing it.

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
OCaml for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists/?e


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: compiling large file hogs RAM and takes a long   time.
  2007-06-15 18:56         ` [Caml-list] " Jon Harrop
@ 2007-06-15 20:06           ` Sam Steingold
  0 siblings, 0 replies; 29+ messages in thread
From: Sam Steingold @ 2007-06-15 20:06 UTC (permalink / raw)
  To: Jon Harrop; +Cc: caml-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jon Harrop wrote:
> On Friday 15 June 2007 16:41:28 Sam Steingold wrote:
>> Indeed, but it appears that in this case improving constants does not
>> solve the problem completely.
> 
> I had a problem with a non tail recursion between a pair of functions in 
> ocamlopt that made it stack overflow on some of my autogenerated code. I'll 
> try to track it down again if anyone is interested in fixing it.

I am not sure how this is relevant to the thread.
there is no recursion (tail or otherwise) in my file.
all it has is

type foo = [`A | `B ... ]
let bar = function
| "A" -> `A
| "B" -> `B
...




-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGcvFBPp1Qsf2qnMcRAiqLAJ40x2ItY4N9kjOI9fXsKNqrxwpa/ACcCJoC
AxGJRTfe7b89b1+l3vL5k9A=
=PaNJ
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 29+ messages in thread

* large parametrized polymorphic variant type combinations take forever to compile
  2007-06-06 16:30 compiling large file hogs RAM and takes a long time Sam Steingold
  2007-06-06 16:51 ` [Caml-list] " skaller
  2007-06-07 15:52 ` Sam Steingold
@ 2007-07-09 20:22 ` Sam Steingold
  2007-07-09 22:45   ` Sam Steingold
                     ` (2 more replies)
  2 siblings, 3 replies; 29+ messages in thread
From: Sam Steingold @ 2007-07-09 20:22 UTC (permalink / raw)
  To: caml-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,
continuing the saga started in
http://permalink.gmane.org/gmane.comp.lang.caml.inria/37496
I have 3 files: f1.ml, f2.ml, and f.ml:
=== f1.ml ===
type ('a,'b) t1 = [
| `t1_a of 'a
| `t1_b of 'b
]

let f ~fa ~fb = function
  | `t1_a a -> fa a
  | `t1_b b -> fb b
=== f1.ml ===

=== f2.ml ===
type ('a,'b,'c) t2 = [
| `t2_a of 'a
| `t2_b of 'b
| `t2_c of 'c
]

let f ~fa ~fb ~fc = function
  | `t2_a a -> fa a
  | `t2_b b -> fb b
  | `t2_c c -> fc c
=== f2.ml ===

=== f.ml ===
type ('a,'b,'c) t3 = [
| ('a,'b) F1.t1
| ('a,'b,'c) F2.t2
]

let f ~fa ~fb ~fc = function
  | # F1.t1 as x -> F1.f ~fa ~fb x
  | # F2.t2 as x -> F2.f ~fa ~fb ~fc x
=== f.ml ===

except that the number of variants is not 2-3 but 800 for f1 and 3000
for f2 (the number of parameter types ('a,'b,'c) is 3).
compiling f1.ml and f1.ml takes 10-30 minutes (of 40% less if I use 3.10).
compiling f.ml takes forever. literally. it has been running for 5+
hours now and there is no sign of hope (all it has produces so far is an
empty f.cmo):
27820 sds       25   0 1343m 1.3g 1576 R   98 37.2 306:01.80 ocamlc.opt
(the 3 sample files above compile reasonably fast, but my files are larger).
the command line is
ocamlfind ocamlc -dtypes -thread -w Ae -warn-error Ae -for-pack
Tickslots -g -I . -c generic_tick.ml

is this a known problem?

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGkpkePp1Qsf2qnMcRAr5mAKCnK+E472stG/892ZTHbXUOifdIGwCgkeHs
xrdhiriBPFyim9n8pHj1Cnk=
=S1Pe
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: large parametrized polymorphic variant type   combinations take forever to compile
  2007-07-09 20:22 ` large parametrized polymorphic variant type combinations take forever to compile Sam Steingold
@ 2007-07-09 22:45   ` Sam Steingold
  2007-07-09 23:37   ` [Caml-list] " Jacques Garrigue
  2007-07-10  3:34   ` [Caml-list] " skaller
  2 siblings, 0 replies; 29+ messages in thread
From: Sam Steingold @ 2007-07-09 22:45 UTC (permalink / raw)
  To: caml-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sam Steingold wrote:
> Hi,
> continuing the saga started in
> http://permalink.gmane.org/gmane.comp.lang.caml.inria/37496
> I have 3 files: f1.ml, f2.ml, and f.ml:
> === f1.ml ===
> type ('a,'b) t1 = [
> | `t1_a of 'a
> | `t1_b of 'b
> ]
> 
> let f ~fa ~fb = function
>   | `t1_a a -> fa a
>   | `t1_b b -> fb b
> === f1.ml ===
> 
> === f2.ml ===
> type ('a,'b,'c) t2 = [
> | `t2_a of 'a
> | `t2_b of 'b
> | `t2_c of 'c
> ]
> 
> let f ~fa ~fb ~fc = function
>   | `t2_a a -> fa a
>   | `t2_b b -> fb b
>   | `t2_c c -> fc c
> === f2.ml ===
> 
> === f.ml ===
> type ('a,'b,'c) t3 = [
> | ('a,'b) F1.t1
> | ('a,'b,'c) F2.t2
> ]
> 
> let f ~fa ~fb ~fc = function
>   | # F1.t1 as x -> F1.f ~fa ~fb x
>   | # F2.t2 as x -> F2.f ~fa ~fb ~fc x
> === f.ml ===
> 
> except that the number of variants is not 2-3 but 800 for f1 and 3000
> for f2 (the number of parameter types ('a,'b,'c) is 3).
> compiling f1.ml and f1.ml takes 10-30 minutes (of 40% less if I use 3.10).
> compiling f.ml takes forever. literally. it has been running for 5+
> hours now and there is no sign of hope (all it has produces so far is an
> empty f.cmo):
> 27820 sds       25   0 1343m 1.3g 1576 R   98 37.2 306:01.80 ocamlc.opt
> (the 3 sample files above compile reasonably fast, but my files are larger).
> the command line is
> ocamlfind ocamlc -dtypes -thread -w Ae -warn-error Ae -for-pack
> Tickslots -g -I . -c generic_tick.ml
> 
> is this a known problem?

more information: when t1 and t2 contain ~1,000 variants, both f1.ml and
f2.ml compile in a few seconds, but f.ml takes forever (10+ minutes).
splitting f.ml into 2 files indicates that its the function f, not type
t3 that is taking all the time.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGkrqkPp1Qsf2qnMcRAsjiAJ9uEH172bR+OWZZQ3naNcfOo4fS9ACeJD79
ACmrszDW34kO4OzyISu6H8I=
=N/CR
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] large parametrized polymorphic variant type combinations take forever to compile
  2007-07-09 20:22 ` large parametrized polymorphic variant type combinations take forever to compile Sam Steingold
  2007-07-09 22:45   ` Sam Steingold
@ 2007-07-09 23:37   ` Jacques Garrigue
  2007-07-10  7:09     ` Christophe Raffalli
                       ` (3 more replies)
  2007-07-10  3:34   ` [Caml-list] " skaller
  2 siblings, 4 replies; 29+ messages in thread
From: Jacques Garrigue @ 2007-07-09 23:37 UTC (permalink / raw)
  To: sds; +Cc: caml-list

From: Sam Steingold <sds@gnu.org>
> Hi,
> continuing the saga started in
> http://permalink.gmane.org/gmane.comp.lang.caml.inria/37496
> I have 3 files: f1.ml, f2.ml, and f.ml:
[...]
> === f.ml ===
> type ('a,'b,'c) t3 = [
> | ('a,'b) F1.t1
> | ('a,'b,'c) F2.t2
> ]
> 
> let f ~fa ~fb ~fc = function
>   | # F1.t1 as x -> F1.f ~fa ~fb x
>   | # F2.t2 as x -> F2.f ~fa ~fb ~fc x
> === f.ml ===
> 
> except that the number of variants is not 2-3 but 800 for f1 and 3000
> for f2 (the number of parameter types ('a,'b,'c) is 3).
> compiling f1.ml and f1.ml takes 10-30 minutes (of 40% less if I use 3.10).
> compiling f.ml takes forever. literally. it has been running for 5+
> hours now and there is no sign of hope (all it has produces so far is an
> empty f.cmo):
> 27820 sds       25   0 1343m 1.3g 1576 R   98 37.2 306:01.80 ocamlc.opt
> (the 3 sample files above compile reasonably fast, but my files are larger).
> the command line is
> ocamlfind ocamlc -dtypes -thread -w Ae -warn-error Ae -for-pack
> Tickslots -g -I . -c generic_tick.ml
> 
> Is this a known problem?

Are you using the CVS 3.10 version?
Performance has been been improved, but as I already answerred you,
complexity for polymorphic variant pattern-matching is still O(n*n),
which is going to create problems with huge variant types.
In particular, you may not be aware that #F1.t1 and #F2.t2 are
actually expanded internally into huge or-patterns, which have to be
type-checked. This could be improved, at least at the type-check
level, but again polymorphic variants were not implemented with
thousands of constructors in mind.

Could you send me your real code, so that I can see whether something
unexpected is happening?

Jacques


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] large parametrized polymorphic variant type combinations take forever to compile
  2007-07-09 20:22 ` large parametrized polymorphic variant type combinations take forever to compile Sam Steingold
  2007-07-09 22:45   ` Sam Steingold
  2007-07-09 23:37   ` [Caml-list] " Jacques Garrigue
@ 2007-07-10  3:34   ` skaller
  2007-07-10 13:27     ` Sam Steingold
  2 siblings, 1 reply; 29+ messages in thread
From: skaller @ 2007-07-10  3:34 UTC (permalink / raw)
  To: Sam Steingold; +Cc: caml-list

On Mon, 2007-07-09 at 16:22 -0400, Sam Steingold wrote:
> -----BEGIN PGP SIGNED MESSAGE-----

> compiling f.ml takes forever. literally. it has been running for 5+
> hours now and there is no sign of hope (all it has produces so far is an
> empty f.cmo):
> 27820 sds       25   0 1343m 1.3g 1576 R   98 37.2 306:01.80 ocamlc.opt


1G ram used on 1G box? Is it thrashing the disk?
Disk is 1000 times slower than RAM .. if you get into random
paging, just reboot your box or get more RAM: the time taken
is no longer relevant. This happened to me: Mlton builds on
my 2G box in 15 minutes. On my 1G box it takes for ever ..
I think the disk would pass its MTBF before the job would
finish :)

-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] large parametrized polymorphic variant type combinations take forever to compile
  2007-07-09 23:37   ` [Caml-list] " Jacques Garrigue
@ 2007-07-10  7:09     ` Christophe Raffalli
  2007-07-10  7:31       ` Jacques Garrigue
  2007-07-10 14:16     ` Sam Steingold
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 29+ messages in thread
From: Christophe Raffalli @ 2007-07-10  7:09 UTC (permalink / raw)
  To: Jacques Garrigue; +Cc: sds, caml-list


> 
> Are you using the CVS 3.10 version?
> Performance has been been improved, but as I already answerred you,
> complexity for polymorphic variant pattern-matching is still O(n*n),

This looks strange: OCaml compute useless cases and incomplete pattern
and since you can encode SAT in that, I would think that OCaml
pattern matching (with monomorphic or polymorphic variant) would be
exponential ...  OK, this is for large patterns, not just case analysis.

As an example, the following function f encodes the following SAT problem:
not a, b ; a, not c ; not b, c ; a, b, c
and f is incomplete iff the above set of clauses is satisfiable

type r = { a : bool; b : bool; c : bool }
let f = function
| { a = true; b = false } -> ()
| { a = false; c = true } -> ()
| { b = true; c = false } -> ()
| { a = false; b = false; c = false } -> ()


-- 
Christophe Raffalli
Université de Savoie
Batiment Le Chablais, bureau 21
73376 Le Bourget-du-Lac Cedex

tél: (33) 4 79 75 81 03
fax: (33) 4 79 75 87 42
mail: Christophe.Raffalli@univ-savoie.fr
www: http://www.lama.univ-savoie.fr/~RAFFALLI
---------------------------------------------
IMPORTANT: this mail is signed using PGP/MIME
At least Enigmail/Mozilla, mutt or evolution
can check this signature. The public key is
stored on www.keyserver.net
---------------------------------------------


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] large parametrized polymorphic variant type combinations take forever to compile
  2007-07-10  7:09     ` Christophe Raffalli
@ 2007-07-10  7:31       ` Jacques Garrigue
  0 siblings, 0 replies; 29+ messages in thread
From: Jacques Garrigue @ 2007-07-10  7:31 UTC (permalink / raw)
  To: christophe.raffalli; +Cc: caml-list

From: Christophe Raffalli <christophe.raffalli@univ-savoie.fr>
> > Are you using the CVS 3.10 version?
> > Performance has been been improved, but as I already answerred you,
> > complexity for polymorphic variant pattern-matching is still O(n*n),
> 
> This looks strange: OCaml compute useless cases and incomplete pattern
> and since you can encode SAT in that, I would think that OCaml
> pattern matching (with monomorphic or polymorphic variant) would be
> exponential ...  OK, this is for large patterns, not just case analysis.

I didn't express myself correctly. I was talking about the complexity
of _typing_ a pattern matching. The completeness check can be of
course much more expensive, if you check for a combination of
patterns, like in you example.
Something like:
let f =
  function
      `A, `A -> ()
    | #Foo.t as x, `A -> Foo.conv x
    | `A, (#Foo.t as x) -> Foo.conv x
    | #Foo.t as x, (#Foo.t as y) -> Foo.conv x; Foo.conv y
where Foo.t is composed of 600 constant cases,
takes 90s in the already improved CVS version,
while
let f =
  function #Foo.t as x -> Foo.conv x | #Foo.t as x -> Foo.conv x
just takes 1.6s

Fortunately it seems that people who use huge types do not check for
combinations of patterns, but just discriminate on a single variant
type...

Jacques Garrigue


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] large parametrized polymorphic variant type combinations take forever to compile
  2007-07-10  3:34   ` [Caml-list] " skaller
@ 2007-07-10 13:27     ` Sam Steingold
  0 siblings, 0 replies; 29+ messages in thread
From: Sam Steingold @ 2007-07-10 13:27 UTC (permalink / raw)
  To: skaller; +Cc: caml-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

skaller wrote:
> On Mon, 2007-07-09 at 16:22 -0400, Sam Steingold wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
> 
>> compiling f.ml takes forever. literally. it has been running for 5+
>> hours now and there is no sign of hope (all it has produces so far is an
>> empty f.cmo):
>> 27820 sds       25   0 1343m 1.3g 1576 R   98 37.2 306:01.80 ocamlc.opt
> 
> 
> 1G ram used on 1G box? Is it thrashing the disk?

1.3G is 37.2% of 3.6G


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGk4k3Pp1Qsf2qnMcRAscJAJ9uE7b3PeU2E4q0+xCZbPq6y2okPQCghRyb
mw0Jirxv+pBAjijoBCWyOCU=
=S6mX
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: large parametrized polymorphic variant type   combinations take forever to compile
  2007-07-09 23:37   ` [Caml-list] " Jacques Garrigue
  2007-07-10  7:09     ` Christophe Raffalli
@ 2007-07-10 14:16     ` Sam Steingold
  2007-07-10 16:49     ` Sam Steingold
       [not found]     ` <46938BDA.1090605@podval.org>
  3 siblings, 0 replies; 29+ messages in thread
From: Sam Steingold @ 2007-07-10 14:16 UTC (permalink / raw)
  To: Jacques Garrigue; +Cc: Caml List

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jacques Garrigue wrote:
> From: Sam Steingold <sds@gnu.org>
>> === f.ml ===
>> type ('a,'b,'c) t3 = [
>> | ('a,'b) F1.t1
>> | ('a,'b,'c) F2.t2
>> ]
>>
>> let f ~fa ~fb ~fc = function
>>   | # F1.t1 as x -> F1.f ~fa ~fb x
>>   | # F2.t2 as x -> F2.f ~fa ~fb ~fc x
>> === f.ml ===
> In particular, you may not be aware that #F1.t1 and #F2.t2 are
> actually expanded internally into huge or-patterns, which have to be
> type-checked. 

are there any type declarations that I could add to improve performance?


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGk5S0Pp1Qsf2qnMcRApD3AJ0RJvmUcPxFZd7B6GJNb+rmSQi32QCfaCHZ
l0ta9Bm7mv1gE8D1/yhSdwU=
=jOxd
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: large parametrized polymorphic variant type   combinations take forever to compile
  2007-07-09 23:37   ` [Caml-list] " Jacques Garrigue
  2007-07-10  7:09     ` Christophe Raffalli
  2007-07-10 14:16     ` Sam Steingold
@ 2007-07-10 16:49     ` Sam Steingold
       [not found]     ` <46938BDA.1090605@podval.org>
  3 siblings, 0 replies; 29+ messages in thread
From: Sam Steingold @ 2007-07-10 16:49 UTC (permalink / raw)
  To: Jacques Garrigue; +Cc: caml-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jacques Garrigue wrote:
> 
> Could you send me your real code, so that I can see whether something
> unexpected is happening?

here is some more information:
file http://sds.podval.org/data/small.zip includes 6 files:

small1.ml small2.ml : define large (~1,000 variants) polymorphic variant
types (non-parametric).
compilation time: ~20 sec each.
small.ml : use small1.ml & small2.ml:
================ small.ml ====================
type t3 = [
| Small1.t1
| Small2.t2
]

let f ~fa ~fb ~fc = function
  | # Small1.t1 as x -> Small1.f ~fa ~fb x
  | # Small2.t2 as x -> Small2.f ~fa ~fb ~fc x
================ small.ml ====================
compilation time:
real    1m28.063s
user    1m8.628s
sys     0m0.828s

smallp1.ml smallp2.ml : define large (~1,000 variants) parametric
polymorphic variant types.
compilation time: ~20 sec each (same as non-parametric)
small_p.ml: use smallp1.ml smallp2.ml:
================ small_p.ml ====================
type ('a,'b,'c) t3 = [
| ('a,'b) Smallp1.t1
| ('a,'b,'c) Smallp2.t2
]

let f ~fa ~fb ~fc = function
  | # Smallp1.t1 as x -> Smallp1.f ~fa ~fb x
  | # Smallp2.t2 as x -> Smallp2.f ~fa ~fb ~fc x
================ small_p.ml ====================
compilation time:
real    29m31.639s
user    25m49.929s
sys     0m10.937s

note that the use of type parameters increased compilation time of "|#"
by a factor of 20!

$ ocamlc -version
3.09.3

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGk7iOPp1Qsf2qnMcRAvVaAKCDfgmyWfvQ4wQA5T6n8PfcIdxWJwCgi6i+
ds8yKZ7tkgFoe4LqXaumV9w=
=+jXg
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] large parametrized polymorphic variant type combinations take forever to compile
       [not found]     ` <46938BDA.1090605@podval.org>
@ 2007-07-11  0:10       ` Jacques Garrigue
  2007-07-11  1:19         ` Jon Harrop
  2007-07-11 13:12         ` Sam Steingold
  0 siblings, 2 replies; 29+ messages in thread
From: Jacques Garrigue @ 2007-07-11  0:10 UTC (permalink / raw)
  To: sds; +Cc: caml-list

From: Sam Steingold <sds@podval.org>

> actually these horrors are observed with 3.09.3.
> we are planning to upgrade to 3.10 soon.

Note that it should be the 3.10 CVS version, not 3.10.0.
Or you can wait for 3.10.1 (but I have no idea when it will be.)

> > This could be improved, at least at the type-check
> > level, but again polymorphic variants were not implemented with
> > thousands of constructors in mind.
> 
> too bad. any chance this could be fixed?

That the original implementation hadn't huge types in mind?
Too late: it would require a very different approach.
Also, as Christophe Raffalli pointed out, there are some features of
pattern-matching that are NP-complete by design.
But we can try to remove flagrant innefficiencies when we understand
them.

> > Could you send me your real code, so that I can see whether something
> > unexpected is happening?
> 
> attached

Seems you're lucky. The fix I did yesterday after reading your first
mail, combined with the fixes following the previous discussion,
solved the problem. Compilation times are now about 7s using less than
7MB, for all of the 3 files, using ocamlc (bytecode). Of course, you
can still expect quadratic behaviour if your types grow more...

By the way, you're also lucky for the generated code: due to the
naming of your constructors, the code for small0.ml is very compact
and efficient. Of course, this is not going to be true in general.
(You can see the result of pattern-matching compilation with
'ocamlc -c -dlambda')

Jacques Garrigue


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] large parametrized polymorphic variant type combinations take forever to compile
  2007-07-11  0:10       ` [Caml-list] " Jacques Garrigue
@ 2007-07-11  1:19         ` Jon Harrop
  2007-07-11  2:23           ` Jacques GARRIGUE
  2007-07-11 13:12         ` Sam Steingold
  1 sibling, 1 reply; 29+ messages in thread
From: Jon Harrop @ 2007-07-11  1:19 UTC (permalink / raw)
  To: caml-list

On Wednesday 11 July 2007 01:10:02 Jacques Garrigue wrote:
> Seems you're lucky. The fix I did yesterday after reading your first
> mail, combined with the fixes following the previous discussion,
> solved the problem. Compilation times are now about 7s using less than
> 7MB, for all of the 3 files, using ocamlc (bytecode). Of course, you
> can still expect quadratic behaviour if your types grow more...

Isn't it a really bad idea to autogenerate polymorphic variant type 
constructors anyway because, sooner or later, you'll get a hash clash? In 
fact, wouldn't that be platform specific?

I would recommend factoring the sum type by the argument types of the 
contructors in this case. Looking at Sam's files, this is trivial. Just 
replace this:

type ('a,'b) t1 = [
| `t1_a of 'a option
| `t1_b of 'b list
| `t1_a0 of 'a option
...
| `t1_a9 of 'a option
| `t1_a10 of 'a option
...
| `t1_a1000 of 'a option

with this:

type t1 = [
| `t1_int of [
  | `a
  | `a0
  | `a10000 ] * int
| `t1_float of [`b] * float
]

and this:

let f ~fa ~fb = function
  | `t1_a a -> fa a
  | `t1_b b -> fb b
  | `t1_a0 a -> fa a
...
  | `t1_a1000 a -> fa a

with this:

let f ~fa ~fb = function
| `t1_int(_, a) -> fa a
| `t1_float(_, b) -> fb b

The following code autogenerates an equivalent to small1.ml and it compiles 
100x faster:

open Printf

let () =
  printf "type t1 = [\n";
  printf "| `t1_int of [\n| `a\n";
  for i=0 to 1000 do
    printf "  | `a%d\n" i
  done;
  printf "] * int\n";
  printf "| `t1_float of [`b] * float\n";
  printf "]\n";
  printf "let f ~fa ~fb = function\n";
  printf "| `t1_int(_, a) -> fa a\n";
  printf "| `t1_float(_, b) -> fb b\n";

Doing the same for small2.ml and small.ml, the final compilation time is 
0.09s.

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
The OCaml Journal
http://www.ffconsultancy.com/products/ocaml_journal/?e


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] large parametrized polymorphic variant type combinations take forever to compile
  2007-07-11  1:19         ` Jon Harrop
@ 2007-07-11  2:23           ` Jacques GARRIGUE
  0 siblings, 0 replies; 29+ messages in thread
From: Jacques GARRIGUE @ 2007-07-11  2:23 UTC (permalink / raw)
  To: jon; +Cc: caml-list

From: Jon Harrop <jon@ffconsultancy.com>

> Isn't it a really bad idea to autogenerate polymorphic variant type 
> constructors anyway because, sooner or later, you'll get a hash clash? In 
> fact, wouldn't that be platform specific?

Indeed, a hash clash can happen. It will be detected at compile time,
but in a generated setting it may be harder to fix. Fortunately, at
least this is not platform specific: the hash function for variants is
defined only once and for all. Also, it is explicitly defined so that
names of up to 4 characters will all get different hash values.

On the other hand, as the intent of polymorphic variants is to be able
to use understandable constructor names, rather than complex
imbrications of datatypes, a large part of their interest is lost here.

> I would recommend factoring the sum type by the argument types of the 
> contructors in this case. Looking at Sam's files, this is trivial. Just 
> replace this:
[..]
> with this:
> 
> type t1 = [
> | `t1_int of [
>   | `a
>   | `a0
>   | `a10000 ] * int
> | `t1_float of [`b] * float
> ]

This is mostly a matter of taste, but this might help producing better
code, independently of the compilation time. Of course, one can also
think of types that you cannot decompose this way, if the type of the
argument is always different for instance.

Jacques Garrigue


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: large parametrized polymorphic variant type   combinations take forever to compile
  2007-07-11  0:10       ` [Caml-list] " Jacques Garrigue
  2007-07-11  1:19         ` Jon Harrop
@ 2007-07-11 13:12         ` Sam Steingold
  2007-07-11 19:17           ` [Caml-list] " Jon Harrop
  1 sibling, 1 reply; 29+ messages in thread
From: Sam Steingold @ 2007-07-11 13:12 UTC (permalink / raw)
  To: Jacques Garrigue; +Cc: caml-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jacques Garrigue wrote:
>>> Could you send me your real code, so that I can see whether something
>>> unexpected is happening?
>> attached
> 
> Seems you're lucky. The fix I did yesterday after reading your first
> mail, combined with the fixes following the previous discussion,
> solved the problem. 

thanks.
does this also fix the "|#" compilation time?
right now it takes 8+ hours and does NOT finish in that time.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGlNcyPp1Qsf2qnMcRAh9KAKCDedfrErrM6V4GHpi42KKQWaroNACffz2K
g1esgfFg8vOmPh+NZ7C4fIA=
=Y4ab
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Re: large parametrized polymorphic variant type combinations take forever to compile
  2007-07-11 13:12         ` Sam Steingold
@ 2007-07-11 19:17           ` Jon Harrop
  0 siblings, 0 replies; 29+ messages in thread
From: Jon Harrop @ 2007-07-11 19:17 UTC (permalink / raw)
  To: caml-list

On Wednesday 11 July 2007 14:12:18 Sam Steingold wrote:
> thanks.
> does this also fix the "|#" compilation time?
> right now it takes 8+ hours and does NOT finish in that time.

Is there a reason you don't just factor by argument types?

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
The OCaml Journal
http://www.ffconsultancy.com/products/ocaml_journal/?e


^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2007-07-11 19:22 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-06-06 16:30 compiling large file hogs RAM and takes a long time Sam Steingold
2007-06-06 16:51 ` [Caml-list] " skaller
2007-06-06 17:05   ` Sam Steingold
2007-06-07 15:52 ` Sam Steingold
2007-06-08  1:02   ` [Caml-list] " Jacques Garrigue
2007-06-08  1:51     ` skaller
2007-06-08  2:26       ` Yaron Minsky
2007-06-08  9:05       ` Thomas Fischbacher
2007-06-08  9:35         ` skaller
2007-06-08  9:55           ` Thomas Fischbacher
2007-06-08 13:39       ` Sam Steingold
2007-06-08 12:30     ` [Caml-list] " Jacques Garrigue
2007-06-15 15:41       ` Sam Steingold
2007-06-15 18:56         ` [Caml-list] " Jon Harrop
2007-06-15 20:06           ` Sam Steingold
2007-07-09 20:22 ` large parametrized polymorphic variant type combinations take forever to compile Sam Steingold
2007-07-09 22:45   ` Sam Steingold
2007-07-09 23:37   ` [Caml-list] " Jacques Garrigue
2007-07-10  7:09     ` Christophe Raffalli
2007-07-10  7:31       ` Jacques Garrigue
2007-07-10 14:16     ` Sam Steingold
2007-07-10 16:49     ` Sam Steingold
     [not found]     ` <46938BDA.1090605@podval.org>
2007-07-11  0:10       ` [Caml-list] " Jacques Garrigue
2007-07-11  1:19         ` Jon Harrop
2007-07-11  2:23           ` Jacques GARRIGUE
2007-07-11 13:12         ` Sam Steingold
2007-07-11 19:17           ` [Caml-list] " Jon Harrop
2007-07-10  3:34   ` [Caml-list] " skaller
2007-07-10 13:27     ` Sam Steingold

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).