From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: weis
Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id SAA07045 for caml-redistribution@pauillac.inria.fr; Mon, 21 Feb 2000 18:13:22 +0100 (MET)
Resent-Message-Id: <200002211713.SAA07045@pauillac.inria.fr>
Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id NAA31854 for <caml-list@pauillac.inria.fr>; Sun, 20 Feb 2000 13:54:20 +0100 (MET)
Received: from enst.enst.fr (enst.enst.fr [137.194.2.16])
	by nez-perce.inria.fr (8.8.7/8.8.7) with ESMTP id NAA05333
	for <caml-list@inria.fr>; Sun, 20 Feb 2000 13:54:18 +0100 (MET)
Received: from email.enst.fr (muse.enst.fr [137.194.2.33])
	by enst.enst.fr (8.9.1a/8.9.1) with ESMTP id NAA06807
	for <caml-list@inria.fr>; Sun, 20 Feb 2000 13:54:18 +0100 (MET)
Received: from young.enst.fr (young.enst.fr [137.194.34.13])
	by email.enst.fr (8.9.3/8.9.3) with ESMTP id NAA18376
	for <caml-list@inria.fr>; Sun, 20 Feb 2000 13:54:17 +0100 (MET)
Received: from localhost (debourse@localhost)
	by young.enst.fr (8.8.8+Sun/8.8.8) with SMTP id NAA13107
	for <caml-list@inria.fr>; Sun, 20 Feb 2000 13:54:17 +0100 (MET)
Date: Sun, 20 Feb 2000 13:54:16 +0100 (MET)
From: Benoit de Boursetty <debourse@email.enst.fr>
To: caml-list@inria.fr
Subject: Lazy evaluation & performance
Message-ID: <Pine.GSO.4.02.10002201143320.12460-100000@young.enst.fr>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=iso-8859-1
Resent-From: weis@pauillac.inria.fr
Resent-Date: Mon, 21 Feb 2000 18:13:22 +0100
Resent-To: caml-redistribution@pauillac.inria.fr

Hello,

Has anybody done benchmarks to eval the cost of lazy computation
encapsulation, in terms of time, memory, garbage collection? I have no
idea of how this is implemented...

Here's my personal case :

There is a function f which I want to compute for several arguments
x_1,...x_n.

let f x =
  [beginning]
  let intermediate_value = ... in
  [end]

only that I want to compute it thoroughly just for the x_i that has the
highest intermediate_value among x_i's. This intermediate value is used
anyway for the [end] part.

Naive design (design a):

let f x =
  [beginning]
  let intermediate_value = ... in
  (intermediate_value, lazy [end])
 
the lazy computation of the [end] being forced only for the x_i that has
the highest intermediate_value

Another possible design (less elegant) (design b):

let intermediate_value x =
  [beginning]
  let intermediate_value = ... in
  intermediate_value

let f x =
  [beginning]
  [end]

I compute the intermediate value and then recompute all over again
for the x I want to compute.

Comparison of time costs:

if
  B is the cost for [beginning]
  E is the cost for [end]
  L is the cost for encapsulating the lazy computation of [end]
then
design a costs n*(B+L) + E
design b costs (n+1)*B + E
(very roughly I suppose)

Clearly, deciding which design to adopt is a trade-off depending on n, B,
L. I suppose L also depends on the number of results from [beginning] that
the computer will need to "remember" for [end]? Also, encapsulating lazy
computations means more memory allocation, means more garbage collecting,
doesn't it?

In my case the efficiency bottleneck is E not B, and n is about 10 (i.e.
high) so I'm not expecting a wonderful overall time gain. I'm just
wondering if it's costly to implement it in the way that corresponds best
to reality (design a). "B" is only a dozen flops.

Could anybody give me a hint about the order of magnitude of L?

Thanks very much in advance for your answers.

Benoit de Boursetty.