caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Re: curried fns
@ 1995-11-20 18:51 Laurent CHENO
  0 siblings, 0 replies; 6+ messages in thread
From: Laurent CHENO @ 1995-11-20 18:51 UTC (permalink / raw)
  To: caml-list


Jocelyn Serot wrote :

>>Hello,
>>
>>Could someone please explain the difference(s) between:
>>
>>        let f x = function y -> y + x;;
>>
>>and
>>
>>        let f x y = y + x;;
>>
>>Both have the same type (int -> int -> int) but they seem to behave
>>distinctly wrt evaluation strategy.
>>
>>For instance, if i use the 1st form and write:
>>
>>        let h x = let z = fact x in fun y -> y + z;;
>>        map (h 30) [1;2;3];; (* note 1 *)
>>
>>fact 30 gets evaluated only once (partial evaluation), while
>>the use of the 2nd form for the h function:
>>
>>        let h x y = let z = fact x in y + z;;
>>        map (h 30) [1;2;3];;
>>
>>causes fact 30 to be evaluated _for each_ element of the list.
>>
>>Is this normal or do i misunderstand sth about curryfied fns ?..
>>
>>Thanks for any help


1. excuse my poor english, thank's

2. I think this can help (I hope so !)

For my example :

#let fact x = print_int x ; print_newline() ; x ;;
fact : int -> int = <fun>

first version

#let h1 x =
   let z = fact x
   in
   function y -> y + z ;;
h1 : int -> int -> int = <fun>

NB : the result of h1 30 is a function, which has been evaluated in a
environment where
z yields for the result of fact 30

So, we can see the result of print_int in *this* call !

#let a1 = h1 30 ;;
30
a1 : int -> int = <fun>


Now, a1 is all defined : no more print ... the function a1 don't print anything

#map a1 [1 ; 2 ; 3] ;;
- : int list = [31; 32; 33]


second version

#let h2 x y =
   let z = fact x
   in
   y + z ;;
h2 : int -> int -> int = <fun>

there is no call for fact : you have specialized the first (and not last)
argument of h2

#let a2 = h2 30 ;;
a2 : int -> int = <fun>

now the evaluation is complete, and there is calls for print_int

#map a2 [1 ; 2 ; 3] ;;
30
30
30
- : int list = [31; 32; 33]
#

Laurent Chéno

-------------------------------------------------------------------
Laurent CHENO teaching at / enseignant au
        Lycée Louis-le-Grand - 123 rue Saint-Jacques
        75231 PARIS CEDEX 05 - FRANCE
personal phone (33) 1 48 05 16 04 - fax  (33) 1 48 07 80 18
-------------------------------------------------------------------






^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: curried fns
  1995-11-20 18:06 ` John Harrison
@ 1995-11-21 10:48   ` Pierre Weis
  0 siblings, 0 replies; 6+ messages in thread
From: Pierre Weis @ 1995-11-21 10:48 UTC (permalink / raw)
  To: John Harrison; +Cc: Jocelyn.Serot, caml-list, jharriso


> Something I sometimes wonder about is which of the following is more
> efficient. Perhaps a CAML Light expert can comment.
> 
>   let map f =
>     let rec mapf l =
>       match l with
>         [] -> []
>       | (x::t) -> let y = f x in y::(mapf t) in
>     mapf;;
> 
> or
> 
>   let rec map f l =
>     match l with
>       [] -> []
>     | (x::t) -> let y = f x in y::(map f t);;

The answer is clearly: this is compiler dependent!

The first definition is a strange view of map as a (non-recursive)
function that compute a looping function. The second definition is
more natural and properly exposes that map gets two arguments. If there
are no efficiency reasons to prefer the first form I definitively
prefer the second ``natural'' definition.

The problem is more generally ``how to define functions with 2
arguments'': there is no syntactic way to define functions with
multiple arguments in Caml. Thus the user has to encode it, either via
curryfication or via tuples (let f x y = ... or let f (x,y) = ...).
Naive compilation of these two encodings is not efficient, let's
explain why.

Suppose f defined as

   let f x y = body
or equivalently
   let f = function x -> function y -> body.

As is explicit on this expanded form, f can be partially applied: if
e1 evaluates to v1, then the expression f e1 evaluates to a closure
with code part (function y -> body) and environment part (x, v1).
Thus the naive compilation of ``f e1 e2'' (or equivalently (f e1) e2)
starts by calling f with argument v1, then f builds and returns the
intermediate closure, and this closure is then  applied to (the value
of) e2. This is inefficient, since we perform 2 function calls and
vaste memory to build this intermediate closure which is immediately
applied and becomes garbage.

Now suppose f be defined as

  let f (x, y) = body

the naive evaluation of f (e1, e2) is now:
1) evaluate e1 and e2 leading to v1 and v2
2) build the tuple (e1, e2)
3) call f with this tuple.

Here we perform only one function call, but once again we build an
intermediate data structure to call the function, and this data
structure becomes garbage as soon as entering the function.

A clever way of compiling one of this two forms is necessary since
functions with multiple arguments are mandatory and heavily used. This
is a bit difficult but the idea is mainly the following:
 - for curried functions directly call their body when they are
applied with all their arguments (complete applications).
 - for functions with tuples directly call their body without
constructing the tuple (these functions are always completely applied).

Since curried functions are a bit more powerful, since they can be
partially applied, we may suppose that users will adopt this way of
encoding multiple arguments. Thus, generally speaking Caml compilers
optimize the curried form of encoding. These optimizations are not
always as efficient as the scheme described above, but compilation of
currification is definitively better than tupling for Caml Light,
Caml Special Light, Bigloo and Camlot. Caml compilers generally do not
optimize tupling.

We may now discuss the efficiency of your example.
The naive form:
   let rec map f l =
     match l with
       [] -> []
     | (x::t) -> let y = f x in y::(map f t);;
exposes that map has two arguments. Thus a compiler that optimizes
complete application of curried functions will efficiently compile the
recursive call. On the other hand a naive compiler will build the
partial intermediate closure for each recursive call.

On the other hand, with the first encoding:
   let map f =
     let rec mapf l =
       match l with
         [] -> []
       | (x::t) -> let y = f x in y::(mapf t) in
     mapf;;
the optimizing compiler may fail to recognize that map may be
optimized as a curried functions with two arguments: it will instead
compute a closure for mapf, and generally speaking a closure is less
efficiently applied.
An interesting case: the Bigloo compiler performs a so-called
$\eta$-optimization that automatically rewrite this code to (an
equivalent of) the first natural one!

The naive compiler will compile naively as usual, but now your code
effectively require to compute the intermediate closure only once,
when partially applying map to f. Hence recursive calls to mapf will
save this extra closure construction, and the second form will be a
bit more efficient.

In conclusion: prefer the natural encoding, it is clearer and
compilers are designed to optimize natural code.

Pierre Weis
----------------------------------------------------------------------------
WWW Home Page: http://pauillac.inria.fr/~weis
Projet Cristal
INRIA, BP 105, F-78153 Le Chesnay Cedex (France)
E-mail: Pierre.Weis@inria.fr
Telephone: +33 1 39 63 55 98
Fax: +33 1 39 63 53 30
----------------------------------------------------------------------------




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: curried fns
@ 1995-11-21  5:17 Tarizzo Martial
  0 siblings, 0 replies; 6+ messages in thread
From: Tarizzo Martial @ 1995-11-21  5:17 UTC (permalink / raw)
  To: caml-list


>For instance, if i use the 1st form and write:
>
>	let h x = let z = fact x in fun y -> y + z;;
>	map (h 30) [1;2;3];; (* note 1 *)
>
>fact 30 gets evaluated only once (partial evaluation), while
>the use of the 2nd form for the h function:
>
>	let h x y = let z = fact x in y + z;;
>	map (h 30) [1;2;3];;
> 
>causes fact 30 to be evaluated _for each_ element of the list.
>
>Is this normal or do i misunderstand sth about curryfied fns ?..

I think it's perfectly normal. One can rewrite the two definitions of h
without the syntactic sugar provided by 'in' :
* for the first one :
let h = 
  fun x ->
    (fun z ->
      (fun y -> y+z))
    (fact x);;
* for the second one :
let h = 
  fun x ->
    fun y ->
      (fun z -> y+z)
      (fact x);;

It's clear that the value returned by h x;; is a function of y.

In the first def, fact x is evaluated when we compute h x;; and this value
is 'captured' in the definition of fun y 

In the second def, fact x is computed in the body of the function of y, i.e
each time this funciotn is called.

It's very easy to see this behavior : try to evaluate 
h 30;; 
with a side effect in the fact function (print its argument for example)
The first definition gives :
let h x = let z = fact x in fun y -> y + z;;
h : int -> int -> int = <fun>
#h 3;;
3
- : int -> int = <fun>

while the second one gives :
let h x y = let z = fact x in y + z;;
h : int -> int -> int = <fun>
#h 3;;
- : int -> int = <fun>
*********************************
 Tarizzo Martial
 Prof. Sc Physiques
 Classes preparatoires
 Lycee J MOULIN
 57600 FORBACH

 Email: tarizzo@world-net.sct.fr
        74014.3307@compuserve.com
 Compuserve : 74014,3307
*********************************





^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re:  curried fns
@ 1995-11-20 20:32 Fauque UPS
  0 siblings, 0 replies; 6+ messages in thread
From: Fauque UPS @ 1995-11-20 20:32 UTC (permalink / raw)
  To: Jocelyn.Serot, caml-list



I am not a caml expert and the gurus will say if I am wrong, but this
is what I understand:

> Could someone please explain the difference(s) between:
>
>	let f x = function y -> y + x;;
>
> and
>
>	let f x y = y + x;;

there is absolutely no differences between these two expressions: y+x will
be evaluated only when two arguments x and y are given to f.

> Both have the same type (int -> int -> int) but they seem to behave
> distinctly wrt evaluation strategy.
> 
> For instance, if i use the 1st form and write:
> 
> 	let h x = let z = fact x in fun y -> y + z;;
> 	map (h 30) [1;2;3];; (* note 1 *)
> 
> fact 30 gets evaluated only once (partial evaluation), while
> the use of the 2nd form for the h function:
> 
> 	let h x y = let z = fact x in y + z;;
> 	map (h 30) [1;2;3];;
>  
> causes fact 30 to be evaluated _for each_ element of the list.

this is not the same case: the first form should have been:

   let h x = fun y -> let z = fact x in y + z;;

and here fact 30 will be evaluated for each element of the list;
you have done in fact an optimization by putting the fact x before
the fun y ->   and it lets caml compute the fact x without the need
to know the actual value of y.

Hubert Fauque
hubert.fauque@enst-bretagne.fr




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: curried fns
  1995-11-20 15:54 Jocelyn Serot
@ 1995-11-20 18:06 ` John Harrison
  1995-11-21 10:48   ` Pierre Weis
  0 siblings, 1 reply; 6+ messages in thread
From: John Harrison @ 1995-11-20 18:06 UTC (permalink / raw)
  To: Jocelyn Serot; +Cc: caml-list, John Harrison

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 3015 bytes --]



| Could someone please explain the difference(s) between:
|
|         let f x = function y -> y + x;;
|
| and
|
|         let f x y = y + x;;

Those should behave identically, even w.r.t evaluation. The two phrases
"let f x = t[x]" and "let f = fun x -> t[x]" are essentially identical.
However in the example you give below, you do additional computation
after getting the first argument:

| let h x = let z = fact x in fun y -> y + z;;

The evaluation strategy of ML is as follows: to evaluate "f x", first
evaluate "f", then evaluate "x" (to "x'"), then if "f" evaluates to a
lambda expression "\v. t[v]", recursively evaluate "t[x']"; otherwise
stop. Evaluation of "f x y" is, of course, just evaluation of "(f x) y",
so the above scheme means "f x" gets evaluated first. However ML will
*not* evaluate inside lambdas, so applying "\u. \v. t[u,v]" to a single
argument "x" just gives "\v. t[x,v]" and evaluation then stops, even if
the body is independent of v. However your example "h" then permits
further evaluation. (In the above I use "\x. t[x]" for "fun x -> t[x]".)

Of course, in a pure functional language one can safely evaluate inside
lambdas, but in a language like ML where there may be imperative bits
floating around, the distinction is important. Doing the kind of "partial
evaluation" you cite is a very important technique in writing efficient
code. (Having said that, I'm surprised compilers don't try to identify
"non-imperative" parts and evaluate them -- of course one has to be
careful with exceptions -- it's not really any different from constant
folding in C compilers.) Conversely, it can occasionally be useful to
delay evaluation by introducing eta-redexes (i.e. including additional
arguments), in particular when defining higher order functions
recursively. For example given:

  let I x = x;;

  let THEN f g x = g(f x);;
  #infix "THEN";;

  let ORELSE f g x = try f x with _ -> g x;;  (* Sorry, Xavier! *)
  #infix "ORELSE";;

the following will loop when applied:

  let rec REPEAT f = (f THEN (REPEAT f)) ORELSE I;;

whereas this works:

  let rec REPEAT f x = ((f THEN (REPEAT f)) ORELSE I) x;;

Something I sometimes wonder about is which of the following is more
efficient. Perhaps a CAML Light expert can comment.

  let map f =
    let rec mapf l =
      match l with
        [] -> []
      | (x::t) -> let y = f x in y::(mapf t) in
    mapf;;

or

  let rec map f l =
    match l with
      [] -> []
    | (x::t) -> let y = f x in y::(map f t);;

John.

=========================================================================
John Harrison                   | email: jharriso@ra.abo.fi
Åbo Akademi University          | web:   http://www.abo.fi/~jharriso/
Department of Computer Science  | phone: +358 (9)21 265-4049
Lemminkäisenkatu 14a            | fax:   +358 (9)21 265-4732
20520 Turku                     | home:  +358 (9)21 2316132
FINLAND                         | time:  UTC+2:00
=========================================================================




^ permalink raw reply	[flat|nested] 6+ messages in thread

* curried fns
@ 1995-11-20 15:54 Jocelyn Serot
  1995-11-20 18:06 ` John Harrison
  0 siblings, 1 reply; 6+ messages in thread
From: Jocelyn Serot @ 1995-11-20 15:54 UTC (permalink / raw)
  To: caml-list


Hello,

Could someone please explain the difference(s) between:

	let f x = function y -> y + x;;

and

	let f x y = y + x;;

Both have the same type (int -> int -> int) but they seem to behave
distinctly wrt evaluation strategy.

For instance, if i use the 1st form and write:

	let h x = let z = fact x in fun y -> y + z;;
	map (h 30) [1;2;3];; (* note 1 *)

fact 30 gets evaluated only once (partial evaluation), while
the use of the 2nd form for the h function:

	let h x y = let z = fact x in y + z;;
	map (h 30) [1;2;3];;
 
causes fact 30 to be evaluated _for each_ element of the list.

Is this normal or do i misunderstand sth about curryfied fns ?..

Thanks for any help

	Jocelyn S.

(* note 1: this example is inspired from a a similar one given by
   X. Leroy in his Research Report INRIA/117 about the Zinc experiment *)

--
E-mail: Jocelyn.Serot@lasmea.univ-bpclermont.fr
S-mail: LASMEA - URA 1793 CNRS, Universite Blaise Pascal, 63177 Aubiere cedex
Tel: (33) 73.40.73.30 - Fax: (33) 73.40.72.62




^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~1995-11-21 10:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1995-11-20 18:51 curried fns Laurent CHENO
  -- strict thread matches above, loose matches on Subject: below --
1995-11-21  5:17 Tarizzo Martial
1995-11-20 20:32 Fauque UPS
1995-11-20 15:54 Jocelyn Serot
1995-11-20 18:06 ` John Harrison
1995-11-21 10:48   ` Pierre Weis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).