caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Generators/iterators and lazy evaluation?
@ 2007-04-04 16:33 Raj B
  2007-04-04 20:10 ` [Caml-list] " Mathias Kende
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Raj B @ 2007-04-04 16:33 UTC (permalink / raw)
  To: caml-list

[-- Attachment #1: Type: text/plain, Size: 1723 bytes --]

Hi all

I've been using OCaml for a year or so building an interpreter/ 
compiler for the Python programming language as a research project.

It's worked pretty well for most part, but I'm trying to implement  
some of the features of Python involving 'lazy' features, such as  
generator functions and list comprehensions.

A generator in Python can be thought of as an arbitrarily generated  
'lazy list'. As an example
the following is a generator capable of generating powers of 2 upto a  
max value.

def pow2(max):
    start = 0
    while (start .lt. max):
       yield 2^start
       start += 1

The 'yield' statement is the point where the function returns the  
next value and suspends itself until the next time it is 'forced'. At  
that time it resumes execution where it left off.

OCaml makes this particularly hard to implement this due to lack of  
'control flow' features, including even a call/cc. The only way I can  
see right now is breaking this code down into little functions,  
saving the current execution environment during the suspend and  
keeping track of the right function to call the next time.

I've been thinking about whether I can express this in some elegant  
way in some lazy construct in OCaml, since this is basically a form  
of 'lazy' evaluation. However, I come from the C world and am not  
quite used to 'lazy' thinking :) Any ideas?

Thanks in advance
Raj

--

The Python charmer

           _                   .-=-.          .-==-.
        { }      __        .' O o '.       /  -<' )
        { }    .' O'.     / o .-. O \     /  .--v`
        { }   / .-. o\   /O  /   \  o\   /O /
         \ `-` /   \ O`-'o  /     \  O`-`o /
     jgs  `-.-`     '.____.'       `.____.'





[-- Attachment #2: Type: text/html, Size: 13792 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Generators/iterators and lazy evaluation?
  2007-04-04 16:33 Generators/iterators and lazy evaluation? Raj B
@ 2007-04-04 20:10 ` Mathias Kende
  2007-04-04 20:46 ` Alain Frisch
  2007-04-04 23:53 ` Jon Harrop
  2 siblings, 0 replies; 12+ messages in thread
From: Mathias Kende @ 2007-04-04 20:10 UTC (permalink / raw)
  To: OCaml List


Le mercredi 04 avril 2007 à 11:33 -0500, Raj B a écrit :
>  see right now is breaking this code down into little functions,
> saving the current execution environment during the suspend and
> keeping track of the right function to call the next time.

Are you not just looking for the "lazy" keyword, and the Lazy module ?
http://caml.inria.fr/pub/docs/manual-ocaml/libref/Lazy.html

Mathias



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Generators/iterators and lazy evaluation?
  2007-04-04 16:33 Generators/iterators and lazy evaluation? Raj B
  2007-04-04 20:10 ` [Caml-list] " Mathias Kende
@ 2007-04-04 20:46 ` Alain Frisch
  2007-04-04 21:16   ` Erik de Castro Lopo
  2007-04-04 23:53 ` Jon Harrop
  2 siblings, 1 reply; 12+ messages in thread
From: Alain Frisch @ 2007-04-04 20:46 UTC (permalink / raw)
  To: Raj B; +Cc: caml-list

Raj B wrote:
> The 'yield' statement is the point where the function returns the next
> value and suspends itself until the next time it is 'forced'. At that
> time it resumes execution where it left off.
> 
> OCaml makes this particularly hard to implement this due to lack of
> 'control flow' features, including even a call/cc. The only way I can
> see right now is breaking this code down into little functions, saving
> the current execution environment during the suspend and keeping track
> of the right function to call the next time.
> 
> I've been thinking about whether I can express this in some elegant way
> in some lazy construct in OCaml, since this is basically a form of
> 'lazy' evaluation.

I think you cannot implement this notion of iterator without some kind
of control flow operator (at runtime) or some code rearrangement (at
compile time). The point is that you must mainting several active
"threads" of computations, including their own stack.

A natural solution, as you suggest, is to write your interpreter in
continuation-passing style. This should not be too hard, and will
actually make your source-level interpreter look more like a bytecode
interpreter (especially if you additionnaly defunctionalize the
interpreter, see:
http://www.brics.dk/RS/03/Abs/BRICS-RS-03-Abs/BRICS-RS-03-Abs.html#BRICS-RS-03-14
). Btw, a related solution is to push the idea and write a real bytecode
interpreter.

Another approach would be to use an implementation of control
operators for OCaml. AFAIK, they exist only in bytecode (
http://caml.inria.fr/pub/ml-archives/caml-list/2006/02/8fc9c1a56c497b9743515a5e3432d704.en.html
).

Yet another solution would be to rely on OCaml threads (but that will
probably be inefficient and not very pretty).

-- Alain


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Generators/iterators and lazy evaluation?
  2007-04-04 20:46 ` Alain Frisch
@ 2007-04-04 21:16   ` Erik de Castro Lopo
  2007-04-04 21:37     ` Erik de Castro Lopo
                       ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Erik de Castro Lopo @ 2007-04-04 21:16 UTC (permalink / raw)
  To: caml-list

Alain Frisch wrote:

> I think you cannot implement this notion of iterator without some kind
> of control flow operator (at runtime) or some code rearrangement (at
> compile time). The point is that you must mainting several active
> "threads" of computations, including their own stack.

Alain, I think you may have misunderstood Python's yeild operator.
Typically the original poster's pow2 generators would be used like:

    for p = pow2 (10):
        print p

but I really don't see how laziness Python's generators actually
provide any benefit for this problem over the more obvious Python 
solution to this which is:

    for i = range (10):
        p = 2 ^ i
        print p

The problem with the above is that generators (and laziness in Ocaml)
really only make sense for sequences which are effecitvely infinite.
In the Python case, that would be:

    def pow2 ():
        start = 0
        yield 2^start
        start += 1

One ocaml solution to this is an class/object:

    class pow2gen =
        object
            val mutable current = 0
    
            method next () =
                current <-
                    if current == 0 then 1
                    else current * 2 ;
                current
        end

which can be used like this:

    let _ =
    	let pow2 = new pow2gen in
    	for k = 0 to 10 do
    		Printf.printf "%d : %d\n" k (pow2#next ()) ;
    		done

There would also be a lazy solution to this problem but I don't think
it would be a better solution that the object soultion above.

I've got a more realistic example of lazy lists on my blog:

    http://www.mega-nerd.com/erikd/Blog/CodeHacking/Ocaml/lazy_lists.html


Hope this helps,
Erik
-- 
+-----------------------------------------------------------+
  Erik de Castro Lopo
+-----------------------------------------------------------+
"Every time microshaft's stock price drops again, I rejoice. I
want to see that bunch of criminals brought to their knees.
Preferably at the chopping block."
-- rixt in http://linuxtoday.com/stories/20659_flat.html


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Generators/iterators and lazy evaluation?
  2007-04-04 21:16   ` Erik de Castro Lopo
@ 2007-04-04 21:37     ` Erik de Castro Lopo
  2007-04-04 22:55     ` Bill Wood
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: Erik de Castro Lopo @ 2007-04-04 21:37 UTC (permalink / raw)
  To: caml-list

Erik de Castro Lopo wrote:

> Alain Frisch wrote:
> 
> > I think you cannot implement this notion of iterator without some kind
> > of control flow operator (at runtime) or some code rearrangement (at
> > compile time). The point is that you must mainting several active
> > "threads" of computations, including their own stack.
> 
> Alain, I think you may have misunderstood Python's yeild operator.
> Typically the original poster's pow2 generators would be used like:
> 
>     for p = pow2 (10):
>         print p

I should add, that when the yeild operator is used in Python, there is no
separate thread of execution like there would be for say pthead_yeild()
in POSIX threads. Python's yeild is very much closer to Lazy evaluation
in Ocaml.

Erik
-- 
+-----------------------------------------------------------+
  Erik de Castro Lopo
+-----------------------------------------------------------+
"The RIAA is obsessed to the point of comedy with the frustration
of having its rules broken, without considering whether such rules
might be standing in the way of increased revenues. Indeed,
Napster and Gnutella may turn out to be the two best music-marketing
gimmicks yet devised, if only the RIAA would take its head out of
its ass long enough to realise it."
-- Thomas C Greene on www.theregister.co.uk


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Generators/iterators and lazy evaluation?
  2007-04-04 21:16   ` Erik de Castro Lopo
  2007-04-04 21:37     ` Erik de Castro Lopo
@ 2007-04-04 22:55     ` Bill Wood
  2007-04-05  2:49     ` skaller
  2007-04-05 16:41     ` Richard Jones
  3 siblings, 0 replies; 12+ messages in thread
From: Bill Wood @ 2007-04-04 22:55 UTC (permalink / raw)
  To: Erik de Castro Lopo; +Cc: caml-list

On Thu, 2007-04-05 at 07:16 +1000, Erik de Castro Lopo wrote:
> Alain Frisch wrote:
> 
> > I think you cannot implement this notion of iterator without some kind
> > of control flow operator (at runtime) or some code rearrangement (at
> > compile time). The point is that you must mainting several active
> > "threads" of computations, including their own stack.
> 
> Alain, I think you may have misunderstood Python's yeild operator.
> Typically the original poster's pow2 generators would be used like:
> 
>     for p = pow2 (10):
>         print p
> 
> but I really don't see how laziness Python's generators actually
> provide any benefit for this problem over the more obvious Python 
> solution to this which is:
> 
>     for i = range (10):
>         p = 2 ^ i
>         print p
> 

I think the motivation was to treat the memory issues with the "range"
function, which is most often used to implement "for" loops (range(N)
returns the list [0, ..., N-1]).  People complained about a control
structure allocating memory linear in the depth of the loop, so "xrange"
was created; it generates the items of the list one at a time rather
than all at once.  Further releases of Python generalized to iterators
and generators.  It does make a difference too.  Somebody wrote a Sieve
of Eratosthenes using Python's high-level list operators; it blew up on
fairly small arguments.  I wrote a straight-forward Sieve using indexing
into a static array, with reasonable time performance.  I then rewrote
it using an iterator and got analogous time performance (roughly a
factor of 2 penalty), without statically allocating the entire sieve
first.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Generators/iterators and lazy evaluation?
  2007-04-04 16:33 Generators/iterators and lazy evaluation? Raj B
  2007-04-04 20:10 ` [Caml-list] " Mathias Kende
  2007-04-04 20:46 ` Alain Frisch
@ 2007-04-04 23:53 ` Jon Harrop
  2007-04-05  3:13   ` Erik de Castro Lopo
  2007-04-05  5:58   ` Alain Frisch
  2 siblings, 2 replies; 12+ messages in thread
From: Jon Harrop @ 2007-04-04 23:53 UTC (permalink / raw)
  To: caml-list

On Wednesday 04 April 2007 17:33, Raj B wrote:
> A generator in Python can be thought of as an arbitrarily generated
> 'lazy list'. As an example
> the following is a generator capable of generating powers of 2 upto a
> max value.
>
> def pow2(max):
>     start = 0
>     while (start .lt. max):
>        yield 2^start
>        start += 1
>
> The 'yield' statement is the point where the function returns the
> next value and suspends itself until the next time it is 'forced'. At
> that time it resumes execution where it left off.
>
> OCaml makes this particularly hard to implement this due to lack of
> 'control flow' features

Might I suggest deferring judgement until you have seen some OCaml solutions.

$ ocaml -rectypes camlp4o.cma
        Objective Caml version 3.10.0+beta

        Camlp4 Parsing version 3.10.0+beta

#

Use a lazy list sum type:

# type 'a llist = Nil | Cons of 'a * 'a llist lazy_t;;
type 'a llist = Nil | Cons of 'a * 'a llist lazy_t

# let rec pow2 ?(i=1) n =
    if n>0 then Cons(i, lazy(pow2 ~i:(2*i) (n-1))) else Nil;;
val pow2 : ?i:int -> int -> int llist = <fun>

For example:

# let rec list_of = function
    | Nil -> []
    | Cons(h, t) -> h :: list_of(Lazy.force t);;
val list_of : 'a llist -> 'a list = <fun>

# list_of(pow2 10);;
- : int list = [1; 2; 4; 8; 16; 32; 64; 128; 256; 512]

Use an inferred lazy list type (polymorphic variant):

# let rec pow2 ?(i=1) n =
    if n>0 then `Cons(i, lazy(pow2 ~i:(2*i) (n-1))) else `Nil;;
val pow2 : ?i:int -> int -> ([> `Cons of int * 'a lazy_t | `Nil ] as 'a) =
  <fun>

Return a lazy tail (using -rectypes):

# let rec pow2 ?(i=1) n =
    if n>0 then Some(i, lazy(pow2 ~i:(2*i) (n-1))) else None;;
val pow2 : ?i:int -> int -> ((int * 'a lazy_t) option as 'a) = <fun>

Return a functional tail (using -rectypes):

# let rec pow2 ?(i=1) j n () =
    if n>0 then Some(i, pow2 ~i:(2*i) (n-1)) else None;;
val pow2 : ?i:int -> int -> (int -> unit -> (int * 'a) option as 'a) = <fun>

Return a lazy stream (using camlp4 stream parsing syntax extension):

# let rec pow2 ?(i=1) n =
    if i<n then [<'i; pow2 ~i:(2*i) (n-1)>] else [<>];;
val pow2 : ?i:int -> int -> int Stream.t = <fun>

The moral is: don't try to write idiomatic Python in OCaml.

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
OCaml for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Generators/iterators and lazy evaluation?
  2007-04-04 21:16   ` Erik de Castro Lopo
  2007-04-04 21:37     ` Erik de Castro Lopo
  2007-04-04 22:55     ` Bill Wood
@ 2007-04-05  2:49     ` skaller
  2007-04-05 16:41     ` Richard Jones
  3 siblings, 0 replies; 12+ messages in thread
From: skaller @ 2007-04-05  2:49 UTC (permalink / raw)
  To: Erik de Castro Lopo; +Cc: caml-list

On Thu, 2007-04-05 at 07:16 +1000, Erik de Castro Lopo wrote:

> One ocaml solution to this is an class/object:
> 
>     class pow2gen =
>         object
>             val mutable current = 0
>     
>             method next () =
>                 current <-
>                     if current == 0 then 1
>                     else current * 2 ;
>                 current
>         end

Yep, and one solution would be for an Ocaml programmer to modify
Felix so it can generate Ocaml instead of C++.

Felix mechanically translates threaded code into event driven
code, that is, it control inverts procedures so that I/O and
calls become returns which remember their return point.

The general structure of a translated procedure ( in Ocaml notation ):

	mutable pc : int
	mutable parent: ptype
	method resume() = match pc with
	| 1 -> ...
	| 2 -> ...
	..
	| 6 -> ... pc <- 7; new thing (self)         (* call thing *)
	| 7 -> ... let tmp = parent in parent <- NULL; tmp (* return *)
	| 8 -> ... pc <- 9; self                  (* yield *)



which is used by 

	while(tr != NULL) tr <- tr#resume()


I call this procedural continuation passing, Reynolds calls it
the method of resumptions. You will note the object here
is a stack frame on the heap, and they're linked together
into a spaghetti stack by parent pointers.

To make this work for functions you add a pointer saying
where to put the result, and translate the function
to a procedure accepting the additional pointer.

This requires unravelling applications to SSA (single 
assignment) form so an application:

	x := f a

is modelled by

	f (x,a)
	

There are more details, the above is only a sketch: you can
have a look at the precise model used by Felix (which is written
in Ocaml), noting Felix generators are simply functions translated
to procedures and stored in variables so when they're resume()d
they just keep going (whereas ordinary procedures invoked via
variables are clones of a prototype to ensure they start in
a fresh state).

The thing to note is that this is a two step model:

A. Translate Python to control inverted Ocaml code
B. Compile and execute the generate Ocaml code

and hence produces high performance native code. The problem
is that Python is dynamically typed .. so most of the performance
will be lost doing type dispatches.

-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Generators/iterators and lazy evaluation?
  2007-04-04 23:53 ` Jon Harrop
@ 2007-04-05  3:13   ` Erik de Castro Lopo
  2007-04-05  5:58   ` Alain Frisch
  1 sibling, 0 replies; 12+ messages in thread
From: Erik de Castro Lopo @ 2007-04-05  3:13 UTC (permalink / raw)
  To: caml-list

Jon Harrop wrote:

> The moral is: don't try to write idiomatic Python in OCaml.

Actually, thats not the moral itself, just a corollaray of the
moral. The moral is actually : try to write code that is idiomatic
for the language you are currently writing in [0].

Erik

[0] Currently learning Erlang while being very careful not to 
    write Ocaml, or Python or C++ or C or any of the other
    languages I know.
-- 
+-----------------------------------------------------------+
  Erik de Castro Lopo
+-----------------------------------------------------------+
"TLC declared bankruptcy after they received less than 2 percent of the $175
million earned by their CD sales. That was about 40 times less than the
profit that was divided among their management, production and record
companies." -- Courtney Love on the REAL piracy


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Generators/iterators and lazy evaluation?
  2007-04-04 23:53 ` Jon Harrop
  2007-04-05  3:13   ` Erik de Castro Lopo
@ 2007-04-05  5:58   ` Alain Frisch
  2007-04-05 20:35     ` Jon Harrop
  1 sibling, 1 reply; 12+ messages in thread
From: Alain Frisch @ 2007-04-05  5:58 UTC (permalink / raw)
  To: caml-list

Jon Harrop wrote:
> The moral is: don't try to write idiomatic Python in OCaml.

I think the moral is rather: read the OP's email more carefully. He
doesn't want to translate some Python examples into OCaml by hand, he
wants to implement a Python interpreter in OCaml.

Erik de Castro Lopo wrote:
> Typically the original poster's pow2 generators would be used like:
>
>     for p = pow2 (10):
>         print p
>
> but I really don't see how laziness Python's generators actually
> provide any benefit for this problem over the more obvious Python
> solution to this which is:

We're not discussing (I think) a specific use case for generators, but a
generic way to support them in an interpreter.

> The problem with the above is that generators (and laziness in Ocaml)
> really only make sense for sequences which are effecitvely infinite.

I don't think so: generators provide a simple control operator which is
sometimes useful, even for finite structures, and easier to grasp than
call/cc-like operators. Typically, generators allow their client to be
written in direct style. E.g. using a push XML event parser (SAX-style)
to produce a tree requires you to maintain your own stack accross
invocations of the callbacks. Instead, a pull parser
(generator-style) lets you write a very simple direct code. You can turn
a push parser into a pull parser either with some kind of control
inversion (CPS, generators, ...), or, in this case, by collecting all
the results in a list. But: (1) this works only because the callback
can't have side effects that would change the future events;  (2) you
need to keep all the events in memory, and you cannot stop the parsing
early.

Moreover, I don't really see the connection with the notion of laziness
in OCaml. If you want to turn the generator definition into something
that produces a lazy sequence, you'll also need some kind of control
inversion.


-- Alain


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Generators/iterators and lazy evaluation?
  2007-04-04 21:16   ` Erik de Castro Lopo
                       ` (2 preceding siblings ...)
  2007-04-05  2:49     ` skaller
@ 2007-04-05 16:41     ` Richard Jones
  3 siblings, 0 replies; 12+ messages in thread
From: Richard Jones @ 2007-04-05 16:41 UTC (permalink / raw)
  To: caml-list

On Thu, Apr 05, 2007 at 07:16:52AM +1000, Erik de Castro Lopo wrote:
> One ocaml solution to this is an class/object:
> 
>     class pow2gen =
>         object
>             val mutable current = 0
>     
>             method next () =
>                 current <-
>                     if current == 0 then 1
>                     else current * 2 ;
>                 current
>         end

Potentially anything which keeps state can be used.  I much prefer:

  let pow2gen =
    let current = ref 0 in
    fun () ->
      current := if !current = 0 then 1 else !current * 2;
      !current

Mainly because it can be inserted anywhere within a function.

Rich.

-- 
Richard Jones
Red Hat


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Generators/iterators and lazy evaluation?
  2007-04-05  5:58   ` Alain Frisch
@ 2007-04-05 20:35     ` Jon Harrop
  0 siblings, 0 replies; 12+ messages in thread
From: Jon Harrop @ 2007-04-05 20:35 UTC (permalink / raw)
  To: caml-list

On Thursday 05 April 2007 06:58, Alain Frisch wrote:
> Jon Harrop wrote:
> > The moral is: don't try to write idiomatic Python in OCaml.
>
> I think the moral is rather: read the OP's email more carefully. He
> doesn't want to translate some Python examples into OCaml by hand, he
> wants to implement a Python interpreter in OCaml.

My mistake. In that case, I agree that CPS is the obvious solution. Perhaps it 
is possible to do something equivalent by explicitly storing the state 
and "program counter" in the function when yield is called?

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
OCaml for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2007-04-05 20:40 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-04-04 16:33 Generators/iterators and lazy evaluation? Raj B
2007-04-04 20:10 ` [Caml-list] " Mathias Kende
2007-04-04 20:46 ` Alain Frisch
2007-04-04 21:16   ` Erik de Castro Lopo
2007-04-04 21:37     ` Erik de Castro Lopo
2007-04-04 22:55     ` Bill Wood
2007-04-05  2:49     ` skaller
2007-04-05 16:41     ` Richard Jones
2007-04-04 23:53 ` Jon Harrop
2007-04-05  3:13   ` Erik de Castro Lopo
2007-04-05  5:58   ` Alain Frisch
2007-04-05 20:35     ` Jon Harrop

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).