caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* tip for tail recursive map
@ 2009-10-23 19:55 pikatchou pokemon
  2009-11-02 10:33 ` [Caml-list] " Damien Doligez
  0 siblings, 1 reply; 7+ messages in thread
From: pikatchou pokemon @ 2009-10-23 19:55 UTC (permalink / raw)
  To: caml-list

[-- Attachment #1: Type: text/plain, Size: 1390 bytes --]

hi everyone,

I know this topic has been discussed several times, but I don't think I have
seen the solution I use for functions of the List module which are not tail
recursive.
I thought sharing the tip could be nice.
I will take an example, List.map.
When rewritten in CPS map becomes:

let rec map k f = function
  | []    -> k []
  | x :: rl -> map (fun res -> k ((f x) :: res)) f rl

The trick consists in "unfolding" this function manually, in order to
allocate less closures:

let rec map k f = function
  | [] -> k []
  | x :: rl ->
      let x = f x in
      match rl with
      | [] -> k [x]
      | y :: rl ->
            let y = f y in
            match rl with
            | [] -> k [x ; y]
            | z :: rl ->
                  let z = f z in
                    match rl with
                    | [] -> k [x ; y ; z]
                    | t :: rl ->
            let t = f t in
            map (fun res -> k (x :: y :: z :: t :: res)) f rl

Performances are roughly equivalent to List.map on short and medium size
lists (at least on my machine), but as
it's tail recursive it doesn't blow the stack on long lists.
Please note that performances are not as good as the "Obj.magic" version of
map on very long lists.
However, this does the job for me, I have a fast map on short and medium
size lists (< 10000 elements) but
still working on larger ones.

Hope this helps !

[-- Attachment #2: Type: text/html, Size: 1589 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] tip for tail recursive map
  2009-10-23 19:55 tip for tail recursive map pikatchou pokemon
@ 2009-11-02 10:33 ` Damien Doligez
  2009-11-02 19:56   ` Julien Verlaguet
  2009-11-02 20:00   ` Christophe Raffalli
  0 siblings, 2 replies; 7+ messages in thread
From: Damien Doligez @ 2009-11-02 10:33 UTC (permalink / raw)
  To: OCaml List


On 2009-10-23, at 21:55, pikatchou pokemon wrote:

> I know this topic has been discussed several times, but I don't  
> think I have seen the solution I use for functions of the List  
> module which are not tail recursive.
> I thought sharing the tip could be nice.
> I will take an example, List.map.
> When rewritten in CPS map becomes:
>
> let rec map k f = function
>   | []    -> k []
>   | x :: rl -> map (fun res -> k ((f x) :: res)) f rl

You can do better with an ad-hoc encoding of the continuation
instead of using closures:

let rec map k f = function
   | [] -> List.rev k
   | x :: rl -> map (f x :: k) f rl
;;

The memory footprint is smaller, and you spend much less time
invoking closures.

Note that I haven't bothered benchmarking these two functions.

-- Damien


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] tip for tail recursive map
  2009-11-02 10:33 ` [Caml-list] " Damien Doligez
@ 2009-11-02 19:56   ` Julien Verlaguet
  2009-11-02 20:04     ` Christophe Raffalli
  2009-11-02 20:00   ` Christophe Raffalli
  1 sibling, 1 reply; 7+ messages in thread
From: Julien Verlaguet @ 2009-11-02 19:56 UTC (permalink / raw)
  To: Damien Doligez; +Cc: OCaml List

[-- Attachment #1: Type: text/plain, Size: 1475 bytes --]

Thanks for the tip !

I used this trick on every function of the List module and didn't try to go
a step further in specific cases.
Main problem being List.fold_right ... I couldn't figure out a way to encode
a more efficient continuation encoding
than the good old CPS, any ideas ?

Thanks

2009/11/2 Damien Doligez <damien.doligez@inria.fr>

>
> On 2009-10-23, at 21:55, pikatchou pokemon wrote:
>
>  I know this topic has been discussed several times, but I don't think I
>> have seen the solution I use for functions of the List module which are not
>> tail recursive.
>> I thought sharing the tip could be nice.
>> I will take an example, List.map.
>> When rewritten in CPS map becomes:
>>
>> let rec map k f = function
>>  | []    -> k []
>>  | x :: rl -> map (fun res -> k ((f x) :: res)) f rl
>>
>
> You can do better with an ad-hoc encoding of the continuation
> instead of using closures:
>
>
> let rec map k f = function
>  | [] -> List.rev k
>  | x :: rl -> map (f x :: k) f rl
> ;;
>
> The memory footprint is smaller, and you spend much less time
> invoking closures.
>
> Note that I haven't bothered benchmarking these two functions.
>
> -- Damien
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>

[-- Attachment #2: Type: text/html, Size: 2353 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] tip for tail recursive map
  2009-11-02 10:33 ` [Caml-list] " Damien Doligez
  2009-11-02 19:56   ` Julien Verlaguet
@ 2009-11-02 20:00   ` Christophe Raffalli
  2009-11-02 23:30     ` Jon Harrop
  1 sibling, 1 reply; 7+ messages in thread
From: Christophe Raffalli @ 2009-11-02 20:00 UTC (permalink / raw)
  To: Damien Doligez; +Cc: OCaml List


[-- Attachment #1.1: Type: text/plain, Size: 3955 bytes --]


> 
> You can do better with an ad-hoc encoding of the continuation
> instead of using closures:
> 
> let rec map k f = function
>   | [] -> List.rev k
>   | x :: rl -> map (f x :: k) f rl
> ;;
> 
> The memory footprint is smaller, and you spend much less time
> invoking closures.
> 
> Note that I haven't bothered benchmarking these two functions.

I did the benchmark with four version of map (below):

List of size 10, 10000000 times with standard map : 0.948059s
List of size 10, 10000000 times with map with rev : 1.800112s
List of size 10, 10000000 times with map with prelist : 3.060192s
List of size 10, 10000000 times with map with obj : 1.704105s

List of size 100, 1000000 times with standard map : 1.068068s
List of size 100, 1000000 times with map with rev : 1.448090s
List of size 100, 1000000 times with map with prelist : 2.668166s
List of size 100, 1000000 times with map with obj : 1.652104s

List of size 1000, 100000 times with standard map : 1.792112s
List of size 1000, 100000 times with map with rev : 2.912182s
List of size 1000, 100000 times with map with prelist : 3.520220s
List of size 1000, 100000 times with map with obj : 2.460154s

List of size 10000, 10000 times with standard map : 7.564473s
List of size 10000, 10000 times with map with rev : 15.452965s
List of size 10000, 10000 times with map with prelist : 12.672792s
List of size 10000, 10000 times with map with obj : 11.572724s

List of size 100000, 1000 times with standard map : 33.018063s
List of size 100000, 1000 times with map with rev : 42.142634s
List of size 100000, 1000 times with map with prelist : 22.161385s
List of size 100000, 1000 times with map with obj : 20.801299s

standard map with size 1000000 segfaults on my machine
List of size 1000000, 100 times with map with rev : 55.211450s
List of size 1000000, 100 times with map with prelist : 23.549472s
List of size 1000000, 100 times with map with obj : 21.777361s

standard map = List.map

map with rev = the above code given by Damien Doligez

map with prelist = a dirty map using Obj but through a
  not too dirty, but not completely safe interface prelist (attached) :

let pmap f l =
  let pl = start () in
  let rec fn = function
    [] -> Prelist.extract pl
  | a::l -> Prelist.cons (f a) pl; fn l
  in
  fn l

map with obj : a directly dirty obj map :

let objmap f l =
  match l with
    [] -> []
  | x::l' ->
      let start = [f x] in
      let rec fn current = function
	  [] -> start
	| x::l' ->
	    let l'' = [f x] in
	    Obj.set_field (Obj.repr current) 1 (Obj.repr l'');
	    fn l'' l'
      in
      fn start l'

Conclusion : dirty map wins for large lists, Standard map wins for small lists,
but if you map a non trivial function (here I map the trivial succ function on int),
there should be no difference so I would suggest to use map with reverse.

The conclusion is that I would replace Xavier's suggestion in another thread :
 << Repeat after me: "Obj.magic is not part of the OCaml language". >>
By
 << Try very very very hard not to use object. If it fails try very hard to use Obj but no C. >>

I still think Obj is safer than C (I do not speak for interfacing with external library where C is
mandatory), but you should be aware that Obj needs as much knowledge about the runtime than
C interface without using the provided C macro ...

Cheers,
Christophe

-- 
Christophe Raffalli
Universite de Savoie
Batiment Le Chablais, bureau 21
73376 Le Bourget-du-Lac Cedex

tel: (33) 4 79 75 81 03
fax: (33) 4 79 75 87 42
mail: Christophe.Raffalli@univ-savoie.fr
www: http://www.lama.univ-savoie.fr/~RAFFALLI
---------------------------------------------
IMPORTANT: this mail is signed using PGP/MIME
At least Enigmail/Mozilla, mutt or evolution
can check this signature. The public key is
stored on www.keyserver.net
---------------------------------------------

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.2: prelist.ml --]
[-- Type: text/x-ocaml; name="prelist.ml", Size: 486 bytes --]

type 'a prelist = { mutable start : 'a list; mutable current : 'a list}

let start () = { start = []; current = []}

let cons a pl = match pl.current with
  [] -> let l = [a] in pl.current <- l; pl.start <- l
| l -> 
	let l' = [a] in Obj.set_field (Obj.repr l) 1 (Obj.repr l');
        pl.current <- l'

let extract pl = 
  let r = pl.start in 
  pl.current <- []; pl.start <- []; 
  (* to guaranty that we can not mute the list once it has been     
  extracted *)
  r

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.3: prelist.mli --]
[-- Type: text/x-ocaml; name="prelist.mli", Size: 175 bytes --]


type 'a prelist (* mutable type of a list under construction *)

val start : unit -> 'a prelist
val extract : 'a prelist -> 'a list
val cons : 'a -> 'a prelist -> unit

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 260 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] tip for tail recursive map
  2009-11-02 19:56   ` Julien Verlaguet
@ 2009-11-02 20:04     ` Christophe Raffalli
  2009-11-03  1:29       ` Yaron Minsky
  0 siblings, 1 reply; 7+ messages in thread
From: Christophe Raffalli @ 2009-11-02 20:04 UTC (permalink / raw)
  To: Julien Verlaguet; +Cc: Damien Doligez, OCaml List

[-- Attachment #1: Type: text/plain, Size: 2752 bytes --]

Julien Verlaguet a écrit :
> Thanks for the tip !
> 
> I used this trick on every function of the List module and didn't try to
> go a step further in specific cases.
> Main problem being List.fold_right ... I couldn't figure out a way to
> encode a more efficient continuation encoding
> than the good old CPS, any ideas ?

reverse the list before and then to a fold_left ?

Cheers,
Christophe

> 
> Thanks
> 
> 2009/11/2 Damien Doligez <damien.doligez@inria.fr
> <mailto:damien.doligez@inria.fr>>
> 
> 
>     On 2009-10-23, at 21:55, pikatchou pokemon wrote:
> 
>         I know this topic has been discussed several times, but I don't
>         think I have seen the solution I use for functions of the List
>         module which are not tail recursive.
>         I thought sharing the tip could be nice.
>         I will take an example, List.map.
>         When rewritten in CPS map becomes:
> 
>         let rec map k f = function
>          | []    -> k []
>          | x :: rl -> map (fun res -> k ((f x) :: res)) f rl
> 
> 
>     You can do better with an ad-hoc encoding of the continuation
>     instead of using closures:
> 
> 
>     let rec map k f = function
>      | [] -> List.rev k
>      | x :: rl -> map (f x :: k) f rl
>     ;;
> 
>     The memory footprint is smaller, and you spend much less time
>     invoking closures.
> 
>     Note that I haven't bothered benchmarking these two functions.
> 
>     -- Damien
> 
>     _______________________________________________
>     Caml-list mailing list. Subscription management:
>     http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
>     Archives: http://caml.inria.fr
>     Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>     Bug reports: http://caml.inria.fr/bin/caml-bugs
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs

-- 
Christophe Raffalli
Universite de Savoie
Batiment Le Chablais, bureau 21
73376 Le Bourget-du-Lac Cedex

tel: (33) 4 79 75 81 03
fax: (33) 4 79 75 87 42
mail: Christophe.Raffalli@univ-savoie.fr
www: http://www.lama.univ-savoie.fr/~RAFFALLI
---------------------------------------------
IMPORTANT: this mail is signed using PGP/MIME
At least Enigmail/Mozilla, mutt or evolution
can check this signature. The public key is
stored on www.keyserver.net
---------------------------------------------


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 260 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] tip for tail recursive map
  2009-11-02 20:00   ` Christophe Raffalli
@ 2009-11-02 23:30     ` Jon Harrop
  0 siblings, 0 replies; 7+ messages in thread
From: Jon Harrop @ 2009-11-02 23:30 UTC (permalink / raw)
  To: caml-list

On Monday 02 November 2009 20:00:31 Christophe Raffalli wrote:
> List of size 10000, 10000 times with standard map : 7.564473s
> List of size 10000, 10000 times with map with rev : 15.452965s
> List of size 10000, 10000 times with map with prelist : 12.672792s
> List of size 10000, 10000 times with map with obj : 11.572724s

Note that standard "map" is still very fast on this list length.

> List of size 100000, 1000 times with standard map : 33.018063s
> List of size 100000, 1000 times with map with rev : 42.142634s
> List of size 100000, 1000 times with map with prelist : 22.161385s
> List of size 100000, 1000 times with map with obj : 20.801299s

Standard map is now relatively slower but only because it is O(n^2). Look at 
page 152 figure 7.4 of OCaml for Scientists to see this effect clearly. It is 
caused by the periodic traversal of the O(n) deep stack by the GC and it 
slows everything down (you get a similar effect with hash tables because the 
GC traverses arrays of pointers, like the spine, atomically).

> standard map with size 1000000 segfaults on my machine
> List of size 1000000, 100 times with map with rev : 55.211450s
> List of size 1000000, 100 times with map with prelist : 23.549472s
> List of size 1000000, 100 times with map with obj : 21.777361s

You can use ulimit to get a bigger function call stack and keep running the 
ordinary "map" as far as you want.

> Conclusion : dirty map wins for large lists, Standard map wins for small
> lists...

I think you can do a lot better than this and I think Xavier's recommendation 
stands (Obj is a horiffically bad idea unless you wrote the compiler and 
run-time *and* have the memory of an elephant ;-). Specifically, you just 
need to get rid of the O(n^2) behaviour by bounding the stack depth, perhaps 
using a trampoline.

IIRC, this was discussed on this list many years ago. One notable observation 
was that adding a depth accumulator does not degrade performance. Another 
alternative is to convert the list into an array rather than reversing it and 
use the array as a kind of alternative to the function call stack (I think F# 
does this).

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] tip for tail recursive map
  2009-11-02 20:04     ` Christophe Raffalli
@ 2009-11-03  1:29       ` Yaron Minsky
  0 siblings, 0 replies; 7+ messages in thread
From: Yaron Minsky @ 2009-11-03  1:29 UTC (permalink / raw)
  To: caml-list

[-- Attachment #1: Type: text/plain, Size: 399 bytes --]

I put a post on our blog a little while back discussing this.

   http://ocaml.janestreet.com/?q=node/71

There are a number of tricks you can do, including loop unrolling, and using
a counter to keep track of the number of stack frames, to get code that
behaves well on small-to-medium lists, uses a bounded number of stack
frames, and is faster than the standard List.map even for small lists.

y

[-- Attachment #2: Type: text/html, Size: 479 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-11-03  1:29 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-23 19:55 tip for tail recursive map pikatchou pokemon
2009-11-02 10:33 ` [Caml-list] " Damien Doligez
2009-11-02 19:56   ` Julien Verlaguet
2009-11-02 20:04     ` Christophe Raffalli
2009-11-03  1:29       ` Yaron Minsky
2009-11-02 20:00   ` Christophe Raffalli
2009-11-02 23:30     ` Jon Harrop

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).