I put a post on our blog a little while back discussing this. 

   http://ocaml.janestreet.com/?q=node/71

There are a number of tricks you can do, including loop unrolling, and using a counter to keep track of the number of stack frames, to get code that behaves well on small-to-medium lists, uses a bounded number of stack frames, and is faster than the standard List.map even for small lists.

y