caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] ASM code generated by OCAML
@ 2003-12-20  9:50 Francesco Abbate
  2003-12-20 16:50 ` Remi Vanicat
  0 siblings, 1 reply; 2+ messages in thread
From: Francesco Abbate @ 2003-12-20  9:50 UTC (permalink / raw)
  To: caml-list

Dear Ocamlers,

ok, I confess that I'm a little bit paranoid and I often
look to the assembler code generated by Ocaml to get an
idea of real efficience of the compiler.

While, generally speaking, the ASM code generated by ocaml
is pretty good, I wonder why the following function
is not decently assembled by ocaml :
-----------------------------------------
let rec conv n =
  let r = n mod 10
  and n' = n / 10 in
  if n' = 0 then r
  else r + 8 * (conv n')
-----------------------------------------
nor the C version is decently assembled by GCC
-----------------------------------------
int
conv (int n)
{
  int m = n / 10, r = n % 10;
  if (m > 0)
    return r + 8 * conv (m);
  return m;
}
-----------------------------------------
If fact the code generated by ocaml is
-----------------------------------------
Pr1__conv_56:
	subl	$4, %esp
.L101:
	movl	%eax, %ebx
	movl	$10, %ecx
	movl	%ebx, %eax
	sarl	$1, %eax
	cltd
	idivl	%ecx
	lea	1(, %edx, 2), %esi
	movl	$10, %ecx
	movl	%ebx, %eax
	sarl	$1, %eax
	cltd
	idivl	%ecx
	lea	1(, %eax, 2), %eax
	cmpl	$1, %eax
	jne	.L100
	movl	%esi, %eax
	addl	$4, %esp
	ret
	.align	16
.L100:
	movl	%esi, 0(%esp)
	call	Pr1__conv_56
.L102:
	sarl	$1, %eax
	movl	%eax, %ebx
	sall	$4, %ebx
	movl	0(%esp), %eax
	addl	%ebx, %eax
	addl	$4, %esp
	ret
-----------------------------------------
and the GCC generated code is (-O3 -fomit-frame-pointer -finline)
-----------------------------------------
conv:
	subl $24,%esp
	pushl %ebx
	movl 32(%esp),%ecx
	movl $1717986919,%edx
	movl %edx,%eax
	imull %ecx
	sarl $2,%edx
	movl %ecx,%eax
	sarl $31,%eax
	subl %eax,%edx
	leal (%edx,%edx,4),%eax
	addl %eax,%eax
	movl %ecx,%ebx
	subl %eax,%ebx
	testl %edx,%edx
	jle .L3
	addl $-12,%esp
	pushl %edx
	call conv
	leal (%ebx,%eax,8),%eax
	addl $16,%esp
	jmp .L2
	.p2align 4,,7
.L3:
	movl %edx,%eax
.L2:
	popl %ebx
	addl $24,%esp
	ret
-----------------------------------------
The code is inefficent because
- only one "idiv" instruction is needed to get bot the quotient
  both the remainder while the assembled code does use to "idiv"
  instruction;
- the function is called recursively while this is not really
  necessary (I know that in other cases both ocaml and GCC does
  transform a recursive function in one simple loop)

I've written the assembled code which implements the required
function by hand (which takes two argument the second of which
should be 10):
-----------------------------------------
	.file	"asmtest.s"
.text
	.align 4
.globl conv
	.type 	conv,@function
conv:
	pushl %ebp
	movl %esp,%ebp
	pushl %ebx
	pushl %ecx
	pushl %edx
	movl 8(%ebp),%eax
	movl 12(%ebp),%ebx
# in the following loop I use idiv to get the remainders
.L1:
	cltd
	idivl %ebx
	pushl %edx # we push the remainder into the stack
	testl %eax,%eax
	jne .L1 # end of the loop
	movl %ebp,%eax
	subl $16,%eax
	subl %esp,%eax
	sarl $2,%eax
	movl %eax,%ecx # now %ecx does contain the number of
#                      # remainders pushed into the stack
	popl %eax
# now there is a cycle that form the octal number
# using the formula : (...((r0 * 8 + r1) * 8 + r2) *8 + ...)*8 + rn
# where r0 is the most significant digit (remainder)
.L2:
	sall $3,%eax # multiply by 8
	popl %ebx
	addl %ebx,%eax
	decl %ecx
	jne .L2 # end of the loop
	popl %edx
	popl %ecx
	popl %ebx
	leave
	ret # in %eax we have the correct result
-----------------------------------------

So my answer is why nor Ocaml nor GCC does generate efficient
assembler code ?

I will attempt to give a tentative answer
- for some reason the compiler does not understands the (n mod 10)
  and (n /10) both can be avaluated with a simgle "idiv"
  instruction
- for some reason the compilers does not conceive to have a loop
  which push something on the stack at each cycle.

While the question of the two "idiv" instead of one is not of
practical importance the different implementation of the loop
is quite significative.
Anyway both problems are of academic importance (insn't it ?).

comments are welcome,
bye,
-- 
Francesco Abbate <segfault@email.it>

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [Caml-list] ASM code generated by OCAML
  2003-12-20  9:50 [Caml-list] ASM code generated by OCAML Francesco Abbate
@ 2003-12-20 16:50 ` Remi Vanicat
  0 siblings, 0 replies; 2+ messages in thread
From: Remi Vanicat @ 2003-12-20 16:50 UTC (permalink / raw)
  To: caml-list

Francesco Abbate <segfault@email.it> writes:

> Dear Ocamlers,
>
> ok, I confess that I'm a little bit paranoid and I often
> look to the assembler code generated by Ocaml to get an
> idea of real efficience of the compiler.
>
> While, generally speaking, the ASM code generated by ocaml
> is pretty good, I wonder why the following function
> is not decently assembled by ocaml :
> -----------------------------------------
> let rec conv n =
>   let r = n mod 10
>   and n' = n / 10 in
>   if n' = 0 then r
>   else r + 8 * (conv n')
> -----------------------------------------
> nor the C version is decently assembled by GCC
> -----------------------------------------
> int
> conv (int n)
> {
>   int m = n / 10, r = n % 10;
>   if (m > 0)
>     return r + 8 * conv (m);
>   return m;
> }
> -----------------------------------------

> So my answer is why nor Ocaml nor GCC does generate efficient
> assembler code ?
>
> I will attempt to give a tentative answer
> - for some reason the compiler does not understands the (n mod 10)
>   and (n /10) both can be avaluated with a simgle "idiv"
>   instruction

This require some analysis that isn't needed in general

> - for some reason the compilers does not conceive to have a loop
>   which push something on the stack at each cycle.

Ocaml (and I believe GCC) only optimize code witch is tail recursive,
that is the result of the function is the result of the recursive
case. 

You should transform your code into a tail rec function by hand :
let conv n =
   let rec helper n mult accu =
      if n = 0 then accu
      else
         let r = n mod 10
         and n' = n / 10 in
         helper n' (mult * 8) (accu + r * mult) in
   helper n 1 0

As you can see, the result of the recursive function helper is the
result of the recursive call.

-- 
Rémi Vanicat

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2003-12-20 16:50 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-12-20  9:50 [Caml-list] ASM code generated by OCAML Francesco Abbate
2003-12-20 16:50 ` Remi Vanicat

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).