caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] obj magic performance and other questions
@ 2011-03-15 16:05 Nicolas Ojeda Bar
  2011-03-15 17:17 ` Dario Teixeira
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Nicolas Ojeda Bar @ 2011-03-15 16:05 UTC (permalink / raw)
  To: caml-list

Hello,

I need to simulate, in Ocaml, the extensible records of Oberon. I am doing the
following:

Oberon type declaration:

TYPE

rec1 = RECORD a : INTEGER END;
rec2 = RECORD (rec1) b : CHAR END;
rec3 = RECORD (rec1) c : REAL END;

This means that variables of type rec2 or rec3 can be used
as variables of type rec1 and variables of type rec1 can be
downcasted at runtime to variables of type rec2 or rec3.

Ocaml code:

type rec2 = {
 b : char
}

type rec3 = {
 c : float
}

type rec123 =
| Rec1
| Rec2 of rec2
| Rec3 of rec3

type rec = {
 a : int;
 children : rec123
}

With this type declaration, upcasting from a rec2 or rec3 to a rec1 is simply
the identity function. Downcasting is obtained by a simple pattern match test:

let rec1_of_rec2 x = x

let rec2_of_rec1 x =
match x.children of
| Rec2 _ -> x | _ -> failwith "bad cast"

Now, if we are given a rec2 variable, and we want to access the field b,
we might have to do:

let rec2_b x =
match x.children with
| Rec2 r -> r.b
| _ -> assert false

This is not very efficient. I can use Obj.magic to access that same field as
follows:

let rec2_b x =
Obj.obj (Obj.field (Obj.field (Obj.repr x.children 0) 0))

My question is: is this use of Obj safe? and is this compiled to (two) simple
array accesses? Or is Obj.obj doing something behind the scenes?

Thanks!
N

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] obj magic performance and other questions
  2011-03-15 16:05 [Caml-list] obj magic performance and other questions Nicolas Ojeda Bar
@ 2011-03-15 17:17 ` Dario Teixeira
  2011-03-16  2:21 ` Guillaume Yziquel
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Dario Teixeira @ 2011-03-15 17:17 UTC (permalink / raw)
  To: caml-list, Nicolas Ojeda Bar

Hi,

> I need to simulate, in Ocaml, the extensible records of
> Oberon. I am doing the
> following:
> 
> Oberon type declaration:
> 
> TYPE
> 
> rec1 = RECORD a : INTEGER END;
> rec2 = RECORD (rec1) b : CHAR END;
> rec3 = RECORD (rec1) c : REAL END;

Have you considered leveraging the structural subtyping features of the object system, or do you reckon the (small) performance penalty is too
great?

type rec1 = < a: int >
type rec2 = < a: int; b: char >
type rec3 = < a: int; c: float >

let make_rec1 a =
object
  method a = a
end

let make_rec2 a b =
object
  method a = a
  method b = b
end

...and so forth.

Best regards,
Dario Teixeira



      


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] obj magic performance and other questions
  2011-03-15 16:05 [Caml-list] obj magic performance and other questions Nicolas Ojeda Bar
  2011-03-15 17:17 ` Dario Teixeira
@ 2011-03-16  2:21 ` Guillaume Yziquel
       [not found] ` <325023989.96091.1300242147328.JavaMail.root@zmbs4.inria.fr>
  2011-03-17 16:36 ` Damien Doligez
  3 siblings, 0 replies; 5+ messages in thread
From: Guillaume Yziquel @ 2011-03-16  2:21 UTC (permalink / raw)
  To: Nicolas Ojeda Bar; +Cc: caml-list

Le Tuesday 15 Mar 2011 à 12:05:31 (-0400), Nicolas Ojeda Bar a écrit :
> Hello,
> 
> This is not very efficient. I can use Obj.magic to access that same field as
> follows:
> 
> let rec2_b x =
> Obj.obj (Obj.field (Obj.field (Obj.repr x.children 0) 0))

Seems that you messed up some parentheses.

> My question is: is this use of Obj safe? and is this compiled to (two) simple
> array accesses? Or is Obj.obj doing something behind the scenes?

yziquel@seldon:~/sandbox/ocaml$ cat a.ml 
let f x = Obj.obj (Obj.field (Obj.field (Obj.repr x) 0) 0)
yziquel@seldon:~/sandbox/ocaml$ ocamlopt -c -S -dlambda a.ml 
(seq
  (let
    (f/1030
       (function x/1031
         (id (array.unsafe_get (array.unsafe_get (id x/1031) 0) 0))))
    (setfield_imm 0 (global A!) f/1030))
  0a)

So it should be compiled to array accesses. However, they are not so
simple array accesses due to OCaml data layout. Moreover, there's this
'id' around. Let's see:

yziquel@seldon:~/sandbox/ocaml$ cat a.s

[...]

camlA__f_1030:
	subq	$8, %rsp
.L103:
	movq	%rax, %rdi
	movzbq	-8(%rdi), %rax

So the stack has been managed, and the %rax registers now contains the
tag of the 'x' argument to 'f'.

	cmpq	$254, %rax

This checks that the tag is or isn't the tag of an array of doubles.
(Obj.double_array_tag is 254). So we shouldn't jump on the ext 'je'.

	je	.L102
	movq	(%rdi), %rbx

The register %rbx now contains the first field of 'x'.

	jmp	.L101

[...]

.L101:
	movzbq	-8(%rbx), %rax

This looks at the tag of the first field of 'x', and checks below if it
is an array of doubles.

	cmpq	$254, %rax
	je	.L100
	movq	(%rbx), %rax

%rax now contains the first field of the first field of 'x'.

	addq	$8, %rsp
	ret

And 'f' returns with what you want.

So, aside from checking if tag is 254, it's essentially two direct array
accesses, and the 'id' function of Obj.repr and Obj.obj is erased.

	external repr : 'a -> t = "%identity"
	external obj : t -> 'a = "%identity"
	external field : t -> int -> t = "%obj_field"

You could probably hack the compiler to avoid the tag checking if you
really wanted to avoid the 254 test. That'd require another "%..." to be
implemented I guess.

> Thanks!
> N

-- 
     Guillaume Yziquel


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] obj magic performance and other questions
       [not found] ` <325023989.96091.1300242147328.JavaMail.root@zmbs4.inria.fr>
@ 2011-03-16  8:46   ` Fabrice Le Fessant
  0 siblings, 0 replies; 5+ messages in thread
From: Fabrice Le Fessant @ 2011-03-16  8:46 UTC (permalink / raw)
  To: caml-list

[-- Attachment #1: Type: text/plain, Size: 2943 bytes --]

Well, actually, pattern matching is compiled to optimize access to
fields depending on their types, so the version without Obj is probably
actually faster than the version with Obj.

Also, the test with 254 should probably not be removed, as records can
also be represented using double arrays (actually, your record rec3 will
be represented as a double_array). That's the reason why using Obj to
optimize code is often not a very good idea, as you lose the type
information, and type information is important for faster access to data
structures.

--Fabrice

On 03/16/2011 03:22 AM, Guillaume Yziquel wrote:
> Le Tuesday 15 Mar 2011 à 12:05:31 (-0400), Nicolas Ojeda Bar a écrit :
>> Hello,
>>
>> This is not very efficient. I can use Obj.magic to access that same field as
>> follows:
>>
>> let rec2_b x =
>> Obj.obj (Obj.field (Obj.field (Obj.repr x.children 0) 0))
> 
> Seems that you messed up some parentheses.
> 
>> My question is: is this use of Obj safe? and is this compiled to (two) simple
>> array accesses? Or is Obj.obj doing something behind the scenes?
> 
> yziquel@seldon:~/sandbox/ocaml$ cat a.ml 
> let f x = Obj.obj (Obj.field (Obj.field (Obj.repr x) 0) 0)
> yziquel@seldon:~/sandbox/ocaml$ ocamlopt -c -S -dlambda a.ml 
> (seq
>   (let
>     (f/1030
>        (function x/1031
>          (id (array.unsafe_get (array.unsafe_get (id x/1031) 0) 0))))
>     (setfield_imm 0 (global A!) f/1030))
>   0a)
> 
> So it should be compiled to array accesses. However, they are not so
> simple array accesses due to OCaml data layout. Moreover, there's this
> 'id' around. Let's see:
> 
> yziquel@seldon:~/sandbox/ocaml$ cat a.s
> 
> [...]
> 
> camlA__f_1030:
> 	subq	$8, %rsp
> .L103:
> 	movq	%rax, %rdi
> 	movzbq	-8(%rdi), %rax
> 
> So the stack has been managed, and the %rax registers now contains the
> tag of the 'x' argument to 'f'.
> 
> 	cmpq	$254, %rax
> 
> This checks that the tag is or isn't the tag of an array of doubles.
> (Obj.double_array_tag is 254). So we shouldn't jump on the ext 'je'.
> 
> 	je	.L102
> 	movq	(%rdi), %rbx
> 
> The register %rbx now contains the first field of 'x'.
> 
> 	jmp	.L101
> 
> [...]
> 
> .L101:
> 	movzbq	-8(%rbx), %rax
> 
> This looks at the tag of the first field of 'x', and checks below if it
> is an array of doubles.
> 
> 	cmpq	$254, %rax
> 	je	.L100
> 	movq	(%rbx), %rax
> 
> %rax now contains the first field of the first field of 'x'.
> 
> 	addq	$8, %rsp
> 	ret
> 
> And 'f' returns with what you want.
> 
> So, aside from checking if tag is 254, it's essentially two direct array
> accesses, and the 'id' function of Obj.repr and Obj.obj is erased.
> 
> 	external repr : 'a -> t = "%identity"
> 	external obj : t -> 'a = "%identity"
> 	external field : t -> int -> t = "%obj_field"
> 
> You could probably hack the compiler to avoid the tag checking if you
> really wanted to avoid the 254 test. That'd require another "%..." to be
> implemented I guess.
> 
>> Thanks!
>> N
> 

[-- Attachment #2: fabrice_le_fessant.vcf --]
[-- Type: text/x-vcard, Size: 384 bytes --]

begin:vcard
fn:Fabrice LE FESSANT
n:LE FESSANT;Fabrice
org:INRIA Saclay -- Ile-de-France;Projet OCamlPro
adr;quoted-printable:;;Parc Orsay Universit=C3=A9 ;Orsay CEDEX;;91893;France
email;internet:fabrice.le_fessant@inria.fr
title;quoted-printable:Charg=C3=A9 de Recherche
tel;work:+33 1 74 85 42 14
tel;fax:+33 1 74 85 42 49 
url:http://fabrice.lefessant.net/
version:2.1
end:vcard


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] obj magic performance and other questions
  2011-03-15 16:05 [Caml-list] obj magic performance and other questions Nicolas Ojeda Bar
                   ` (2 preceding siblings ...)
       [not found] ` <325023989.96091.1300242147328.JavaMail.root@zmbs4.inria.fr>
@ 2011-03-17 16:36 ` Damien Doligez
  3 siblings, 0 replies; 5+ messages in thread
From: Damien Doligez @ 2011-03-17 16:36 UTC (permalink / raw)
  To: OCaml mailing list


On 2011-03-15, at 17:05, Nicolas Ojeda Bar wrote:

> let rec2_b x =
> match x.children with
> | Rec2 r -> r.b
> | _ -> assert false
> 
> This is not very efficient. I can use Obj.magic to access that same field as
> follows:
> 
> let rec2_b x =
> Obj.obj (Obj.field (Obj.field (Obj.repr x.children 0) 0))
> 
> My question is: is this use of Obj safe? and is this compiled to (two) simple
> array accesses? Or is Obj.obj doing something behind the scenes?


Not safe.

Your first version of rec2_b is safe: it throws an exception when the "downcast"
fails.  With your second version, anything can happen if the argument is not a
rec2 (program crash, wrong result, silent failure, strange side effects, correct
result, whatever).

Also, as others have noted, the Obj version might actually do more tests than the
proper one (I can't be bothered to check exactly).

-- Damien



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-03-17 16:36 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-03-15 16:05 [Caml-list] obj magic performance and other questions Nicolas Ojeda Bar
2011-03-15 17:17 ` Dario Teixeira
2011-03-16  2:21 ` Guillaume Yziquel
     [not found] ` <325023989.96091.1300242147328.JavaMail.root@zmbs4.inria.fr>
2011-03-16  8:46   ` Fabrice Le Fessant
2011-03-17 16:36 ` Damien Doligez

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).