[Caml-list] The verdict on "%identity"

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

* [Caml-list] The verdict on "%identity"
@ 2012-11-19 17:49 Dario Teixeira
  2012-11-19 18:02 ` Török Edwin
  0 siblings, 1 reply; 9+ messages in thread
From: Dario Teixeira @ 2012-11-19 17:49 UTC (permalink / raw)
  To: OCaml mailing-list

Hi,

I've found conflicting information regarding the use of "%identity",
which I hope to see clarified.

Let's consider a typical example where a module defines an abstract
type t and provides (de)serialisation functions of_string/to_string.
Moreover, the actual implementation of t uses a string, and the
(de)serialisation functions are just identities:

  module Foo:
  sig
        type t

        val of_string: string -> t
        val to_string: t -> string
  end =
  struct
        type t = string

        let of_string x = x
        let to_string x = x
  end

In practice, it's not unusual for such code to be implemented using
the compiler's "%identity" builtin, all in the name of performance:

  module Foo:
  sig
        type t

        external of_string: string -> t = "%identity"
        external to_string: t -> string = "%identity"
  end =
  struct
        type t = string

        external of_string: string -> t = "%identity"
        external to_string: t -> string = "%identity"
  end

I realise that the use of "%identity" is dangerous.  This is, after all,
how Obj.magic is defined.  Moreover, it uglifies interface definitions
and makes a ridicule of the abstraction.  However, on the assumption that
ocamlopt won't otherwise optimise away the no-op across module boundaries,
the use of "%identity" may well be justified for performance reasons.

With all the above in mind, I have two questions:

1) Is the assumption correct that today's ocamlopt won't optimise no-ops
   across module boundaries? (I know that ocamlopt does not generally engage
   in MLton-style whole programme optimisation, but is this also true for
   low-hanging fruit such as the first example above?)

2) Consider the code below.  For which modules can one expect of_string calls
   to be optimised across module boundaries?

  module type SIG1 = sig type t val of_string: string -> t end
  module type SIG2 = sig type t external of_string: string -> t = "%identity" end

  module Impl1 = struct type t = string let of_string x = x end
  module Impl2 = struct type t = string external of_string: string -> t = "%identity" end

  module A: SIG1 = Impl1
  module B: SIG1 = Impl2
  module C: SIG2 = Impl1
  module D: SIG2 = Impl2

Thank you in advance for your time!
Best regards,
Dario Teixeira

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Caml-list] The verdict on "%identity"
  2012-11-19 17:49 [Caml-list] The verdict on "%identity" Dario Teixeira
@ 2012-11-19 18:02 ` Török Edwin
  2012-11-19 18:18   ` Dario Teixeira
  0 siblings, 1 reply; 9+ messages in thread
From: Török Edwin @ 2012-11-19 18:02 UTC (permalink / raw)
  To: caml-list

On 11/19/2012 07:49 PM, Dario Teixeira wrote:
> Hi,
> 
> 
> I've found conflicting information regarding the use of "%identity",
> which I hope to see clarified.
> 
> Let's consider a typical example where a module defines an abstract
> type t and provides (de)serialisation functions of_string/to_string.
> Moreover, the actual implementation of t uses a string, and the
> (de)serialisation functions are just identities:
> 
>   module Foo:
>   sig
>         type t
> 
>         val of_string: string -> t
>         val to_string: t -> string
>   end =
>   struct
>         type t = string
> 
>         let of_string x = x
>         let to_string x = x
>   end
> 
> 
> In practice, it's not unusual for such code to be implemented using
> the compiler's "%identity" builtin, all in the name of performance:
> 
>   module Foo:
>   sig
>         type t

Wouldn't 'type t = private string' help the compiler optimize this?

> 
>         external of_string: string -> t = "%identity"
>         external to_string: t -> string = "%identity"
>   end =
>   struct
>         type t = string
> 
>         external of_string: string -> t = "%identity"
>         external to_string: t -> string = "%identity"
>   end
> 
> 
> I realise that the use of "%identity" is dangerous.  This is, after all,
> how Obj.magic is defined.  Moreover, it uglifies interface definitions
> and makes a ridicule of the abstraction.  However, on the assumption that
> ocamlopt won't otherwise optimise away the no-op across module boundaries,
> the use of "%identity" may well be justified for performance reasons.
> 
> With all the above in mind, I have two questions:
> 
> 1) Is the assumption correct that today's ocamlopt won't optimise no-ops
>    across module boundaries? (I know that ocamlopt does not generally engage
>    in MLton-style whole programme optimisation, but is this also true for
>    low-hanging fruit such as the first example above?)
> 
> 2) Consider the code below.  For which modules can one expect of_string calls
>    to be optimised across module boundaries?
> 
>   module type SIG1 = sig type t val of_string: string -> t end
>   module type SIG2 = sig type t external of_string: string -> t = "%identity" end
> 
>   module Impl1 = struct type t = string let of_string x = x end
>   module Impl2 = struct type t = string external of_string: string -> t = "%identity" end
> 
>   module A: SIG1 = Impl1
>   module B: SIG1 = Impl2
>   module C: SIG2 = Impl1
>   module D: SIG2 = Impl2
> 
> Thank you in advance for your time!
> Best regards,
> Dario Teixeira
> 
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Caml-list] The verdict on "%identity"
  2012-11-19 18:02 ` Török Edwin
@ 2012-11-19 18:18   ` Dario Teixeira
  2012-11-19 18:28     ` David House
  0 siblings, 1 reply; 9+ messages in thread
From: Dario Teixeira @ 2012-11-19 18:18 UTC (permalink / raw)
  To: Török Edwin, caml-list

Hi,

> Wouldn't 'type t = private string' help the compiler optimize this?


Possibly, though the semantics would change: what before was
an abstract type is now translucent (ie, not quite transparent).

Regards,
Dario

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Caml-list] The verdict on "%identity"
  2012-11-19 18:18   ` Dario Teixeira
@ 2012-11-19 18:28     ` David House
  2012-11-20  9:53       ` Gabriel Scherer
  2012-11-20 10:25       ` Pierre Chambart
  0 siblings, 2 replies; 9+ messages in thread
From: David House @ 2012-11-19 18:28 UTC (permalink / raw)
  To: Dario Teixeira; +Cc: Török Edwin, caml-list

If you wanted to investigate this yourself, you could compile with -S
and look at the generated assembly. For such short functions, this is
generally not very hard.

On Mon, Nov 19, 2012 at 6:18 PM, Dario Teixeira <darioteixeira@yahoo.com> wrote:
> Hi,
>
>> Wouldn't 'type t = private string' help the compiler optimize this?
>
>
> Possibly, though the semantics would change: what before was
> an abstract type is now translucent (ie, not quite transparent).
>
> Regards,
> Dario
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Caml-list] The verdict on "%identity"
  2012-11-19 18:28     ` David House
@ 2012-11-20  9:53       ` Gabriel Scherer
  2012-11-20 10:25       ` Pierre Chambart
  1 sibling, 0 replies; 9+ messages in thread
From: Gabriel Scherer @ 2012-11-20  9:53 UTC (permalink / raw)
  To: David House; +Cc: Dario Teixeira, Török Edwin, caml-list

On Mon, Nov 19, 2012 at 7:28 PM, David House <dhouse@janestreet.com> wrote:
> If you wanted to investigate this yourself, you could compile with -S
> and look at the generated assembly. For such short functions, this is
> generally not very hard.

Indeed:

cat test.ml
  module type SIG1 = sig type t val of_string: string -> t end
  module type SIG2 = sig type t external of_string: string -> t =
"%identity" end

  module Impl1 = struct type t = string let of_string x = x end
  module Impl2 = struct type t = string external of_string: string ->
t = "%identity" end

  module A: SIG1 = Impl1
  module B: SIG1 = Impl2
(*  module C: SIG2 = Impl1 *)
  module D: SIG2 = Impl2

  let testA = A.of_string "foo"
  let testB = B.of_string "bar"
  let testD = D.of_string "baz"

(I commented C out because it makes no sense to me, semantically, and
it's rejected by the compiler.)

ocamlopt -c -S test.ml
less test.s

The (relevant part of the) result on my machine, that correspond to
compilation of testA, testB, testD:
camlTest__entry:
  [...]
        movl    $camlTest__3, %eax
        movl    %eax, camlTest + 20
        movl    $camlTest__2, %eax
        movl    %eax, camlTest + 24
        movl    $camlTest__1, %eax
        movl    %eax, camlTest + 28
  [...]

All compiled in the same way.

There may be a difference for calls across compilation units: in
absence of the .cmx, no inlining would be performed. My guess would be
that in presence of the .cmx we should get the same final result, but
I must say I don't really care for performances on this front.

Note that there is however an important difference with private
definitions: with private, the cast from t to string is not only
erased by the compiler, it is a *coercion* that can be lifted to casts
to larger datatypes. You can coerce a (list t) into a (list string)
and this is also a no-op in the dynamic semantics. That's much
stronger than what you get from %identity.

(This suggest that, with the explicit subtyping we have in OCaml,
there would be a case for inter-coercible types: a way to define a t
such that for example (t :> string) and (string :> t), but not (t =
string). This doesn't increase the type safety of arbitrary programs
but allow programmers to force abstraction-breaking to be explicit,
with no performance cost in both directions.)

>
> On Mon, Nov 19, 2012 at 6:18 PM, Dario Teixeira <darioteixeira@yahoo.com> wrote:
>> Hi,
>>
>>> Wouldn't 'type t = private string' help the compiler optimize this?
>>
>>
>> Possibly, though the semantics would change: what before was
>> an abstract type is now translucent (ie, not quite transparent).
>>
>> Regards,
>> Dario
>>
>> --
>> Caml-list mailing list.  Subscription management and archives:
>> https://sympa.inria.fr/sympa/arc/caml-list
>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Caml-list] The verdict on "%identity"
  2012-11-19 18:28     ` David House
  2012-11-20  9:53       ` Gabriel Scherer
@ 2012-11-20 10:25       ` Pierre Chambart
  2012-11-20 16:19         ` Gabriel Scherer
  1 sibling, 1 reply; 9+ messages in thread
From: Pierre Chambart @ 2012-11-20 10:25 UTC (permalink / raw)
  To: caml-list

To know what will be generated after inlining I prefer to use -dcmm,
which is closer to the original code than the assembly and still show
show inlining result (and if you are using the svn trunk, you can use
-dclambda which show a higher level code, without allocations and
boxing).

Inlining of OCaml functions works the same way inside a module or
cross-module. It looks at the function size and if it smaller than
a certain threshold, it will be inlined. To know if it will happen to
your function, look at the result of ocamlobjinfo.

for instance:
module.mli:

type t
external id_prim : string -> t = "%identity"
val id : string -> t
val f : int -> int
val g : int -> int

module.ml:

type t = string
external id_prim : 'a -> 'a = "%identity"
let id x = x
let f x = x + x + x + x
let g x = x + x + x + x + x + x + x + x

ocamlobjinfo module.cmx:

...
Approximation:
  (0: function camlIdentity__id_1010 arity 1 (closed) (inline) ->  _;
   1: function camlIdentity__f_1012 arity 1 (closed) (inline) ->  _;
   2: function camlIdentity__g_1014 arity 1 (closed) ->  _)
...

the function id and f will be inlined whatever the context of the call
is, but g won't be.

If you want a function to be inlined, you can use the -inline option of
ocamlopt to increase the maximum size of inlined functions in the
module. Notice that recursive functions can't be inlined.

The usage of private type is different.
When using generic comparison/equality/hash/set in an array, the
compiler generate an optimised code when the type is known to be one of
the fast cases:

module M1 : sig
  type t = private int
end = struct type t = int end
module M2 : sig
  type t
end = struct type t = int end

let a x y = x > y
let b (x:int) y = x > y
let c (x:M1.t) y = x > y
let d (x:M2.t) y = x > y

the result of ocamlopt -dcmm:

(function camlCompare__a_1014 (x/1015: addr y/1016: addr)
 (extcall "caml_greaterthan" x/1015 y/1016 addr))

(function camlCompare__b_1017 (x/1018: addr y/1019: addr)
 (+ (<< (> x/1018 y/1019) 1) 1))

(function camlCompare__c_1020 (x/1021: addr y/1022: addr)
 (+ (<< (> x/1021 y/1022) 1) 1))

(function camlCompare__d_1023 (x/1024: addr y/1025: addr)
 (extcall "caml_greaterthan" x/1024 y/1025 addr))

Here b and c will be a lot faster than a and d.
Using private type allows to keep those informations acros modules.
-- 
Pierre

Le Mon, 19 Nov 2012 18:28:32 +0000,
David House <dhouse@janestreet.com> wrote :

> If you wanted to investigate this yourself, you could compile with -S
> and look at the generated assembly. For such short functions, this is
> generally not very hard.
> 
> On Mon, Nov 19, 2012 at 6:18 PM, Dario Teixeira
> <darioteixeira@yahoo.com> wrote:
> > Hi,
> >
> >> Wouldn't 'type t = private string' help the compiler optimize this?
> >
> >
> > Possibly, though the semantics would change: what before was
> > an abstract type is now translucent (ie, not quite transparent).
> >
> > Regards,
> > Dario
> >
> > --
> > Caml-list mailing list.  Subscription management and archives:
> > https://sympa.inria.fr/sympa/arc/caml-list
> > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> > Bug reports: http://caml.inria.fr/bin/caml-bugs
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Caml-list] The verdict on "%identity"
  2012-11-20 10:25       ` Pierre Chambart
@ 2012-11-20 16:19         ` Gabriel Scherer
  2012-11-20 19:03           ` Vincent HUGOT
  2012-11-20 20:43           ` Dario Teixeira
  0 siblings, 2 replies; 9+ messages in thread
From: Gabriel Scherer @ 2012-11-20 16:19 UTC (permalink / raw)
  To: Pierre Chambart; +Cc: caml-list

This is good advice in general, and using ocamlobjinfo to get inlining
information from the .cmx is indeed a very good idea.

Regarding -dcmm vs. -S, I generally use -dcmm myself (much more
readable), but it is not the right tool in this case. The cmm produced
for my test.ml code does make a difference between the three styles:
     (let testA/1046 (let x/1075 "camlTest__3" x/1075)
       (store (+a "camlTest" 20) testA/1046))
     (let testB/1047 (let prim/1076 "camlTest__2" prim/1076)
       (store (+a "camlTest" 24) testB/1047))
     (let testD/1048 "camlTest__1"
        (store (+a "camlTest" 28) testD/1048))

In fact, the removal of the trivial (let x = foo in x) does not happen
during the inlining passes in closure.ml, but much later at the
register allocation phase, where there is indeed a strong preference
for eg. testA/1046 and x/1705 to be given the same register, and the
useless move is erased. I make no claim of how robust this behavior
will be in a different case (eg. with higher register pressure), but
I'm not sure I really care.
I'd rather have people study the behavior on the compiler the real
performance-critical applications and suggest potential style changes
in the program (or optimization changes in the compiler) in cases
where this really make a performance difference. Writing code in a
certain way because "the generated code is nicer" is usually not worth
the trouble.

On Tue, Nov 20, 2012 at 11:25 AM, Pierre Chambart
<pierre.chambart@ocamlpro.com> wrote:
> To know what will be generated after inlining I prefer to use -dcmm,
> which is closer to the original code than the assembly and still show
> show inlining result (and if you are using the svn trunk, you can use
> -dclambda which show a higher level code, without allocations and
> boxing).
>
> Inlining of OCaml functions works the same way inside a module or
> cross-module. It looks at the function size and if it smaller than
> a certain threshold, it will be inlined. To know if it will happen to
> your function, look at the result of ocamlobjinfo.
>
> for instance:
> module.mli:
>
> type t
> external id_prim : string -> t = "%identity"
> val id : string -> t
> val f : int -> int
> val g : int -> int
>
> module.ml:
>
> type t = string
> external id_prim : 'a -> 'a = "%identity"
> let id x = x
> let f x = x + x + x + x
> let g x = x + x + x + x + x + x + x + x
>
> ocamlobjinfo module.cmx:
>
> ...
> Approximation:
>   (0: function camlIdentity__id_1010 arity 1 (closed) (inline) ->  _;
>    1: function camlIdentity__f_1012 arity 1 (closed) (inline) ->  _;
>    2: function camlIdentity__g_1014 arity 1 (closed) ->  _)
> ...
>
> the function id and f will be inlined whatever the context of the call
> is, but g won't be.
>
> If you want a function to be inlined, you can use the -inline option of
> ocamlopt to increase the maximum size of inlined functions in the
> module. Notice that recursive functions can't be inlined.
>
> The usage of private type is different.
> When using generic comparison/equality/hash/set in an array, the
> compiler generate an optimised code when the type is known to be one of
> the fast cases:
>
> module M1 : sig
>   type t = private int
> end = struct type t = int end
> module M2 : sig
>   type t
> end = struct type t = int end
>
> let a x y = x > y
> let b (x:int) y = x > y
> let c (x:M1.t) y = x > y
> let d (x:M2.t) y = x > y
>
> the result of ocamlopt -dcmm:
>
> (function camlCompare__a_1014 (x/1015: addr y/1016: addr)
>  (extcall "caml_greaterthan" x/1015 y/1016 addr))
>
> (function camlCompare__b_1017 (x/1018: addr y/1019: addr)
>  (+ (<< (> x/1018 y/1019) 1) 1))
>
> (function camlCompare__c_1020 (x/1021: addr y/1022: addr)
>  (+ (<< (> x/1021 y/1022) 1) 1))
>
> (function camlCompare__d_1023 (x/1024: addr y/1025: addr)
>  (extcall "caml_greaterthan" x/1024 y/1025 addr))
>
> Here b and c will be a lot faster than a and d.
> Using private type allows to keep those informations acros modules.
> --
> Pierre
>
> Le Mon, 19 Nov 2012 18:28:32 +0000,
> David House <dhouse@janestreet.com> wrote :
>
>> If you wanted to investigate this yourself, you could compile with -S
>> and look at the generated assembly. For such short functions, this is
>> generally not very hard.
>>
>> On Mon, Nov 19, 2012 at 6:18 PM, Dario Teixeira
>> <darioteixeira@yahoo.com> wrote:
>> > Hi,
>> >
>> >> Wouldn't 'type t = private string' help the compiler optimize this?
>> >
>> >
>> > Possibly, though the semantics would change: what before was
>> > an abstract type is now translucent (ie, not quite transparent).
>> >
>> > Regards,
>> > Dario
>> >
>> > --
>> > Caml-list mailing list.  Subscription management and archives:
>> > https://sympa.inria.fr/sympa/arc/caml-list
>> > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> > Bug reports: http://caml.inria.fr/bin/caml-bugs
>>
>
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Caml-list] The verdict on "%identity"
  2012-11-20 16:19         ` Gabriel Scherer
@ 2012-11-20 19:03           ` Vincent HUGOT
  2012-11-20 20:43           ` Dario Teixeira
  1 sibling, 0 replies; 9+ messages in thread
From: Vincent HUGOT @ 2012-11-20 19:03 UTC (permalink / raw)
  To: caml-list

Is there some place where this fabulous -dmcc switch is documented?
ocamlopt's man and --help pages, as well as the manual, are utterly  
uninformative (either don't mention it or say simply "undocumented").

V.



On Tue, 20 Nov 2012 17:19:34 +0100, Gabriel Scherer  
<gabriel.scherer@gmail.com> wrote:

> This is good advice in general, and using ocamlobjinfo to get inlining
> information from the .cmx is indeed a very good idea.
>
> Regarding -dcmm vs. -S, I generally use -dcmm myself (much more
> readable), but it is not the right tool in this case. The cmm produced
> for my test.ml code does make a difference between the three styles:
>      (let testA/1046 (let x/1075 "camlTest__3" x/1075)
>        (store (+a "camlTest" 20) testA/1046))
>      (let testB/1047 (let prim/1076 "camlTest__2" prim/1076)
>        (store (+a "camlTest" 24) testB/1047))
>      (let testD/1048 "camlTest__1"
>         (store (+a "camlTest" 28) testD/1048))
>
> In fact, the removal of the trivial (let x = foo in x) does not happen
> during the inlining passes in closure.ml, but much later at the
> register allocation phase, where there is indeed a strong preference
> for eg. testA/1046 and x/1705 to be given the same register, and the
> useless move is erased. I make no claim of how robust this behavior
> will be in a different case (eg. with higher register pressure), but
> I'm not sure I really care.
> I'd rather have people study the behavior on the compiler the real
> performance-critical applications and suggest potential style changes
> in the program (or optimization changes in the compiler) in cases
> where this really make a performance difference. Writing code in a
> certain way because "the generated code is nicer" is usually not worth
> the trouble.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Caml-list] The verdict on "%identity"
  2012-11-20 16:19         ` Gabriel Scherer
  2012-11-20 19:03           ` Vincent HUGOT
@ 2012-11-20 20:43           ` Dario Teixeira
  1 sibling, 0 replies; 9+ messages in thread
From: Dario Teixeira @ 2012-11-20 20:43 UTC (permalink / raw)
  To: Gabriel Scherer, Pierre Chambart; +Cc: caml-list

Hi,

And thank you, Gabriel and Pierre, for your insights.

> I'd rather have people study the behavior on the compiler the real
> performance-critical applications and suggest potential style changes
> in the program (or optimization changes in the compiler) in cases
> where this really make a performance difference. Writing code in a
> certain way because "the generated code is nicer" is usually not worth
> the trouble.

Mind you, I'm not especially fond of such low-level trickery myself,
particularly when the trick can cause a segfault if used carelessly,
as is the case of "%identity". On the other hand, it's always good
to have these tricks in the back of your mind -- you never know when
they might come in handy...

Best regards,
Dario Teixeira

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-11-20 20:43 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-19 17:49 [Caml-list] The verdict on "%identity" Dario Teixeira
2012-11-19 18:02 ` Török Edwin
2012-11-19 18:18   ` Dario Teixeira
2012-11-19 18:28     ` David House
2012-11-20  9:53       ` Gabriel Scherer
2012-11-20 10:25       ` Pierre Chambart
2012-11-20 16:19         ` Gabriel Scherer
2012-11-20 19:03           ` Vincent HUGOT
2012-11-20 20:43           ` Dario Teixeira

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).