caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] Type Encoding Format Control
@ 2015-08-18 17:06 Kenneth Adam Miller
  2015-08-18 18:57 ` Hendrik Boom
  2015-08-25 12:09 ` [Caml-list] <DKIM> " Pierre Chambart
  0 siblings, 2 replies; 12+ messages in thread
From: Kenneth Adam Miller @ 2015-08-18 17:06 UTC (permalink / raw)
  To: caml users

[-- Attachment #1: Type: text/plain, Size: 783 bytes --]

I was wondering if cases where format control is possible in typing
constructs can allow things like restricting the implementation size after
compilation of a specific variant type. Say, for instance that I wanted to
have a malloc implementation instead of returning a Some 'a | None type
that compiles down to a boxed case of first a word and then the subsequent
'a instance, down to the 'a instance, where in the values of the word enum
(or tag) are not present in the possibilities of the 'a instance.

Maybe it sounds silly, but in really tight loops you want to squeeze for
efficiency. So I was wondering if maybe the same actual code be used with
the same sanity of type checking, but some annotation provided at the type
declaration to allow such optimization to take place.

[-- Attachment #2: Type: text/html, Size: 845 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Type Encoding Format Control
  2015-08-18 17:06 [Caml-list] Type Encoding Format Control Kenneth Adam Miller
@ 2015-08-18 18:57 ` Hendrik Boom
  2015-08-18 19:01   ` Kenneth Adam Miller
  2015-08-25 12:09 ` [Caml-list] <DKIM> " Pierre Chambart
  1 sibling, 1 reply; 12+ messages in thread
From: Hendrik Boom @ 2015-08-18 18:57 UTC (permalink / raw)
  To: Kenneth Adam Miller; +Cc: caml users

On Tue, Aug 18, 2015 at 01:06:55PM -0400, Kenneth Adam Miller wrote:
> I was wondering if cases where format control is possible in typing
> constructs can allow things like restricting the implementation size after
> compilation of a specific variant type. Say, for instance that I wanted to
> have a malloc implementation instead of returning a Some 'a | None type
> that compiles down to a boxed case of first a word and then the subsequent
> 'a instance, down to the 'a instance, where in the values of the word enum
> (or tag) are not present in the possibilities of the 'a instance.
> 
> Maybe it sounds silly, but in really tight loops you want to squeeze for
> efficiency. So I was wondering if maybe the same actual code be used with
> the same sanity of type checking, but some annotation provided at the type
> declaration to allow such optimization to take place.

Let's see.  OCaml steals a bit to indicate whether a valus is a pointer 
or not, right?  Could that bit see duual usage for the option type?  So 
that if it's an optional pointer type, the bit is left off, and if it's 
an optional nonpointer type, it's turned on (and set to point to 
location zero, which the GC couls check for)?

THe proble I see with this is if the 'a is passed to a generic function 
where iti isn't statically known where it's a pinter or not.  The 
conpiler will not know whether to test for absence or presence of the 
bit.

-- hendrik

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Type Encoding Format Control
  2015-08-18 18:57 ` Hendrik Boom
@ 2015-08-18 19:01   ` Kenneth Adam Miller
  2015-08-18 19:44     ` Gabriel Scherer
  0 siblings, 1 reply; 12+ messages in thread
From: Kenneth Adam Miller @ 2015-08-18 19:01 UTC (permalink / raw)
  To: Hendrik Boom; +Cc: caml users

[-- Attachment #1: Type: text/plain, Size: 2110 bytes --]

Well, it's not restricted to pointers - In general I would think that the
type annotation for Some | None would be left alone. I just used pointer as
an example because pointers exclude a value, 0x0, from the valid set. In
which case None is encoded as 0x0.

Thanks for the bit about polymorphism in the context of what a compiler
would see - clients that do not see the hypothetical additional annotation
for that specific type to allow a format wouldn't have the augmented
operational needs to work on such an instance correctly. Got it!

On Tue, Aug 18, 2015 at 2:57 PM, Hendrik Boom <hendrik@topoi.pooq.com>
wrote:

> On Tue, Aug 18, 2015 at 01:06:55PM -0400, Kenneth Adam Miller wrote:
> > I was wondering if cases where format control is possible in typing
> > constructs can allow things like restricting the implementation size
> after
> > compilation of a specific variant type. Say, for instance that I wanted
> to
> > have a malloc implementation instead of returning a Some 'a | None type
> > that compiles down to a boxed case of first a word and then the
> subsequent
> > 'a instance, down to the 'a instance, where in the values of the word
> enum
> > (or tag) are not present in the possibilities of the 'a instance.
> >
> > Maybe it sounds silly, but in really tight loops you want to squeeze for
> > efficiency. So I was wondering if maybe the same actual code be used with
> > the same sanity of type checking, but some annotation provided at the
> type
> > declaration to allow such optimization to take place.
>
> Let's see.  OCaml steals a bit to indicate whether a valus is a pointer
> or not, right?  Could that bit see duual usage for the option type?  So
> that if it's an optional pointer type, the bit is left off, and if it's
> an optional nonpointer type, it's turned on (and set to point to
> location zero, which the GC couls check for)?
>
> THe proble I see with this is if the 'a is passed to a generic function
> where iti isn't statically known where it's a pinter or not.  The
> conpiler will not know whether to test for absence or presence of the
> bit.
>
> -- hendrik
>

[-- Attachment #2: Type: text/html, Size: 2701 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Type Encoding Format Control
  2015-08-18 19:01   ` Kenneth Adam Miller
@ 2015-08-18 19:44     ` Gabriel Scherer
  2015-08-18 19:55       ` Kenneth Adam Miller
  2015-08-20  9:10       ` Goswin von Brederlow
  0 siblings, 2 replies; 12+ messages in thread
From: Gabriel Scherer @ 2015-08-18 19:44 UTC (permalink / raw)
  To: Kenneth Adam Miller; +Cc: Hendrik Boom, caml users

We've discussed optimisations of ('a option) in the past. Certainly
some things could be done, but it's unclear to me how much value there is in
optimizing ('a option) specifically: what if, for example, we later
understand that ('a, exn) result is the more general abstraction that
we should have used instead, and rely on it heavily in libraries, will
we de-optimize options and work on optimizing results?

Note that your idea of "either a failure of a value" can be achieved,
in some monomorphic cases (specifically when you know 'a and it has a
product structure) by using a specific type declaration:

  type my_struct =
    | None
    | Some of int * int array * string

This will be represented as efficiently as the tuple (int * int array
* string), yet it has a default case (or two, or another case with
exceptions, whatever -- this is more flexible than just options). With
inline records in 4.03 -- not yet released -- you will even be able to
have some of the product structure mutable:

  type my_struct =
    | None
    | Some of { mutable count : int; values : int array; name : string }

On Tue, Aug 18, 2015 at 9:01 PM, Kenneth Adam Miller
<kennethadammiller@gmail.com> wrote:
> Well, it's not restricted to pointers - In general I would think that the
> type annotation for Some | None would be left alone. I just used pointer as
> an example because pointers exclude a value, 0x0, from the valid set. In
> which case None is encoded as 0x0.
>
> Thanks for the bit about polymorphism in the context of what a compiler
> would see - clients that do not see the hypothetical additional annotation
> for that specific type to allow a format wouldn't have the augmented
> operational needs to work on such an instance correctly. Got it!
>
> On Tue, Aug 18, 2015 at 2:57 PM, Hendrik Boom <hendrik@topoi.pooq.com>
> wrote:
>>
>> On Tue, Aug 18, 2015 at 01:06:55PM -0400, Kenneth Adam Miller wrote:
>> > I was wondering if cases where format control is possible in typing
>> > constructs can allow things like restricting the implementation size
>> > after
>> > compilation of a specific variant type. Say, for instance that I wanted
>> > to
>> > have a malloc implementation instead of returning a Some 'a | None type
>> > that compiles down to a boxed case of first a word and then the
>> > subsequent
>> > 'a instance, down to the 'a instance, where in the values of the word
>> > enum
>> > (or tag) are not present in the possibilities of the 'a instance.
>> >
>> > Maybe it sounds silly, but in really tight loops you want to squeeze for
>> > efficiency. So I was wondering if maybe the same actual code be used
>> > with
>> > the same sanity of type checking, but some annotation provided at the
>> > type
>> > declaration to allow such optimization to take place.
>>
>> Let's see.  OCaml steals a bit to indicate whether a valus is a pointer
>> or not, right?  Could that bit see duual usage for the option type?  So
>> that if it's an optional pointer type, the bit is left off, and if it's
>> an optional nonpointer type, it's turned on (and set to point to
>> location zero, which the GC couls check for)?
>>
>> THe proble I see with this is if the 'a is passed to a generic function
>> where iti isn't statically known where it's a pinter or not.  The
>> conpiler will not know whether to test for absence or presence of the
>> bit.
>>
>> -- hendrik
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Type Encoding Format Control
  2015-08-18 19:44     ` Gabriel Scherer
@ 2015-08-18 19:55       ` Kenneth Adam Miller
  2015-08-18 19:58         ` Gabriel Scherer
  2015-08-20  9:10       ` Goswin von Brederlow
  1 sibling, 1 reply; 12+ messages in thread
From: Kenneth Adam Miller @ 2015-08-18 19:55 UTC (permalink / raw)
  To: Gabriel Scherer; +Cc: Hendrik Boom, caml users

[-- Attachment #1: Type: text/plain, Size: 4646 bytes --]

I wouldn't know future route in regards to the question about preferring
exceptions instead later. I wouldn't say go back and undo anything
necessarily, my idea about how this would be implemented would be an extra
annotation that allows the compiler to see that there are potential value
instances of such a type that allow excision of the tag.

In the case of my_struct, you would have instead maybe some extension to
the vernacular like:

type my_struct =
  | None
  | Some of int * int array * string where_instance 0 * [||] * "" is None

In this case, you could continue your annotation of the type my_struct with
more expressive capabilities for situations where there is more than one
hole that represents None.

I don't really believe this constitutes so much value in a language that's
already fast. It would be line noise in almost all applications, and users
might not have incentives to draw from such a language feature unless they
were pushing for improvements in some very tight loop. It was just a
curiosity.

On Tue, Aug 18, 2015 at 3:44 PM, Gabriel Scherer <gabriel.scherer@gmail.com>
wrote:

> We've discussed optimisations of ('a option) in the past. Certainly
> some things could be done, but it's unclear to me how much value there is
> in
> optimizing ('a option) specifically: what if, for example, we later
> understand that ('a, exn) result is the more general abstraction that
> we should have used instead, and rely on it heavily in libraries, will
> we de-optimize options and work on optimizing results?
>
> Note that your idea of "either a failure of a value" can be achieved,
> in some monomorphic cases (specifically when you know 'a and it has a
> product structure) by using a specific type declaration:
>
>   type my_struct =
>     | None
>     | Some of int * int array * string
>
> This will be represented as efficiently as the tuple (int * int array
> * string), yet it has a default case (or two, or another case with
> exceptions, whatever -- this is more flexible than just options). With
> inline records in 4.03 -- not yet released -- you will even be able to
> have some of the product structure mutable:
>
>   type my_struct =
>     | None
>     | Some of { mutable count : int; values : int array; name : string }
>
> On Tue, Aug 18, 2015 at 9:01 PM, Kenneth Adam Miller
> <kennethadammiller@gmail.com> wrote:
> > Well, it's not restricted to pointers - In general I would think that the
> > type annotation for Some | None would be left alone. I just used pointer
> as
> > an example because pointers exclude a value, 0x0, from the valid set. In
> > which case None is encoded as 0x0.
> >
> > Thanks for the bit about polymorphism in the context of what a compiler
> > would see - clients that do not see the hypothetical additional
> annotation
> > for that specific type to allow a format wouldn't have the augmented
> > operational needs to work on such an instance correctly. Got it!
> >
> > On Tue, Aug 18, 2015 at 2:57 PM, Hendrik Boom <hendrik@topoi.pooq.com>
> > wrote:
> >>
> >> On Tue, Aug 18, 2015 at 01:06:55PM -0400, Kenneth Adam Miller wrote:
> >> > I was wondering if cases where format control is possible in typing
> >> > constructs can allow things like restricting the implementation size
> >> > after
> >> > compilation of a specific variant type. Say, for instance that I
> wanted
> >> > to
> >> > have a malloc implementation instead of returning a Some 'a | None
> type
> >> > that compiles down to a boxed case of first a word and then the
> >> > subsequent
> >> > 'a instance, down to the 'a instance, where in the values of the word
> >> > enum
> >> > (or tag) are not present in the possibilities of the 'a instance.
> >> >
> >> > Maybe it sounds silly, but in really tight loops you want to squeeze
> for
> >> > efficiency. So I was wondering if maybe the same actual code be used
> >> > with
> >> > the same sanity of type checking, but some annotation provided at the
> >> > type
> >> > declaration to allow such optimization to take place.
> >>
> >> Let's see.  OCaml steals a bit to indicate whether a valus is a pointer
> >> or not, right?  Could that bit see duual usage for the option type?  So
> >> that if it's an optional pointer type, the bit is left off, and if it's
> >> an optional nonpointer type, it's turned on (and set to point to
> >> location zero, which the GC couls check for)?
> >>
> >> THe proble I see with this is if the 'a is passed to a generic function
> >> where iti isn't statically known where it's a pinter or not.  The
> >> conpiler will not know whether to test for absence or presence of the
> >> bit.
> >>
> >> -- hendrik
> >
> >
>

[-- Attachment #2: Type: text/html, Size: 5852 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Type Encoding Format Control
  2015-08-18 19:55       ` Kenneth Adam Miller
@ 2015-08-18 19:58         ` Gabriel Scherer
  0 siblings, 0 replies; 12+ messages in thread
From: Gabriel Scherer @ 2015-08-18 19:58 UTC (permalink / raw)
  To: Kenneth Adam Miller; +Cc: Hendrik Boom, caml users

My point is that in this case (the information you want to nullify is
a product type), you don't need to define a magic null value or try to
avoid that, because adding an extra None constructor will not change
the in-memory representation of the other case (it will not make it
less compact than the original tuple). Accessing the elements of the
tuple will be done by pattern-matching, which implies an extra check
(just like checking that a value is not NULL), but in fact you can use
GADTs to make even that cost go away when you statically know in which
case you are.

On Tue, Aug 18, 2015 at 9:55 PM, Kenneth Adam Miller
<kennethadammiller@gmail.com> wrote:
> I wouldn't know future route in regards to the question about preferring
> exceptions instead later. I wouldn't say go back and undo anything
> necessarily, my idea about how this would be implemented would be an extra
> annotation that allows the compiler to see that there are potential value
> instances of such a type that allow excision of the tag.
>
> In the case of my_struct, you would have instead maybe some extension to the
> vernacular like:
>
> type my_struct =
>   | None
>   | Some of int * int array * string where_instance 0 * [||] * "" is None
>
> In this case, you could continue your annotation of the type my_struct with
> more expressive capabilities for situations where there is more than one
> hole that represents None.
>
> I don't really believe this constitutes so much value in a language that's
> already fast. It would be line noise in almost all applications, and users
> might not have incentives to draw from such a language feature unless they
> were pushing for improvements in some very tight loop. It was just a
> curiosity.
>
> On Tue, Aug 18, 2015 at 3:44 PM, Gabriel Scherer <gabriel.scherer@gmail.com>
> wrote:
>>
>> We've discussed optimisations of ('a option) in the past. Certainly
>> some things could be done, but it's unclear to me how much value there is
>> in
>> optimizing ('a option) specifically: what if, for example, we later
>> understand that ('a, exn) result is the more general abstraction that
>> we should have used instead, and rely on it heavily in libraries, will
>> we de-optimize options and work on optimizing results?
>>
>> Note that your idea of "either a failure of a value" can be achieved,
>> in some monomorphic cases (specifically when you know 'a and it has a
>> product structure) by using a specific type declaration:
>>
>>   type my_struct =
>>     | None
>>     | Some of int * int array * string
>>
>> This will be represented as efficiently as the tuple (int * int array
>> * string), yet it has a default case (or two, or another case with
>> exceptions, whatever -- this is more flexible than just options). With
>> inline records in 4.03 -- not yet released -- you will even be able to
>> have some of the product structure mutable:
>>
>>   type my_struct =
>>     | None
>>     | Some of { mutable count : int; values : int array; name : string }
>>
>> On Tue, Aug 18, 2015 at 9:01 PM, Kenneth Adam Miller
>> <kennethadammiller@gmail.com> wrote:
>> > Well, it's not restricted to pointers - In general I would think that
>> > the
>> > type annotation for Some | None would be left alone. I just used pointer
>> > as
>> > an example because pointers exclude a value, 0x0, from the valid set. In
>> > which case None is encoded as 0x0.
>> >
>> > Thanks for the bit about polymorphism in the context of what a compiler
>> > would see - clients that do not see the hypothetical additional
>> > annotation
>> > for that specific type to allow a format wouldn't have the augmented
>> > operational needs to work on such an instance correctly. Got it!
>> >
>> > On Tue, Aug 18, 2015 at 2:57 PM, Hendrik Boom <hendrik@topoi.pooq.com>
>> > wrote:
>> >>
>> >> On Tue, Aug 18, 2015 at 01:06:55PM -0400, Kenneth Adam Miller wrote:
>> >> > I was wondering if cases where format control is possible in typing
>> >> > constructs can allow things like restricting the implementation size
>> >> > after
>> >> > compilation of a specific variant type. Say, for instance that I
>> >> > wanted
>> >> > to
>> >> > have a malloc implementation instead of returning a Some 'a | None
>> >> > type
>> >> > that compiles down to a boxed case of first a word and then the
>> >> > subsequent
>> >> > 'a instance, down to the 'a instance, where in the values of the word
>> >> > enum
>> >> > (or tag) are not present in the possibilities of the 'a instance.
>> >> >
>> >> > Maybe it sounds silly, but in really tight loops you want to squeeze
>> >> > for
>> >> > efficiency. So I was wondering if maybe the same actual code be used
>> >> > with
>> >> > the same sanity of type checking, but some annotation provided at the
>> >> > type
>> >> > declaration to allow such optimization to take place.
>> >>
>> >> Let's see.  OCaml steals a bit to indicate whether a valus is a pointer
>> >> or not, right?  Could that bit see duual usage for the option type?  So
>> >> that if it's an optional pointer type, the bit is left off, and if it's
>> >> an optional nonpointer type, it's turned on (and set to point to
>> >> location zero, which the GC couls check for)?
>> >>
>> >> THe proble I see with this is if the 'a is passed to a generic function
>> >> where iti isn't statically known where it's a pinter or not.  The
>> >> conpiler will not know whether to test for absence or presence of the
>> >> bit.
>> >>
>> >> -- hendrik
>> >
>> >
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Type Encoding Format Control
  2015-08-18 19:44     ` Gabriel Scherer
  2015-08-18 19:55       ` Kenneth Adam Miller
@ 2015-08-20  9:10       ` Goswin von Brederlow
  2015-08-20 13:08         ` Kenneth Adam Miller
  1 sibling, 1 reply; 12+ messages in thread
From: Goswin von Brederlow @ 2015-08-20  9:10 UTC (permalink / raw)
  To: caml-list

On Tue, Aug 18, 2015 at 09:44:04PM +0200, Gabriel Scherer wrote:
> We've discussed optimisations of ('a option) in the past. Certainly
> some things could be done, but it's unclear to me how much value there is in
> optimizing ('a option) specifically: what if, for example, we later
> understand that ('a, exn) result is the more general abstraction that
> we should have used instead, and rely on it heavily in libraries, will
> we de-optimize options and work on optimizing results?
> 
> Note that your idea of "either a failure of a value" can be achieved,
> in some monomorphic cases (specifically when you know 'a and it has a
> product structure) by using a specific type declaration:
> 
>   type my_struct =
>     | None
>     | Some of int * int array * string
> 
> This will be represented as efficiently as the tuple (int * int array
> * string), yet it has a default case (or two, or another case with
> exceptions, whatever -- this is more flexible than just options). With
> inline records in 4.03 -- not yet released -- you will even be able to
> have some of the product structure mutable:
> 
>   type my_struct =
>     | None
>     | Some of { mutable count : int; values : int array; name : string }

Except here you are talking about an already boxed value. The Some
part is encoded in the tag of the box you already have.

In the case of a pointer you have not box to put the Some tag on.
 
> On Tue, Aug 18, 2015 at 9:01 PM, Kenneth Adam Miller
> <kennethadammiller@gmail.com> wrote:
> > Well, it's not restricted to pointers - In general I would think that the
> > type annotation for Some | None would be left alone. I just used pointer as
> > an example because pointers exclude a value, 0x0, from the valid set. In
> > which case None is encoded as 0x0.
> >
> > Thanks for the bit about polymorphism in the context of what a compiler
> > would see - clients that do not see the hypothetical additional annotation
> > for that specific type to allow a format wouldn't have the augmented
> > operational needs to work on such an instance correctly. Got it!

There are 2 problems:

1) polymorphism

A function taking an 'a option could get 0, which it would easily
detect as 0, a pointer to some value or a pointer to box taged Some.
But there is no way to separate the last two cases. The pointer
doesn't even have to point to an ocaml value so dereferencing and
checking if it is a box with Some tag is not an option. And even if
you could any variant type with a boxed constructor will have a tag of
0, same as Some. Having different representation for pointer option
and other option doesn't work.

2) 'a option option

For 'a option None becomes 0x0 and Some pointer becomes pointer. But
what about 'a option option. Do you repeat the process? Then None and
Some None would both become 0x0 and can't be separated. And if you
don't repeat the process you break polymorphism (see 1).


What you can do is create a new type that behaves like

    type 'a ptr_option = NULL | 'a

but the type would have to be abstract and you would have to have a
function to convert it into an 'a option for pattern matching like

    match to_option ptr_opt with
    | None -> ...
    | Some ptr -> ...

and hope the optimizer eliminates the allocation and boxing.

Doesn't ocaml-ctypes already have such a type?

> > On Tue, Aug 18, 2015 at 2:57 PM, Hendrik Boom <hendrik@topoi.pooq.com>
> > wrote:
> >>
> >> On Tue, Aug 18, 2015 at 01:06:55PM -0400, Kenneth Adam Miller wrote:
> >> > I was wondering if cases where format control is possible in typing
> >> > constructs can allow things like restricting the implementation size
> >> > after
> >> > compilation of a specific variant type. Say, for instance that I wanted
> >> > to
> >> > have a malloc implementation instead of returning a Some 'a | None type
> >> > that compiles down to a boxed case of first a word and then the
> >> > subsequent
> >> > 'a instance, down to the 'a instance, where in the values of the word
> >> > enum
> >> > (or tag) are not present in the possibilities of the 'a instance.
> >> >
> >> > Maybe it sounds silly, but in really tight loops you want to squeeze for
> >> > efficiency. So I was wondering if maybe the same actual code be used
> >> > with
> >> > the same sanity of type checking, but some annotation provided at the
> >> > type
> >> > declaration to allow such optimization to take place.
> >>
> >> Let's see.  OCaml steals a bit to indicate whether a valus is a pointer
> >> or not, right?  Could that bit see duual usage for the option type?  So
> >> that if it's an optional pointer type, the bit is left off, and if it's
> >> an optional nonpointer type, it's turned on (and set to point to
> >> location zero, which the GC couls check for)?
> >>
> >> THe proble I see with this is if the 'a is passed to a generic function
> >> where iti isn't statically known where it's a pinter or not.  The
> >> conpiler will not know whether to test for absence or presence of the
> >> bit.
> >>
> >> -- hendrik

MfG
	Goswin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Type Encoding Format Control
  2015-08-20  9:10       ` Goswin von Brederlow
@ 2015-08-20 13:08         ` Kenneth Adam Miller
  2015-08-20 14:05           ` David Allsopp
  0 siblings, 1 reply; 12+ messages in thread
From: Kenneth Adam Miller @ 2015-08-20 13:08 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: caml users

[-- Attachment #1: Type: text/plain, Size: 6177 bytes --]

In the case of 2), that's interesting because such a type of Some None is
sort of antithetical to describing anything sensical. Not that it's not
pragmatic or doesn't occur-I'm sure some functions get combined in ways
that stuff like this crops up. But I think of the typing system as being
badly exercised if something like this arises-but I really see your point
in terms of implementation. Thanks for writing this.

I'm sure that ctypes would likely be able to answer exactly this problem
because it boils down to C. But it was a general question about extensions
to the typing system.

On Thu, Aug 20, 2015 at 5:10 AM, Goswin von Brederlow <goswin-v-b@web.de>
wrote:

> On Tue, Aug 18, 2015 at 09:44:04PM +0200, Gabriel Scherer wrote:
> > We've discussed optimisations of ('a option) in the past. Certainly
> > some things could be done, but it's unclear to me how much value there
> is in
> > optimizing ('a option) specifically: what if, for example, we later
> > understand that ('a, exn) result is the more general abstraction that
> > we should have used instead, and rely on it heavily in libraries, will
> > we de-optimize options and work on optimizing results?
> >
> > Note that your idea of "either a failure of a value" can be achieved,
> > in some monomorphic cases (specifically when you know 'a and it has a
> > product structure) by using a specific type declaration:
> >
> >   type my_struct =
> >     | None
> >     | Some of int * int array * string
> >
> > This will be represented as efficiently as the tuple (int * int array
> > * string), yet it has a default case (or two, or another case with
> > exceptions, whatever -- this is more flexible than just options). With
> > inline records in 4.03 -- not yet released -- you will even be able to
> > have some of the product structure mutable:
> >
> >   type my_struct =
> >     | None
> >     | Some of { mutable count : int; values : int array; name : string }
>
> Except here you are talking about an already boxed value. The Some
> part is encoded in the tag of the box you already have.
>
> In the case of a pointer you have not box to put the Some tag on.
>
> > On Tue, Aug 18, 2015 at 9:01 PM, Kenneth Adam Miller
> > <kennethadammiller@gmail.com> wrote:
> > > Well, it's not restricted to pointers - In general I would think that
> the
> > > type annotation for Some | None would be left alone. I just used
> pointer as
> > > an example because pointers exclude a value, 0x0, from the valid set.
> In
> > > which case None is encoded as 0x0.
> > >
> > > Thanks for the bit about polymorphism in the context of what a compiler
> > > would see - clients that do not see the hypothetical additional
> annotation
> > > for that specific type to allow a format wouldn't have the augmented
> > > operational needs to work on such an instance correctly. Got it!
>
> There are 2 problems:
>
> 1) polymorphism
>
> A function taking an 'a option could get 0, which it would easily
> detect as 0, a pointer to some value or a pointer to box taged Some.
> But there is no way to separate the last two cases. The pointer
> doesn't even have to point to an ocaml value so dereferencing and
> checking if it is a box with Some tag is not an option. And even if
> you could any variant type with a boxed constructor will have a tag of
> 0, same as Some. Having different representation for pointer option
> and other option doesn't work.
>
> 2) 'a option option
>
> For 'a option None becomes 0x0 and Some pointer becomes pointer. But
> what about 'a option option. Do you repeat the process? Then None and
> Some None would both become 0x0 and can't be separated. And if you
> don't repeat the process you break polymorphism (see 1).
>
>
> What you can do is create a new type that behaves like
>
>     type 'a ptr_option = NULL | 'a
>
> but the type would have to be abstract and you would have to have a
> function to convert it into an 'a option for pattern matching like
>
>     match to_option ptr_opt with
>     | None -> ...
>     | Some ptr -> ...
>
> and hope the optimizer eliminates the allocation and boxing.
>
> Doesn't ocaml-ctypes already have such a type?
>
> > > On Tue, Aug 18, 2015 at 2:57 PM, Hendrik Boom <hendrik@topoi.pooq.com>
> > > wrote:
> > >>
> > >> On Tue, Aug 18, 2015 at 01:06:55PM -0400, Kenneth Adam Miller wrote:
> > >> > I was wondering if cases where format control is possible in typing
> > >> > constructs can allow things like restricting the implementation size
> > >> > after
> > >> > compilation of a specific variant type. Say, for instance that I
> wanted
> > >> > to
> > >> > have a malloc implementation instead of returning a Some 'a | None
> type
> > >> > that compiles down to a boxed case of first a word and then the
> > >> > subsequent
> > >> > 'a instance, down to the 'a instance, where in the values of the
> word
> > >> > enum
> > >> > (or tag) are not present in the possibilities of the 'a instance.
> > >> >
> > >> > Maybe it sounds silly, but in really tight loops you want to
> squeeze for
> > >> > efficiency. So I was wondering if maybe the same actual code be used
> > >> > with
> > >> > the same sanity of type checking, but some annotation provided at
> the
> > >> > type
> > >> > declaration to allow such optimization to take place.
> > >>
> > >> Let's see.  OCaml steals a bit to indicate whether a valus is a
> pointer
> > >> or not, right?  Could that bit see duual usage for the option type?
> So
> > >> that if it's an optional pointer type, the bit is left off, and if
> it's
> > >> an optional nonpointer type, it's turned on (and set to point to
> > >> location zero, which the GC couls check for)?
> > >>
> > >> THe proble I see with this is if the 'a is passed to a generic
> function
> > >> where iti isn't statically known where it's a pinter or not.  The
> > >> conpiler will not know whether to test for absence or presence of the
> > >> bit.
> > >>
> > >> -- hendrik
>
> MfG
>         Goswin
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>

[-- Attachment #2: Type: text/html, Size: 8047 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [Caml-list] Type Encoding Format Control
  2015-08-20 13:08         ` Kenneth Adam Miller
@ 2015-08-20 14:05           ` David Allsopp
  2015-08-20 14:09             ` Kenneth Adam Miller
  0 siblings, 1 reply; 12+ messages in thread
From: David Allsopp @ 2015-08-20 14:05 UTC (permalink / raw)
  To: Kenneth Adam Miller, Goswin von Brederlow; +Cc: caml users

Kenneth Adam Miller wrote:
> In the case of 2), that's interesting because such a type of
> Some None is sort of antithetical to describing anything 
> sensical. Not that it's not pragmatic or doesn't occur-I'm sure
> some functions get combined in ways that stuff like this crops 
> up. But I think of the typing system as being badly exercised 
> if something like this arises-

One example that springs immediately to mind: NULLable field in a database, so 'a option is a sensible type to represent it. Now consider a function intended to generate SQL UPDATE statements for that field:

val update_record : ?foo:int option -> ?bar:int option -> id:int -> baz:string -> bool

where omitting ~foo or ~bar means "don't change foo/bar in the UPDATE statement". Within the code for update_record:

foo = None => don't update that field
foo = Some None => set that field to NULL
foo = Some (Some i) => set that field's value to i

and all three cases will need different code.

See also https://github.com/ocaml/ocaml/blob/trunk/typing/env.ml#L391

What's (to you) badly exercised or nonsensical in either of those representations?


David

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Type Encoding Format Control
  2015-08-20 14:05           ` David Allsopp
@ 2015-08-20 14:09             ` Kenneth Adam Miller
  2015-08-20 14:11               ` Kenneth Adam Miller
  0 siblings, 1 reply; 12+ messages in thread
From: Kenneth Adam Miller @ 2015-08-20 14:09 UTC (permalink / raw)
  To: David Allsopp; +Cc: Goswin von Brederlow, caml users

[-- Attachment #1: Type: text/plain, Size: 1919 bytes --]

It expresses intuitively, "Something which is exactly a nothing", so
naturally, I would categorize that as a nothing directly of course. And
you've just done precisely that with your code; foo = Some None => set that
field to NULL could only represent saying that field is just exactly
nothing directly. So, it's just like I said-you have to deal with the
instance because it comes up in practice, and pragmatically we have to
represent such cases in machine code as has been discussed. But in
practicality almost never would an author sensibly keep the expanded form
of Some None directly, it shows up due to code combinations only to be
reduced.

On Thu, Aug 20, 2015 at 10:05 AM, David Allsopp <dra-news@metastack.com>
wrote:

> Kenneth Adam Miller wrote:
> > In the case of 2), that's interesting because such a type of
> > Some None is sort of antithetical to describing anything
> > sensical. Not that it's not pragmatic or doesn't occur-I'm sure
> > some functions get combined in ways that stuff like this crops
> > up. But I think of the typing system as being badly exercised
> > if something like this arises-
>
> One example that springs immediately to mind: NULLable field in a
> database, so 'a option is a sensible type to represent it. Now consider a
> function intended to generate SQL UPDATE statements for that field:
>
> val update_record : ?foo:int option -> ?bar:int option -> id:int ->
> baz:string -> bool
>
> where omitting ~foo or ~bar means "don't change foo/bar in the UPDATE
> statement". Within the code for update_record:
>
> foo = None => don't update that field
> foo = Some None => set that field to NULL
> foo = Some (Some i) => set that field's value to i
>
> and all three cases will need different code.
>
> See also https://github.com/ocaml/ocaml/blob/trunk/typing/env.ml#L391
>
> What's (to you) badly exercised or nonsensical in either of those
> representations?
>
>
> David
>

[-- Attachment #2: Type: text/html, Size: 2586 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Type Encoding Format Control
  2015-08-20 14:09             ` Kenneth Adam Miller
@ 2015-08-20 14:11               ` Kenneth Adam Miller
  0 siblings, 0 replies; 12+ messages in thread
From: Kenneth Adam Miller @ 2015-08-20 14:11 UTC (permalink / raw)
  To: David Allsopp; +Cc: Goswin von Brederlow, caml users

[-- Attachment #1: Type: text/plain, Size: 2316 bytes --]

One of the things I had also thought about expressing also is about
automatically collapsing such a type instance as part of it's definition
with other annotations, but that's complicated and likely a different topic
so I didn't mention it.

On Thu, Aug 20, 2015 at 10:09 AM, Kenneth Adam Miller <
kennethadammiller@gmail.com> wrote:

> It expresses intuitively, "Something which is exactly a nothing", so
> naturally, I would categorize that as a nothing directly of course. And
> you've just done precisely that with your code; foo = Some None => set that
> field to NULL could only represent saying that field is just exactly
> nothing directly. So, it's just like I said-you have to deal with the
> instance because it comes up in practice, and pragmatically we have to
> represent such cases in machine code as has been discussed. But in
> practicality almost never would an author sensibly keep the expanded form
> of Some None directly, it shows up due to code combinations only to be
> reduced.
>
> On Thu, Aug 20, 2015 at 10:05 AM, David Allsopp <dra-news@metastack.com>
> wrote:
>
>> Kenneth Adam Miller wrote:
>> > In the case of 2), that's interesting because such a type of
>> > Some None is sort of antithetical to describing anything
>> > sensical. Not that it's not pragmatic or doesn't occur-I'm sure
>> > some functions get combined in ways that stuff like this crops
>> > up. But I think of the typing system as being badly exercised
>> > if something like this arises-
>>
>> One example that springs immediately to mind: NULLable field in a
>> database, so 'a option is a sensible type to represent it. Now consider a
>> function intended to generate SQL UPDATE statements for that field:
>>
>> val update_record : ?foo:int option -> ?bar:int option -> id:int ->
>> baz:string -> bool
>>
>> where omitting ~foo or ~bar means "don't change foo/bar in the UPDATE
>> statement". Within the code for update_record:
>>
>> foo = None => don't update that field
>> foo = Some None => set that field to NULL
>> foo = Some (Some i) => set that field's value to i
>>
>> and all three cases will need different code.
>>
>> See also https://github.com/ocaml/ocaml/blob/trunk/typing/env.ml#L391
>>
>> What's (to you) badly exercised or nonsensical in either of those
>> representations?
>>
>>
>> David
>>
>
>

[-- Attachment #2: Type: text/html, Size: 3258 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] <DKIM> Type Encoding Format Control
  2015-08-18 17:06 [Caml-list] Type Encoding Format Control Kenneth Adam Miller
  2015-08-18 18:57 ` Hendrik Boom
@ 2015-08-25 12:09 ` Pierre Chambart
  1 sibling, 0 replies; 12+ messages in thread
From: Pierre Chambart @ 2015-08-25 12:09 UTC (permalink / raw)
  To: Kenneth Adam Miller, caml users

On 18/08/2015 19:06, Kenneth Adam Miller wrote:
> I was wondering if cases where format control is possible in typing
> constructs can allow things like restricting the implementation size
> after compilation of a specific variant type. Say, for instance that I
> wanted to have a malloc implementation instead of returning a Some 'a
> | None type that compiles down to a boxed case of first a word and
> then the subsequent 'a instance, down to the 'a instance, where in the
> values of the word enum (or tag) are not present in the possibilities
> of the 'a instance.
>
> Maybe it sounds silly, but in really tight loops you want to squeeze
> for efficiency. So I was wondering if maybe the same actual code be
> used with the same sanity of type checking, but some annotation
> provided at the type declaration to allow such optimization to take place.
Note that there are some ways to avoid allocating options in tight loops
using well chosen GADT and CPS style. With CPS style, it is possible to
'return' multiple values (as they are encoded as arguments to the
continuation) and GADT's it is possible to encode the tag of a value
outside of it, i.e. not needing to box to have different cases.

This is kind of a heavy style, but if it is only for the tight loop, it
may not matter.

For instance, encoding a while loop accumulating on an option can look like:

let f x =
  let r = ref None in
  while 'condition' do
    do_something ...
    if other_condition then
      r := Some (something)
    else
      r := None
  done

Transformed to:

type some_type = ... (the thing that will end up in the option)

type _ arg =
  | None_arg : unit arg
  | Some_arg : some_type arg

let f x =
  let rec loop : type t. t arg -> t -> unit = fun arg_kind r ->
    if condition then ()
    else begin
      do_something ...
      if other_condition then
        loop Some_arg (something)
      else
        loop None_arg ()
   in
   loop None_arg ()

Happy encoding !
-- 
Pierre

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2015-08-25 12:09 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-18 17:06 [Caml-list] Type Encoding Format Control Kenneth Adam Miller
2015-08-18 18:57 ` Hendrik Boom
2015-08-18 19:01   ` Kenneth Adam Miller
2015-08-18 19:44     ` Gabriel Scherer
2015-08-18 19:55       ` Kenneth Adam Miller
2015-08-18 19:58         ` Gabriel Scherer
2015-08-20  9:10       ` Goswin von Brederlow
2015-08-20 13:08         ` Kenneth Adam Miller
2015-08-20 14:05           ` David Allsopp
2015-08-20 14:09             ` Kenneth Adam Miller
2015-08-20 14:11               ` Kenneth Adam Miller
2015-08-25 12:09 ` [Caml-list] <DKIM> " Pierre Chambart

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).