caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] Funny name for a type variable in the output: should this be filed on the BTS?
@ 2012-03-22  9:51 Roberto Di Cosmo
  2012-03-22 12:05 ` Julien Signoles
  2012-03-22 22:12 ` Jacques Garrigue
  0 siblings, 2 replies; 11+ messages in thread
From: Roberto Di Cosmo @ 2012-03-22  9:51 UTC (permalink / raw)
  To: caml-list

I was playing around some more with GADTs, preparing my next course
on functional programming, and came across an example on
which the type checker outputs a type containing a funny name
for a type variable. 

I do not know if this qualifies as a bug, so before filing it, I'd
love some feedback.

Here we are: I am writing an example to show how one can code
a well typed length function for empty/nonempty list; we start
with the type definition

 type empty
 type nonempty

 type ('a, 'b) l =
     Nil : (empty, 'b) l
   | Cons : 'b * ('a, 'b) l -> (nonempty, 'b) l;;

then we move on to the length function

 let rec length : type a b. (a,b) l -> 'a = function 
   | Nil -> 0 
   | (Cons (_,r)) -> 1+ (length r);;

Hey, you say, the natural type to declare for length should be

   type a b. (a,b) l -> int

but as you see, for some mysterious reason I ended
up writing something silly (dont ask why, I am just
very well known to step on strange bugs as easily
as I breath)

   type a b. (a,b) l -> 'a

The type checker is happy anyway: 'a will be instantiated
to int, and the declaration of length is accepted... but now
look at the output type

   val length : ('a1, 'b) l -> int = <fun>

Why do we get 'a1, and not 'a, in the type?

Well, probably, since 'a is instantiated to int during
type checking, it may be the case that 'a, as type name, is
still marked as taken during the type output, so we get ('a1,'b)

The type is perfectly sound... it is just 'surprising' for
a regular user... do you think this should be considered a bug?

--Roberto
 
------------------------------------------------------------------
Professeur               En delegation a l'INRIA
PPS                      E-mail: roberto@dicosmo.org
Universite Paris Diderot WWW  : http://www.dicosmo.org
Case 7014                Tel  : ++33-(0)1-57 27 92 20
5, Rue Thomas Mann       
F-75205 Paris Cedex 13   Identica: http://identi.ca/rdicosmo
FRANCE.                  Twitter: http://twitter.com/rdicosmo
------------------------------------------------------------------
Attachments:
MIME accepted, Word deprecated
      http://www.gnu.org/philosophy/no-word-attachments.html
------------------------------------------------------------------
Office location:
 
Bureau 6C08 (6th floor)
175, rue du Chevaleret, XIII
Metro Chevaleret, ligne 6
------------------------------------------------------------------                                                 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Caml-list] Funny name for a type variable in the output: should this be filed on the BTS?
  2012-03-22  9:51 [Caml-list] Funny name for a type variable in the output: should this be filed on the BTS? Roberto Di Cosmo
@ 2012-03-22 12:05 ` Julien Signoles
  2012-03-22 13:39   ` Jonathan Protzenko
  2012-03-22 22:12 ` Jacques Garrigue
  1 sibling, 1 reply; 11+ messages in thread
From: Julien Signoles @ 2012-03-22 12:05 UTC (permalink / raw)
  To: Roberto Di Cosmo; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 671 bytes --]

Hello,

Le 22 mars 2012 10:51, Roberto Di Cosmo <roberto@dicosmo.org> a écrit :

>   val length : ('a1, 'b) l -> int = <fun>
>
> Why do we get 'a1, and not 'a, in the type?
>
> Well, probably, since 'a is instantiated to int during
> type checking, it may be the case that 'a, as type name, is
> still marked as taken during the type output, so we get ('a1,'b)
>
> The type is perfectly sound... it is just 'surprising' for
> a regular user... do you think this should be considered a bug?
>

IMHO it is not a bug (as you said, the type is sound), but you could write
a feature request like "generate variable name as best as possible"...

--
Julien

[-- Attachment #2: Type: text/html, Size: 997 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Caml-list] Funny name for a type variable in the output: should this be filed on the BTS?
  2012-03-22 12:05 ` Julien Signoles
@ 2012-03-22 13:39   ` Jonathan Protzenko
  2012-03-22 14:40     ` Didier Remy
  0 siblings, 1 reply; 11+ messages in thread
From: Jonathan Protzenko @ 2012-03-22 13:39 UTC (permalink / raw)
  To: Julien Signoles; +Cc: caml-list, Roberto Di Cosmo

Jacques Garrigue recently implemented a patch that tries to keep 
user-provided type names when printing out types, and also when giving 
error messages. In your example, there probably is an internal variable 
that's called "a" because of the type name you provided. Some wizardry 
happens with GADTs, so another type variable is generated. The other 
one ends up being printed, and because it's called "a" too, the 
type-checker has to find another suitable name.

Or something like that. Maybe change type a b. to type foo bar. will 
give different results :).

Cheers,

jonathan

On Thu 22 Mar 2012 01:05:08 PM CET, Julien Signoles wrote:
> Hello,
>
> Le 22 mars 2012 10:51, Roberto Di Cosmo <roberto@dicosmo.org
> <mailto:roberto@dicosmo.org>> a écrit :
>
>       val length : ('a1, 'b) l -> int = <fun>
>
>     Why do we get 'a1, and not 'a, in the type?
>
>     Well, probably, since 'a is instantiated to int during
>     type checking, it may be the case that 'a, as type name, is
>     still marked as taken during the type output, so we get ('a1,'b)
>
>     The type is perfectly sound... it is just 'surprising' for
>     a regular user... do you think this should be considered a bug?
>
>
> IMHO it is not a bug (as you said, the type is sound), but you could
> write a feature request like "generate variable name as best as
> possible"...
>
> --
> Julien

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Caml-list] Funny name for a type variable in the output: should this be filed on the BTS?
  2012-03-22 13:39   ` Jonathan Protzenko
@ 2012-03-22 14:40     ` Didier Remy
  0 siblings, 0 replies; 11+ messages in thread
From: Didier Remy @ 2012-03-22 14:40 UTC (permalink / raw)
  To: caml-list

> Jacques Garrigue recently implemented a patch that tries to keep user-provided
> type names when printing out types, and also when giving error
> messages. In your
> example, there probably is an internal variable that's called "a" because of the
> type name you provided. Some wizardry happens with GADTs, so another type
> variable is generated. The other one ends up being printed, and because it's
> called "a" too, the type-checker has to find another suitable name.
>
> Or something like that. Maybe change type a b. to type foo bar. will give
> different results :).

This is an explanation of why it is printed so.

The question of Roberto is rather whether this is intended or should be
fixed (call it a bug or anything else).

If the intention is that types should be printed with _best_ variable names
for some meaning of best (shortest, reuse source variable names whenever
meaningful, etc.), I do not see any reasonable definition of "best" that
would imply printing 'a1 in Roberto's example instead of 'a, while 'a is
printed in the two following examples (especially the second one) while
it appears in the source---and also unified with int in the second example.

         let id : 'a -> 'b = function x -> 1
         id : 'a -> 'a

         let foo : 'b -> 'a = function x -> fst x; 1;;
         val foo : 'a * 'b -> int = <fun>

Thus, I would tend to call this a bug (a something that should be fixed if
possible)...

         Didier



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Caml-list] Funny name for a type variable in the output: should this be filed on the BTS?
  2012-03-22  9:51 [Caml-list] Funny name for a type variable in the output: should this be filed on the BTS? Roberto Di Cosmo
  2012-03-22 12:05 ` Julien Signoles
@ 2012-03-22 22:12 ` Jacques Garrigue
  2012-03-22 23:35   ` Roberto Di Cosmo
  2012-03-23  5:52   ` Gabriel Scherer
  1 sibling, 2 replies; 11+ messages in thread
From: Jacques Garrigue @ 2012-03-22 22:12 UTC (permalink / raw)
  To: Roberto Di Cosmo; +Cc: caml-list

On 2012/03/22, at 18:51, Roberto Di Cosmo wrote:

> I was playing around some more with GADTs, preparing my next course
> on functional programming, and came across an example on
> which the type checker outputs a type containing a funny name
> for a type variable. 

Dear Roberto,

Thanks for testing the new features.
There certainly are some rough edges left.

> I do not know if this qualifies as a bug, so before filing it, I'd
> love some feedback.
> 
> Here we are: I am writing an example to show how one can code
> a well typed length function for empty/nonempty list; we start
> with the type definition
> 
> type empty
> type nonempty
> 
> type ('a, 'b) l =
>     Nil : (empty, 'b) l
>   | Cons : 'b * ('a, 'b) l -> (nonempty, 'b) l;;
> 
> then we move on to the length function
> 
> let rec length : type a b. (a,b) l -> 'a = function 
>   | Nil -> 0 
>   | (Cons (_,r)) -> 1+ (length r);;
> 
> Hey, you say, the natural type to declare for length should be
> 
>   type a b. (a,b) l -> int
> 
> but as you see, for some mysterious reason I ended
> up writing something silly (dont ask why, I am just
> very well known to step on strange bugs as easily
> as I breath)
> 
>   type a b. (a,b) l -> 'a
> 
> The type checker is happy anyway: 'a will be instantiated
> to int, and the declaration of length is accepted... but now
> look at the output type
> 
>   val length : ('a1, 'b) l -> int = <fun>
> 
> Why do we get 'a1, and not 'a, in the type?
> 
> Well, probably, since 'a is instantiated to int during
> type checking, it may be the case that 'a, as type name, is
> still marked as taken during the type output, so we get ('a1,'b)
> 
> The type is perfectly sound... it is just 'surprising' for
> a regular user... do you think this should be considered a bug?

Well, since there is no specification whatsoever about type variable names
in the output, this is hard to call it a bug.
On the other hand, you were surprised, so something is probably wrong.

After thinking a bit more about it, actually the problem is not in the printer,
as everybody assumed, but in the parser.
Namely, 

	... : type a b. (a,b) l -> 'a = ...

is just syntactic sugar for

	... : 'a1 'b. ('a1,'b) l -> 'a = fun (type a) (type b) -> (... : (a,b) l -> 'a)

Now you see where 'a1 appears: since there is already an 'a in the type,
we cannot use 'a for the fresh abstract type a.

However, as you already mentioned, the original type itself was erroneous.
So a better solution would be to get an error at that point.
Namely, if the type annotation contains a type variable with the same name
as one of the quantified types.
Does it sound reasonable.

At times I wonder whether the "type a b. ...." syntax is the right approach.
Another approach would be to change the meaning of
  ... : 'a 'b. ('a,'b) l -> 'a = ...
to have 'a and 'b defined in the right hand side.
The trouble is that we still need the locally abstract types internally, so this
could be confusing.
Also this could break some existing programs using polymorphic methods.

	Jacques Garrigue

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Caml-list] Funny name for a type variable in the output: should this be filed on the BTS?
  2012-03-22 22:12 ` Jacques Garrigue
@ 2012-03-22 23:35   ` Roberto Di Cosmo
  2012-03-23  5:52   ` Gabriel Scherer
  1 sibling, 0 replies; 11+ messages in thread
From: Roberto Di Cosmo @ 2012-03-22 23:35 UTC (permalink / raw)
  To: Jacques Garrigue; +Cc: caml-list

Dear Jacques,
     thanks for having looked into this issue and explaining its origin:
if I understand well, this comes from a combination of two mechanisms
 
 (1) the desugaring of type a b. (a,b) l -> 'a into 'a1 'b. ('a1,'b) l -> 'a 
     that introduces the 'a1
  
 (2) the fact that the output type strives to respect the type variable
     names used in the user code : this brings the 'a1 over to the
     output type

I think (2) is a useful feature: it may be hepful to see immediately
where a type variable comes from.

I would probably, as you did, point my finger at (1): the trouble seems
to come from the fact that we can intermix unquoted type variables and
quoted type variables in the same type.

One may have very legitimate reasons to want an 'a in the type 

 type a b. (a,b) l -> 'a 

but of course we do not expect a and 'a to be the same variable...

--Roberto



On Fri, Mar 23, 2012 at 07:12:52AM +0900, Jacques Garrigue wrote:
> On 2012/03/22, at 18:51, Roberto Di Cosmo wrote:
<snip>
> > type empty
> > type nonempty
> > 
> > type ('a, 'b) l =
> >     Nil : (empty, 'b) l
> >   | Cons : 'b * ('a, 'b) l -> (nonempty, 'b) l;;
> > 
<snip>
> > let rec length : type a b. (a,b) l -> 'a = function 
> >   | Nil -> 0 
> >   | (Cons (_,r)) -> 1+ (length r);;
> > 
<snip>
> > look at the output type
> > 
> >   val length : ('a1, 'b) l -> int = <fun>
> > 
<snip>
> > The type is perfectly sound... it is just 'surprising' for
> > a regular user... do you think this should be considered a bug?
> 
> Well, since there is no specification whatsoever about type variable names
> in the output, this is hard to call it a bug.
> On the other hand, you were surprised, so something is probably wrong.
> 
> After thinking a bit more about it, actually the problem is not in the printer,
> as everybody assumed, but in the parser.
> Namely, 
> 
> 	... : type a b. (a,b) l -> 'a = ...
> 
> is just syntactic sugar for
> 
> 	... : 'a1 'b. ('a1,'b) l -> 'a = fun (type a) (type b) -> (... : (a,b) l -> 'a)
> 
> Now you see where 'a1 appears: since there is already an 'a in the type,
> we cannot use 'a for the fresh abstract type a.
> 
> However, as you already mentioned, the original type itself was erroneous.
> So a better solution would be to get an error at that point.
> Namely, if the type annotation contains a type variable with the same name
> as one of the quantified types.
> Does it sound reasonable.
> 
> At times I wonder whether the "type a b. ...." syntax is the right approach.
> Another approach would be to change the meaning of
>   ... : 'a 'b. ('a,'b) l -> 'a = ...
> to have 'a and 'b defined in the right hand side.
> The trouble is that we still need the locally abstract types internally, so this
> could be confusing.
> Also this could break some existing programs using polymorphic methods.
> 
> 	Jacques Garrigue

-- 
--Roberto Di Cosmo
 
------------------------------------------------------------------
Professeur               En delegation a l'INRIA
PPS                      E-mail: roberto@dicosmo.org
Universite Paris Diderot WWW  : http://www.dicosmo.org
Case 7014                Tel  : ++33-(0)1-57 27 92 20
5, Rue Thomas Mann       
F-75205 Paris Cedex 13   Identica: http://identi.ca/rdicosmo
FRANCE.                  Twitter: http://twitter.com/rdicosmo
------------------------------------------------------------------
Attachments:
MIME accepted, Word deprecated
      http://www.gnu.org/philosophy/no-word-attachments.html
------------------------------------------------------------------
Office location:
 
Bureau 6C08 (6th floor)
175, rue du Chevaleret, XIII
Metro Chevaleret, ligne 6
------------------------------------------------------------------                                                 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Caml-list] Funny name for a type variable in the output: should this be filed on the BTS?
  2012-03-22 22:12 ` Jacques Garrigue
  2012-03-22 23:35   ` Roberto Di Cosmo
@ 2012-03-23  5:52   ` Gabriel Scherer
  2012-03-23  6:45     ` Jacques Garrigue
  1 sibling, 1 reply; 11+ messages in thread
From: Gabriel Scherer @ 2012-03-23  5:52 UTC (permalink / raw)
  To: Jacques Garrigue; +Cc: Roberto Di Cosmo, caml-list

> However, as you already mentioned, the original type itself was erroneous.
> So a better solution would be to get an error at that point.
> Namely, if the type annotation contains a type variable with the same name
> as one of the quantified types.
> Does it sound reasonable.

I'd rather use a warning here. There are enough subtleties with (type
a b . foo) already, I possible I would be glad not to have to explain
different scoping rules from the rest of the language.

> Another approach would be to change the meaning of
>  ... : 'a 'b. ('a,'b) l -> 'a = ...
> to have 'a and 'b defined in the right hand side.

That would be quite strange. I already find it dubious that (type a b
. ....) would escape its natural scope, but our nice old type
variables had not been affected by those changes yet, and I think it's
better that they keep their natural meaning of local quantification.

Could we get the type-checker to understand that
    let rec length (type a) (type b) : (a, b) l -> _ = function ...
or let rec length (type a) (type b) (li : (a, b) l) = ...
is polymorphic enough to be a case of polymorphic recursion? This way
we could get rid of the (type a b . ....) syntax altogether have
clearer scoping rules.

(On my wishlist:   ... (type a) (type b) ... could be written ...
(type a b) ...)

On Thu, Mar 22, 2012 at 11:12 PM, Jacques Garrigue
<garrigue@math.nagoya-u.ac.jp> wrote:
> On 2012/03/22, at 18:51, Roberto Di Cosmo wrote:
>
>> I was playing around some more with GADTs, preparing my next course
>> on functional programming, and came across an example on
>> which the type checker outputs a type containing a funny name
>> for a type variable.
>
> Dear Roberto,
>
> Thanks for testing the new features.
> There certainly are some rough edges left.
>
>> I do not know if this qualifies as a bug, so before filing it, I'd
>> love some feedback.
>>
>> Here we are: I am writing an example to show how one can code
>> a well typed length function for empty/nonempty list; we start
>> with the type definition
>>
>> type empty
>> type nonempty
>>
>> type ('a, 'b) l =
>>     Nil : (empty, 'b) l
>>   | Cons : 'b * ('a, 'b) l -> (nonempty, 'b) l;;
>>
>> then we move on to the length function
>>
>> let rec length : type a b. (a,b) l -> 'a = function
>>   | Nil -> 0
>>   | (Cons (_,r)) -> 1+ (length r);;
>>
>> Hey, you say, the natural type to declare for length should be
>>
>>   type a b. (a,b) l -> int
>>
>> but as you see, for some mysterious reason I ended
>> up writing something silly (dont ask why, I am just
>> very well known to step on strange bugs as easily
>> as I breath)
>>
>>   type a b. (a,b) l -> 'a
>>
>> The type checker is happy anyway: 'a will be instantiated
>> to int, and the declaration of length is accepted... but now
>> look at the output type
>>
>>   val length : ('a1, 'b) l -> int = <fun>
>>
>> Why do we get 'a1, and not 'a, in the type?
>>
>> Well, probably, since 'a is instantiated to int during
>> type checking, it may be the case that 'a, as type name, is
>> still marked as taken during the type output, so we get ('a1,'b)
>>
>> The type is perfectly sound... it is just 'surprising' for
>> a regular user... do you think this should be considered a bug?
>
> Well, since there is no specification whatsoever about type variable names
> in the output, this is hard to call it a bug.
> On the other hand, you were surprised, so something is probably wrong.
>
> After thinking a bit more about it, actually the problem is not in the printer,
> as everybody assumed, but in the parser.
> Namely,
>
>        ... : type a b. (a,b) l -> 'a = ...
>
> is just syntactic sugar for
>
>        ... : 'a1 'b. ('a1,'b) l -> 'a = fun (type a) (type b) -> (... : (a,b) l -> 'a)
>
> Now you see where 'a1 appears: since there is already an 'a in the type,
> we cannot use 'a for the fresh abstract type a.
>
> However, as you already mentioned, the original type itself was erroneous.
> So a better solution would be to get an error at that point.
> Namely, if the type annotation contains a type variable with the same name
> as one of the quantified types.
> Does it sound reasonable.
>
> At times I wonder whether the "type a b. ...." syntax is the right approach.
> Another approach would be to change the meaning of
>  ... : 'a 'b. ('a,'b) l -> 'a = ...
> to have 'a and 'b defined in the right hand side.
> The trouble is that we still need the locally abstract types internally, so this
> could be confusing.
> Also this could break some existing programs using polymorphic methods.
>
>        Jacques Garrigue
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa-roc.inria.fr/wws/info/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Caml-list] Funny name for a type variable in the output: should this be filed on the BTS?
  2012-03-23  5:52   ` Gabriel Scherer
@ 2012-03-23  6:45     ` Jacques Garrigue
  2012-03-23  8:13       ` Roberto Di Cosmo
  2012-03-23 10:25       ` Gabriel Scherer
  0 siblings, 2 replies; 11+ messages in thread
From: Jacques Garrigue @ 2012-03-23  6:45 UTC (permalink / raw)
  To: Gabriel Scherer; +Cc: Roberto Di Cosmo, caml-list

On 2012/03/23, at 14:52, Gabriel Scherer wrote:

>> However, as you already mentioned, the original type itself was erroneous.
>> So a better solution would be to get an error at that point.
>> Namely, if the type annotation contains a type variable with the same name
>> as one of the quantified types.
>> Does it sound reasonable.
> 
> I'd rather use a warning here. There are enough subtleties with (type
> a b . foo) already, I possible I would be glad not to have to explain
> different scoping rules from the rest of the language.

The point here is that the "type a b. ..." syntax is syntactic sugar.
Rather than putting efforts in making it "hygienic", at the cost of
mangling type variable names, isn't it simpler to define precisely
how it expands, and prohibit ambiguous types?
I would argue that this is actually simpler to explain it that way
(if you want to give an exact specification).

I have committed this in trunk and 4.00, and the output now is a syntax
error:

Error: In this scoped type, variable 'a is reserved for the local type a.

It is of course possible to make it a warning (by doing two passes and generating
fresh names), but I don't see the point: you can always choose your names so
that no conflict happens, so why risk the confusion?
A similar case occurs with value fields in classes: not only you cannot access them
from other value fields, but they actually hide external values.

>> Another approach would be to change the meaning of
>> ... : 'a 'b. ('a,'b) l -> 'a = ...
>> to have 'a and 'b defined in the right hand side.
> 
> That would be quite strange. I already find it dubious that (type a b
> . ....) would escape its natural scope, but our nice old type
> variables had not been affected by those changes yet, and I think it's
> better that they keep their natural meaning of local quantification.

Backward compatibility is of course a concern.
On the other hand, polymorphic methods are not used that often,
and problems only occur with code that is ambiguous to read.

For me the main problem is that it hides the fact those type variables
are really locally abstract types. And the idea behind using locally
abstract types was to avoid introducing yet another kind of type
variable.

> Could we get the type-checker to understand that
>    let rec length (type a) (type b) : (a, b) l -> _ = function ...
> or let rec length (type a) (type b) (li : (a, b) l) = ...
> is polymorphic enough to be a case of polymorphic recursion? This way
> we could get rid of the (type a b . ....) syntax altogether have
> clearer scoping rules.

This is certainly doable, but in my opinion it would not replace the
"type a b. ..." syntax, which is more readable.

> (On my wishlist:   ... (type a) (type b) ... could be written ...
> (type a b) ...)

Easy to do.
Actually I'm not sure why it wasn't defined that way to start with.
Not enough System F-ish?

Jacques Garrigue

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Caml-list] Funny name for a type variable in the output: should this be filed on the BTS?
  2012-03-23  6:45     ` Jacques Garrigue
@ 2012-03-23  8:13       ` Roberto Di Cosmo
  2012-03-23 10:25       ` Gabriel Scherer
  1 sibling, 0 replies; 11+ messages in thread
From: Roberto Di Cosmo @ 2012-03-23  8:13 UTC (permalink / raw)
  To: Jacques Garrigue; +Cc: Gabriel Scherer, caml-list

Dear Jacques,
     that seems a perfect fix, for me, as it also gives a hint
to the user about what is going on under the hood, thanks for
implementing it so fast.

And thanks to all for the feedback on this issue :-)

--Roberto

On Fri, Mar 23, 2012 at 03:45:51PM +0900, Jacques Garrigue wrote:
<snip>
> The point here is that the "type a b. ..." syntax is syntactic sugar.
> Rather than putting efforts in making it "hygienic", at the cost of
> mangling type variable names, isn't it simpler to define precisely
> how it expands, and prohibit ambiguous types?
> I would argue that this is actually simpler to explain it that way
> (if you want to give an exact specification).
> 
> I have committed this in trunk and 4.00, and the output now is a syntax
> error:
> 
> Error: In this scoped type, variable 'a is reserved for the local type a.
> 
<snip>
> 
> Jacques Garrigue


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Caml-list] Funny name for a type variable in the output: should this be filed on the BTS?
  2012-03-23  6:45     ` Jacques Garrigue
  2012-03-23  8:13       ` Roberto Di Cosmo
@ 2012-03-23 10:25       ` Gabriel Scherer
  2012-03-24  0:24         ` Jacques Garrigue
  1 sibling, 1 reply; 11+ messages in thread
From: Gabriel Scherer @ 2012-03-23 10:25 UTC (permalink / raw)
  To: Jacques Garrigue; +Cc: Roberto Di Cosmo, caml-list

> The point here is that the "type a b. ..." syntax is syntactic sugar.
> Rather than putting efforts in making it "hygienic", at the cost of
> mangling type variable names, isn't it simpler to define precisely
> how it expands, and prohibit ambiguous types?
> I would argue that this is actually simpler to explain it that way
> (if you want to give an exact specification).

I agree that not having to deal with name shadowing makes the
macro-expansion explanation simpler. Furthermore, treating this as an
error instead of a warning also makes the implementation simpler,
which is a point in favor of your change.

I'm not totally convinced by the idea that "type a b . ..." won't
cause trouble to users because it is syntactic sugar:

1. To explain it directly as syntactic sugar, you have to explain the
users both the polymorphic recursion notation and the rigid variable
abstraction (type a) (is this a good name? I like "System F
quantification" and prefer to avoid mumbles like "variables that are
type constructors rather than type variables because of implementation
reasons"); that's reasonable if you're giving a lecture where you have
the opportunity to introduce new concepts in your preferred order and
at your pace, but may not be practical in the random context of
someone asking "hey, what is this (type a b . ...) code doing?". In
practice I expect most non-expert users to consider it as a new,
rather than derived, concept.

2. The syntax still suggests a semantic that is different from the
actual one (ie. that the variables are bound locally to the type
annotation), which could confuse users that don't even ask about it
but simply assume it works somehow as they expect.

I understand there are compromises at play here and am not too
dissatisfied with this syntax, but still if it could be made
unecessary by making the (type a) one scale to polymorphic recursion,
I think that would be even better.

> It is of course possible to make it a warning (by doing two passes and generating
> fresh names), but I don't see the point: you can always choose your names so
> that no conflict happens, so why risk the confusion?

In my opinion, errors should be for semantic errors, and warnings for
"things that look like mistakes but are actually meaningful". In
particular, errors are part of the language specification, while
warnings are not. Making this cosmetic choice of "discouraging
shadowing" an error makes it part of the semantic of the construct,
while leaving it as a warning give it equivalent visibility, but with
less urge to make sure that everyone really understands the situation.
I also believe making part of the language semantics makes it even
more difficult to get a clear and exhaustive mental model of the
status of type variables: we already have 'a, '_a and a, maybe we
don't also need non-uniform rules about when name conflicts and when
they don't.
(However, there is the argument that making this an error now leaves
the option open to degrade it into a warning, while the reverse
direction would break compatibility. I can understand the benefit
keeping it that way for a coming release, even if those decisions
usually have the unfortunate effect of keeping those warts with us
much longer than originally hoped.)

> A similar case occurs with value fields in classes: not only you cannot access them
> from other value fields, but they actually hide external values.

Indeed, instance values syntax is horrible -- I would be very
frustrated if they were actually used.
That's certainly not a good argument to introduce similar things, but
I think you're pessimistic and the present situation is not as bad --
or maybe I haven't understood it in its full glory yet.

> This is certainly doable, but in my opinion it would not replace the
> "type a b. ..." syntax, which is more readable.

Frankly, I don't think the extremely meager readability benefits
outweigh the need to explain yet another kind of type annotation to
the code readers. Compare:

  let rec length : type a b. (a,b) l -> 'a = function
  let rec length (type a b) : (a,b) l -> 'a = function

And that's in the defavorable case of "function" which is friendly to
function types annotations. In other situations you could replace the
relatively awkward
  : a b . foo -> bar -> baz = fun x y -> ....
by something like
  (type a b) (x : foo) (y : bar) : baz = ...
in the cases where it is more readable -- you don't need to emphasize
the whole function type.
(Assuming, always, that the type-checker can detect the polymorphism
for polymorphic recursion.)

On Fri, Mar 23, 2012 at 7:45 AM, Jacques Garrigue
<garrigue@math.nagoya-u.ac.jp> wrote:
> On 2012/03/23, at 14:52, Gabriel Scherer wrote:
>
>>> However, as you already mentioned, the original type itself was erroneous.
>>> So a better solution would be to get an error at that point.
>>> Namely, if the type annotation contains a type variable with the same name
>>> as one of the quantified types.
>>> Does it sound reasonable.
>>
>> I'd rather use a warning here. There are enough subtleties with (type
>> a b . foo) already, I possible I would be glad not to have to explain
>> different scoping rules from the rest of the language.
>
> The point here is that the "type a b. ..." syntax is syntactic sugar.
> Rather than putting efforts in making it "hygienic", at the cost of
> mangling type variable names, isn't it simpler to define precisely
> how it expands, and prohibit ambiguous types?
> I would argue that this is actually simpler to explain it that way
> (if you want to give an exact specification).
>
> I have committed this in trunk and 4.00, and the output now is a syntax
> error:
>
> Error: In this scoped type, variable 'a is reserved for the local type a.
>
> It is of course possible to make it a warning (by doing two passes and generating
> fresh names), but I don't see the point: you can always choose your names so
> that no conflict happens, so why risk the confusion?
> A similar case occurs with value fields in classes: not only you cannot access them
> from other value fields, but they actually hide external values.
>
>>> Another approach would be to change the meaning of
>>> ... : 'a 'b. ('a,'b) l -> 'a = ...
>>> to have 'a and 'b defined in the right hand side.
>>
>> That would be quite strange. I already find it dubious that (type a b
>> . ....) would escape its natural scope, but our nice old type
>> variables had not been affected by those changes yet, and I think it's
>> better that they keep their natural meaning of local quantification.
>
> Backward compatibility is of course a concern.
> On the other hand, polymorphic methods are not used that often,
> and problems only occur with code that is ambiguous to read.
>
> For me the main problem is that it hides the fact those type variables
> are really locally abstract types. And the idea behind using locally
> abstract types was to avoid introducing yet another kind of type
> variable.
>
>> Could we get the type-checker to understand that
>>    let rec length (type a) (type b) : (a, b) l -> _ = function ...
>> or let rec length (type a) (type b) (li : (a, b) l) = ...
>> is polymorphic enough to be a case of polymorphic recursion? This way
>> we could get rid of the (type a b . ....) syntax altogether have
>> clearer scoping rules.
>
> This is certainly doable, but in my opinion it would not replace the
> "type a b. ..." syntax, which is more readable.
>
>> (On my wishlist:   ... (type a) (type b) ... could be written ...
>> (type a b) ...)
>
> Easy to do.
> Actually I'm not sure why it wasn't defined that way to start with.
> Not enough System F-ish?
>
> Jacques Garrigue


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Caml-list] Funny name for a type variable in the output: should this be filed on the BTS?
  2012-03-23 10:25       ` Gabriel Scherer
@ 2012-03-24  0:24         ` Jacques Garrigue
  0 siblings, 0 replies; 11+ messages in thread
From: Jacques Garrigue @ 2012-03-24  0:24 UTC (permalink / raw)
  To: Gabriel Scherer; +Cc: caml users

Dear Gabriel,

I agree with you on the basics.
But maybe not on the importance of syntax :-)

On 2012/03/23, at 19:25, Gabriel Scherer wrote:

>> The point here is that the "type a b. ..." syntax is syntactic sugar.
>> Rather than putting efforts in making it "hygienic", at the cost of
>> mangling type variable names, isn't it simpler to define precisely
>> how it expands, and prohibit ambiguous types?
>> I would argue that this is actually simpler to explain it that way
>> (if you want to give an exact specification).
> 
> I agree that not having to deal with name shadowing makes the
> macro-expansion explanation simpler. Furthermore, treating this as an
> error instead of a warning also makes the implementation simpler,
> which is a point in favor of your change.
> 
> I'm not totally convinced by the idea that "type a b . ..." won't
> cause trouble to users because it is syntactic sugar:
> 
> 1. To explain it directly as syntactic sugar, you have to explain the
> users both the polymorphic recursion notation and the rigid variable
> abstraction (type a) (is this a good name? I like "System F
> quantification" and prefer to avoid mumbles like "variables that are
> type constructors rather than type variables because of implementation
> reasons"); that's reasonable if you're giving a lecture where you have
> the opportunity to introduce new concepts in your preferred order and
> at your pace, but may not be practical in the random context of
> someone asking "hey, what is this (type a b . ...) code doing?". In
> practice I expect most non-expert users to consider it as a new,
> rather than derived, concept.

Indeed, this understanding is basically fine.
You only need to understand that this is syntactic sugar when you want
to get back to the definition.

> 2. The syntax still suggests a semantic that is different from the
> actual one (ie. that the variables are bound locally to the type
> annotation), which could confuse users that don't even ask about it
> but simply assume it works somehow as they expect.

I do not agree there.
There is a syntax to bind type variables to annotations, which is

   'a 'b. ....

We intentionally introduced a new syntax, close to the (type a) notation,
to make clear that the scope extends to the right.

> I understand there are compromises at play here and am not too
> dissatisfied with this syntax, but still if it could be made
> unnecessary by making the (type a) one scale to polymorphic recursion,
> I think that would be even better.

I agree with the desire for minimality. But honestly, the notation

   let rec f (type a b) : a t -> a = ...

doesn't strike me as the right solution.
This problem with type annotations is congenital to ML, but for GADTs it
is essential to be able to write function types in an intuitive way.

>> It is of course possible to make it a warning (by doing two passes and generating
>> fresh names), but I don't see the point: you can always choose your names so
>> that no conflict happens, so why risk the confusion?
> 
> In my opinion, errors should be for semantic errors, and warnings for
> "things that look like mistakes but are actually meaningful". In
> particular, errors are part of the language specification, while
> warnings are not. Making this cosmetic choice of "discouraging
> shadowing" an error makes it part of the semantic of the construct,
> while leaving it as a warning give it equivalent visibility, but with
> less urge to make sure that everyone really understands the situation.
> I also believe making part of the language semantics makes it even
> more difficult to get a clear and exhaustive mental model of the
> status of type variables: we already have 'a, '_a and a, maybe we
> don't also need non-uniform rules about when name conflicts and when
> they don't.

I'm afraid there is no clear distinction between warning and errors.
I see at least 2 criteria.
* whether this is a question of programming style (then it can be a warning)
* whether this can be detected in a complete (then it can be an error)

To me the above is not a question of programming style, and can be detected
exhaustively, so it is more natural to have an error. But you are also right
that the definition requires to understand the syntactic sugar, which is not so
nice. On the other, we would have exactly the same problem with a warning.

> (However, there is the argument that making this an error now leaves
> the option open to degrade it into a warning, while the reverse
> direction would break compatibility. I can understand the benefit
> keeping it that way for a coming release, even if those decisions
> usually have the unfortunate effect of keeping those warts with us
> much longer than originally hoped.)

This is indeed a strong reason. I am not really happy with the current
situation, so it is best to keep all avenues open.

>> This is certainly doable, but in my opinion it would not replace the
>> "type a b. ..." syntax, which is more readable.
> 
> Frankly, I don't think the extremely meager readability benefits
> outweigh the need to explain yet another kind of type annotation to
> the code readers. Compare:
> 
>  let rec length : type a b. (a,b) l -> a = function
>  let rec length (type a b) : (a,b) l -> a = function

Sorry, but I buy the first one anytime.
This makes clear that we are writing the type of length, while in the second
one we are writing the return type of some function. Probably fine for Coq
users, but not really user friendly.

> And that's in the defavorable case of "function" which is friendly to
> function types annotations. In other situations you could replace the
> relatively awkward
>  : a b . foo -> bar -> baz = fun x y -> ....
> by something like
>  (type a b) (x : foo) (y : bar) : baz = ...
> in the cases where it is more readable -- you don't need to emphasize
> the whole function type.

Two problems here:
* this is even worse than before: variables and types are mingled in an
  utterly unreadable way. (I don't say that this should be prohibited, just
  that I don't think users should have to write things this way, which is
  unfortunately the ML way)
* we then need a specification of how much of the function is analyzed
  to determine the recursive type. Imagine when you mix that with optional
  arguments with defaults.

OK, this is a difficult problem, and I have no ideal solution.
Maybe we could support indeed your first syntax too:
	let rec length (type a b) : (a,b) l -> a = function

	Jacques

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-03-24  0:24 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-22  9:51 [Caml-list] Funny name for a type variable in the output: should this be filed on the BTS? Roberto Di Cosmo
2012-03-22 12:05 ` Julien Signoles
2012-03-22 13:39   ` Jonathan Protzenko
2012-03-22 14:40     ` Didier Remy
2012-03-22 22:12 ` Jacques Garrigue
2012-03-22 23:35   ` Roberto Di Cosmo
2012-03-23  5:52   ` Gabriel Scherer
2012-03-23  6:45     ` Jacques Garrigue
2012-03-23  8:13       ` Roberto Di Cosmo
2012-03-23 10:25       ` Gabriel Scherer
2012-03-24  0:24         ` Jacques Garrigue

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).