caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Typing of patterns
@ 2000-06-01 13:23 Markus Mottl
  2000-06-05 13:56 ` Pierre Weis
  0 siblings, 1 reply; 8+ messages in thread
From: Markus Mottl @ 2000-06-01 13:23 UTC (permalink / raw)
  To: OCAML

Hello,

I would like to know something about the typing rules of identifiers that
are bound in pattern matching:

Let's consider this small example:

  module type FUNCTOR = sig
    type 'a t
    val map : ('a -> 'b) -> ('a t -> 'b t)
  end

  type 'a expr = Num of int | Add of 'a * 'a

  module ExprFun : FUNCTOR = struct
    type 'a t = 'a expr

    let map f = function
      | Num n as num -> num
      | Add (e, e') -> Add (f e, f e')
  end

This will lead to the (shortened) error message:

  Values do not match:
    val map : ('a -> 'a) -> 'a expr -> 'a expr
  is not included in
    val map : ('a -> 'b) -> 'a t -> 'b t

The problem arises in this line:

  | Num n as num -> num

This looks perfectly ok at first sight, but a closer look reveals that
"num" is not only bound to the value "Num n", it also has the exact type of
the left-hand side. Thus, the rhs will also be forced to have this type if
we use this pattern name.

To correct the problem, we can write:

  | Num n -> Num n

But isn't this a bit strange that an identifier cannot be used in a context
where replacing it syntactically with its bound value would be perfectly
ok?

Or has experience shown that using more general types in such cases leads
to more programming errors?

One can, of course, always explicitely restrict the type of an identifier,
but in the upper example we want to have the opposite, i.e. have it more
general. One side effect of this problem is that we cannot efficiently
return the value "as is": we have to construct it again, which may come
with a not insignficant performance penalty...

Are there any deeper insights behind this rationale? (The given code works
without problems if polymorphic variants are used instead).

Best regards,
Markus Mottl

-- 
Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Typing of patterns
  2000-06-01 13:23 Typing of patterns Markus Mottl
@ 2000-06-05 13:56 ` Pierre Weis
  2000-06-05 15:29   ` Markus Mottl
                     ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Pierre Weis @ 2000-06-05 13:56 UTC (permalink / raw)
  To: Markus Mottl; +Cc: caml-list

> I would like to know something about the typing rules of identifiers that
> are bound in pattern matching:

The typing rule specifies that identifiers bound in a pattern get
monomorphic types (or unifiable ``unknowns'', or type variables): it
means that all the type constraints found in the program for this
identifier accumulate on the type of the identifier.

[...]
> The problem arises in this line:
> 
>   | Num n as num -> num
[...]

> To correct the problem, we can write:
> 
>   | Num n -> Num n
> 
> But isn't this a bit strange that an identifier cannot be used in a context
> where replacing it syntactically with its bound value would be perfectly
> ok?

Yes it is strange, but this is exactly what ML polymorphism specifies:
polymorphism is introduced by the let-binding construct, and only
occurrences of let-bound identifiers can be used with different
types. Since your ``num'' above is not let-bound it is monomorphic and
constraints are accumulating.

In contrast, the Num constructor is considered let-bound (by the type
definition it is defined in), hence Num is assigned a fresh instance
of its type scheme, hence the more general typing of the function.

I first encountered an instance of your problem in a very simple
function, that roughly read as follows:

let ensure_nil = function
  | x :: l -> failwith "Not nil"
  | l -> l;;
val ensure_nil : 'a list -> 'a list = <fun>

Unfortunately here, the result has been assigned the same type as the
argument.  However, writing | [] -> [] instead of | l -> l lead us to
the more general 'a list -> 'b list typing:

let ensure_nil = function
  | x :: l -> failwith "Not nil"
  | [] -> [];;
val ensure_nil : 'a list -> 'b list = <fun>

As in your example, adding a ``as'' synonymous to the pattern argument
once more connects the input and the output type of the function:

let ensure_nil = function
  | x :: l -> failwith "Not nil"
  | [] as l -> l;;
val ensure_nil : 'a list -> 'a list = <fun>

> Or has experience shown that using more general types in such cases leads
> to more programming errors?

No problem in the exact situation you mentioned: num can be safely
generalized because there is nothing bound to its type variable
encapsulated into the value bound to num. More generally this should
be done for type variables of synonymous identifiers that are ``free''
in the type of the pattern that bind them: if the identifier id is
bound in the pattern pat and id has type t, then any type variable 'a
that appears in t can be generalized if it does not occur in the type
of any identifier of the pattern pat (intuitively 'a is free in the
sense that no constraints on identifiers appearing in pat can involve
'a).

For instance, in fun [] as l -> e, the type of l is 'a list, and it
can be generalized to all 'a. 'a list during the type-checking of e.
In contrast, in fun x :: l as mylist -> e,
the type 'a list of identifier mylist cannot be generalized, since 'a
is the type of x and also occurs in the type of l. 

> One can, of course, always explicitely restrict the type of an identifier,
> but in the upper example we want to have the opposite, i.e. have it more
> general. One side effect of this problem is that we cannot efficiently
> return the value "as is": we have to construct it again, which may come
> with a not insignficant performance penalty...

You're right, that's the main reason to try to solve this problem.

> Are there any deeper insights behind this rationale? (The given code works
> without problems if polymorphic variants are used instead).

We can observe the same kind of behaviour with polymorphic variants:

type 'a pliste = [`Nil | `Cons of 'a * 'a pliste];;

Connection between input and output:

let (ensure_nil : 'a pliste -> 'b pliste) = function
  | `Cons (x, l) -> failwith "Not nil"
  | l -> l;;
val ensure_nil : 'a pliste -> 'a pliste = <fun>

Deconnection with explicit copy:

let (ensure_nil : 'a pliste -> 'b pliste) = function
  | `Cons (x, l) -> failwith "Not nil"
  | `Nil -> `Nil;;
val ensure_nil : 'a bliste -> 'b bliste = <fun>

However, we get a strange difference in typing since, as you
mentioned, the explicit ``as clause'' does not set up a typing
connection between input and output :

let (ensure_nil : 'a pliste -> 'b pliste) = function
  | `Cons (x, l) -> failwith "Not nil"
  | `Nil as l -> l;;
val ensure_nil : 'a pliste -> 'b pliste = <fun>

Jacques may explain us if the above suggested generalization scheme is
used for identifiers bound in as clauses of patterns (and if not,
which scheme is used ?)...

Conclusion: this strongly suggests to generalize the typing of
synonymous identifiers in patterns, in order to obtain the same typing
assignments for pattern matching on ``closed'' polymorphic variants
and pattern matching on their (semantically equivalent) sum data types
counterparts.

Thank you for your interesting remark and question.

Pierre Weis

INRIA, Projet Cristal, Pierre.Weis@inria.fr, http://cristal.inria.fr/~weis/





^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Typing of patterns
  2000-06-05 13:56 ` Pierre Weis
@ 2000-06-05 15:29   ` Markus Mottl
  2000-06-06  0:55   ` Jacques Garrigue
  2000-06-06  1:10   ` Patrick M Doane
  2 siblings, 0 replies; 8+ messages in thread
From: Markus Mottl @ 2000-06-05 15:29 UTC (permalink / raw)
  To: Pierre Weis; +Cc: caml-list

On Mon, 05 Jun 2000, Pierre Weis wrote:
> > But isn't this a bit strange that an identifier cannot be used in a context
> > where replacing it syntactically with its bound value would be perfectly
> > ok?
> 
> Yes it is strange, but this is exactly what ML polymorphism specifies:
> polymorphism is introduced by the let-binding construct, and only
> occurrences of let-bound identifiers can be used with different
> types. Since your ``num'' above is not let-bound it is monomorphic and
> constraints are accumulating.

Ah - I think I see now: I was fooled by the syntactic "disguise" of the
problem! So this is similar to:

This does not work:

  fun id -> id 42, id "foo"

This works:

  let id x = x in id 42, id "foo"

> For instance, in fun [] as l -> e, the type of l is 'a list, and it
> can be generalized to all 'a. 'a list during the type-checking of e.
> In contrast, in fun x :: l as mylist -> e,
> the type 'a list of identifier mylist cannot be generalized, since 'a
> is the type of x and also occurs in the type of l. 

So the restriction is required to prevent problems with references, I
think, but those cannot occur if the type parameter is not used by any of
the parameters of the constructor, i.e. they are monomorphic (or there are
no parameters as in our case).

> Jacques may explain us if the above suggested generalization scheme is
> used for identifiers bound in as clauses of patterns (and if not,
> which scheme is used ?)...

I first thought it was some strange "special feature" of polymorphic
variants, but as it seems then, it is just that the corresponding typing
rule is obviously implemented differently...

> Conclusion: this strongly suggests to generalize the typing of
> synonymous identifiers in patterns, in order to obtain the same typing
> assignments for pattern matching on ``closed'' polymorphic variants
> and pattern matching on their (semantically equivalent) sum data types
> counterparts.
>
> Thank you for your interesting remark and question.

Thanks for your interesting explanations!

Since we are at it, there is another sometimes annoying type restriction,
this time with record updates, that comes to my mind, e.g.:

  type 'a t = { foo : 'a; bar : int };;

  let x = { foo = "foo"; bar = 3 };;

  let ok = { x with foo = "bla" };;
  let not_ok = { x with foo = 7 };;

Here the updated record "x" could have a less rigidly typed "foo"-field -
or should we rather say it has a "default" type if the field is not
updated? It can be a bit painful to do updates without such a
generalisation if there are many record names that one would have to
mention explicitely to create the wanted value as in:

  let now_ok = { foo = 7; bar = x.bar }

Best regards,
Markus Mottl

-- 
Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Typing of patterns
  2000-06-05 13:56 ` Pierre Weis
  2000-06-05 15:29   ` Markus Mottl
@ 2000-06-06  0:55   ` Jacques Garrigue
  2000-06-06 15:32     ` Pierre Weis
  2000-06-06  1:10   ` Patrick M Doane
  2 siblings, 1 reply; 8+ messages in thread
From: Jacques Garrigue @ 2000-06-06  0:55 UTC (permalink / raw)
  To: Pierre.Weis; +Cc: mottl, caml-list

From: Pierre Weis <Pierre.Weis@inria.fr>

> We can observe the same kind of behaviour with polymorphic variants:
> 
> type 'a pliste = [`Nil | `Cons of 'a * 'a pliste];;
[...]
> However, we get a strange difference in typing since, as you
> mentioned, the explicit ``as clause'' does not set up a typing
> connection between input and output :
> 
> let (ensure_nil : 'a pliste -> 'b pliste) = function
>   | `Cons (x, l) -> failwith "Not nil"
>   | `Nil as l -> l;;
> val ensure_nil : 'a pliste -> 'b pliste = <fun>
> 
> Jacques may explain us if the above suggested generalization scheme is
> used for identifiers bound in as clauses of patterns (and if not,
> which scheme is used ?)...

Ok, with polymorphic variants the as clause has a special meaning:
typingwise it does the same thing as an explicit copy, but without the
operational cost involved.

However, the goal is a bit different, as you can can see if you don't
put type annotations on the output:

let (ensure_nil : 'a pliste -> 'b) = function
  | `Cons (x, l) -> failwith "Not nil"
  | `Nil as l -> l;;
val ensure_nil : 'a pliste -> [> `Nil] = <fun>

With polymorphic variants we know that the output does not contain
Cons! The disconnection between input and output variables is a mere
consequence of that: [> `Nil] knows nothing about the type parameter
of pliste.

This mechanism was intended as an implementation of type-based
dispatch (with polymorhic variants type and values coincide), and you
could say it is just an overloading of the as construct (which is OK
since it does not change its operational meaning).
>From the point of view of separation, it is also a bit ad-hoc, in that
it requires the variant pattern (or an or-pattern containing only
variants) to be immediately enclosed in the as clause, otherwise it
reverts to the usual type-sharing behaviour:

let f1 = function `A as x -> x;;
val f1 : [< `A] -> [> `A] = <fun>
let f2 = function (`A, `B) as x -> x;;
val f2 : ([< `A] as 'a) * ([< `B] as 'b) -> 'a * 'b = <fun>

> Conclusion: this strongly suggests to generalize the typing of
> synonymous identifiers in patterns, in order to obtain the same typing
> assignments for pattern matching on ``closed'' polymorphic variants
> and pattern matching on their (semantically equivalent) sum data types
> counterparts.

This is indeed possible. This just requires a bit of additional code in
the type checker. However sum data types do not allow type-based
dispatch, so this would really only be useful for separating type
parameters.
Also, this reminds me of another problem discussed last year in this
mailing list:
        # type 'a t = {key: int; data: 'a};;
        type 'a t = { key : int; data : 'a; } 
        # let v = {key = 1; data = 1};;    
        val v : int t = {key=1; data=1}
        # {v with data = true};;
        This expression has type int t but is here used with type bool t

If we separate type parameters in sums, then we will probably also
have to do it for products, isn't it ?
Symmetries can bring you a long way.

Cheers,

        Jacques
---------------------------------------------------------------------------
Jacques Garrigue      Kyoto University     garrigue at kurims.kyoto-u.ac.jp
		<A HREF=http://wwwfun.kurims.kyoto-u.ac.jp/~garrigue/>JG</A>




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Typing of patterns
  2000-06-05 13:56 ` Pierre Weis
  2000-06-05 15:29   ` Markus Mottl
  2000-06-06  0:55   ` Jacques Garrigue
@ 2000-06-06  1:10   ` Patrick M Doane
  2000-06-06  8:55     ` Jacques Garrigue
  2 siblings, 1 reply; 8+ messages in thread
From: Patrick M Doane @ 2000-06-06  1:10 UTC (permalink / raw)
  To: Pierre Weis; +Cc: Markus Mottl, caml-list

On Mon, 5 Jun 2000, Pierre Weis wrote:

> However, we get a strange difference in typing since, as you
> mentioned, the explicit ``as clause'' does not set up a typing
> connection between input and output :
> 
> let (ensure_nil : 'a pliste -> 'b pliste) = function
>   | `Cons (x, l) -> failwith "Not nil"
>   | `Nil as l -> l;;
> val ensure_nil : 'a pliste -> 'b pliste = <fun>
> 
> Jacques may explain us if the above suggested generalization scheme is
> used for identifiers bound in as clauses of patterns (and if not,
> which scheme is used ?)...

This discussion seems related to a recent typing difficulty I have had
with polymorphic variants.

I want to change the data associated to a particular constructor of a
variant and leave the others unmodified:

     # let f = function `A -> `A () | _ as t -> t;;
     Characters 41-42:
     This expression has type [< `A | ..] but is here used with type
       [> `A of unit]

Notice that the type for '_' includes `A when it obviously cannot. One
could include every possible expected type in place of '_' but that
removes one of the advantages of polymorphic variants: that the type can 
be refined and extended easily.


While further exploring this issue I managed to get an uncaught exception
from the type checker:

     # let f = function `A true -> () | `A 5 -> ();; 
     Uncaught exception: File "typing/parmmatch.ml", ...

Taking the conjuction of 'unit' and 'int' does seem to work however,

     # let f = function `A -> () | `A 5 -> ();;
     val f : [< `A of & int] -> unit = <fun>

but it prints the type without mentioning 'unit'.

Patrick Doane




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Typing of patterns
  2000-06-06  1:10   ` Patrick M Doane
@ 2000-06-06  8:55     ` Jacques Garrigue
  0 siblings, 0 replies; 8+ messages in thread
From: Jacques Garrigue @ 2000-06-06  8:55 UTC (permalink / raw)
  To: patrick; +Cc: caml-list

From: Patrick M Doane <patrick@watson.org>

> I want to change the data associated to a particular constructor of a
> variant and leave the others unmodified:
> 
>      # let f = function `A -> `A () | _ as t -> t;;
>      Characters 41-42:
>      This expression has type [< `A | ..] but is here used with type
>        [> `A of unit]
> 
> Notice that the type for '_' includes `A when it obviously cannot. One
> could include every possible expected type in place of '_' but that
> removes one of the advantages of polymorphic variants: that the type can 
> be refined and extended easily.

However, this is what you will have to do :-)
I know that it is theoretically possible to build a type system
where t is given a type excluding `A, but this would require exposing
the row variable, which we have been carefully avoiding until now.

As for the ease of definition, you can use type-patterns:

type others = [ you other cases ]
let f = function `A -> `A () | #others as t -> t

I believe this should be enough in general.

> While further exploring this issue I managed to get an uncaught exception
> from the type checker:
> 
>      # let f = function `A true -> () | `A 5 -> ();; 
>      Uncaught exception: File "typing/parmmatch.ml", ...

This is a known bug, the current version of the typechecker finds an
error here,

> Taking the conjuction of 'unit' and 'int' does seem to work however,
> 
>      # let f = function `A -> () | `A 5 -> ();;
>      val f : [< `A of & int] -> unit = <fun>

and also here (basically conflicting types are not allowed in the same
pattern-matching).

        Jacques




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Typing of patterns
  2000-06-06  0:55   ` Jacques Garrigue
@ 2000-06-06 15:32     ` Pierre Weis
  0 siblings, 0 replies; 8+ messages in thread
From: Pierre Weis @ 2000-06-06 15:32 UTC (permalink / raw)
  To: Jacques Garrigue; +Cc: caml-list

[...]
> However, the goal is a bit different, as you can can see if you don't
> put type annotations on the output:
> 
> let (ensure_nil : 'a pliste -> 'b) = function
>   | `Cons (x, l) -> failwith "Not nil"
>   | `Nil as l -> l;;
> val ensure_nil : 'a pliste -> [> `Nil] = <fun>
> 
> With polymorphic variants we know that the output does not contain
> Cons! The disconnection between input and output variables is a mere
> consequence of that: [> `Nil] knows nothing about the type parameter
> of pliste.

Marvellous.

> This mechanism was intended as an implementation of type-based
> dispatch (with polymorhic variants type and values coincide), and you
> could say it is just an overloading of the as construct (which is OK
> since it does not change its operational meaning).
> >From the point of view of separation, it is also a bit ad-hoc, in that
> it requires the variant pattern (or an or-pattern containing only
> variants) to be immediately enclosed in the as clause, otherwise it
> reverts to the usual type-sharing behaviour:
> 
> let f1 = function `A as x -> x;;
> val f1 : [< `A] -> [> `A] = <fun>
> let f2 = function (`A, `B) as x -> x;;
> val f2 : ([< `A] as 'a) * ([< `B] as 'b) -> 'a * 'b = <fun>

Yes this is a bit strange. You may use the rule I gave in my previous
message : generalize type variables that does not appear in the type
of identifiers bound in sub-patterns.

> > Conclusion: this strongly suggests to generalize the typing of
> > synonymous identifiers in patterns, in order to obtain the same typing
> > assignments for pattern matching on ``closed'' polymorphic variants
> > and pattern matching on their (semantically equivalent) sum data types
> > counterparts.
> 
> This is indeed possible. This just requires a bit of additional code in
> the type checker. However sum data types do not allow type-based
> dispatch, so this would really only be useful for separating type
> parameters.

Yes and this is indeed desirable.

> Also, this reminds me of another problem discussed last year in this
> mailing list:
>         # type 'a t = {key: int; data: 'a};;
>         type 'a t = { key : int; data : 'a; } 
>         # let v = {key = 1; data = 1};;    
>         val v : int t = {key=1; data=1}
>         # {v with data = true};;
>         This expression has type int t but is here used with type bool t
> 
> If we separate type parameters in sums, then we will probably also
> have to do it for products, isn't it ?

Sure.

> Symmetries can bring you a long way.

Yes, and this is an easy way for the user of the language.

Pierre Weis

INRIA, Projet Cristal, Pierre.Weis@inria.fr, http://cristal.inria.fr/~weis/





^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Typing of patterns
@ 2000-06-06  7:24 Pierre Weis
  0 siblings, 0 replies; 8+ messages in thread
From: Pierre Weis @ 2000-06-06  7:24 UTC (permalink / raw)
  To: caml-list

[...]
> Ah - I think I see now: I was fooled by the syntactic "disguise" of the
> problem! So this is similar to:
> 
> This does not work:
> 
>   fun id -> id 42, id "foo"
> 
> This works:
> 
>   let id x = x in id 42, id "foo"

Exactly.

[...]
> So the restriction is required to prevent problems with references, I
> think, but those cannot occur if the type parameter is not used by any of
> the parameters of the constructor, i.e. they are monomorphic (or there are
> no parameters as in our case).

There is nothing special with references here: restriction for
references is just for generalization of types in let binding (roughly
speaking, when typing let x = e, x is monomorphic if e is an
application of the form f y). Here the restriction is just to be
correct: fun x -> x should have type 'a -> 'a, not 'a -> 'b; however
in case of complex pattern matching the rule is a bit too restrictive,
and can be relaxed in some cases as we saw.

> > Jacques may explain us if the above suggested generalization scheme is
> > used for identifiers bound in as clauses of patterns (and if not,
> > which scheme is used ?)...
> 
> I first thought it was some strange "special feature" of polymorphic
> variants, but as it seems then, it is just that the corresponding typing
> rule is obviously implemented differently...

I don't know if it is an implementation of the relaxed type-checking of
as identifiers or a special feature... Jacques will tell us...

[...]

> Since we are at it, there is another sometimes annoying type restriction,
> this time with record updates, that comes to my mind, e.g.:
> 
>   type 'a t = { foo : 'a; bar : int };;
> 
>   let x = { foo = "foo"; bar = 3 };;
> 
>   let ok = { x with foo = "bla" };;
>   let not_ok = { x with foo = 7 };;
> 
> Here the updated record "x" could have a less rigidly typed "foo"-field -
> or should we rather say it has a "default" type if the field is not
> updated? It can be a bit painful to do updates without such a
> generalisation if there are many record names that one would have to
> mention explicitely to create the wanted value as in:
> 
>   let now_ok = { foo = 7; bar = x.bar }

In some sense the problem is similar, since it is a problem of type
sharing. However it is a bit simpler in this case:

{ x with foo = "bla" } should be simply treated as a macro and typechecked
exactly as its equivalent unsuggared expression :

{foo = "bla"; bar = x.bar}

This would be more general and regular.

Best regards,

Pierre Weis

INRIA, Projet Cristal, Pierre.Weis@inria.fr, http://cristal.inria.fr/~weis/




^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2000-06-06 15:37 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-06-01 13:23 Typing of patterns Markus Mottl
2000-06-05 13:56 ` Pierre Weis
2000-06-05 15:29   ` Markus Mottl
2000-06-06  0:55   ` Jacques Garrigue
2000-06-06 15:32     ` Pierre Weis
2000-06-06  1:10   ` Patrick M Doane
2000-06-06  8:55     ` Jacques Garrigue
2000-06-06  7:24 Pierre Weis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).