From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gabriel.scherer@gmail.com>
X-Original-To: caml-list@sympa.inria.fr
Delivered-To: caml-list@sympa.inria.fr
Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82])
	by sympa.inria.fr (Postfix) with ESMTPS id D14E67F249
	for <caml-list@sympa.inria.fr>; Tue, 30 Oct 2012 22:26:24 +0100 (CET)
Received-SPF: None (mail1-smtp-roc.national.inria.fr: no sender
  authenticity information available from domain of
  gabriel.scherer@gmail.com) identity=pra;
  client-ip=209.85.210.182;
  receiver=mail1-smtp-roc.national.inria.fr;
  envelope-from="gabriel.scherer@gmail.com";
  x-sender="gabriel.scherer@gmail.com";
  x-conformance=sidf_compatible
Received-SPF: Pass (mail1-smtp-roc.national.inria.fr: domain of
  gabriel.scherer@gmail.com designates 209.85.210.182 as
  permitted sender) identity=mailfrom;
  client-ip=209.85.210.182;
  receiver=mail1-smtp-roc.national.inria.fr;
  envelope-from="gabriel.scherer@gmail.com";
  x-sender="gabriel.scherer@gmail.com";
  x-conformance=sidf_compatible; x-record-type="v=spf1"
Received-SPF: None (mail1-smtp-roc.national.inria.fr: no sender
  authenticity information available from domain of
  postmaster@mail-ia0-f182.google.com) identity=helo;
  client-ip=209.85.210.182;
  receiver=mail1-smtp-roc.national.inria.fr;
  envelope-from="gabriel.scherer@gmail.com";
  x-sender="postmaster@mail-ia0-f182.google.com";
  x-conformance=sidf_compatible
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: ArMBAAtFkFDRVdK2m2dsb2JhbAAqGrE5kiQIIwEBAQEBCAkLCRQngh4BAQEEEgITGQEbEgsBAwwGBQsDCg0hIQEBEQEFAQoSBhMSEIdRAQMPCy2cX2IJA4wwgWyBCoR0ChknAwpZiHUBBQyLBGeGXQOSQ4FdgVWBGooXgzAWKYQS
X-IronPort-AV: E=Sophos;i="4.80,682,1344204000"; 
   d="scan'208";a="179597046"
Received: from mail-ia0-f182.google.com ([209.85.210.182])
  by mail1-smtp-roc.national.inria.fr with ESMTP/TLS/RC4-SHA; 30 Oct 2012 22:26:23 +0100
Received: by mail-ia0-f182.google.com with SMTP id k10so919995iag.27
        for <caml-list@inria.fr>; Tue, 30 Oct 2012 14:26:22 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20120113;
        h=mime-version:in-reply-to:references:from:date:message-id:subject:to
         :cc:content-type:content-transfer-encoding;
        bh=vT90U2SLx1vgoC+HfEbrHqabHTM+QjoPReAvG7kC8xY=;
        b=hQlU52gqbQGYiVgXJUO5CouC5rdcgfsA9pVgMy4I++WKyyi7a2iahwLBlh9OAvRUVv
         cHwaNKTraBZ7Gv44EhOHbjLYxRBUZ8EQhAAlL9SoH7ibsKyVl0BuAWlQi8FgtSKbzWBg
         gmOIfc3kXucDIa7U1vDy8hxvi+7pG7oDOKJabQcBJXsXPxJAnadZGQ/YdGgtzRuV5TFd
         bKG8gU8a10QG9v+w60rNASFQIPs41UrGEyMZBvz2pyHoMzC1iqpnC7OrHCEzxSz0tJah
         b0NUv+a6aMhY4FNf6avoggbxLvwhkJIeKZcouRNbcuMaCEvAu+/pyFDa3TUrTESSthqK
         OjJg==
Received: by 10.50.242.74 with SMTP id wo10mr2838054igc.51.1351632382427; Tue,
 30 Oct 2012 14:26:22 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.50.65.9 with HTTP; Tue, 30 Oct 2012 14:25:42 -0700 (PDT)
In-Reply-To: <50900466.2050000@gmail.com>
References: <508F22BD.7010103@riken.jp> <CACX4useWCnu4tg2jAkSx=K=38BaHimgZWRFPF1Jyj5gw+YHAbg@mail.gmail.com>
 <026F32A8-2790-4CDD-A839-58655A8074CA@first.in-berlin.de> <508FE869.3070603@inria.fr>
 <508FFB12.9030307@gmail.com> <508FFE82.2050409@inria.fr> <50900466.2050000@gmail.com>
From: Gabriel Scherer <gabriel.scherer@gmail.com>
Date: Tue, 30 Oct 2012 22:25:42 +0100
Message-ID: <CAPFanBHcmSy3vAS8G+CHey1YZ-6kyYxxgtHJdkt40fmr54Fw1g@mail.gmail.com>
To: Edgar Friendly <thelema314@gmail.com>
Cc: caml-list@inria.fr
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Caml-list] Why should I use .mli files?

Independently of this discussion, I have been discussed related issues
with a friend for the last two weeks. I think there would be a feature
proposal to make in this area that would be *relatively* natural and
help solve the following different problems in this area:
1. Repeating types, module types and class types declarations in the
.mli and the .ml is redundant and increases the maintenance burden
2. The semantics of type-checking the .ml as a whole, and then
checking that it matches the .mli signature is sometimes inconvenient:
checking values against their .mli counterpart could help detect
typing mistakes sooner, which is helpful to get error messages that
are easier to understand
3. Some users want to have annotations on values (at least at
toplevel) for readability and the expression language is not optimal
for this (in particular the fact that type variables do not enforce
polymorphism)

## Proposal

My suggestion would be to allow two features (the second depending on
the first):
1. allow to insert signature items anywhere in a module structure or
.ml file, giving a reasonable semantics to it
    (in particular allow inclusion of signature as a structure item,
with the derived semantics)
2. have a syntax to designate the mli signature from the ml file
(possible suggestion: "module type", used a signature expression)

By combining the two features you could have something like

a.mli
  type t =3D A | B of u
  type u
  val f : 'a -> 'a

a.ml
  include (module type with type u =3D int)
  let f x =3D x

But you could also write only a.ml:
  type t =3D A | B of u
  type u =3D int
  val f : 'a -> 'a
  let f x =3D x

## Corner cases and rejected extensions

Regarding feature 1., there is indeed something to be decided about
shadowing of values:
  val x : int
  let x =3D 0.
  let x =3D int_of_float x
or:
  val x : float
  let x =3D 0.
  val x : int
  let x =3D 0

My suggestion would be, as a conservative measure, to reject all
shadowing of an identifier that has been prototype with "val" -- as
Edgar suggested. This can always be extended with time (eg. to accept
the second example).

It is also natural, in particular if you come from the world of C
prototypes and predeclared types, to wish to accept something like
  type u
  type u =3D int (* refining semantics *)
I think we should reject those because they have a muddier semantic
(you could define what the least abstract "unifier" of two definitions
would be, but let's net get down that road). The "S with type u =3D ..."
(in particular when S is "module type") allows to refine type in a
principle way without shadowing/redeclaration.

Finally, the presence of these "val f : ..." ahead of the actual
definition made me think of the well-known regret that OCaml
compilation units are not very good at top-down programming: you have
to define everything before use which enforces a bottom-up style which
is sometimes not how you would like your code to be read. C prototypes
allow top-down programming, but picking a right semantics without
implicit mutual recursion is tricky. My feature idea willingly rejects
this ambitious direction to stay simple and true. I would still be
interested in thinking about this in the future.


## Known issues and competing proposals

The main problem with this proposal is that the presence of the
"module type" construction can only be explained in terms of the
"superficial" semantics of OCaml compilation units, rather than as a
"pure" feature of a typed lambda calculus. More precisely, I see no
way to explain this semantics but saying "we include the signature of
the .mli", which is admittedly an presentation detail of the OCaml
compiler rather than a deep aspect of the OCaml language.

"module type" could be explained in terms of the module sublanguage:
module M : sig
   type t =3D int
end =3D struct
    include (module type)
    let x : t =3D 1
end

But as you see the context-dependent semantics is not very nice and it
feels ad-hoc -- it is. I think this is the most serious problem with
the proposal and the reason why I'm unsure it could ever be accepted
by the language designers.

Another criticism is that most of the benefits of this proposal can
already be obtained with existing features:
1. one can define the relevant signatures in a "interfaces.ml" files
reused in both the current .ml and the .mli, to avoid duplication
2. there is a rich annotation=B9 language for OCaml expressions patterns
or modules that can express what would be independently expressed as a
signature item

=B9: while we're at it, I'm considering pushing for the syntax "fun pat
pat ... pat : ty -> expr" as a synonym for "fun pat pat ... pat ->
(expr : ty)", but the potential confusion between "fun x : int ->" and
"fun (x : int) ->" is disheartening.

There is another related proposal: Alain Frisch's suggestion to make
compilation units *recursive* modules implicitly, rather than
non-recursive modules.
  http://caml.inria.fr/mantis/print_bug_page.php?bug_id=3D5480
Using recursive modules induces a definitive amount of complexity, but
is also more expressive in terms of allowing forward references (the
top-to-bottom style aspect I mentioned earlier). I think a simpler,
non-recursive proposition is a more reasonable choice. Alain described
a compromise in use at Lexifi, where compilation units are
*type-checked* are recursive modules but *compiled* as non-recursive
modules. That's probably close in expressivity to this proposal (but
harder, I think, to explain to the newcomer).

On another level, the signature module of OCaml is not arbitrarily
expressive, and there are some things that cannot be presently handled
by signature manipulations and inclusion alone. For example, you
cannot easily hide/remove values or types from a signature
(destructive substitution "with t :=3D ..." partly does that for type).
Heavy use of the proposed feature could run into these limitations. I
think we should be ready to enrich the signature language in this
case.


On Tue, Oct 30, 2012 at 5:46 PM, Edgar Friendly <thelema314@gmail.com> wrot=
e:
> On 10/30/2012 12:21 PM, Romain Bardou wrote:
>>
>> Le 30/10/2012 17:06, Edgar Friendly a =E9crit :
>>>
>>> On 10/30/2012 10:47 AM, Romain Bardou wrote:
>>>>
>>>> Maybe ocaml should provide something like:
>>>>
>>>> include type t in sig
>>>>
>>> Is there any case where type declarations in the .mli file should not be
>>> included as part of the .ml file?
>>
>>
>> No, as it does not compile otherwise. However, maybe the user wants the
>> declarations to be put in some specific order.
>>
>>>> Which would mean "copy-paste the definition of t from the .mli".
>>>>
>>> Why not have this be the default? i.e. when compiling a .ml file, if the
>>> corresponding .mli file exists, its type declarations are in scope for
>>> the .ml file, and its value declarations are applied to the
>>> corresponding values in the .ml file. The only edge case I can think of
>>> is when an identifier is bound multiple times in the .ml file, the type
>>> from the .mli file currently only applies to the last binding, whereas
>>> with this strategy, the type would most easily be implemented as
>>> applying to all bindings of that identifier.
>>>
>>> E.
>>>
>>
>> I'm not sure I understand what you mean, but here are some examples I
>> worry about.
>>
>> 1) You have the following .mli:
>>         type t =3D A
>>         type u =3D A
>> with the following implementation:
>>         let x =3D A
>> Where should type t and type u be copied? In other words, should x be of
>> type t or of type u?
>>
> They could be copied to the top of the .ml file in the order specified in
> .mli file.
>
>> It may also happen that the user wants the order of t and u to be revers=
ed
>> in the implementation.
>>
>> 2) You have a declaration like this in the .mli:
>>         type t =3D private something
>> Should it be copied as:
>>         type t =3D something
>> or as:
>>         type t =3D private something
>> I don't think the latter makes much sense most of the time, but it is a
>> valid declaration.
>>
> Or the user could specify "type t =3D something" in the .ml file and over=
ride
> the .mli file's type.  To do this, we'd need the concept of these "copied"
> types being overrideable in the .ml file, but I think doing this would be
> necessary for backwards compatibility, as all existing code would break
> under the new .mli semantics.
>
> E.
>
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs