From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by sympa.inria.fr (Postfix) with ESMTPS id D14E67F249 for ; Tue, 30 Oct 2012 22:26:24 +0100 (CET) Received-SPF: None (mail1-smtp-roc.national.inria.fr: no sender authenticity information available from domain of gabriel.scherer@gmail.com) identity=pra; client-ip=209.85.210.182; receiver=mail1-smtp-roc.national.inria.fr; envelope-from="gabriel.scherer@gmail.com"; x-sender="gabriel.scherer@gmail.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail1-smtp-roc.national.inria.fr: domain of gabriel.scherer@gmail.com designates 209.85.210.182 as permitted sender) identity=mailfrom; client-ip=209.85.210.182; receiver=mail1-smtp-roc.national.inria.fr; envelope-from="gabriel.scherer@gmail.com"; x-sender="gabriel.scherer@gmail.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail1-smtp-roc.national.inria.fr: no sender authenticity information available from domain of postmaster@mail-ia0-f182.google.com) identity=helo; client-ip=209.85.210.182; receiver=mail1-smtp-roc.national.inria.fr; envelope-from="gabriel.scherer@gmail.com"; x-sender="postmaster@mail-ia0-f182.google.com"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ArMBAAtFkFDRVdK2m2dsb2JhbAAqGrE5kiQIIwEBAQEBCAkLCRQngh4BAQEEEgITGQEbEgsBAwwGBQsDCg0hIQEBEQEFAQoSBhMSEIdRAQMPCy2cX2IJA4wwgWyBCoR0ChknAwpZiHUBBQyLBGeGXQOSQ4FdgVWBGooXgzAWKYQS X-IronPort-AV: E=Sophos;i="4.80,682,1344204000"; d="scan'208";a="179597046" Received: from mail-ia0-f182.google.com ([209.85.210.182]) by mail1-smtp-roc.national.inria.fr with ESMTP/TLS/RC4-SHA; 30 Oct 2012 22:26:23 +0100 Received: by mail-ia0-f182.google.com with SMTP id k10so919995iag.27 for ; Tue, 30 Oct 2012 14:26:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=vT90U2SLx1vgoC+HfEbrHqabHTM+QjoPReAvG7kC8xY=; b=hQlU52gqbQGYiVgXJUO5CouC5rdcgfsA9pVgMy4I++WKyyi7a2iahwLBlh9OAvRUVv cHwaNKTraBZ7Gv44EhOHbjLYxRBUZ8EQhAAlL9SoH7ibsKyVl0BuAWlQi8FgtSKbzWBg gmOIfc3kXucDIa7U1vDy8hxvi+7pG7oDOKJabQcBJXsXPxJAnadZGQ/YdGgtzRuV5TFd bKG8gU8a10QG9v+w60rNASFQIPs41UrGEyMZBvz2pyHoMzC1iqpnC7OrHCEzxSz0tJah b0NUv+a6aMhY4FNf6avoggbxLvwhkJIeKZcouRNbcuMaCEvAu+/pyFDa3TUrTESSthqK OjJg== Received: by 10.50.242.74 with SMTP id wo10mr2838054igc.51.1351632382427; Tue, 30 Oct 2012 14:26:22 -0700 (PDT) MIME-Version: 1.0 Received: by 10.50.65.9 with HTTP; Tue, 30 Oct 2012 14:25:42 -0700 (PDT) In-Reply-To: <50900466.2050000@gmail.com> References: <508F22BD.7010103@riken.jp> <026F32A8-2790-4CDD-A839-58655A8074CA@first.in-berlin.de> <508FE869.3070603@inria.fr> <508FFB12.9030307@gmail.com> <508FFE82.2050409@inria.fr> <50900466.2050000@gmail.com> From: Gabriel Scherer Date: Tue, 30 Oct 2012 22:25:42 +0100 Message-ID: To: Edgar Friendly Cc: caml-list@inria.fr Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: [Caml-list] Why should I use .mli files? Independently of this discussion, I have been discussed related issues with a friend for the last two weeks. I think there would be a feature proposal to make in this area that would be *relatively* natural and help solve the following different problems in this area: 1. Repeating types, module types and class types declarations in the .mli and the .ml is redundant and increases the maintenance burden 2. The semantics of type-checking the .ml as a whole, and then checking that it matches the .mli signature is sometimes inconvenient: checking values against their .mli counterpart could help detect typing mistakes sooner, which is helpful to get error messages that are easier to understand 3. Some users want to have annotations on values (at least at toplevel) for readability and the expression language is not optimal for this (in particular the fact that type variables do not enforce polymorphism) ## Proposal My suggestion would be to allow two features (the second depending on the first): 1. allow to insert signature items anywhere in a module structure or .ml file, giving a reasonable semantics to it (in particular allow inclusion of signature as a structure item, with the derived semantics) 2. have a syntax to designate the mli signature from the ml file (possible suggestion: "module type", used a signature expression) By combining the two features you could have something like a.mli type t =3D A | B of u type u val f : 'a -> 'a a.ml include (module type with type u =3D int) let f x =3D x But you could also write only a.ml: type t =3D A | B of u type u =3D int val f : 'a -> 'a let f x =3D x ## Corner cases and rejected extensions Regarding feature 1., there is indeed something to be decided about shadowing of values: val x : int let x =3D 0. let x =3D int_of_float x or: val x : float let x =3D 0. val x : int let x =3D 0 My suggestion would be, as a conservative measure, to reject all shadowing of an identifier that has been prototype with "val" -- as Edgar suggested. This can always be extended with time (eg. to accept the second example). It is also natural, in particular if you come from the world of C prototypes and predeclared types, to wish to accept something like type u type u =3D int (* refining semantics *) I think we should reject those because they have a muddier semantic (you could define what the least abstract "unifier" of two definitions would be, but let's net get down that road). The "S with type u =3D ..." (in particular when S is "module type") allows to refine type in a principle way without shadowing/redeclaration. Finally, the presence of these "val f : ..." ahead of the actual definition made me think of the well-known regret that OCaml compilation units are not very good at top-down programming: you have to define everything before use which enforces a bottom-up style which is sometimes not how you would like your code to be read. C prototypes allow top-down programming, but picking a right semantics without implicit mutual recursion is tricky. My feature idea willingly rejects this ambitious direction to stay simple and true. I would still be interested in thinking about this in the future. ## Known issues and competing proposals The main problem with this proposal is that the presence of the "module type" construction can only be explained in terms of the "superficial" semantics of OCaml compilation units, rather than as a "pure" feature of a typed lambda calculus. More precisely, I see no way to explain this semantics but saying "we include the signature of the .mli", which is admittedly an presentation detail of the OCaml compiler rather than a deep aspect of the OCaml language. "module type" could be explained in terms of the module sublanguage: module M : sig type t =3D int end =3D struct include (module type) let x : t =3D 1 end But as you see the context-dependent semantics is not very nice and it feels ad-hoc -- it is. I think this is the most serious problem with the proposal and the reason why I'm unsure it could ever be accepted by the language designers. Another criticism is that most of the benefits of this proposal can already be obtained with existing features: 1. one can define the relevant signatures in a "interfaces.ml" files reused in both the current .ml and the .mli, to avoid duplication 2. there is a rich annotation=B9 language for OCaml expressions patterns or modules that can express what would be independently expressed as a signature item =B9: while we're at it, I'm considering pushing for the syntax "fun pat pat ... pat : ty -> expr" as a synonym for "fun pat pat ... pat -> (expr : ty)", but the potential confusion between "fun x : int ->" and "fun (x : int) ->" is disheartening. There is another related proposal: Alain Frisch's suggestion to make compilation units *recursive* modules implicitly, rather than non-recursive modules. http://caml.inria.fr/mantis/print_bug_page.php?bug_id=3D5480 Using recursive modules induces a definitive amount of complexity, but is also more expressive in terms of allowing forward references (the top-to-bottom style aspect I mentioned earlier). I think a simpler, non-recursive proposition is a more reasonable choice. Alain described a compromise in use at Lexifi, where compilation units are *type-checked* are recursive modules but *compiled* as non-recursive modules. That's probably close in expressivity to this proposal (but harder, I think, to explain to the newcomer). On another level, the signature module of OCaml is not arbitrarily expressive, and there are some things that cannot be presently handled by signature manipulations and inclusion alone. For example, you cannot easily hide/remove values or types from a signature (destructive substitution "with t :=3D ..." partly does that for type). Heavy use of the proposed feature could run into these limitations. I think we should be ready to enrich the signature language in this case. On Tue, Oct 30, 2012 at 5:46 PM, Edgar Friendly wrot= e: > On 10/30/2012 12:21 PM, Romain Bardou wrote: >> >> Le 30/10/2012 17:06, Edgar Friendly a =E9crit : >>> >>> On 10/30/2012 10:47 AM, Romain Bardou wrote: >>>> >>>> Maybe ocaml should provide something like: >>>> >>>> include type t in sig >>>> >>> Is there any case where type declarations in the .mli file should not be >>> included as part of the .ml file? >> >> >> No, as it does not compile otherwise. However, maybe the user wants the >> declarations to be put in some specific order. >> >>>> Which would mean "copy-paste the definition of t from the .mli". >>>> >>> Why not have this be the default? i.e. when compiling a .ml file, if the >>> corresponding .mli file exists, its type declarations are in scope for >>> the .ml file, and its value declarations are applied to the >>> corresponding values in the .ml file. The only edge case I can think of >>> is when an identifier is bound multiple times in the .ml file, the type >>> from the .mli file currently only applies to the last binding, whereas >>> with this strategy, the type would most easily be implemented as >>> applying to all bindings of that identifier. >>> >>> E. >>> >> >> I'm not sure I understand what you mean, but here are some examples I >> worry about. >> >> 1) You have the following .mli: >> type t =3D A >> type u =3D A >> with the following implementation: >> let x =3D A >> Where should type t and type u be copied? In other words, should x be of >> type t or of type u? >> > They could be copied to the top of the .ml file in the order specified in > .mli file. > >> It may also happen that the user wants the order of t and u to be revers= ed >> in the implementation. >> >> 2) You have a declaration like this in the .mli: >> type t =3D private something >> Should it be copied as: >> type t =3D something >> or as: >> type t =3D private something >> I don't think the latter makes much sense most of the time, but it is a >> valid declaration. >> > Or the user could specify "type t =3D something" in the .ml file and over= ride > the .mli file's type. To do this, we'd need the concept of these "copied" > types being overrideable in the .ml file, but I think doing this would be > necessary for backwards compatibility, as all existing code would break > under the new .mli semantics. > > E. > > > -- > Caml-list mailing list. Subscription management and archives: > https://sympa.inria.fr/sympa/arc/caml-list > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs