Re: automatic construction of mli files

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

* Re: automatic construction of mli files
@ 2000-07-27 19:04 Damien Doligez
  0 siblings, 0 replies; 16+ messages in thread
From: Damien Doligez @ 2000-07-27 19:04 UTC (permalink / raw)
  To: frouaix; +Cc: caml-list

>From: Francois Rouaix <frouaix@liquidmarket.com>

>This has to be one of the most cryptic comment ever made to this list.
>And a "rather complex issue" coming from Damien, the mind boggles, especially 
>on this mysterious 8% figure.

>Care to give some details ?

OK.  It has to do with examining the roots at the beginning of each
minor collection.  The global variables are roots, but each global
variable is assigned only once, so we only need to examine it once
(after that, the value will be in the major heap, so it is not a root
for the minor collector).

We do it by remembering which modules have executed some
initialisation code since the last collection, and only examining
their globals.

The 8% figure comes from the speedup on Coq that we got when we
implemented the trick.

Actually, now that you force me to remember the complex part, I want
to take back my comment.  It's the fact that only the exported symbols
are roots, so using .mli files will speed up the first garbage
collections, and only by a small amount.  Using an empty .mli
file for your main module (the one that's linked last) will speed up
all garbage collections (because the initialisation of that module is
only complete when the program stops running), again by a very small
amount.

I have to apologize for not checking my facts before I posted to the
list.

Oh, and this applies only to programs compiled with ocamlopt.

-- Damien

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: automatic construction of mli files
  2000-08-01 11:22   ` Anton Moscal
  2000-08-02 12:03     ` Dmitri Lomov
@ 2000-08-02 14:13     ` Gerard Huet
  1 sibling, 0 replies; 16+ messages in thread
From: Gerard Huet @ 2000-08-02 14:13 UTC (permalink / raw)
  To: dsl, caml-list

At 16:03 02/08/00 +0400, Dmitri Lomov wrote:

>There are some things that I DO NOT like about OCaml .mli/.ml files, however.
>1) I have to duplicate type definitions  in .mli and in .ml files.
>   I think every ocamler knows how tedious is keeping consistency here.
>   Most of the time, you just change the definition in one of the files,
>   compile, get an error message, and (with curses ;)) copy definition to
another
>   file. I suppose complier can do this work for the user.
>   Maybe there is some hidden philosophy behind this... I will be happy
>   if someone explaint it to me

Just to add my 2 cents to this rambling discussion on interfaces. I used to
gripe myself about this duplication problem. But one day I realised that a
lot of this duplication could be avoided, because often the interfaces can
be designed as pieces of data structures definitions (DTDs), which consist
entirely of type declarations AND WHICH DO NOT NEED AN IMPLEMENTATION FILE
AT ALL, and pieces of method/exceptions descriptions, for which a .mli and
a .ml file share some redundancy, but where the programmer has the
opportunity to think carefully about what is visible from outside and at
what level of abstraction. 

Furthermore it is essential that we provide the possibility of designing
the interface BEFORE the code gets written, in order to do team work and
top-down design. For a large project, dummy implementations will typically
give a debugging environment to test your module, but here we want to check
the type consistency of these debugging dummies with respect to the master
.mli, as opposed to generate the .mli specs from them. 

I am currently working on a linguistic data base, where the data base
format is a BIG .mli describing the abstract syntax of my data base,
thinking about it as a DTD. This format is used as interface both to the
parser which reads in the data, and to the various processors, extractors,
and printers of the data base. This .mli file has no implementation
associated with it, and thus there is not any redundancy in terms of source
code. 

Of course, it is useful to have the facility to get the .mli from the .ml
for quick-and-dirty bottom-up programming. But as your program gets large,
and your modules become parametric functors, you really want to keep
separate your interface specs and your code, and the redundancy is slight.  

Gérard

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: automatic construction of mli files
  2000-08-01 11:22   ` Anton Moscal
@ 2000-08-02 12:03     ` Dmitri Lomov
  2000-08-02 14:13     ` Gerard Huet
  1 sibling, 0 replies; 16+ messages in thread
From: Dmitri Lomov @ 2000-08-02 12:03 UTC (permalink / raw)
  To: caml-list

I have lot of experience with Java, where there is no 
explicit signatures present. 
'public' in Java is like the EXPORT keyword that was proposed.

This makes any Java class a total mess of interface and
implementation. If you look at Java class you cannot figure out
the interface, because implementation pokes out everywhere.

Usually, if I need to understand some Java class, I have to
javadoc it, and then look at the resulting html. IMHO, that's awful.
Even C++ headers are better than this.

>From my point of view, OCaml .ml/.mli treatment is very good.

There are some things that I DO NOT like about OCaml .mli/.ml files, however.
1) I have to duplicate type definitions  in .mli and in .ml files.
   I think every ocamler knows how tedious is keeping consistency here.
   Most of the time, you just change the definition in one of the files,
   compile, get an error message, and (with curses ;)) copy definition to another
   file. I suppose complier can do this work for the user.
   Maybe there is some hidden philosophy behind this... I will be happy
   if someone explaint it to me

2) Just one little thing I could not figure how to do... 
   Suppose I have a module, called Eval, implementation of which
   differs for bytecode and native-code generation.
   What I wanted to do was:
      - create an eval.mli in the root directory of my project
      - create bytecode/eval.ml: bytecode implementation of eval.mli
      - create i386/eval.ml: implementation of eval.mli for i386 
      - create alpha/eval.ml: implementation of eval.mli for Alpha &c
      - compile versions of my program as follows:
          ocamlc -o program bytecode/eval.cmo program.ml
          ocamlopt -o program.opt i386/eval.cmx program.ml

      You see here, that inside of program.ml I need not to know what versio  I
      compile (for bytecode or for i386 or for Alpha)

   But this trick didn't do, because I was unable to compille eval.cmo and eval.cmx
   When I did (in bytecode/)
      ocamlc -c -I .. eval.ml
   I got eval.cmi and eval.cmo (so ocamlc did not look eval.cmi in a root directory).
   Can anybody suggest how to do this trick with OCaml?

Regards,
Dmitri

Anton Moscal wrote:
> 
> On Tue, 25 Jul 2000, Jacques Garrigue wrote:
> 
> > Not to say that the current situation is perfect. The fact you have to
> > duplicate all type definitions is not so nice for instance. But for
> > people used to the .ml/.mli dichotomy, having both kind of information
> > united in a single file does not seem very attractive.
> 
> I think, SML `local' is a good alternative to interfaces in the many
> cases: `local' allows to hide definitions from the module interface
> without explicit signature specification.
> 
> Regards,
> Anton Moscal

-- 
_________________________________________________________________
Dmitri S. Lomov
mailto:dsl@tepkom.ru    ICQ#: 20524819 (Rusty)
+7 (812) 428-46-57 (b)   +7 (812) 295-94-15 (h)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: automatic construction of mli files
  2000-07-25  1:13 ` Jacques Garrigue
@ 2000-08-01 11:22   ` Anton Moscal
  2000-08-02 12:03     ` Dmitri Lomov
  2000-08-02 14:13     ` Gerard Huet
  0 siblings, 2 replies; 16+ messages in thread
From: Anton Moscal @ 2000-08-01 11:22 UTC (permalink / raw)
  To: Jacques Garrigue; +Cc: caml-list

On Tue, 25 Jul 2000, Jacques Garrigue wrote:

> Not to say that the current situation is perfect. The fact you have to
> duplicate all type definitions is not so nice for instance. But for
> people used to the .ml/.mli dichotomy, having both kind of information
> united in a single file does not seem very attractive.

I think, SML `local' is a good alternative to interfaces in the many
cases: `local' allows to hide definitions from the module interface
without explicit signature specification. 

Regards, 
Anton Moscal



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: automatic construction of mli files
  2000-07-26 12:58 Damien Doligez
@ 2000-07-27 17:46 ` Francois Rouaix
  0 siblings, 0 replies; 16+ messages in thread
From: Francois Rouaix @ 2000-07-27 17:46 UTC (permalink / raw)
  To: Damien Doligez; +Cc: caml-list

> 2.  Due to rather complex implementation issues, if you don't use .mli
>     files and let the compiler generate the .cmi from the .ml, then
>     garbage collection will be slightly slower.  If you do it for all
>     your files, you might lose as much as 8% on the speed of your
>     program.
> -- Damien

This has to be one of the most cryptic comment ever made to this list.
And a "rather complex issue" coming from Damien, the mind boggles, especially 
on this mysterious 8% figure.

Care to give some details ?

--f



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: automatic construction of mli files
  2000-07-24 22:02 ` Jean-Christophe Filliatre
@ 2000-07-26 16:09   ` John Max Skaller
  0 siblings, 0 replies; 16+ messages in thread
From: John Max Skaller @ 2000-07-26 16:09 UTC (permalink / raw)
  To: Jean-Christophe Filliatre; +Cc: Julian Assange, caml-list

Jean-Christophe Filliatre wrote:

[discussion of programming techniques]

> But writing  an interface is always  a good idea,  since it encourages
> you to  abstract data types and  to provide only  small and orthogonal
> sets of functions, which cannot be done automatically in most cases.

On the other hand, writing the interface _first_ is not always 
appropriate when you want to rapidly prototype experimental code
where you don't actually know what the interface is going to turn
out to be until you experiment.

I often do this, _then_ write an interface, then reimplement
the code cleanly according to this interface. Using this methodology,
I often write the implementation first -- an 'abstraction' of the
interface being in my head:  an easier medium to 'edit' rapidly 
than a text file :-)

-- 
John (Max) Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia voice: 61-2-9660-0850
checkout Vyper http://Vyper.sourceforge.net
download Interscript http://Interscript.sourceforge.net

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: automatic construction of mli files
  2000-07-24 20:48 ` Olivier Andrieu
@ 2000-07-26 16:03   ` John Max Skaller
  0 siblings, 0 replies; 16+ messages in thread
From: John Max Skaller @ 2000-07-26 16:03 UTC (permalink / raw)
  To: andrieu; +Cc: caml-list

Olivier Andrieu wrote:
> 
> Julian Assange wrote:
> >
> > .mli files are highly redundant. Almost without exception all, or at
> > the vast majority of .mli information can be generated from the
> > underlying .ml implementation. We have programming languages to reduce
> > redundancy, not increase it. Keeping mli and ml files in-sync is not
> > only a waste of time, but error-prone and from my survey often not
> > performed correctly, particularly where consistency is not enforced by
> > the compiler (e.g comments describing functions and types). While
> > exactly the same problem exists in a number of other
> > separate-compilation language implementations, we, as camlers, should
> > strive for something better.
> 
> Urmpf. Well, it's still about type abstraction and name hiding isn't it
> ? The compiler can't decide for you what you want to keep in the
> interface and what should be hidden.

	Perhaps not, but sometimes I think it might be easier if the
interface specification was consolidated with the implementation.
When no .mli file is specified, ocaml generates an interface
-- with all symbols becoming public -- anyhow.
 
> But it's quite easy to generate an mli file with everything in it :
>  ocamlc -i -c mycode.ml > mycode.mli

	The problem is that subsequent refinements are lost
when the implementation file is extended to include symbols:
now you're committed to hand management of the .mli file.
But if you stick with the generated .mli file
the same interface is generated if the file
is simply omitted. So the main use is to make the interface
more compact by physically hiding hiden implementation details.

-- 
John (Max) Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia voice: 61-2-9660-0850
checkout Vyper http://Vyper.sourceforge.net
download Interscript http://Interscript.sourceforge.net



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: automatic construction of mli files
@ 2000-07-26 12:58 Damien Doligez
  2000-07-27 17:46 ` Francois Rouaix
  0 siblings, 1 reply; 16+ messages in thread
From: Damien Doligez @ 2000-07-26 12:58 UTC (permalink / raw)
  To: caml-list

>From: Jean-Christophe Filliatre <filliatr@csl.sri.com>

>In the  extreme situation where there  is no real need  for writing an
>interface, you can either simply not write one (this is not mandatory)
>or generate it from the code with "ocamlc -c -i".

There are two technical details you should all know concerning .mli
files:

1.  If you don't use .mli files, or if you generate them automatically
    from the corresponding .ml files, then you lose separate
    compilation: whenever you change a semicolon in foo.ml, all
    the files that depend on module Foo will have to be recompiled.
    This may or may not be a big problem depending on the size of your
    project.

2.  Due to rather complex implementation issues, if you don't use .mli
    files and let the compiler generate the .cmi from the .ml, then
    garbage collection will be slightly slower.  If you do it for all
    your files, you might lose as much as 8% on the speed of your
    program.

-- Damien



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: automatic construction of mli files
  2000-07-24  5:34 Julian Assange
                   ` (5 preceding siblings ...)
  2000-07-25 11:48 ` Hendrik Tews
@ 2000-07-26 10:16 ` David Delahaye
  6 siblings, 0 replies; 16+ messages in thread
From: David Delahaye @ 2000-07-26 10:16 UTC (permalink / raw)
  To: Julian Assange; +Cc: caml-list

> .mli files are highly redundant. Almost without exception all, or at
> the vast majority of .mli information can be generated from the
> underlying .ml implementation. We have programming languages to reduce
> redundancy, not increase it. Keeping mli and ml files in-sync is not
> only a waste of time, but error-prone and from my survey often not
> performed correctly, particularly where consistency is not enforced by
> the compiler (e.g comments describing functions and types). While
> exactly the same problem exists in a number of other
> separate-compilation language implementations, we, as camlers, should
> strive for something better.

    I don't think that .mli are redundant. They essentially contain type
information and, in a strongly typed system, it is quite relevant and
especially useful. Moreover, with .mli, you can build abstract data types,
which are, in general, greatly used. So, to keep .mli and .ml files in-sync is
not a waste of time but a way to ensure that your .ml file is an implementation
of your specification. If you have errors during this verification then it
could be explained by (semantical) errors in your code, you wouldn't have seen
without this check. Of course, it could also be due to the .mli file which must
be changed but, again, it is not a waste of time and it should be have been
done before changing the implementation. Personally, I don't like .ml files
without .mli because you don't have any type information and even if I want to
export everything in an .ml file, I generate the .mli automatically.

    Regards.

    David.

===============================================================================
David Delahaye                                 <Email>: David.Delahaye@inria.fr
<Laboratory>: The Coq Project                                  <Domain>: Proofs
<Adress>: INRIA-Rocquencourt Domaine de Voluceau BP105 78153 Le Chesnay Cedex
          FRANCE
<Tel>: (33)-(0)1 39 63 57 53
<Fax>: (33)-(0)1 39 63 56 84
<Url>: http://pauillac.inria.fr/~delahaye
===============================================================================

[If you have time to waste, you can have a look on my proof that 2 = 1. We know
 that, for -1 < x <= 1:

   ln(1 + x) = x - 1/2(x^2) + 1/3(x^3) - 1/4(x^4) + ...

 Let x = 1:

   ln(2) = 1 - 1/2 + 1/3 - 1/4 + 1/5 - 1/6 + 1/7 - 1/8 + 1/9 - ...

 Let multiply the two members by 2:

   2ln(2) = 2 - 2/2 + 2/3 - 2/4 + 2/5 - 2/6 + 2/7 - 2/8 + 2/9 - ...
   2ln(2) = 2 - 1 + 2/3 - 1/2 + 2/5 - 1/3 + 2/7 - 1/4 + 2/9 - ...

 Let sum the terms with the same denominator:

   2ln(2) = 1 - 1/2 + 1/3 - 1/4 + 1/5 - 1/6 + 1/7 - ...
   2ln(2) = ln(2)

 Finally, 2 = 1.]



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: automatic construction of mli files
  2000-07-24  5:34 Julian Assange
                   ` (4 preceding siblings ...)
  2000-07-25  1:13 ` Jacques Garrigue
@ 2000-07-25 11:48 ` Hendrik Tews
  2000-07-26 10:16 ` David Delahaye
  6 siblings, 0 replies; 16+ messages in thread
From: Hendrik Tews @ 2000-07-25 11:48 UTC (permalink / raw)
  To: caml-list

Julian Assange writes:
   From: Julian Assange <proff@iq.org>
   Date: 24 Jul 2000 15:34:22 +1000
   Subject: automatic construction of mli files

   .mli files are highly redundant. Almost without exception all, or at
   the vast majority of .mli information can be generated from the
   underlying .ml implementation. We have programming languages to reduce
   redundancy, not increase it. Keeping mli and ml files in-sync is not
   only a waste of time, but error-prone and from my survey often not
   performed correctly, particularly where consistency is not enforced by
   the compiler (e.g comments describing functions and types). While
   exactly the same problem exists in a number of other
   separate-compilation language implementations, we, as camlers, should
   strive for something better.

I don't share your point of view. We have a big project (>100
files, approx 60000 lines of ocaml code). And the policy
regarding mli files is quite simple: If the mli file could be
generated from the source code, then, don't write one! The ocaml
compiler (together with ocamldep) will recognize its absence and
compile the ml file into both a cmo and a cmi file. 

If, on the other side, you really want to hide some type
information, then you provide an mli file and in this case it
cannot be generated from the ml file. In our project we have 26
mli files for 63 ml files. The rest is generated by the compiler.

Moreover I don't think redundancy is that bad. Redundancy helps
you to catch errors. I often write mli files even if they contain
only the information that would be infered by the comiler
automatically. But this redundant mli file helps me to catch
programming errors.

Bye,

Hendrik

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: automatic construction of mli files
  2000-07-24  5:34 Julian Assange
                   ` (3 preceding siblings ...)
  2000-07-24 23:14 ` David Brown
@ 2000-07-25  1:13 ` Jacques Garrigue
  2000-08-01 11:22   ` Anton Moscal
  2000-07-25 11:48 ` Hendrik Tews
  2000-07-26 10:16 ` David Delahaye
  6 siblings, 1 reply; 16+ messages in thread
From: Jacques Garrigue @ 2000-07-25  1:13 UTC (permalink / raw)
  To: proff; +Cc: caml-list

From: Julian Assange <proff@iq.org>

> .mli files are highly redundant. Almost without exception all, or at
> the vast majority of .mli information can be generated from the
> underlying .ml implementation. We have programming languages to reduce
> redundancy, not increase it. Keeping mli and ml files in-sync is not
> only a waste of time, but error-prone and from my survey often not
> performed correctly, particularly where consistency is not enforced by
> the compiler (e.g comments describing functions and types). While
> exactly the same problem exists in a number of other
> separate-compilation language implementations, we, as camlers, should
> strive for something better.

I know the argument against .h files.
However I'm not sure it applies to .mli, which in my view have a role
rather different of C headers.

One first point is that they are optional. If you don't like the
burden of writing them (which is much simplified by the -i option of
the compiler), you can just avoid them. A tool like ocamlbrowser still
allows you to browse the contents of the .cmi file, thus giving access
to all type information. The only capacity you loose is export
control, but in many cases this doesn't really matter anyway.

So why do people write .mli files ? Just because it is nice to cleanly
separate interface and implementation. This goes completely against
some ideas in the OO community, particularly concerned by the close
relationship between specification and implementation, but thanks to a
more powerful type system, a larger part of the specification is
verifiable. A verifiable redundancy can be profitable, in fact all the
ML type system is about getting enough redundancy to your code to
detect errors.

Not to say that the current situation is perfect. The fact you have to
duplicate all type definitions is not so nice for instance. But for
people used to the .ml/.mli dichotomy, having both kind of information
united in a single file does not seem very attractive.

Practically what I usally do when I start a new project in Caml is
start without .mli files, until my project gets structured enough so
that I know what needs to be exported from each module. Then I
generate .mli's with the -i option of the compiler, trim definitions
to keep only the ones I need, and eventually add comments to
definitions (I'm often happy enough with labels only, but I know it's
not Good). From there on changes to the .mli are small enough, and
actually I want to know when the interface is changed, since this may
influence other modules.

Regards,

---------------------------------------------------------------------------
Jacques Garrigue      Kyoto University     garrigue at kurims.kyoto-u.ac.jp
		<A HREF=http://wwwfun.kurims.kyoto-u.ac.jp/~garrigue/>JG</A>



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: automatic construction of mli files
  2000-07-24  5:34 Julian Assange
                   ` (2 preceding siblings ...)
  2000-07-24 22:09 ` John Prevost
@ 2000-07-24 23:14 ` David Brown
  2000-07-25  1:13 ` Jacques Garrigue
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 16+ messages in thread
From: David Brown @ 2000-07-24 23:14 UTC (permalink / raw)
  To: Julian Assange; +Cc: caml-list

On Mon, Jul 24, 2000 at 03:34:22PM +1000, Julian Assange wrote:

> .mli files are highly redundant. Almost without exception all, or at
> the vast majority of .mli information can be generated from the
> underlying .ml implementation. We have programming languages to reduce
> redundancy, not increase it.

The redundancy is intentional, though.  There are two differing
philosophies concerning the information in the mli file.  One attitude,
taken by languages such as Eiffel, is that there should be no specification
file.  The source contains markers to determine what is "public" and what
is "private".  The compiler provides tools to generate specs for
documentation use.

The other attitude, taken by Ada, and sort of, but not really, by C, is
that the spec file is a document of the interface to the module.  The
implementation is independent from the spec.  The developer should feel
free to implement the spec as needed, as long as it complies with that
interface.

Both methods have advantages and disadvantages.

- The single file approach is easier for the programmer to create,
  especially when doing rapid prototyping.

- The single file approach makes it less obvious if a given change is going
  to change the interface.  A separate spec can be checked against to
  verify that the developer didn't change the interface.

- The separate spec is usually difficult for those first learning the
  language.  Ocaml helps here by not requiring the spec to be written, and
  even generating it for you.

There are very strong arguments by big names for both sides.  Ocaml, like
it does in many areas, leaves the choice up to the developer.

If you wish to have a single file, don't write a .mli file at all.  Modules
will export all symbols by default if there is no spec.  If you want to
make a spec when you are done, compile the modules with -i, and redirect
the output to the spec file.  Feel free to comment it, and delete the
symbols you didn't really want to export.

For those who wish to use a more "structured" approach to development, the
spec can be written first.  Then the compiler will verify that the code
written corresponds with the spec.

The challenge, then is to figure out which methodology I prefer :-).

Dave Brown

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: automatic construction of mli files
  2000-07-24  5:34 Julian Assange
  2000-07-24 20:48 ` Olivier Andrieu
  2000-07-24 22:02 ` Jean-Christophe Filliatre
@ 2000-07-24 22:09 ` John Prevost
  2000-07-24 23:14 ` David Brown
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 16+ messages in thread
From: John Prevost @ 2000-07-24 22:09 UTC (permalink / raw)
  To: Julian Assange; +Cc: caml-list

>>>>> "ja" == Julian Assange <proff@iq.org> writes:

    ja> .mli files are highly redundant. Almost without exception all,
    ja> or at the vast majority of .mli information can be generated
    ja> from the underlying .ml implementation. We have programming
    ja> languages to reduce redundancy, not increase it. Keeping mli
    ja> and ml files in-sync is not only a waste of time, but
    ja> error-prone and from my survey often not performed correctly,
    ja> particularly where consistency is not enforced by the compiler
    ja> (e.g comments describing functions and types). While exactly
    ja> the same problem exists in a number of other
    ja> separate-compilation language implementations, we, as camlers,
    ja> should strive for something better.

I disagree.  .mli files are not at all redundant.  For one thing, if
you do not desire to use .mli files, you do not need to write them at
all--the system works perfectly well if you use only .ml files.

What .mli files are *for* is to restrict the type of a top-level
module.  Within the language we may write:

module Foo =
 (struct
    type t = int
    let foo x = x
    let bar x = x + 1
    let baz x = x
  end : sig
    type t
    val foo : int -> t
    val bar : t -> t
    val baz : t -> int
  end)

in order to restrict the visibility of types.  But using top-level
modules, there's no declaration which may be wrapped in a type
constraint.  .mli serves exactly this purpose.

Now--what if the language were changed to annotate such things
in-line?  I argue that it would not, in fact, become a better
language.  For the above, we might write:

EXTERN opaque
type t = int

EXTERN int -> t
let foo x = x

EXTERN t -> t
let bar x = x + 1

EXTERN t -> int
let baz x = x

Note that I don't actually recommend this syntax, I'm just trying to
point out what information has to be provided.

The first thing to note is that each top-level declaration must be
annotated with the type you wish to export, if you want to export
anything at all.  I'm of two minds about this.  On the one level,
there are good arguments for doing it that way: the type constraints
are near the code they go with.  When I look at foo, I see that it's
actually meant to turn an int into a t.

On the other hand, it really breaks things up.  When I see:

type t

val foo : int -> t
val bar : t -> t
val baz : t -> int

it's very easy to see what basic types and operations on those types
are provided by the module.  If I want to change what information is
revealed, it's pretty easy to do.

And things get even hairier when you want to restrict things down
more from more complex types.

I personally think that making type constraints an aspect of the
module-level language, and hence not supporting inline declarations of
this sort of thing is good.  If you want things to be transparent,
don't use .mli files.  In general, if you want to hide anything, I
think you usually want to hide enough to make inline constraints more
confusing than not.

John.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: automatic construction of mli files
  2000-07-24  5:34 Julian Assange
  2000-07-24 20:48 ` Olivier Andrieu
@ 2000-07-24 22:02 ` Jean-Christophe Filliatre
  2000-07-26 16:09   ` John Max Skaller
  2000-07-24 22:09 ` John Prevost
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Jean-Christophe Filliatre @ 2000-07-24 22:02 UTC (permalink / raw)
  To: Julian Assange; +Cc: caml-list

> .mli files are highly redundant. Almost without exception all, or at
> the vast majority of .mli information can be generated from the
> underlying .ml implementation. We have programming languages to reduce
> redundancy, not increase it. [...]

We probably don't use interfaces (.mli files) in the same way.

Personally, I write the interface  *before* the code. It clarifies the
design  of  an  implementation,  helping  you focusing  first  on  the
features of  the module, independently of  the way you  will write it.
Therefore, there is no way of  generating the .mli from the .ml, since
it comes first.  Moreover, I don't put the  same kind of documentation
in  the   interface  and  in  the   code:  in  the   interface  I  put
specifications, mainly, and in the code I put implementation details.

In the  extreme situation where there  is no real need  for writing an
interface, you can either simply not write one (this is not mandatory)
or generate it from the code with "ocamlc -c -i".

But writing  an interface is always  a good idea,  since it encourages
you to  abstract data types and  to provide only  small and orthogonal
sets of functions, which cannot be done automatically in most cases.

-- 
Jean-Christophe Filliatre    
  Computer Science Laboratory   Phone (650) 859-5173
  SRI International             FAX   (650) 859-2844
  333 Ravenswood Ave.           email filliatr@csl.sri.com
  Menlo Park, CA 94025, USA     web   http://www.csl.sri.com/~filliatr

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: automatic construction of mli files
  2000-07-24  5:34 Julian Assange
@ 2000-07-24 20:48 ` Olivier Andrieu
  2000-07-26 16:03   ` John Max Skaller
  2000-07-24 22:02 ` Jean-Christophe Filliatre
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Olivier Andrieu @ 2000-07-24 20:48 UTC (permalink / raw)
  To: caml-list

Julian Assange wrote:
> 
> .mli files are highly redundant. Almost without exception all, or at
> the vast majority of .mli information can be generated from the
> underlying .ml implementation. We have programming languages to reduce
> redundancy, not increase it. Keeping mli and ml files in-sync is not
> only a waste of time, but error-prone and from my survey often not
> performed correctly, particularly where consistency is not enforced by
> the compiler (e.g comments describing functions and types). While
> exactly the same problem exists in a number of other
> separate-compilation language implementations, we, as camlers, should
> strive for something better.

Urmpf. Well, it's still about type abstraction and name hiding isn't it
? The compiler can't decide for you what you want to keep in the
interface and what should be hidden.

But it's quite easy to generate an mli file with everything in it :
 ocamlc -i -c mycode.ml > mycode.mli

You could even use a Makefile rule :
%.mli : %.ml
	$(OCAMLC) -i -c $^ > $@


	Olivier.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* automatic construction of mli files
@ 2000-07-24  5:34 Julian Assange
  2000-07-24 20:48 ` Olivier Andrieu
                   ` (6 more replies)
  0 siblings, 7 replies; 16+ messages in thread
From: Julian Assange @ 2000-07-24  5:34 UTC (permalink / raw)
  To: caml-list; +Cc: proff

.mli files are highly redundant. Almost without exception all, or at
the vast majority of .mli information can be generated from the
underlying .ml implementation. We have programming languages to reduce
redundancy, not increase it. Keeping mli and ml files in-sync is not
only a waste of time, but error-prone and from my survey often not
performed correctly, particularly where consistency is not enforced by
the compiler (e.g comments describing functions and types). While
exactly the same problem exists in a number of other
separate-compilation language implementations, we, as camlers, should
strive for something better.

The mli case parallels the hideous task of maintaining C extern
definitions in .h files (the C++ case is usually even worse). This is
always the first thing to go in any C project I work on. Instead I use
a small cpp / sed script to automagically generates this information
from the underlying C implementation file.  This is quite simple to
use and merely involves placing the token "EXPORT" before the
variable/function definition. There is some minor added complexity to
support the full range of C compile-time variable
instantiation. Appended to this email is the part of my C style-guide
that describes this approach. Having greater control over the language
and compiler proper we should be able to do better, but the general
approach seems sound and applicable to ocaml.

GENEXTERN EXPORT MACROS
-----------------------

Redundant code is bad code. Unproto-typed code is bad code. Prototypes
and extern's are redundant by their very nature, and it's depressing
that people put up with the soul destroying action of manually
creating, updating (an exceptionally tedious and error-prone task) the
great swag of prototypes and externs that a C program of any size
needs for its various bits to communicate with each other. God didn't
give you a computer in order to further the evils of redundant
behaviour, but to eliminate it.

Examine the following conventional situation, where we have two C
files, each of which has variables and functions that the other calls --
I dont' recommend this way of parsing information about, but we need it
for the example :)

	== frazer.c ==

	bool CIA_support = TRUE;

	static int campaign_fund;
	static int frazer_dollars:
	static char *frazer_mental_state = "hopeful";

	void
	frazer(void)
	{
		frazer_dollars -= bribe_kerr(frazer_dollars);
		campaign_find -= frazer_dollars/2;
		if (dismiss_govenment &&
		    strcasecmp(dismiss_action, "care-taker"))
			frazer_mental_state = "hot doggarty dog";
	}

	== kerr.c ==

	bool dismiss_government;
	char *dismiss_action;

	#ifndef HAVE_STRCASECMP
	int strcasecmp (char *s, char *s2)
	{
		do
		{
			char c1=tolower(*s);
			char c2=tolower(*s2);
			if (c1>c2)
				return 1;
			if (c1<c2)
				return -1;
		} while (*s++ && *s2++);
		return 0;
	}
	#endif

	int
	kerr(int offer)
	{
		if (offer>KER_MIN_ACTION)
		{
			dismiss_government = TRUE;
			if (offer>KER_MIN_ACTION * 2)
				dismiss_action = "care-taker";
			else
				dismiss_action = "dissolution";
			return (CIA_support? offer/2: offer);
		}
		return offer/8;
	}

Now, lets look at the prototypes we will need to support these
shenanigans:

	== frazer.h ==
	extern bool CIA_support;
	void frazer(void);

	== kerr.h ==
	extern bool dismiss_government;
	extern char *dismiss_action;
	#ifndef HAVE_STRCASECMP
	int strcasecmp (char *s, char *s2);
	#endif
	int kerr(int offer);

In the marutukku build system this becomes:

	== frazer.c ==

	EXPORT bool CIA_support = TRUE;

	static int campaign_fund;
	static int frazer_dollars:
	static char *frazer_mental_state = "hopeful";

	EXPORT void frazer(void)
	{
		frazer_dollars -= bribe_kerr(frazer_dollars);
		campaign_find -= frazer_dollars/2;
		if (dismiss_government &&
		    strcasecmp(dismiss_action, "care-taker"))
			frazer_mental_state = "hot doggarty dog";
	}

	== kerr.c ==

	EXPORT bool dismiss_government;
	EXPORT char *dismiss_action;

	#ifndef HAVE_STRCASECMP
	EXPORT int strcasecmp (char *s, char *s2)
	{
		do
		{
			char c1=tolower(*s);
			char c2=tolower(*s2);
			if (c1>c2)
				return 1;
			if (c1<c2)
				return -1;
		} while (*s++ && *s2++);
		return 0;
	}
	#endif

	EXPORT int kerr(int offer)
	{
		if (offer>KER_MIN_ACTION)
		{
			dismiss_government = TRUE;
			if (offer>KER_MIN_ACTION * 2)
				dismiss_action = "care-taker";
			else
				dismiss_action = "dissolution";
			return (CIA_support? offer/2: offer);
		}
		return offer/8;
	}

EXPORT is merely the token genextern.sh uses for parsing cues,
although it's nice to see at a glance what is being referenced (or at
least, is meant to be referenced) from other .c files. Everything not
EXPORT'ed should be static. Can you see why?

Now, lets look at what has happened to frazer.h and kerr.h:

	== frazer.h ==
	#include "frazer.ext"

	== kerr.h ==
	#include "kerr.ext"

frazer.ext and kerr.ext are automatically generated by the
following rule in mk/rules.mk.in

    %.ext : %.c %.h $(top_srcdir)/config.h $(top_srcdir)/scripts/genextern.sh
            CPP="$(CPP)";export CPP; sh $(top_srcdir)/scripts/genextern.sh $<\
	    > $@.tmp $(DEFS) $(INCLUDES) $(CPPFLAGS) $(CFLAGS) \
	    && mv -f $@.tmp $@ || rm -f $@.tmp

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2000-08-03 13:10 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-07-27 19:04 automatic construction of mli files Damien Doligez
  -- strict thread matches above, loose matches on Subject: below --
2000-07-26 12:58 Damien Doligez
2000-07-27 17:46 ` Francois Rouaix
2000-07-24  5:34 Julian Assange
2000-07-24 20:48 ` Olivier Andrieu
2000-07-26 16:03   ` John Max Skaller
2000-07-24 22:02 ` Jean-Christophe Filliatre
2000-07-26 16:09   ` John Max Skaller
2000-07-24 22:09 ` John Prevost
2000-07-24 23:14 ` David Brown
2000-07-25  1:13 ` Jacques Garrigue
2000-08-01 11:22   ` Anton Moscal
2000-08-02 12:03     ` Dmitri Lomov
2000-08-02 14:13     ` Gerard Huet
2000-07-25 11:48 ` Hendrik Tews
2000-07-26 10:16 ` David Delahaye

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).