caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Lucas Dixon <lucas.dixon@gmail.com>
To: Andreas Rossberg <rossberg@mpi-sws.org>
Cc: caml-list List <caml-list@inria.fr>
Subject: [Caml-list] Re: functors and modules (was "What is an applicative functor?")
Date: Wed, 18 May 2011 00:45:49 -0400	[thread overview]
Message-ID: <BANLkTikNRy_5WomDwjhzS3cggzvHS0f2Jw@mail.gmail.com> (raw)
In-Reply-To: <2F01B4AD-D6F2-4503-8F27-C87ECAC987E1@mpi-sws.org>

On 15 May 2011 05:05, Andreas Rossberg <rossberg@mpi-sws.org> wrote:
> On May 11, 2011, at 15.11 h, Lucas Dixon wrote:
>>>>
>>>> Here's a better presentation of the idea:
>>>>
>>>>
>>>> http://informallyabstract.blogspot.com/2011/04/typed-signature-variables-for.html
>>>> [...]
>>>> I'd like to treat S1 as a signature variable... see blog post linked
>>>> above.
>>>
>>> Sorry, I still don't get it.
>>
>> Probably my fault, the post had some typos :( I think I've fixed them
>> now, as well as swapped the signature ordering, which makes more sense
>> now: more stuff = bigger.
>
> OK, after reading it again, I'm afraid I still have more questions than
> answers: ;)
>
> - What do you mean by "typed signature variables"? In what sense are ML's
> signature identifiers not variables, or not typed?

in a functor, the signatures are defined, there are no lambda
variables which have a signature as a type... see below...

> Also, it seems that all
> your examples are easily expressible with existing language features, such
> as `include'. Consequently, I'm somewhat confused what exactly you are
> trying to achieve.
>
[moved the next up]
> - Similarly, where do you express the dependency between argument and result
> in a functor signature? S1 -> S1 is not the most specific signature of the
> identity functor over S1, that would be "(X:S1) -> S1 with type T = X.T" in
> most ML's. How should I read your "exact" type of F2?

Ah, I think I know what was confusing things: I was using the same
name of signature when what I really meant was a lambda-abstraction...
(the old free-variable implicitly being a lambda abstraction...
sigh... ); how about this explanation:

I want to define a functor like this:
F(M : S1) = { open M; val y = ... };

and I want it to have a type like this:
F : "For S <= S', S -> (S + { val y: T. })"

if I give F an input of signature S1 (which must itself be expressible
as an include of S', that's what the <= means) then I get a structure
with signature:
  "sig include S1; val y : T end;"

if, on the other hand, I give F an input of signature S2, then I get
out a structure with signature:
  "sig include S2; val y : T end;"

This cannot (to my knowledge) be expressed in ML at present. In ML I'd
need to write the same functor for each typing. i.e.:

functor F1(M : S1) : sig include S1; val y : T end
= struct open M; val y : T end;

and

functor F2(M : S2) : sig include S2; val y : T end
= struct open M; val y : T end;

If you give yourself signature level variables in functors, you rather
nicely express many OO style things in functors.

> - I don't understand the semantics of your =: operator. AFAICS, the most
> specific signature of M1 is not S1, but "S1 with type T = int". (But even
> that depends on the details of the type system. If you had, for example,
> something akin to SML'90's structure sharing, then the most specific
> signature of M1 would, in fact, be the singleton M1.)

For M =: S, I'm considering M as a record, each type and value is an
entry. And M =: S says that M has no more entries in the record, than
are specified in S.

There is a fine distinction between the usual unification based notion
and this one, which I don't think I fully appreciated until I started
writing this email: unification mixes up whats in the domain with what
it can be instantiated with. I have a hunch that these are separate
things and should be treated as such.

> - It seems you want to express a form of mixin composition with your +
> operator, but you gloss over what the binding structure should be. That is,
> you seem to expect being able to say S1 + S2, where S2 arbitrarily refers to
> members from S1. How is that supposed to work? (And would it differ from our
> `with' operator in MixML?)

Nothing fancy; I was thinking simply of sequential union, I guess I
should not have used "+" as that's normally thought of as commutative.
But using MixML's "with" would be fine too.

>>> How can you project S1.t then?
>>
>> S1 still has a type/signature, even though it is a variable: it's just
>> like an extensible record type (which is now signatures are often
>> implemented). S1 : { type t, ... }. So you can still project out what
>> you know is in S1. It's just that S1 may have more stuff as well.
>>
>> Does that make sense?
>
> Not quite. :) I thought S1 is a signature itself? What does S1 : { type t,
> ... } mean then?

The reply above is not probably more meaningful than this thread of
the conversation, but what I meant was that S1 can be expressed using
an include of sig type T = ... end; I'm thinking of signatures as the
typing of a record: i.e. the names of the fields, and their typing.
Variables of record type don't need to specify all fields. You get the
natural mix-in style typing.

>>>> functor F (S1 : SIG1) =
>>>> struct
>>>>  use "submodule1.ML";
>>>>  use "submodule2.ML";
>>>> end;
>>>
>>> Yes, but what you _actually_ want is a way to compile the submodules
>>> separately, which is where the real problem lies.
>>
>> Ah, indeed! So we'd perhaps like each file to implicitly be a functor:
>>
>> in "submodule1.ML":
>>
>> given M1 : SIG1;
>> val x = M1.f "some argument string";
>> ...
>>
>> Is this what you're thinking of in terms of a looser module coupling?
>
> Depends on what that declaration does. AFAIU, it would still require me to
> functorize each individual module, just with a different syntax.

Yes, reading the stuff you wrote below, I think I finally understand
the distinction you're aiming for in terms of modules from libraries.
Makes sense. :)

>>>> But! I think a better language for expressing signatures can easily fix
>>>> this.
>>>> e.g. we should be able to write:
>>>>
>>>> MAP = {  ... }
>>>> SIG2A = { Map : SIG1 ... }
>>>> SIG2B = { Map : SIG1 ... }
>>>> SIG3A = {
>>>>  Map : MAP;
>>>>  S2a : SIG2A where S2a.Map := Map;
>>>>  S2b : SIG2B where S2a.Map := Map;
>>>>  ... }
>>>>
>>>> Having structure/signature level substitutions seems to naturally fix
>>>> this: you now check a single Map at the top level, and modify all
>>>> types in S2a to refer to those in the parent structure. This also
>>>> cleans up the many occurrences of the same structure: I'm assuming
>>>> that S2a.Map no longer exists after you write S2a.Map := ..., instead
>>>> just have the single top-level Map. Maybe ":=" is not the best
>>>> syntax... but I think you get the idea.
>>>
>>> You actually have that in OCaml 3.12, with almost exactly your syntax.
>>> But
>>> your idiom again implements a form of flattening that will pollute the
>>> namespace of the signature's toplevel, and I think can still grow
>>> exponentially, just with a smaller constant factor.
>>
>> I don't think it grows exponentially: Map does not exist in S2a or S2b.
>> I also don't think it is a form of name-space pollution. You need to
>> have some syntax for Map, so it has to live somewhere.  Or maybe I'm
>> missing something?
>
> What this feature buys you is that you can eliminate duplicates from your
> dependencies. That is, you actually get to represent your dependency graph
> in the signature as a DAG instead of a tree with copies. But you still
> potentially include that whole dependency graph in every signature.

If you eliminate all duplication, the size of the DAG will only be as
big as it needs to be. It can't be any smaller. I guess the key issue
is: can sufficiently nice syntax exists to express DAG in the way we
want it; and in particular, for abstracting certain details as you
build up your library/modules. That's perhaps the key issue. That's
where I think deletion syntax may be particularly useful...

>>> I think there is no general solution. Specifying the signatures of
>>> arbitrary
>>> slices through a complex software architecture will produce ugly, big and
>>> complex signatures. That's why I don't want to be forced to do it
>>> everywhere. I only want to specify signatures at suitable package
>>> boundaries, where I expect looser coupling than within. Within a package,
>>> I
>>> just want to use other modules of the same package without (explicitly)
>>> reiterating their signature at every import edge. I.e., writing functors
>>> is
>>> not what I want there.
>>
>> This sounds to me like just putting things in a global name-space, and
>> having a hierarchy of uses. I agree that this is useful, but isn't
>> this just avoiding the act of modularity altogether? I guess I'm
>> misunderstanding something. Maybe an small example would clarify?
>
> Well, small examples are difficult with this topic.
>
> A module for me, typically, is an individual ADT. More generally, you can
> have modules with several abstract types, or cases with none, but the ADT is
> the essence of small-scale modularity.

cool, that makes sense, that's what I think of too. :)
How's this for a starting point of an example... libraries will want
to define lots of different ADTs, e.g. for trees, heaps, queues,
directed graphs, tables, XML, IO, ... Many will be built in terms of
each other... I guess the ML Standard Basis library is an example, or
the Isabelle libraries, or the ocaml batteries?

In the Isabelle libraries there is essentially no modularity
management of the library as a whole. Locally things are functorised
for some genericity, but overall everything just sits on a top-level.
So I guess if you wanted to include "just the right" bits, it would be
a little unclear how to do it. What I do at present is to load it all
up, then push out stuff from the top-level, and use the garbage
collector to clean up...

I think I see the problem of library management/modularity now - how
to make the (inter) dependencies of bigger libraries less ad-hoc. So
that we can easily identify which bits we use, what to link and what
not to...  cool, it's nice to see a problem. :)

In a sense, I think there are two issues here: one is name-space
management, and the other is linking/compilation of just what is
needed... if name-space is delt with properly, then it feels like all
linking/compilation should be inferable from some small bit of
top-level code that is explicitly intended to be used... it seems like
a garbage collection-like problem - track the detailed dependencies...
? So maybe the real issue is name-space management?

> But a module is not a library. A library (or package, or whatever term you
> prefer), is a collection of modules that are interdependent and closely
> related in terms of functionality as well as implementation. Building
> applications (or other libraries) from libraries is large-scale modularity.

Nice terminology; that's good, and I think I see the distinction you
are making now between libraries and modules. It makes lots of sense.

> I don't think it's useful trying to turn every individual module into a
> stand-alone library -- which is what always functorizing amounts to.

I guess I'm still not sure why... or maybe I do... I mean I agree that
if you carelessly include sub-structures that's bad in many ways and
leads to terribly big signatures; and so I agree that this is not 'the
right way' to do it. I guess I still think that we can change some
syntax and it might be all rosy...  :)

>>>> But when you specify a signature, you don't have to specify everything
>>>> it has, only a super-set of what you use, I can't see how you could
>>>> get away with less and maintain checking of module compositions...
>>>> actually I'm not sure I'm understanding this point, maybe you can
>>>> expand on it a bit?
>>>
>>> Does my explanation above answer it?
>>
>> Perhaps; but I think I might be misunderstanding, it seems that you're
>> suggesting just using the top-level to place background libraries. I
>> think this is often sensible, but this seems independent of having a
>> module system. So I suspect you're thinking of something else... ?
>
> Yes: I simply want to parameterize libraries as a whole over their library
> dependencies, with as little fuzz as possible. At the level of individual
> modules, I want to use functors only where I need genericity. The ML module
> system is great for the latter, but currently not very useful for the
> former, i.e., it wasn't designed as a package system.

I totally agree with this. :)

I guess now I've understood, I don't mind letting this tread
trickle-out, and maybe speaking to you in person would be more
efficient... :)

cheers!
lucas


      reply	other threads:[~2011-05-18  4:45 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-28  5:06 [Caml-list] " Lucas Dixon
2011-04-29 13:09 ` [Caml-list] " Andreas Rossberg
2011-04-30 22:19   ` Lucas Dixon
2011-05-03 18:47     ` Andreas Rossberg
2011-05-11 13:11       ` Lucas Dixon
2011-05-15  9:05         ` Andreas Rossberg
2011-05-18  4:45           ` Lucas Dixon [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BANLkTikNRy_5WomDwjhzS3cggzvHS0f2Jw@mail.gmail.com \
    --to=lucas.dixon@gmail.com \
    --cc=caml-list@inria.fr \
    --cc=rossberg@mpi-sws.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).