From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: weis Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id XAA20388 for caml-redistribution; Sun, 5 Dec 1999 23:25:38 +0100 (MET) Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id HAA24065 for ; Sat, 4 Dec 1999 07:54:21 +0100 (MET) Received: from isil.maya.com (HOPKINS.PC.CS.CMU.EDU [128.2.215.35]) by nez-perce.inria.fr (8.8.7/8.8.7) with ESMTP id HAA13098 for ; Sat, 4 Dec 1999 07:54:18 +0100 (MET) Received: (from prevost@localhost) by isil.maya.com (8.9.3/8.9.3) id BAA14561; Sat, 4 Dec 1999 01:45:44 -0500 X-Authentication-Warning: isil.maya.com: prevost set sender to prevost@maya.com using -f Sender: weis To: caml-list@inria.fr Subject: Module hierarchy revisited From: John Prevost Date: 04 Dec 1999 01:45:44 -0500 Message-ID: <87bt876w6f.fsf@isil.maya.com> User-Agent: Gnus/5.0803 (Gnus v5.8.3) Emacs/20.4 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Idea for namespace management, followed by a question or two, and explanation of why. I just came up with what seems like a reasonable way to package my modules hierarchically (to avoid namespace collisions) in a reasonable way. The idea: For each "package" of stuff, the various modules (individual object files) have short names, like "Foo", "Bar", and "Baz". The danger, of course, is that other packages from other sources will have names like Foo Bar and Baz, because they're so short. My current working solution is to add as the last object file something like this: ===mypackage.ml=== module Foo = Foo module Bar = Bar module Baz = Baz ================== Why am I doing this? Well, with Gord Stolpmann's findlib, there's a convenient way to link against a bunch of "packages" by name. This is nice, because you don't have to manage ordering and -Iing of all of those files and directories by hand. The difficulty is that control over which modules come first is a problem. Say we have two packages: package1 => package1.cma (foo.cmo frubble.cmo) package2 => package2.cma (foo.cmo bar.cmo zotz.cmo) If I do: ocamlc package1.cma package2.cma I lose access to package1's Foo module in any files I use after this. If I do it in the other order, I lose access to package2's Foo module. I can work around this by putting things that want foo from package1.cma before the reference to package2.cma, but this can be a pain. I could also insert a file between the two which binds package1's Foo to something else: ===foobinder.ml=== module Zot = Foo ================== Now I can refer to Foo from either package later on (which is important if I want to use both in one file, or I have dependencies which don't allow a simple ordering of my modules and the libraries.) Even more, it's a pain to have to think about this stuff. I'd rather just be able to know it won't happen. What adding the extra module at the end of the library does for me as a library author is arrange for an "automatic" binding like this to take place. Mypackage.Foo Mypackage.Bar and Mypackage.Baz will always be the modules from Mypackage unless Mypackage is shadowed. And the namespace of packages tends to be nicer and cleaner than the namespace of individual modules in those packages. (Say, Text.Parser for low-level Unicode parsers vs XML.Parser for a module that does XML parsing.) One could extend this further by having super-packages which provide namespace to a number of other packages: module Apollo = struct module XML = struct module Parser = ... ... end module Text = struct module Parser = ... ... end end or module Apollo = struct module XML = XML module Text = Text end So the question I have is whether people think that organizing things in this manner is a Good Thing, and if people have opinions on whether there's a Right Way to go about doing this and choosing names for things. As an example, I have a package I call "text" which has a text.cma containing a module Text which points at the other modules by name. But the name "Text" is pretty broad, and could collide easily with other people, even when both packages would be useful. A second question is whether anyone has recommendations for hiding the "other" bindings of modules (i.e. I don't want Iso_10646 to appear in the top-level namespace, I only want Text to appear, containing Text.Iso_10646) to keep people from referring to the modules in less safe ways. I'm thinking about this because I'd like to put some modules out there for people to use, and the community-driven standards in the world of Perl, for example, allow huge numbers of modules from all over to be mixed and matched at will. O'Caml stuff, on the other hand, tends to be much more willy-nilly, making me think of the world of C libraries, where people are much more likely to write their own library to do something than to use someone else's, just because hooking things together and finding libraries and the like is so painful. findlib provides some nice features along these lines (though I think it'd be nice if some of this functionality were folded into the standard ocaml distribution, to encourage people to use it), but without a discipline (community-driven, of course) for managing the published module namespace, I don't think library development is likely to grow like it has in Perl and Java-land--even with more people developing. In short, it's nice that ocaml has a compilation model that allows more traditional models of building software than SML's CM package. I'm much more comfortable with Makefiles and separate compilation than with CM. But there *are* amenities that would make library creation, publication, and use much more common. jmp