From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id OAA04977; Mon, 3 May 2004 14:40:23 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id OAA05244 for ; Mon, 3 May 2004 14:40:22 +0200 (MET DST) Received: from qrnik.knm.org.pl (paf87.warszawa.sdi.tpnet.pl [217.96.225.87]) by nez-perce.inria.fr (8.12.10/8.12.10) with ESMTP id i43CeIEV024685 for ; Mon, 3 May 2004 14:40:20 +0200 Received: from qrnik ([192.168.0.1] ident=qrczak) by qrnik.knm.org.pl with esmtp (Exim 3.36 #1) id 1BKck6-0001IZ-00 for caml-list@inria.fr; Mon, 03 May 2004 14:40:18 +0200 Subject: Re: [Caml-list] [ANN] The Missing Library From: "Marcin 'Qrczak' Kowalczyk" To: caml-list In-Reply-To: <1083581924.20722.584.camel@pelican.wigram> References: <200404280613.19547.jdh30@cam.ac.uk> <1083141467.9537.845.camel@pelican.wigram> <200404281018.14913.jdh30@cam.ac.uk> <1083151482.9537.904.camel@pelican.wigram> <93ADD9EA-9936-11D8-BD03-000A958FF2FE@wetware.com> <1083173516.9537.1162.camel@pelican.wigram> <1083542538.9152.65.camel@qrnik> <1083570883.20722.536.camel@pelican.wigram> <1083574691.1643.55.camel@qrnik> <1083581924.20722.584.camel@pelican.wigram> Content-Type: text/plain; charset=ISO-8859-2 Message-Id: <1083588018.1643.156.camel@qrnik> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.5.7 Date: Mon, 03 May 2004 14:40:18 +0200 Content-Transfer-Encoding: quoted-printable X-Miltered: at nez-perce with ID 40963DB2.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Loop: caml-list@inria.fr X-Spam: no; 0.00; caml-list:01 marcin:01 'qrczak':01 kowalczyk:01 qrczak:01 statically:01 generics:01 subtypes:01 subtypes:01 dynamically:01 extensible:01 enrich:01 non-portable:01 hash:01 generic:01 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk W li=B6cie z pon, 03-05-2004, godz. 20:58 +1000, skaller napisa=B3: > Felix is statically typed, everything is recursive, > recursive types are supported, it has generics, > overloading by selecting the most specialised=20 > function (like C++). I left static typing of my language for possible future research, because designing a good static type system is extremely hard. Type systems from the functional side typically don't support open world assumption wrt. the set of subtypes of a type. They can't express the OO way of adding subtypes in various parts of the program, without a centralized place where all such extensions are gathered and tied together into one type, and with the ability to dynamically check whether the given object has the given subtype of the declared type. In other words, you can't emulate dynamic typing in them. Well, SML and OCaml do have one extensible type, exn, but using it for other purposes than exceptions feels like a hack. Trying to extend functions which were previously defined for only some cases is worse - you either make a reference to a function, incrementally enrich it, and in effect check cases linearly, or do some non-portable tricks of putting an exception constructor in a hash table. It's all ugly. They poorly support generic programming in the sense of working generically on pure data terms of various types. For example implementing the natural equality and ordering for algebraic types in both OCaml and Haskell relies on builtin compiler magic. OCaml has builtin hashing and serialization, Haskell has builtin conversion to and from text - but if you wanted to implement hashing and serialization in Haskell, or textual printing and parsing of arbitrary values in OCaml, you have do to it by hand for different types. OCaml doesn't have Haskell-like type classes, polymorphic recursion nor type variables instantiated with higher order kinds. Haskell doesn't have parametrized modules, polymorphic variants nor an OO subsystem. Neither language allows to share record field labels between types. Neither language allows to implement these features without modifying the compiler. And their implementations of type checking is so complex that I'm afraid a few people in the world understand it enough to be able to change it. Type systems from the OO side do support extending the set of subtypes, and thus can emulate dynamic typing. But they don't support the other dimension of extensibility: adding an interface to an old type. They only recently learned about parametric polymorphism, without which they plainly sucked. (They =3D Java and C#; Eiffel did have parametric polymorphism earlier, but its type system has major holes.) And they lead to a verbose syntax because of lack of type inference. They used to be simple, but parametric polymorphism adds a significant complexity. Compared to that, dynamic typing is extremely easy and flexible. Yes, it doesn't detect so many errors in compile time; I miss it when using a dynamically typed language. And it leads to slightly slower code. But it's so simple! Both conceptually and in the implementation. I will try to design a static type system for my language in future. Preferably such that after removing type declarations, the program will work in the dynamically typed variant, which will not go away. The language is lexically scoped including globals, it avoids type puns, e.g. it distinguishes lists from tuples, conditionals require booleans, the tail of a list must be a list, variables are immutable by default, there are no uninitialized variables - so it should be quite close to what a static type system expects, but I still think it would be very hard to design a static type system which is not overly restrictive. > > and using a name before its definition has been executed, in > > any other way than attaching it to a closure, is an error, by necessity > > sometimes detected only at runtime. >=20 > In Felix this can't be a problem for functions, only for > variables. Unlike Ocaml,=20 >=20 > fun f ... >=20 > declares a class, whilst >=20 > let f x =3D .. >=20 > in Ocaml constructs a value. I don't understand. The problem in my language is that here: let f x {y + 1}; let z =3D f 10; let y =3D z + 1; there is no single place which should be detected as an error. Well, the problem is solved: execution of f will cause a runtime excepion, and the compiler reports statically only obvious cases where the use of a variable is outside an expression set to be executed later. > But I do no such thing for functions, they just run on the stack > (even though their stack frames are heap allocated, the return > addresses are not). I don't like imposing arbitrary resource limits without a good reason. The stack of my implementation is resized when needed, so there is no reason to avoid deep recursion if it only replaced a deep stack with a long heap-allocated list. For example Map on lists is implemented in the traditional non-tail-recursive way, and it works for long lists too. A small stack size would be a problem for threads. Since arguments and the return address are passed in global C variables, there is no need to allocate a stack frame in many short functions which don't contain non-tail calls, and in others it's often allocated only in those execution paths which need it. So the fact that allocation of a stack frame performs an overflow check should not be a big worry. I think there are two major performance problems (but just guessing): 1. Since the virtual registers are in global C variables, there is lot of moving data to and from them, even in simple functions. Putting some global variables in machine registers, possible in GCC, is problematic: on x86 there are so few registers that gcc sometimes fails to generate code when %esi or %edi are taken, and of course it generates worse code if fewer registers are available. I tried to put some variables in physical registers, the improvement was not big. Perhaps I will try again now, the code generator has evolved a bit since then. 2. Ordinary code uses dynamic dispatch a lot, e.g. for all arithmetic, and currently the only way to avoid that is to use inline C code (similar to inline asm in C). Even some type inference would not help much because of bignums (did I say that I hate arbitrary limitations? I can't just not have bignums!). Even if I implement inlining, not much could be inlined from a dispatched function. Performing the dispatch statically when possible would be very hard... --=20 __("< Marcin Kowalczyk \__/ qrczak@knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/ ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners