From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id VAA14300; Wed, 5 Dec 2001 21:47:12 +0100 (MET) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id VAA14242 for caml-list@pauillac.inria.fr; Wed, 5 Dec 2001 21:47:11 +0100 (MET) Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id HAA02557 for ; Wed, 5 Dec 2001 07:19:23 +0100 (MET) Received: from tcsnpop1.tcsn.uswest.net (tcsnpop1.tcsn.uswest.net [207.108.112.1]) by nez-perce.inria.fr (8.11.1/8.11.1) with SMTP id fB56JL122156 for ; Wed, 5 Dec 2001 07:19:22 +0100 (MET) Received: (qmail 3274 invoked by uid 50); 5 Dec 2001 06:19:20 -0000 Delivered-To: fixup-caml-list@inria.fr@fixme Received: (qmail 3269 invoked by uid 0); 5 Dec 2001 06:19:19 -0000 Received: from adslppp183.tcsn.uswest.net (HELO dylan) (216.161.144.183) by tcsnpop1.tcsn.uswest.net with SMTP; 5 Dec 2001 06:19:19 -0000 Message-ID: <001e01c17d54$e5ed65f0$210148bf@dylan> From: "David McClain" To: Subject: [Caml-list] Code-base size comparison vs Language Date: Tue, 4 Dec 2001 23:20:12 -0700 MIME-Version: 1.0 Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.00.2919.6600 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6600 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk I just finished my experiment to reduce the size of a fielded application by recoding in either of Lisp or OCaml. I had early indications that, aside from pure ease of programming in these HLL's, the overall code base would be drastically reduced (5x to 6x). That is certainly true if you count all the source code needed to produce the application, but an honest, impartial, comparison of the lines I actually had to write, of non-reusable, application specific code shows somewhat disappointing results on this basis alone. The application is a system network server that performs recursive prefix mappings of file pathnames, including environment variable substitutions. This is a variation on the system provided by the Sprite experimental OS developed at UCB by John Ousterhout, et. al. in the late 1980's and early 1990's. The existing version was coded in M$ VC++ making heavy use of STL. It is a COM/OLE process server based on M$ ATL. All three versions retain a machine generated ATL wrapper code for this COM/OLE behavior -- I only needed to write a few lines of IDL to produce the basic skeleton, and all three versions use identical stuff here... For the application specific coding, the scores are: Existing App: C/C++ = 1106 LOC Lisp Version: C/C++ = 284 LOC, Lisp = 798 LOC --> Total = 1082 LOC OCaml Version: C/C++ = 284 LOC, Lisp = 58 LOC, OCaml = 453 LOC --> Total = 888 These LOC counts do *not* include blank lines and comment only lines. On the basis of code-base size reduction, these results are nearly a tie. But on the basis of ease of programming, I have to award Lisp first, followed by OCaml, and distantly trailed by C/C++. The reasons for this are: 1. Lisp is a huge langauge with nearly everything you need already built in. But it produces very bulky DLL's -- on the order of 15 MBytes. 2. OCaml is equally terse as Lisp, or even slightly better, but needs a fair amount of additional support routines written, to cover the application needs. Some of this is in C/C++ (very little) but most has to do with providing things like unwind-protect, generalized string handling, generalized list operations. It produces very fast runtime code (not needed here) and quite reasonably sized DLL's -- about 300 KBytes (50x smaller than Lisp!!) 3. C/C++, making heavy use of classes and STL is nearly unreadable, took a long time to program, and is frightening to revisit after some time away from it (1 year or more since original writing). C/C++ retains the capability to utilize Unicode (FWIW -- I don't really need it), but it was written with some embedded bugs that I found only when I was able to remain at the abstract levels permitted by HOL's. Both the Lisp and OCaml versions were written in the course of 2-3 hours. Writing the C/C++ version took the better part of 1 week. Prior to that I had written experimental versions in Lisp and had more than a year of playing with the system to get an understanding of the needed algorithms. I will say that both Lisp and OCaml allowed me to spot some errors in the C/C++ implementation, fix those errors, and add some extra capability (about 20 LOC in both Lisp and OCaml for the extra stuff). I estimate the time needed to go back and refamiliarize myself with STL and the internal architecture of the existing application -- in order to fix the bugs I discovered and add the additional capabilities -- would be several days. I find it remarkable that OCaml has a slight edge on Lisp for terseness of expression. OCaml is a highly expressive syntax and you can say quite a lot in a few keystrokes. Lisp tends to be more wordy, use longer identifiers, and the code is quite a bit sparser for semantic content over a given number of LOC. This is as close as I can come to providing an honest, impartial, comparison of these languages for the purpose of rewriting existing code to be more maintainable, robust, and correct. I definitely think the effort is worthwhile, but not entirely for the reasons I had originally anticipated. Cheers, - David McClain, Sr. Scientist, Raytheon Systems Co., Tucson, AZ ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr