From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id IAA16905; Thu, 29 Jul 2004 08:27:01 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id IAA16523 for ; Thu, 29 Jul 2004 08:27:00 +0200 (MET DST) Received: from alex.baretta.com (h213-255-109-130.albacom.net [213.255.109.130] (may be forged)) by nez-perce.inria.fr (8.12.10/8.12.10) with ESMTP id i6T6QxEV012739 for ; Thu, 29 Jul 2004 08:26:59 +0200 Received: from baretta.com (unknown [127.0.0.1]) by alex.baretta.com (Postfix) with ESMTP id E6E0F2D5B2E; Thu, 29 Jul 2004 02:28:00 -0400 (EDT) Message-ID: <410898F0.6030804@baretta.com> Date: Thu, 29 Jul 2004 08:28:00 +0200 From: Alex Baretta User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040113 X-Accept-Language: it, en-us, en MIME-Version: 1.0 To: brogoff Cc: Ocaml Subject: Re: [Caml-list] looping recursion References: <16646.64470.304530.264731@soggy.deldotd.com> <16647.5177.849829.421587@soggy.deldotd.com> <4107544C.4040502@baretta.com> In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Miltered: at nez-perce with ID 410898B3.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Loop: caml-list@inria.fr X-Spam: no; 0.00; baretta:01 baretta:01 caml-list:01 recursion:01 brogoff:01 brogoff:01 000,:01 caml-list:01 petty:01 whining:01 unspecified:01 locality:01 caching:01 unbounded:01 unbounded:01 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk brogoff wrote: > On Wed, 28 Jul 2004, Alex Baretta wrote: > >>brogoff wrote: >> >>>Sometimes lists are best. If lists are OK for 100, 1_000, or 10_000 items, >>>why not 100_000 or 1_000_000? Map and friends shouldn't blow stack. >>> >>>-- Brian >> >>Actually, it's not that bad. > > > Well, I'm still on the caml-list, so of course it isn't *that* bad. Also, as > I said, if you need a tail recursive map over built in lists, you have at > least two options. Unfortunately, my preference is to use Obj, which IMO > points to a deficiency in the core language. Most, maybe all, of my issues > with OCaml, amount to petty complaints, no showstoppers or dealbreakers. > Hey, at least I'm not whining about the license! Yes, I guess we simply can live with it. I can't wait to have STL iterators in Ocaml, but meanwhile I can live with this functional nonsense: lists, maps, sets... ;) > There are quite a few cases where singly linked lists are the most efficient > data structure, and they're just right for the problem. Streams and laziness > are not at all efficient in comparison. Stack overflow commonly happens whenm > one of my users (yup, I have users of my code who aren't me ;-) decides to > run on data much larger than I'd anticipated. Huge lists are usually not an efficient data structure. Lists are local structures: at any point you can only observe a fixed number of head elements (usually one at a time) and an unspecified tail. Such locality generally makes caching the entire data structure useless. Long lists (lists of a priori unbounded length) arise from user input: typically the are obtained by reading in a file of a priori unbounded length. In such cases lists are very much inefficient from the point of view of memory complexity: lists cost O(n) where n is the size of the input; streams cost O(1). Both cost O(n) in terms of time complexity. Usually the potential speed benefit of lists is amply counterbalanced by reduction in memory footprint, which usually means no memory swapping in the kernel. Here's an obvious example (not too obvious: I fell for it at first). I have a XML syntax specifying relational databases. Essentially an entire relational database can be represented with it. This XML lexicon is meant for now only for the database initial schema definition and for human-readable backups. I initially wrote the backup algorithm in terms of transformations on a list of PXP trees which were finally serialized. Algebraically, this is perfect. The only problem is that this algorithm stores the entire contents of a database in memory before serializing everything. Ooops, here's a SIGKILL landing in! Now I only use PXP trees to represent the database schema, which I immediately begin to serialize. All the while I pass around continuations which are invoked iteratively on subnodes. If a subnode is an table-data node I declare an SQL cursor in the database and begin walking through the table. Result, where my initial memory footprint bought me a SIGKILL on a 2GB server now I have bounded memory usage (a few megabytes). I don't swap, so the the backup process actually runs an order of magnitude faster. And I actually get my output at the end ;) > Also, lists are builtin, and have syntactic support in OCaml, by which I mean > nice, easy to read syntax. So the language encourages you to use them. They are sooo cool! I actually ask all trainees to reimplement the List module. But then the industrial practice is to use lists only where there is no input or as the result of some kind of slicing or research applied to bigger and more efficient data structures. I actually use a list iteration to schedule tasks in my real-time control kernel, but of course, I don't expect my collegues to write more than a few tasks at a time for my kernel to schedule: safety, liveness and UI. ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners