From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id AAA02119; Thu, 29 Jul 2004 00:06:13 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id AAA02291 for ; Thu, 29 Jul 2004 00:06:11 +0200 (MET DST) Received: from smtp3.adl2.internode.on.net (smtp3.adl2.internode.on.net [203.16.214.203]) by nez-perce.inria.fr (8.12.10/8.12.10) with ESMTP id i6SM68EV000307 for ; Thu, 29 Jul 2004 00:06:09 +0200 Received: from [192.168.1.200] (ppp205-61.lns1.syd3.internode.on.net [203.122.205.61]) by smtp3.adl2.internode.on.net (8.12.9/8.12.9) with ESMTP id i6SM5xHY068856; Thu, 29 Jul 2004 07:36:00 +0930 (CST) Subject: Re: [Caml-list] looping recursion From: skaller Reply-To: skaller@users.sourceforge.net To: Brian Hurt Cc: briand@aracnet.com, John Prevost , Ocaml Mailing List In-Reply-To: References: Content-Type: text/plain Message-Id: <1091052358.5870.1000.camel@pelican.wigram> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 (1.2.2-4) Date: 29 Jul 2004 08:05:59 +1000 Content-Transfer-Encoding: 7bit X-Miltered: at nez-perce with ID 41082350.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Loop: caml-list@inria.fr X-Spam: no; 0.00; caml-list:01 recursion:01 sourceforge:01 2004:99 2004:99 extlib:01 enumerations:01 slower:01 streaming:99 vyper:01 python:01 indentation:01 quirks:01 precondition:01 lalr:01 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk On Thu, 2004-07-29 at 00:36, Brian Hurt wrote: > On 28 Jul 2004, skaller wrote: > > > On Wed, 2004-07-28 at 11:43, Brian Hurt wrote: > > > On Tue, 27 Jul 2004 briand@aracnet.com wrote: > > > > > Very long lists are a sign that you're using the wrong data structure. > > > > What would you recommend for a sequence of tokens? > > Streams are slow and hard to match on.. bucket lists > > have lower storage overhead but hard to match on. > > Extlib Enumerations. For short lists, yeah they're slower than lists. That doesn't matter -- the lists are long by specification. > But for long lists, I could see them being a lot faster. Don't forget > cache effects- streaming processing can have much better cache behavior > than repeatedly walking a long list (too large to fit into cache). Can't pattern match on them. One reason for building a list is I filter it, for example, in Felix I strip out white space tokens, in Vyper (Python interpreter written in Ocaml) I did something like 13 separate passes to handle the indentation and other quirks to precondition the input to the parser so it became LALR(1). So, I'd have to use a list as a buffer for the head of the stream anyhow.. Also, there is a serious design problem with ExtLib Enums. Although the data structure appears functional, it doesn't specify when things happen precisely. In particular if the input is a stream, that is, uses mutators to extract elements, then instead of using the persistence and laziness so you can use the Enums as forward iterators -- for example in a backtracking parser -- the Enums actually degrade to uncopyable input iterators. Since Ocamllex uses a mutable lex buffer, the Enums based on them are also non-functional input iterators .. [I can get around that by calling 'force()' but that totally defeats the purpose of using Enums .. :] Whereas, a plain old list is a purely functional forward iterator, and unquestionably works with a backtracking parser. As an example of a simple modification I could do that won't work easily with uncontrolled control inversion: suppose I cache the token stream on disk, and in particular Marshal file 'fred.flx' out as 'fred.tokens'. [Now you *have* to force() all the iterators, or each one inside the #include will write the file to disk at the end of the sub-file .. but that should only be done once -- its quite slow writing a file to disk .. forcing all the enums makes separate copies of the tokens .. argggg .. ] The problem goes away when I manually build lists and preprocess them because I have explicit control. Bottom line is that Enums work fine to integrate purely functional data structures together but they're not very useful mixing coupled streams together. Crudely -- if you have a hierarchy of streams you may need to read them in a particular order due to the coupling .. with STL input iterators you can do that, with hand written Ocaml you can do that -- with Enums you can't. -- John Skaller, mailto:skaller@users.sf.net voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners