caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: skaller <skaller@users.sourceforge.net>
To: Brian Hurt <bhurt@spnz.org>
Cc: briand@aracnet.com, John Prevost <j.prevost@gmail.com>,
	Ocaml Mailing List <caml-list@inria.fr>
Subject: Re: [Caml-list] looping recursion
Date: 29 Jul 2004 08:05:59 +1000	[thread overview]
Message-ID: <1091052358.5870.1000.camel@pelican.wigram> (raw)
In-Reply-To: <Pine.LNX.4.44.0407280907470.6739-100000@localhost.localdomain>

On Thu, 2004-07-29 at 00:36, Brian Hurt wrote:
> On 28 Jul 2004, skaller wrote:
> 
> > On Wed, 2004-07-28 at 11:43, Brian Hurt wrote:
> > > On Tue, 27 Jul 2004 briand@aracnet.com wrote:
> > 
> > > Very long lists are a sign that you're using the wrong data structure.
> > 
> > What would you recommend for a sequence of tokens?
> > Streams are slow and hard to match on.. bucket lists
> > have lower storage overhead but hard to match on.
> 
> Extlib Enumerations.  For short lists, yeah they're slower than lists.  

That doesn't matter -- the lists are long by specification.

> But for long lists, I could see them being a lot faster.  Don't forget 
> cache effects- streaming processing can have much better cache behavior 
> than repeatedly walking a long list (too large to fit into cache). 

Can't pattern match on them. One reason for building
a list is I filter it, for example, in Felix I strip out white space
tokens, in Vyper (Python interpreter written in Ocaml)
I did something like 13 separate passes to handle
the indentation and other quirks to precondition the input
to the parser so it became LALR(1).

So, I'd have to use a list as a buffer for the head of the stream
anyhow..

Also, there is a serious design problem with ExtLib Enums.
Although the data structure appears functional, it doesn't
specify when things happen precisely.

In particular if the input is a stream, that is, uses 
mutators to extract elements, then instead of using
the persistence and laziness so you can use the Enums
as forward iterators -- for example in a backtracking
parser -- the Enums actually degrade to uncopyable
input iterators.

Since Ocamllex uses a mutable lex buffer, the Enums
based on them are also non-functional input iterators ..
[I can get around that by calling 'force()' but that
totally defeats the purpose of using Enums .. :]

Whereas, a plain old list is a purely functional
forward iterator, and unquestionably works with
a backtracking parser.

As an example of a simple modification I could do that
won't work easily with uncontrolled control inversion:
suppose I cache the token stream on disk, and in
particular Marshal file 'fred.flx' out as 'fred.tokens'.
[Now you *have* to force() all the iterators, or
each one inside the #include will write the file
to disk at the end of the sub-file .. but that 
should only be done once -- its quite slow writing
a file to disk .. forcing all the enums makes
separate copies of the tokens .. argggg .. ]

The problem goes away when I manually build lists
and preprocess them because I have explicit control.

Bottom line is that Enums work fine to integrate
purely functional data structures together but they're
not very useful mixing coupled streams together.

Crudely -- if you have a hierarchy of streams
you may need to read them in a particular order
due to the coupling .. with STL input iterators
you can do that, with hand written Ocaml
you can do that -- with Enums you can't.

-- 
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net



-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


  reply	other threads:[~2004-07-28 22:06 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-07-27 23:43 briand
2004-07-28  0:27 ` John Prevost
2004-07-28  0:38   ` John Prevost
2004-07-28  1:17     ` skaller
2004-07-28  1:05   ` briand
2004-07-28  1:43     ` Brian Hurt
2004-07-28  2:49       ` briand
2004-07-28  3:12         ` Brian Hurt
2004-07-28  3:20         ` Brian Hurt
2004-07-28  5:54         ` brogoff
2004-07-28  7:22           ` Alex Baretta
2004-07-28 16:38             ` brogoff
2004-07-28 19:40               ` Jon Harrop
2004-07-28 20:18                 ` Brandon J. Van Every
2004-07-29  6:01                   ` Alex Baretta
2004-07-28 21:22                 ` brogoff
2004-07-29  9:13                   ` Daniel Andor
2004-07-29  9:25                     ` Keith Wansbrough
2004-07-29  9:41                       ` Nicolas Cannasse
2004-07-29  9:57                       ` Xavier Leroy
2004-07-29 10:44                         ` Daniel Andor
2004-07-29 12:56                           ` brogoff
2004-07-29 10:11                     ` skaller
2004-07-29 12:41                     ` brogoff
2004-07-29  6:28               ` Alex Baretta
2004-07-29 14:58                 ` brogoff
2004-07-29 16:12                   ` Brian Hurt
2004-07-29 17:49                     ` james woodyatt
2004-07-29 19:25                       ` Brian Hurt
2004-07-29 20:01                         ` brogoff
2004-07-30  4:42                           ` james woodyatt
2004-07-29 17:44                   ` james woodyatt
2004-07-29 23:12                     ` skaller
2004-07-29 22:42                   ` Alex Baretta
2004-07-30  2:38                     ` Corey O'Connor
     [not found]                     ` <200407300136.14042.jon@jdh30.plus.com>
2004-07-30 12:45                       ` Alex Baretta
2004-07-30 17:07                     ` brogoff
2004-07-30 18:25                       ` [Caml-list] kaplan-okasaki-tarjan deque (was "looping recursion") james woodyatt
2004-07-30 21:20                         ` brogoff
2004-07-31  5:37                           ` james woodyatt
2004-07-28  7:27       ` [Caml-list] looping recursion skaller
2004-07-28 14:36         ` Brian Hurt
2004-07-28 22:05           ` skaller [this message]
2004-07-28  0:37 ` skaller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1091052358.5870.1000.camel@pelican.wigram \
    --to=skaller@users.sourceforge.net \
    --cc=bhurt@spnz.org \
    --cc=briand@aracnet.com \
    --cc=caml-list@inria.fr \
    --cc=j.prevost@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).