caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: John Gerard Malecki <johnm@artisan.com>
To: Caml List <caml-list@inria.fr>
Subject: [Caml-list] Wanted - General Purpose "Glue Logic" Data-Structures
Date: Wed, 19 Mar 2003 18:33:48 -0800	[thread overview]
Message-ID: <15993.10380.206589.498703@barrow.artisan.com> (raw)

I want to broaden my arsenal of data-structures that interface data
between the specific structures that some algorithms require.  Here is
an example of a real problem that I solved too specifically:

  I had an existing reader that produced a list of objects, 'a list.

  I wrote a filter which fractured those items producing 'a list list.

  I then wanted to feed that data to an existing program which builds
  a balanced tree top-down.  It expects 'a list which it then passes
  to a median finder which prefers its input list to be in random
  order.

In the great ocaml tradition this worked just fine and the first time.
(What a great language.)  To really test things I then ran it on
MOADB, the mother of all data bases, a very large but real-world
data-base.

The program broke down in 2 places.  First, List.concat was used to
convert the output of the fracturer from 'a list list to 'a list.  Not
only is List.concat not tail-recursive but @ (Pervasives.append) is
also not tail-recursive.  I modified the fracturer to directly produce
'a list.

Second, the median program has some limitations.  It not only prefers
the input to be randomized but while it runs it too constructs some 'a
list list.  (Why?  This median program returns not just the median but
the set of items lt and gt the median.)  I re-wrote the median program.

I would prefer not to keep re-writing programs and instead have a
better collection intermediate data-structures that I can use to glue
my programs together efficiently and safely.  One can argue that the
median program was already flawed and it was best to re-write it.
Still, it would be nice to have a data-structure which the fracturer
could produce that not only can deal with large data sets but also has
the randomizing property.  Of course I want all this at very little
expense as none of these manipulations are germane to the real problem
at hand.

I considered having the fracturer build a priority queue over random
numbers but all that balancing book-keeping seems expensive.  I guess
what I'm looking for are inexpensive data-structures that have
properties like

   - fast computation of cardinality
 
   - fast addition of single elements
 
   - fast addition of lists of elements
 
   - fast removal of single elements in random order
 
   - not choking on a large data size

Any suggestions?  Does anyone have other useful glue-logic
data-structures which they use?

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


             reply	other threads:[~2003-03-20 14:45 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-03-20  2:33 John Gerard Malecki [this message]
2003-03-20 16:53 ` Brian Hurt
2003-03-20 17:36 ` Matthew W. Boyd
2003-03-24  6:08 ` Nicolas Cannasse
2003-04-08  0:59 ` [Caml-list] " Wheeler Ruml
2003-04-08  9:12   ` Markus Mottl
2003-04-08 12:03     ` Yaron M. Minsky
2003-04-09  6:51       ` Jean-Christophe Filliatre
2003-04-09 18:12         ` Brian Hurt
2003-04-10  8:12           ` Jean-Christophe Filliatre
2003-04-10 10:35             ` Markus Mottl
2003-04-10 15:30               ` David Brown
2003-04-10 15:03             ` Brian Hurt
2003-04-08 15:07   ` Brian Hurt
2003-04-08 16:38     ` John Gerard Malecki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=15993.10380.206589.498703@barrow.artisan.com \
    --to=johnm@artisan.com \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).