caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] Wanted - General Purpose "Glue Logic" Data-Structures
@ 2003-03-20  2:33 John Gerard Malecki
  2003-03-20 16:53 ` Brian Hurt
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: John Gerard Malecki @ 2003-03-20  2:33 UTC (permalink / raw)
  To: Caml List

I want to broaden my arsenal of data-structures that interface data
between the specific structures that some algorithms require.  Here is
an example of a real problem that I solved too specifically:

  I had an existing reader that produced a list of objects, 'a list.

  I wrote a filter which fractured those items producing 'a list list.

  I then wanted to feed that data to an existing program which builds
  a balanced tree top-down.  It expects 'a list which it then passes
  to a median finder which prefers its input list to be in random
  order.

In the great ocaml tradition this worked just fine and the first time.
(What a great language.)  To really test things I then ran it on
MOADB, the mother of all data bases, a very large but real-world
data-base.

The program broke down in 2 places.  First, List.concat was used to
convert the output of the fracturer from 'a list list to 'a list.  Not
only is List.concat not tail-recursive but @ (Pervasives.append) is
also not tail-recursive.  I modified the fracturer to directly produce
'a list.

Second, the median program has some limitations.  It not only prefers
the input to be randomized but while it runs it too constructs some 'a
list list.  (Why?  This median program returns not just the median but
the set of items lt and gt the median.)  I re-wrote the median program.

I would prefer not to keep re-writing programs and instead have a
better collection intermediate data-structures that I can use to glue
my programs together efficiently and safely.  One can argue that the
median program was already flawed and it was best to re-write it.
Still, it would be nice to have a data-structure which the fracturer
could produce that not only can deal with large data sets but also has
the randomizing property.  Of course I want all this at very little
expense as none of these manipulations are germane to the real problem
at hand.

I considered having the fracturer build a priority queue over random
numbers but all that balancing book-keeping seems expensive.  I guess
what I'm looking for are inexpensive data-structures that have
properties like

   - fast computation of cardinality
 
   - fast addition of single elements
 
   - fast addition of lists of elements
 
   - fast removal of single elements in random order
 
   - not choking on a large data size

Any suggestions?  Does anyone have other useful glue-logic
data-structures which they use?

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2003-04-10 15:31 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-03-20  2:33 [Caml-list] Wanted - General Purpose "Glue Logic" Data-Structures John Gerard Malecki
2003-03-20 16:53 ` Brian Hurt
2003-03-20 17:36 ` Matthew W. Boyd
2003-03-24  6:08 ` Nicolas Cannasse
2003-04-08  0:59 ` [Caml-list] " Wheeler Ruml
2003-04-08  9:12   ` Markus Mottl
2003-04-08 12:03     ` Yaron M. Minsky
2003-04-09  6:51       ` Jean-Christophe Filliatre
2003-04-09 18:12         ` Brian Hurt
2003-04-10  8:12           ` Jean-Christophe Filliatre
2003-04-10 10:35             ` Markus Mottl
2003-04-10 15:30               ` David Brown
2003-04-10 15:03             ` Brian Hurt
2003-04-08 15:07   ` Brian Hurt
2003-04-08 16:38     ` John Gerard Malecki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).