Great work! One small question: someone suggested me to rewrite Vec to add a Leaf (Node_without_children) construct, to cut down on the number of Empty instances I use. I have so far not done that (no reason in particular, but I was busy with other things). In light of your experiments, would you also advise me to do it? All the best, Luca On 7/28/07, Mauricio Fernandez wrote: > > In the aftermath of the ICFP contest, during which I used Luca de Alfaro's > Vec, I felt like implementing ropes, based on Boehm's paper and the > well-known > code included in his conservative garbage collector. > > I later realized that the basic implementation strategies ("dense" leaves, > bounded tree height and amortized constant time concatenation of small > strings) could be generalized to build general extensible vectors similar > to > Vec. > > Such vectors (tentatively named "Vect" until I find a better name) have > some > interesting properties: > * lower space overhead than other structures based on balanced trees such > as > Vec. The overhead can be adjusted, allowing to make "get" faster at the > expense of "set" and viceversa. > * appending or prepending a small vector to an arbitrarily large one in > amortized constant time > * concat, subarray, insert, remove operations in amortized logarithmic > time > * access and modification (get, set) in logarithmic time > > The first one is probably the most compelling. Currently, Vec imposes a > 6-word > overhead per element. Even after the obvious modification consisting in > adding > a new constructor for leaves, the overhead would still be 350%... Vect > uses > compact leaves with a configurable number of elements (32-64 seem good > choices, leading to worst-case overheads of 100% and 50% respectively), > which > also helps with "get" due to the improved spatial locality. > > You can find the code for both Rope and Vect at > http://eigenclass.org/repos/oropes/head/ > It is still young and experimental, but it's maybe worth a look. Any > feedback > is very welcome. > > The documentation can be found under > http://eigenclass.org/repos/oropes/head/doc/ > > I've spent some time benchmarking it against Vec; you can also find the > code I used and the resulting graphs at the above address. > > To summarise how it compares to Vec: > * Vec can be used when persistence is required, but Vect would probably be > a > poor choice in this case (until that is fixed using lazy rebuilding, > which > doesn't seem too hard), unless rebalancing explicitly before "taking the > snapshot" is an option > * Vect can append/prepend single elements or small vectors very quickly, > in > amortized constant time. See > http://eigenclass.org/repos/oropes/head/append.png > * as expected, Vec.set is faster than Vect's in general > http://eigenclass.org/repos/oropes/head/set.png > However, if the vector is balanced explicitly shortly before an update > burst, Vect is somewhat surprisingly faster > http://eigenclass.org/repos/oropes/head/set-balanced.png > This might be attributed to Vect's smaller memory profile and the fact > that > it allows better use of the L2 cache, but there seems to be another > factor > that I have yet to discover. > * Vect.get is considerably faster than Vec.get > http://eigenclass.org/repos/oropes/head/get.png > > The above URL is a darcs repository, so if anybody shoots me a patch I'll > be > happy to apply it :) > > Regards, > > -- > Mauricio Fernandez - http://eigenclass.org > > _______________________________________________ > Caml-list mailing list. Subscription management: > http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list > Archives: http://caml.inria.fr > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs >