caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: james woodyatt <jhw@wetware.com>
To: Ocaml Trade <caml-list@inria.fr>
Subject: Re: [Caml-list] Set union/inter/diff efficiency
Date: Wed, 27 Jul 2005 09:04:19 -0700	[thread overview]
Message-ID: <133225B6-7E0E-4FAE-9EFA-4AE9650BDD3E@wetware.com> (raw)
In-Reply-To: <200507271012.02020.jon@ffconsultancy.com>

On 27 Jul 2005, at 02:12, Jon Harrop wrote:
>
> Does anyone have any ideas or references on how the union/inter/ 
> diff functions
> of the Set module could be optimised by accepting a sequence of  
> sets rather
> than a pair at a time? For example, if A overlaps B overlaps C but  
> A does not
> overlap C then it is probably quicker to compute the union "(A U C)  
> U B"
> rather than "A U B U C".
>
> Better still, does anyone have a replacement Set module which  
> implements this
> functionality?

No, but you could maybe make an extension more easily using my OCaml  
NAE core foundation library.

Here is the pseudo-code for set union that I would try:

     Make a heap of sets [Cf_heap.of_seq].
     Map into a sequence of sets [Cf_heap.to_seq].
     Map into a sequence of element sequences [Cf_seq.map,  
Cf_set.to_seq_incr].
     Load into a queue.
     While queue is not empty,
       Take an element sequence from the queue.
       Take an element from the head of the sequence.
       If there is no output yet, or the element is greater than  
current output, then
         Output the element
       If the element sequence tail is not empty, then
         Push the element sequence tail onto the queue
     End while

You could do similar things for difference and intersection.

I'm not optimistic that this will actually improve performance.   
Beating the implementation in the standard library is tricky and  
harder than one might think.


-- 
j h woodyatt <jhw@wetware.com>
markets are only free to the people who own them.


  parent reply	other threads:[~2005-07-27 16:04 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-07-27  9:12 Jon Harrop
2005-07-27  9:42 ` [Caml-list] " Diego Olivier Fernandez Pons
2005-07-27 16:04 ` james woodyatt [this message]
2005-07-27 17:00   ` james woodyatt
2005-07-27 17:32     ` james woodyatt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=133225B6-7E0E-4FAE-9EFA-4AE9650BDD3E@wetware.com \
    --to=jhw@wetware.com \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).