From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id MAA10442; Thu, 29 Aug 2002 12:16:58 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id MAA10248 for ; Thu, 29 Aug 2002 12:16:57 +0200 (MET DST) Received: from shiva.jussieu.fr (shiva.jussieu.fr [134.157.0.129]) by concorde.inria.fr (8.11.1/8.11.1) with ESMTP id g7TAGuX27436 for ; Thu, 29 Aug 2002 12:16:56 +0200 (MET DST) Received: from ibm3.cicrp.jussieu.fr (ibm3.cicrp.jussieu.fr [134.157.15.3]) by shiva.jussieu.fr (8.12.5/jtpda-5.4) with ESMTP id g7TAGsjR001260 ; Thu, 29 Aug 2002 12:16:54 +0200 (CEST) Received: from ibm1.cicrp.jussieu.fr (ibm1.cicrp.jussieu.fr [134.157.15.1]) by ibm3.cicrp.jussieu.fr (8.8.8/jtpda/mob-V8) with ESMTP id MAA21030 ; Thu, 29 Aug 2002 12:15:15 +0200 Received: from localhost (fernande@localhost) by ibm1.cicrp.jussieu.fr (8.8.8/jtpda/mob-v8) with SMTP id MAA13238 ; Thu, 29 Aug 2002 12:13:19 +0200 Date: Thu, 29 Aug 2002 12:13:18 +0200 (DST) From: Diego Olivier Fernandez Pons To: Tibor Simko cc: caml-list@inria.fr Subject: Re: [Caml-list] intersecting huge integer sets In-Reply-To: <87n0r835w9.fsf@pcdh91.cern.ch> Message-ID: MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="-2036527359-1850874265-1030615998=:108094" X-Antivirus: scanned by sophie at shiva.jussieu.fr Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info. ---2036527359-1850874265-1030615998=:108094 Content-Type: TEXT/PLAIN; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Bonjour, I made a small module with tries in base n where n is inferior to the native integer size. I supports big-endian tries and little-endian tries (in fact, user defined decomposition) avaible functions : - insert (log n) - delete (log n) - from_list (n log n) - to_list (unknown) - union (shoud be n) - intersection (should be n) I do not know if they are faster or not than bit-vectors or patricia trees, but they save a lot of space A set of integers from 0 to 9999 (I tried with 1 000 000 items as you suggested but my function never ended. I remember a computation by Xavier Leroy showing that a counter could hardly overflow) (results given by JC Fill=E2tre's module Size) avl set : 200 000 bytes red-black set : 160 012 bytes 30 bit trie big-endian : 5 524 bytes 31 bit trie big-endian : 5 312 bytes 30 bit trie little-endian : 15 020 bytes 31 bit trie little-endian : 16 016 bytes The implementation is not speed optimized. In fact one could also save some more space compacting bits for bases inferior to int_size / 2 Diego Olivier ---2036527359-1850874265-1030615998=:108094 Content-Type: APPLICATION/octet-stream; name="bitSet.ml.gz" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: bitSet H4sICMnjbT0AA2JpdFNldC5tbADVV81u2zgQvvsp2J4kN+m66C3bFN3D3oq+ QBAItEWvCUuiIdGNXOSF+xY7nCGpESW7wXpRIDokMjk/38x8Q47s6aBEp6y4 F9+E2QrdWLHEBdm28rRYLCp4X1dmsy86/UOB3McVrGbLpdiaZmO1aTpRKlH+ 3Jj6YDrtVsRymZNmqzZirf8pVFNq2RTuZ6u6Y+Ucbo+kvxDiWfTiaaca8SQ+ cW+3n2Hj7s7reEFYTExmvahNyRWjTu42/xh2PK7BAAMya/vhUfRDLJW2tlJT VR5DP4nhoX8csM9hHZudRey8b1tTFxchAFpwscL3nZKlM26lrtwq/n7HPS9F NmPSyadOWb6ssbIa+Q1ucWcx4zkxQTUb4SCj71AzRz0CkDpnxap0Z88ZX+Eu mbCmOEi7A9VBiBmf2UN2m8PPVhK9j91RVZXqiNao+kO1hoPJcgflm8hWN+Lh +fnR5+/YaDshGEh9EFVXiX4kq7tC1Qd7SkvKjGKS26PC9QKDl1WnvH7TqdYW tez28F6qnjsGI27jBtSVcnbAhFP6Lquja2oCRGra+RWilnazE2iukk3pRZ+0 3eG2g7Dy4ZCQaUnGO4lSCLSVulPi717bgVkesXtFqAJr4ZQJICHAxeg2cG0L FHGL4i2zcicog07lbRDvUZ5np0cffj/hqkcdnCMYFjPpxGyGqrBdQYnF4xPC +sv9f1/LveKNlyF/sjwPyWYPar7PqBM+3fIsIUa+n+pGYCiUz4OmvVnUjXoq xsg35nDywUyhRvELcFOZs5CjYM75LEY0Hrhh2xN3lYUu7/NQXGSHIxx1jfJt Uir4q35/m7AGGDVGbJ8+7Z/YJx7ylX3CrFzoE56e19UnLL7X0Ccp3P/cJ2To Up8wVy/vEyddq3qt2muJx6xcIJ7zGWMscnCIfnTDeivcm5BAGnKuIGbB7lS6 RacCkQE8E1NyYb5IYrYMXHtSgHS46+yFKZlSy6bh2UkrGsn8GYpSYR5Ohisc o86NVYOl2IjDYIXJMCRDNXKq+JbA3tNg7NaTwfjR78e66e203ntWb2C0M5Wh ry/iYQ8VdPHP48j2MFN+CL2koMq/kgyRudbxktTVKBp6nkX36/iSCKNdzCuP w5MKwv1y3nkACvlZjG0llVwkx9kwQc5nYJUqDOw/p3EJ5ypO3ynDQld8hXU4 8Q/+I8R1RY79ESMCuvurMB3JFRC7XjvB0Vi+McfGFo1pJqO0RxVvdLzDQfce ir11jvypvYVbWMPqCpCPigl3f2kGkvpaaaQmH9GRnsH63T0YA3q+8QuoX5pG iT/xNa7HhmpMW8uKvrRnjvO5Qy18Ikzvq4zE06xQKpJDMZ4uvtOTrT3uDegi V8dyRbxE4mK+SLfYB4BVbacoSLphpFj7CAh7Jm/EOmdgn0XmGVpIn/ObwNli 7a6OfE6wGEuhXvhmIxk6ckYCc4Yw8EKOzdHiOh9PC1fMNy+iYXgYF2/HOfVg cc9jdO8jfUbH8JzJCZ86XPng6/b/qxsZP5/yF9QPcpZAfYX1o6ReX7hJMv4F Wur3sesTAAA= ---2036527359-1850874265-1030615998=:108094-- ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners