From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by c5ff346549e7 (Postfix) with ESMTPS id C98265D6 for ; Sat, 9 Oct 2021 19:59:26 +0000 (UTC) X-IronPort-AV: E=Sophos;i="5.84,326,1620684000"; d="asc'?scan'208";a="532999086" Received: from prod-listesu18.inria.fr (HELO sympa.inria.fr) ([128.93.162.160]) by mail2-relais-roc.national.inria.fr with ESMTP; 09 Oct 2021 21:59:24 +0200 Received: by sympa.inria.fr (Postfix, from userid 20132) id D4157E01CC; Sat, 9 Oct 2021 21:59:24 +0200 (CEST) Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by sympa.inria.fr (Postfix) with ESMTPS id 22A76E00CF for ; Sat, 9 Oct 2021 21:59:19 +0200 (CEST) Authentication-Results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=christophe@raffalli.eu; spf=Pass smtp.mailfrom=christophe@raffalli.eu; spf=None smtp.helo=postmaster@mail.raffalli.eu IronPort-PHdr: =?us-ascii?q?A9a23=3AJmIF9haKSOZnEdt69O+GKEf/LTGg0IqcDmcuAno?= =?us-ascii?q?PtbtCf+yZ8oj4OwSHvLMx1gePDNyQtq0MotGVmpioYXYH75eFvSJKW713fDhBt?= =?us-ascii?q?/8rmRc9CtWOE0zxIa2iRSU7GMNfSA0tpCnjYgBaF8nkelLdvGC54yIMFRXjLwp?= =?us-ascii?q?1Ifn+FpLPg8it2O2+5YHfbx9MiTagbr9/LBe7phjNu8cLhodvNrw/wQbTrHtSf?= =?us-ascii?q?ORWy2JoJVaNkBv5+8y94p1t/TlOtvw478JPXrn0cKo+TbxDETQpKHs169HxtRn?= =?us-ascii?q?CVgSA+H0RWXgLnxVSAgjF6Bb6Xortsib/q+Fw1jWWMdHwQLspQjmp8btlRwH0h?= =?us-ascii?q?ycGLz458X/YispsjKJAvRmtowVzz5PIbI2JMfZzeL7Wc9EHSmpbRstfWSxPDJ2?= =?us-ascii?q?hYYUMAeoOMvpXoJT/qFYVohuyGROhCfnzxjNUhHL727Ax3eQ7EQHB2QwtB8wDs?= =?us-ascii?q?HTOrNX0L6cSTee1zLHIzTrdcvhYxS3y6IbGch87pfGMWbNwetfWxEYzFwPFlFS?= =?us-ascii?q?QqZf5PzOSzuQNs3aU4vF6Ve21hW4npRt+ojyrxsctkIXGmJ8Vx0nC+C5kz4k7O?= =?us-ascii?q?ce2R1RnYd64DpRQrSeaOpN5TM4tTG9kpSg0xL0Jt5O0YiUEx4kqyh7DZ/GDb4W?= =?us-ascii?q?G7BLtWPuPLDpmmn5odqyyihSw/EWh1+DxS8e53EpUoidFlNTHq34D1xvW6sedS?= =?us-ascii?q?/t9+F+s2TmO1wDP6uFEPFs7mbDHJJ4mx749kIcYv0fbHiLuhUn7g6Cbel8g9+S?= =?us-ascii?q?18ejqZqnqqoWCO4J6hAzyKrkiltCxDOgiLAQDX2eW9f6i2LDt/UD1WqtGg/4wn?= =?us-ascii?q?6LEqp7VP94bqbS8AwJN0oYs9RK/DzC+3dQdh3YHLVZFdAidj4fzNVHOLur3DfO?= =?us-ascii?q?7g1Stijtk2e3GMqXgApXLMHfDjK/scaty5kNT0gY+yc1T64hQB70dOv7/REH8u?= =?us-ascii?q?dLAAh88KQO0wuLnCNtn1oMZXGKCGrOWMKPIsV+J/eIvP+6MZJcVuDnjMPUl/eT?= =?us-ascii?q?hjXE3mVAHeamp2ZoXZGqmEfR7O0mZe2bjgs8dEWcWuQozVPDlh0eHUT5XfnqyW?= =?us-ascii?q?6M85ionCI+9FofCRoWtgKSb0yuhH51WYHpGClGWHnvyeYWEQaREVCXHB8Z7kzt?= =?us-ascii?q?MbbW7Ro473Fn6sQ3ijaJqNOfV4CQwu5n/ksVz46vLiUdh2yZzCpG203uMVH1zh?= =?us-ascii?q?mMFDwU/0q52pUh8ggOI0bJ5mOBfDdxez+lAXAkzMZrdieFnXYOhEjndd8uEHQ7?= =?us-ascii?q?1Cu6tBis8G5dom4dmi6dVHtyjilbZ1iSkCrsRlvqGGc5smko592L4Kch0z3LHk?= =?us-ascii?q?qQ83QFOqiRnL2SoiKd78wmVCpObyi2k?= IronPort-HdrOrdr: =?us-ascii?q?A9a23=3A+ga2h6GG93Mif887pLqE5ceALOsnbusQ8zAX?= =?us-ascii?q?Po5KJyC9Vvbo8/xG/c5rsCMc7Qx6ZJhOo7290cW7LU80sKQFhrX5Xo3SPjUO2l?= =?us-ascii?q?HJEGgK1+KLqAEIWReOldK1vp0AT0ERMrLNMWQ=3D?= X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: =?us-ascii?q?A0BUAAASaxph/+laW5BaGgEBAQEBAQEBA?= =?us-ascii?q?QEDAQEBARIBAQEBAgIBAQEBghmDIlcoEjGER4kEiGkDkGeMfQsBAwEIBTUKAgQ?= =?us-ascii?q?BAYRhAoJsHgcBBDQTAQIEFQEBBQEBAQIBAwMEAYEQE4VoDYI1KQGDYwEBAQECA?= =?us-ascii?q?SMdAQEjBwoDAQQLCw4KKgICVwaDBYJmIAEPpnJ6gTETboIHAQEGhiiBUwcDBoE?= =?us-ascii?q?6AYFShCeGf3pDP4FOgzlKNz6CYgKEd4Jkg3gZBoErEIEFEwWBWpEXrBuDMoE3i?= =?us-ascii?q?QaTe0UQgSuCKpFwN4ozhjaiT5kHgXckgVlNJk2CNQEzUBkOjiCDcoEBigEdMjg?= =?us-ascii?q?CBgEKAQEDCWkmE4hTAQE?= X-IPAS-Result: =?us-ascii?q?A0BUAAASaxph/+laW5BaGgEBAQEBAQEBAQEDAQEBARIBAQE?= =?us-ascii?q?BAgIBAQEBghmDIlcoEjGER4kEiGkDkGeMfQsBAwEIBTUKAgQBAYRhAoJsHgcBB?= =?us-ascii?q?DQTAQIEFQEBBQEBAQIBAwMEAYEQE4VoDYI1KQGDYwEBAQECASMdAQEjBwoDAQQ?= =?us-ascii?q?LCw4KKgICVwaDBYJmIAEPpnJ6gTETboIHAQEGhiiBUwcDBoE6AYFShCeGf3pDP?= =?us-ascii?q?4FOgzlKNz6CYgKEd4Jkg3gZBoErEIEFEwWBWpEXrBuDMoE3iQaTe0UQgSuCKpF?= =?us-ascii?q?wN4ozhjaiT5kHgXckgVlNJk2CNQEzUBkOjiCDcoEBigEdMjgCBgEKAQEDCWkmE?= =?us-ascii?q?4hTAQE?= X-IronPort-AV: E=Sophos;i="5.84,326,1620684000"; d="asc'?scan'208";a="395404066" X-MGA-submission: =?us-ascii?q?MDFNpb3LHLfOolJPM5PUkYGDeA6XEe3MngYEPQ?= =?us-ascii?q?Kn2SkiKeSIsG1geDQ1LCcWoLzGzPLi16n4DRYaY8Jh4IRTP8IewQFBJ+?= =?us-ascii?q?ANCMJzDsdF9q+XgRwEDMR9uldMFj85Q8QuhD8Ou8bqKRI5Ls+SwCkDpV?= =?us-ascii?q?uMOJa7pEft32GrzRFZ5Q3RmQ=3D=3D?= Received: from vmi311709.contaboserver.net (HELO mail.raffalli.eu) ([144.91.90.233]) by mail3-smtp-sop.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Oct 2021 21:59:18 +0200 Received: from localhost (28.235.254.103.dsl.dyn.mana.pf [103.254.235.28]) by mail.raffalli.eu (Postfix) with ESMTPSA id F22E2640E9C; Sat, 9 Oct 2021 21:59:15 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=raffalli.eu; s=mail; t=1633809557; bh=uj2MpCBj3vKu2g39pO+hN4NyelHyhYlK0w83pCp7CU0=; h=Date:From:To:Cc:Subject:References:In-Reply-To; b=xTPQrpgY9kvWjIWpKK0m10a28og9Of4pw64BPGlKlhUu3/XQZOJ7cgp7NUKUrOxeE HKGRZP3vQVq7ojCPp8BB08itM/Xpbeo3iv/0t6/Nactg22jL+SVKtyfZjxHam2tsdk W1/6QrJOj6iJptUTv9lp+/nnP+RRgQup+KaoqA+Pd8sGHnNfarvlCJMAe+PNroGSeX a7gdWeTC+VsT8nMdCB9SmiLZ2mFdgQ0hMcbAP50Sa1edirnzQfA/EUrU1t3Tcnue2x 2XcCfmpPpHY3t3tjdJmR8r7RxTQNAKSxlHSf6ehbPBD/7hOrZ4JCaYG6QndAuKWpg+ i3viwtwOf+MVg== Date: Sat, 9 Oct 2021 09:59:10 -1000 From: Christophe Raffalli To: Mario Pereira Cc: caml-list@inria.fr Message-ID: <20211009195910.mqivg4tsafiyb3tr@oulala> References: <20211009040533.chu7kqqk27r5wcok@oulala> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="o3sgaj7qbnwlvcxj" Content-Disposition: inline In-Reply-To: Subject: Re: [Caml-list] O(n ln k) sorting for ocaml on github and a challenge Reply-To: Christophe Raffalli X-Loop: caml-list@inria.fr X-Sequence: 18599 Errors-To: caml-list-owner@inria.fr Precedence: list Precedence: bulk Sender: caml-list-request@inria.fr X-no-archive: yes List-Id: List-Help: List-Subscribe: List-Unsubscribe: List-Post: List-Owner: List-Archive: Archived-At: --o3sgaj7qbnwlvcxj Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 21-10-09 09:58:12, Mario Pereira wrote: > Hi Christophe, > > Your implementation reminds me very much of TimSort (for an OCaml > implementation, see for instance=C2=A0https://github.com/LesBoloss-es/sor= ting/blob/ > master/src/list/timsort.ml). This is also a stable algorithm. Hi M=C3=A1rio, Yes the idea is similar, but - I use a reference to the list for the sorted/reverse-sorted block I build= in first phase (like Timsort on arrays, which actually is stable, in place a= nd as good complexity, but you have to get the balancing right, see below) - in the second phase, I use a merge, returning sorted list at even depth a= nd rev-sorted at odd depth (like OCaml's List.sort). This avoid the Asc/Desc constructor and the List.rev. - the balancement is ensured by a simple arithmetic computation, not by a s= tack and an invariant on sizes (probably building a balanced binary tree with the initial block instead of a list could be a bit better, I am not sure). - Last but not least: the code you point seems broken somewhere, it does not ends on large lists (probably quadratic because the balancing with size is probably wrong. cf the paragraph "bug" on wikipedia about Timsort. Anyway= , the code you point is probably optimised for arrays. > Was TimSort an inspiration? The inspiration was a paper I read long ago about the O(n ln n) not being t= he best we can do for lists, as O(n ln k) can be achieved where k is the numbe= r of change of direction. My work aims at being a replacement for OCaml sort (for list currently). He= re is the timing I get: Correctness test passed Stability test passed Random lists: random: tf =3D 1.44, tg =3D 1.39, factor =3D 1.04x, gain =3D 3.70% random small: tf =3D 1.10, tg =3D 1.19, factor =3D 0.92x, gain =3D -8.2= 7% Worst cases: worst1: tf =3D 0.83, tg =3D 0.71, factor =3D 1.17x, gain =3D 14.2= 1% worst2: tf =3D 0.65, tg =3D 0.70, factor =3D 0.93x, gain =3D -7.1= 7% Sorted (partially) lists: sorted: tf =3D 0.65, tg =3D 0.01, factor =3D 97.86x, gain =3D 98.= 98% reversed: tf =3D 0.65, tg =3D 0.09, factor =3D 7.62x, gain =3D 86.8= 8% sorted@rev: tf =3D 0.65, tg =3D 0.21, factor =3D 3.14x, gain =3D 68.1= 9% rev@sorted: tf =3D 0.65, tg =3D 0.21, factor =3D 3.13x, gain =3D 68.0= 3% Shuffled lists (permute k times 2 elements in a sorted list): shuffle 10: tf =3D 0.66, tg =3D 0.41, factor =3D 1.61x, gain =3D 38.0= 1% shuffle 100: tf =3D 0.64, tg =3D 0.51, factor =3D 1.25x, gain =3D 20.2= 8% shuffle 1000: tf =3D 0.63, tg =3D 0.56, factor =3D 1.14x, gain =3D 11.9= 4% shuffle 10000: tf =3D 0.64, tg =3D 0.61, factor =3D 1.04x, gain =3D 4.30% factor > 1 means Block_sort is faster that List.sort. Remark that in the ca= se of sorted list, my algorithm returns the original list using constant memory. Cheers, Christophe PS: I probably will try something much more like Timsort on arrays later ... > Cheers > -- > M=C3=A1rio --o3sgaj7qbnwlvcxj Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEJ369adg37nCCF0BCVXIaPfvmsfAFAmFh9IoACgkQVXIaPfvm sfB5Rgf/UeM6SnUbZnzIismhXUQoPmtzC9LWbyK7MsAlJ0nnsc0wsrNzP2R1ORTW Sbx03SV3zTTQE8QcBmXfuUCR9IrnXEzqEHIKOcQCWpW9FQQfGTDwFpOCY7gEWRhp 2wAh5xGyINz2PjjP1ZvWdTg2QZxSpe40y+HX3W1aC2/QRSSSIJ0N0SfV+9G2PDus EbrXWH2QyP6wx4ZijO5SGXs/Y9Y6mu9r6qUrjLkBbTDnC10gCcyjzVaXm6F6PwcV VSMuPixMAVBPc0FH1uM+uoU8FmBkBcZxIaWB7ot9MGXcOiTjcp3oLzSJY+XLhWPa PJSFo5fHKA8oGoqK01JO8VO1cqNmLA== =kQoB -----END PGP SIGNATURE----- --o3sgaj7qbnwlvcxj--