From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham autolearn_force=no version=3.4.2 Received: from primenet.com.au (ns1.primenet.com.au [203.24.36.2]) by inbox.vuxu.org (OpenSMTPD) with ESMTP id 9e720f1d for ; Mon, 8 Apr 2019 15:25:12 +0000 (UTC) Received: (qmail 16327 invoked by alias); 8 Apr 2019 15:24:57 -0000 Mailing-List: contact zsh-users-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Users List List-Post: List-Help: List-Unsubscribe: X-Seq: 23909 Received: (qmail 28712 invoked by uid 1010); 8 Apr 2019 15:24:57 -0000 X-Qmail-Scanner-Diagnostics: from mailout2.w1.samsung.com by f.primenet.com.au (envelope-from , uid 7791) with qmail-scanner-2.11 (clamdscan: 0.101.1/25412. spamassassin: 3.4.2. Clear:RC:0(210.118.77.12):SA:0(-7.0/5.0):. Processed in 3.655513 secs); 08 Apr 2019 15:24:57 -0000 X-Envelope-From: p.stephenson@samsung.com X-Qmail-Scanner-Mime-Attachments: | X-Qmail-Scanner-Zip-Files: | Received-SPF: pass (ns1.primenet.com.au: SPF record at _spf.samsung.com designates 210.118.77.12 as permitted sender) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout2.w1.samsung.com 20190408152418euoutp020eb91b43ebaa12c48796d37ad5f26be8~TiR33xCIE0242602426euoutp02V DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1554737058; bh=MBEKcI7CS4tfrK3XnnOktX9q5lQUlwRPkvVPaTlv510=; h=Subject:From:To:Date:In-Reply-To:References:From; b=umEyM+YkSskneVrpu3rtZMW5Oi1SPhuvQnOxKF1+bsKOd9FRv8ZHu093GuEjubHqQ bnd+vshGQa5i4H9k7aZ7GNTXtkMdW5WbC4xYfzOolCfzJaewYmFpHYJMqlRLq6jzik K2cgs0Ci67CYVG2BQMBJkFmfOd8/pIH43goQQn2w= X-AuditID: cbfec7f4-113ff70000001119-88-5cab67a1906e Message-ID: <1554737055.5822.14.camel@samsung.com> Subject: Re: find duplicate files From: Peter Stephenson To: Date: Mon, 8 Apr 2019 16:24:15 +0100 In-Reply-To: <391277a7-d604-4c20-a666-a2886b1d2939@eastlink.ca> X-Mailer: Evolution 3.18.5.2-0ubuntu3.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrHIsWRmVeSWpSXmKPExsWy7djP87oL01fHGBw4Im+x4+RKRgdGj1UH PzAFMEZx2aSk5mSWpRbp2yVwZfTO+sRYsIy3Ys7t/4wNjBe5uhg5OSQETCTmTFjL0sXIxSEk sIJRYsaqE4wgCSGBHiaJPws9IBLdTBKTV81igenY3H2EDSKxnFHi3ov9zHBVO2e9g5p1mlHi wblmZohZ5xklJlzVAbF5BYwkGp8uYAexhQUUJbrffGMFsdkEDCWmbpoNtltEQFzi2cxvYOtY BFQkGn7fYAOxOQXsJR7sPcIEcYaGxIabx5ggZgpKnJz5BKyeWUBeonnrbLCLJAQes0nc+LiO HaLBRWJW5zyoZmGJV8e3QMVlJP7vnM8E0dDOKLFm0mt2CKeHUWLT0TuMEFXWEn23LwLZHEAr NCXW79KHCDtKPJgwmQkkLCHAJ3HjrSDEEXwSk7ZNZ4YI80p0tAlBVKtJ7GjayggRlpF4ukZh AqPSLCQfzELywSyEVQsYmVcxiqeWFuempxYb5aWW6xUn5haX5qXrJefnbmIEpoLT/45/2cG4 60/SIUYBDkYlHl4Fx9UxQqyJZcWVuYcYJTiYlUR4d05dFSPEm5JYWZValB9fVJqTWnyIUZqD RUmct5rhQbSQQHpiSWp2ampBahFMlomDU6qBUcjCwyd0blqF4icX3jlnKg9xSxRM5XDZv0Ys tnnH/1kbjoqzGvlfebrF10g1N038jjiTmkNoua+LsGFlrI28aT9/6bK/hUH7V/v+yBbtW1h/ 5tvNK1bzxF1d9r+XDvg6d6X48pBNMvFJWre6bmy9v/P7Otb3Z+otVO7t+POeu0b9I5fwgUOy SizFGYmGWsxFxYkA0Wh4UgEDAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFmpmkeLIzCtJLcpLzFFi42I5/e/4Xd0F6atjDF5vNLDYcXIlowOjx6qD H5gCGKP0bIryS0tSFTLyi0tslaINLYz0DC0t9IxMLPUMjc1jrYxMlfTtbFJSczLLUov07RL0 MnpnfWIsWMZbMef2f8YGxotcXYycHBICJhKbu4+wdTFycQgJLGWUuHt1OTNEQkbi05WP7BC2 sMSfa11QRZ1MEn+/dLFCOKcZJZ4cmAblnGeUWLisgQ2khVfASKLx6QKwdmEBRYnuN99YQWw2 AUOJqZtmM4LYIgLiEs9mfmMBsVkEVCQaft8A6+UUsJd4sPcIE8TQK0wS564vAStiFtCUaN3+ G+omDYkNN48xQSwTlDg58wlUjbxE89bZzBMYhWYhaZmFpGwWkrIFjMyrGEVSS4tz03OLjfSK E3OLS/PS9ZLzczcxAiNg27GfW3Ywdr0LPsQowMGoxMOr4Lg6Rog1say4MvcQowQHs5II786p q2KEeFMSK6tSi/Lji0pzUosPMZoCfTSRWUo0OR8YnXkl8YamhuYWlobmxubGZhZK4rznDSqj hATSE0tSs1NTC1KLYPqYODilGhjnGihOeBFWF2fUf0Euu7jusdqSTWmCpenrPty6fFf4jOeB WkmxneEv5bzP3Tkg2MX55NdFhpecGzYszC4X69348q1Nr7uB2PcasRnF/18sEiu4scxpR5pD Jb+c+uUorr4PyR++3S2bV/hmxS7XCt8UtexbszYEO68TWKug/2vT9ZU3upgOXDiuxFKckWio xVxUnAgAF9Oqz5YCAAA= X-CMS-MailID: 20190408152417eucas1p1460d1ea7161f08ccc83334e1d3f5d36a X-Msg-Generator: CA Content-Type: text/plain; charset="utf-8" X-RootMTR: 20190408150024epcas2p3d98ea43aca6b09982d2697aa457b2593 X-EPHeader: CA CMS-TYPE: 201P X-CMS-RootMailID: 20190408150024epcas2p3d98ea43aca6b09982d2697aa457b2593 References: <86v9zrbsic.fsf@zoho.eu> <20190406130242.GA29292@trot> <86tvfb9ore.fsf@zoho.eu> <86mul2apj8.fsf@zoho.eu> <20190408143748.GA21630@trot> <391277a7-d604-4c20-a666-a2886b1d2939@eastlink.ca> On Mon, 2019-04-08 at 07:58 -0700, Ray Andrews wrote: > Pardon a comment from the peanut gallery, but it seems strange to me  > that the issue of duplicate files would be something that would not have  > been solved definitively  50 years ago.  I'd have thought that for  > donkey's years there would be utilities that would let you find  > duplicates at various levels of rigour because it's so obviously an  > issue that's bound to come up.  No? Slightly philosophical answer.  (The right answer is probably that actually there are such utilities but a lot of us don't know them.) Nothing really to do with zsh. One of the answers may be the way issues of this kind typically arise. For example: you've copied a whole pile of files somewhere else to rationalise an interface, or make a new project, or something like that.  Then the problem isn't so much looking for individual files that happen to be the same as looking for a bunch of stuff that got copied together and could really be re-rationalised to avoid the copy.  But often it's a modified copy so the files are similar but not quite the same. On the other hand, if you're looking not for piles of stuff that got copied but shouldn't, but individual files that happen to be the same, then you're quite likely to get lots of false positives with e.g. files based on a template that are logically different in the sense they are generated with different arguments but happen to turn out the same (that's just one random example off the top of my head) and you probably wouldn't want to rationalise them down to one. So the use case here maybe isn't quite as common as you might think. pws