zsh-users
 help / color / mirror / code / Atom feed
From: Bart Schaefer <schaefer@brasslantern.com>
To: Charles Blake <charlechaud@gmail.com>
Cc: Zsh Users <zsh-users@zsh.org>
Subject: Re: find duplicate files
Date: Mon, 8 Apr 2019 10:14:20 -0700	[thread overview]
Message-ID: <CAH+w=7ZSJsCPhtG-vRDqxAoMEk3jAL5UpC+g-PCyKWm2isxqEA@mail.gmail.com> (raw)
In-Reply-To: <CAKiz1a90DsdjXmrE1wEviN5R9hP=225tVfQPFTJkC73f9pQ74A@mail.gmail.com>

On Mon, Apr 8, 2019 at 4:18 AM Charles Blake <charlechaud@gmail.com> wrote:
>
> >I find that a LOT more understandable than the python code.
>
> Understandability is, of course, somewhat subjective (e.g. some might say
> every 15th field is unclear relative to a named label)

Yes, lack of multi-dimensional data structures is a limitation on the
shell implementation.

I could have done it this way:

names=( **/*(.l+0) )
zstat -tA stats $names
sizes=( ${(M)stats:#size *} )

I chose the other way so the name and size would be directly connected
in the stats array rather than rely on implicit ordering (to one of
your later points, bad things happen with the above if a file is
removed between generating the list of names and collecting the file
stats).

> >unless you're NOT going to consider linked files as duplicates you
> >might as well just compare sizes.  (It would be faster to get inodes
>
> It may have been underappreciated is that handling hard-link identity also
> lets you skip recomputing hashes over a hard link cluster

Yes, this could be used to reduce the number of names passed to
"cksum" or the equivalent.

> Almost everything you say needs a "probably/maybe"
> qualifier.  I don't think you disagree.  I'm just elaborating a little
> for passers by.

Absolutely.  The flip side of this is that shells and utilities are
generally optimized for the average case, not for the extremes.

      reply	other threads:[~2019-04-08 17:15 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-06  5:40 Emanuel Berg
2019-04-06 13:02 ` Paul Hoffman
2019-04-06 14:44   ` Emanuel Berg
2019-04-06 19:11     ` zv
2019-04-06 19:42       ` Emanuel Berg
2019-04-08 14:37         ` Paul Hoffman
2019-04-08 14:58           ` Ray Andrews
2019-04-08 15:14             ` Volodymyr Khomchak
2019-04-08 15:24             ` Peter Stephenson
2019-04-08 15:32             ` Andrew J. Rech
2019-04-08 15:47             ` Oliver Kiddle
2019-04-08 16:29               ` Ray Andrews
2019-04-08 16:45                 ` Bart Schaefer
2019-04-08 21:30               ` Emanuel Berg
2019-04-09  1:08             ` Jason L Tibbitts III
2019-04-09  1:28               ` Ray Andrews
2019-04-09  9:28               ` Charles Blake
2019-04-08 21:26           ` Emanuel Berg
2019-04-07 11:16       ` Charles Blake
2019-04-07 21:32         ` Bart Schaefer
2019-04-08 11:17           ` Charles Blake
2019-04-08 17:14             ` Bart Schaefer [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAH+w=7ZSJsCPhtG-vRDqxAoMEk3jAL5UpC+g-PCyKWm2isxqEA@mail.gmail.com' \
    --to=schaefer@brasslantern.com \
    --cc=charlechaud@gmail.com \
    --cc=zsh-users@zsh.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).