From: Ralph Corderoy <ralph@inputplus.co.uk>
Subject: [TUHS] Re: Likely a one-liner in Unix
Date: Tue, 11 Jun 2024 09:05:06 +0100 [thread overview]
Message-ID: <20240611080506.73D7B21309@orac.inputplus.co.uk> (raw)
In-Reply-To: <a5ddb4f9-e72f-4e0e-ac65-48aadcaed458@ucsb.edu>
Hi James,
> > > "Show me the last 5 files read in a directory tree"
Given sort(1) gained -u for efficiency, I've often wondered why, in
those constrained times, it didn't have a ‘-m n’ to output only the
n ‘minimums’, e.g. ‘sed ${n}q’. With ‘-m 5’, this would let sort track
the current fifth entry and discard input which was bigger, so avoiding
both storing many unwanted lines and finding the current line's location
within them.
> OK, I'll bite (NB: using GNU find):
I think the POSIX way of getting the atime would be ‘LC_CTIME=C ls -lu’
and then parsing the two possible date formats. So non-POSIX find is
simpler. Also, GNU find shows me the sub-second part but ls doesn't.
Neither does GNU ‘stat -c '%X %n'’.
> find "$directory_tree" -type f -printf "%A+ %p\n" | sort -r | cut -d' ' -f2 | head -5
- I'd switch the atime format to seconds since epoch for easier
formatting given it's discarded.
- When atimes tie, sort's -r will give file Z before A so I'd add some
-k's so A comes first.
- I'd move the head to before the cut so cut processes fewer lines...
- But on so few lines, I'd just use sed to do both in one.
find "$@" -type f -printf '%A@ %p\n' |
sort -k1,1nr -k2 |
sed 's/^[^ ]* //; 5q'
Remaining issues...
If tied entries bridge the top-five border then this isn't shown.
Is the real requirement to show files with the five most recent distinct
atimes?
awk '{t += !s[$0]; s[$0] = 1; print} t == 5 {exit}'
Though this might give many lines. Instead, an ellipsis could show
a tie bridged the cut-off.
awk 't {if ($0 == l) print "..."; exit} NR == 5 {l = $0; t = 1} 1'
Paths can contain linefeeds and some versions allow handling NULs to be
tediously employed.
find "$@" -type f -printf '%A@ %p\0' |
sort -z -k1,1nr -k2 |
sed -z 's/[^ ]* //; 5q' |
tr \\0 \\n
David Wheeler has a nice article he maintains on unusual characters in
filenames: how to cope, and what other systems do, e.g. Plan 9.
Fixing Unix/Linux/POSIX filenames: control characters (such as
newline), leading dashes, and other problems
David A. Wheeler, 2023-08-22 (originally 2009-03-24)
https://dwheeler.com/essays/fixing-unix-linux-filenames.html
As he writes, Linux already returns EINVAL for some paths on some
filesystem types. A mount option which had a syscall return an error on
meeting an insensible path would be useful. It avoids any attempt at
escapement and its greater risk of implementation errors. I could
always re-mount some old volume without the option to list the directory
and fix up its entries. The second-best day to plant a tree is today.
--
Cheers, Ralph.
next prev parent reply other threads:[~2024-06-11 8:05 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-09 11:34 [TUHS] Re: most direct Unix descendant Douglas McIlroy
2024-06-09 11:59 ` A. P. Garcia
2024-06-09 12:31 ` Ralph Corderoy
2024-06-09 14:06 ` A. P. Garcia
2024-06-10 5:13 ` Ed Bradford
2024-06-10 5:25 ` G. Branden Robinson
2024-06-10 8:39 ` Dave Horsfall
2024-06-10 9:36 ` Marc Donner
2024-06-10 19:40 ` Steffen Nurpmeso
2024-06-10 20:09 ` Marc Donner
2024-06-10 20:19 ` Steffen Nurpmeso
2024-06-11 3:15 ` [TUHS] Re: Likely a one-liner in Unix James Frew
2024-06-11 8:05 ` Ralph Corderoy [this message]
2024-06-11 21:01 ` Steffen Nurpmeso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240611080506.73D7B21309@orac.inputplus.co.uk \
--to=ralph@inputplus.co.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).