* Re: [TUHS] Command line options and complexity
@ 2020-03-04 14:06 Nelson H. F. Beebe
2020-03-04 16:17 ` John P. Linderman
0 siblings, 1 reply; 68+ messages in thread
From: Nelson H. F. Beebe @ 2020-03-04 14:06 UTC (permalink / raw)
To: tuhs
Arnold Robbins writes:
>> There was no tac in V7 Unix. It was first posted to USENET, I don't
>> know by who, and picked up by Linux and *BSD.
That brought back memories, and to verify them, I checked the tac.c
source code in the latest GNU coreutils test release. It says
/* Written by Jay Lepreau (lepreau@cs.utah.edu).
GNU enhancements by David MacKenzie (djm@gnu.ai.mit.edu). */
So my memory was right that my old friend Jay was the author. Sadly,
we lost him in September 2008: see
https://www.legacy.com/obituaries/saltlaketribune/obituary.aspx?page=lifestory&pid=117597321
Jay founded the influential Flux group in advanced networking research:
http://www.flux.utah.edu/profile/lepreau
-------------------------------------------------------------------------------
- Nelson H. F. Beebe Tel: +1 801 581 5254 -
- University of Utah FAX: +1 801 581 4148 -
- Department of Mathematics, 110 LCB Internet e-mail: beebe@math.utah.edu -
- 155 S 1400 E RM 233 beebe@acm.org beebe@computer.org -
- Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ -
-------------------------------------------------------------------------------
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-04 14:06 [TUHS] Command line options and complexity Nelson H. F. Beebe
@ 2020-03-04 16:17 ` John P. Linderman
2020-03-04 17:25 ` Bakul Shah
` (2 more replies)
0 siblings, 3 replies; 68+ messages in thread
From: John P. Linderman @ 2020-03-04 16:17 UTC (permalink / raw)
To: Nelson H. F. Beebe; +Cc: The Eunuchs Hysterical Society
[-- Attachment #1: Type: text/plain, Size: 3678 bytes --]
The "statute of limitations" must have passed long ago, so I confess to
having been the author of the original tac (cat in reverse). I was working
on a project that wrote log files, but the logs were very "bursty". Minutes
might go by without any activity, followed by a burst of logging activity.
We often wanted to see *the most recent* burst of activity, so "tail -f"
wouldn't do the job. It would show the *next* burst of activity, which
might not occur for quite some time. Somebody posted a functional
equivalent on some netnews group, but it was *ghastly*. I think it did
seeks of -1 characters at a time to accumulate each line. That would have
been fast enough to feed our pathetic 1200 baud terminals, but it would
have beat the system to death, and that would have been a disservice to
other users. My version did reads of 512 bytes on 512-byte boundaries, so
it put much less load on the system. I couldn't bear to see something like
the netnews version
get adopted. The software release process at the Labs was a bureaucratic
nightmare, so I "tossed my version over the wall", into the arms of Andy
Tanenbaum, as I recall. He made it public, attributed to "an unknown
author".
I don't know how Rob Pike got ahold of it, but he recognized that mailbox
files had the same bursty growth. Unlike our log files, whose contents were
acceptably understandable in reverse order, mail messages were hard to read
in reverse order, so he proposed making it possible to recognize the
headers at the start of each mail message, and put the entire message out
in readable order. I think that was a useful option, but the irony of Rob
adding an option to "tac" was hard to overlook.
The version out there now was rewritten by Jay Lepreau, it seems:
/*
* tac.c - Print file segments in reverse order
*
* Original line-only version by unknown author off the net.
* Rewritten in 1985 by Jay Lepreau, Univ of Utah, to allocate memory
* dynamically, handle string bounded segments (suggested by Rob Pike),
* and handle pipes.
*/
Dynamic buffer allocation rather than relying on the time-honored
512-bytes-is-enough assumption was a positive, as was supporting Rob's
suggestion. Handling pipes strikes me as a waste of code, but hey, anything
is better than that version I replaced.
On Wed, Mar 4, 2020 at 9:15 AM Nelson H. F. Beebe <beebe@math.utah.edu>
wrote:
> Arnold Robbins writes:
>
> >> There was no tac in V7 Unix. It was first posted to USENET, I don't
> >> know by who, and picked up by Linux and *BSD.
>
> That brought back memories, and to verify them, I checked the tac.c
> source code in the latest GNU coreutils test release. It says
>
> /* Written by Jay Lepreau (lepreau@cs.utah.edu).
> GNU enhancements by David MacKenzie (djm@gnu.ai.mit.edu). */
>
> So my memory was right that my old friend Jay was the author. Sadly,
> we lost him in September 2008: see
>
>
> https://www.legacy.com/obituaries/saltlaketribune/obituary.aspx?page=lifestory&pid=117597321
>
> Jay founded the influential Flux group in advanced networking research:
>
> http://www.flux.utah.edu/profile/lepreau
>
>
> -------------------------------------------------------------------------------
> - Nelson H. F. Beebe Tel: +1 801 581 5254
> -
> - University of Utah FAX: +1 801 581 4148
> -
> - Department of Mathematics, 110 LCB Internet e-mail:
> beebe@math.utah.edu -
> - 155 S 1400 E RM 233 beebe@acm.org
> beebe@computer.org -
> - Salt Lake City, UT 84112-0090, USA URL:
> http://www.math.utah.edu/~beebe/ -
>
> -------------------------------------------------------------------------------
>
[-- Attachment #2: Type: text/html, Size: 5471 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-04 16:17 ` John P. Linderman
@ 2020-03-04 17:25 ` Bakul Shah
2020-03-05 0:55 ` Rob Pike
2020-03-05 2:05 ` Kurt H Maier
2 siblings, 0 replies; 68+ messages in thread
From: Bakul Shah @ 2020-03-04 17:25 UTC (permalink / raw)
To: John P. Linderman; +Cc: The Eunuchs Hysterical Society
[-- Attachment #1: Type: text/plain, Size: 4168 bytes --]
I missed knowing about tac till now. I’ve used tail -r since
1982 when Yost pointed out that tail -r|rev was equivalent
to a toy recursive C program I had written to reverse a file.
He was almost right!
rev(){int c=getchar();if(c==EOF)return;rev();putchar(c);}
> On Mar 4, 2020, at 8:19 AM, John P. Linderman <jpl.jpl@gmail.com> wrote:
>
>
> The "statute of limitations" must have passed long ago, so I confess to having been the author of the original tac (cat in reverse). I was working on a project that wrote log files, but the logs were very "bursty". Minutes might go by without any activity, followed by a burst of logging activity. We often wanted to see the most recent burst of activity, so "tail -f" wouldn't do the job. It would show the next burst of activity, which might not occur for quite some time. Somebody posted a functional equivalent on some netnews group, but it was ghastly. I think it did seeks of -1 characters at a time to accumulate each line. That would have been fast enough to feed our pathetic 1200 baud terminals, but it would have beat the system to death, and that would have been a disservice to other users. My version did reads of 512 bytes on 512-byte boundaries, so it put much less load on the system. I couldn't bear to see something like the netnews version
> get adopted. The software release process at the Labs was a bureaucratic nightmare, so I "tossed my version over the wall", into the arms of Andy Tanenbaum, as I recall. He made it public, attributed to "an unknown author".
>
> I don't know how Rob Pike got ahold of it, but he recognized that mailbox files had the same bursty growth. Unlike our log files, whose contents were acceptably understandable in reverse order, mail messages were hard to read in reverse order, so he proposed making it possible to recognize the headers at the start of each mail message, and put the entire message out in readable order. I think that was a useful option, but the irony of Rob adding an option to "tac" was hard to overlook.
>
> The version out there now was rewritten by Jay Lepreau, it seems:
>
> /*
> * tac.c - Print file segments in reverse order
> *
> * Original line-only version by unknown author off the net.
> * Rewritten in 1985 by Jay Lepreau, Univ of Utah, to allocate memory
> * dynamically, handle string bounded segments (suggested by Rob Pike),
> * and handle pipes.
> */
>
> Dynamic buffer allocation rather than relying on the time-honored 512-bytes-is-enough assumption was a positive, as was supporting Rob's suggestion. Handling pipes strikes me as a waste of code, but hey, anything is better than that version I replaced.
>
>> On Wed, Mar 4, 2020 at 9:15 AM Nelson H. F. Beebe <beebe@math.utah.edu> wrote:
>> Arnold Robbins writes:
>>
>> >> There was no tac in V7 Unix. It was first posted to USENET, I don't
>> >> know by who, and picked up by Linux and *BSD.
>>
>> That brought back memories, and to verify them, I checked the tac.c
>> source code in the latest GNU coreutils test release. It says
>>
>> /* Written by Jay Lepreau (lepreau@cs.utah.edu).
>> GNU enhancements by David MacKenzie (djm@gnu.ai.mit.edu). */
>>
>> So my memory was right that my old friend Jay was the author. Sadly,
>> we lost him in September 2008: see
>>
>> https://www.legacy.com/obituaries/saltlaketribune/obituary.aspx?page=lifestory&pid=117597321
>>
>> Jay founded the influential Flux group in advanced networking research:
>>
>> http://www.flux.utah.edu/profile/lepreau
>>
>> -------------------------------------------------------------------------------
>> - Nelson H. F. Beebe Tel: +1 801 581 5254 -
>> - University of Utah FAX: +1 801 581 4148 -
>> - Department of Mathematics, 110 LCB Internet e-mail: beebe@math.utah.edu -
>> - 155 S 1400 E RM 233 beebe@acm.org beebe@computer.org -
>> - Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ -
>> -------------------------------------------------------------------------------
[-- Attachment #2: Type: text/html, Size: 6405 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-04 16:17 ` John P. Linderman
2020-03-04 17:25 ` Bakul Shah
@ 2020-03-05 0:55 ` Rob Pike
2020-03-05 2:05 ` Kurt H Maier
2 siblings, 0 replies; 68+ messages in thread
From: Rob Pike @ 2020-03-05 0:55 UTC (permalink / raw)
To: John P. Linderman; +Cc: The Eunuchs Hysterical Society
I have no memory of this, but that doesn't mean it's false.
Also in my defense, suggesting an option compared to actually adding
the code is a lesser crime. Or is it?
Anyway I removed all the options from research cat, including -u. That
counts for something.
-rob
On Thu, Mar 5, 2020 at 3:19 AM John P. Linderman <jpl.jpl@gmail.com> wrote:
>
> The "statute of limitations" must have passed long ago, so I confess to having been the author of the original tac (cat in reverse). I was working on a project that wrote log files, but the logs were very "bursty". Minutes might go by without any activity, followed by a burst of logging activity. We often wanted to see the most recent burst of activity, so "tail -f" wouldn't do the job. It would show the next burst of activity, which might not occur for quite some time. Somebody posted a functional equivalent on some netnews group, but it was ghastly. I think it did seeks of -1 characters at a time to accumulate each line. That would have been fast enough to feed our pathetic 1200 baud terminals, but it would have beat the system to death, and that would have been a disservice to other users. My version did reads of 512 bytes on 512-byte boundaries, so it put much less load on the system. I couldn't bear to see something like the netnews version
> get adopted. The software release process at the Labs was a bureaucratic nightmare, so I "tossed my version over the wall", into the arms of Andy Tanenbaum, as I recall. He made it public, attributed to "an unknown author".
>
> I don't know how Rob Pike got ahold of it, but he recognized that mailbox files had the same bursty growth. Unlike our log files, whose contents were acceptably understandable in reverse order, mail messages were hard to read in reverse order, so he proposed making it possible to recognize the headers at the start of each mail message, and put the entire message out in readable order. I think that was a useful option, but the irony of Rob adding an option to "tac" was hard to overlook.
>
> The version out there now was rewritten by Jay Lepreau, it seems:
>
> /*
> * tac.c - Print file segments in reverse order
> *
> * Original line-only version by unknown author off the net.
> * Rewritten in 1985 by Jay Lepreau, Univ of Utah, to allocate memory
> * dynamically, handle string bounded segments (suggested by Rob Pike),
> * and handle pipes.
> */
>
> Dynamic buffer allocation rather than relying on the time-honored 512-bytes-is-enough assumption was a positive, as was supporting Rob's suggestion. Handling pipes strikes me as a waste of code, but hey, anything is better than that version I replaced.
>
> On Wed, Mar 4, 2020 at 9:15 AM Nelson H. F. Beebe <beebe@math.utah.edu> wrote:
>>
>> Arnold Robbins writes:
>>
>> >> There was no tac in V7 Unix. It was first posted to USENET, I don't
>> >> know by who, and picked up by Linux and *BSD.
>>
>> That brought back memories, and to verify them, I checked the tac.c
>> source code in the latest GNU coreutils test release. It says
>>
>> /* Written by Jay Lepreau (lepreau@cs.utah.edu).
>> GNU enhancements by David MacKenzie (djm@gnu.ai.mit.edu). */
>>
>> So my memory was right that my old friend Jay was the author. Sadly,
>> we lost him in September 2008: see
>>
>> https://www.legacy.com/obituaries/saltlaketribune/obituary.aspx?page=lifestory&pid=117597321
>>
>> Jay founded the influential Flux group in advanced networking research:
>>
>> http://www.flux.utah.edu/profile/lepreau
>>
>> -------------------------------------------------------------------------------
>> - Nelson H. F. Beebe Tel: +1 801 581 5254 -
>> - University of Utah FAX: +1 801 581 4148 -
>> - Department of Mathematics, 110 LCB Internet e-mail: beebe@math.utah.edu -
>> - 155 S 1400 E RM 233 beebe@acm.org beebe@computer.org -
>> - Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ -
>> -------------------------------------------------------------------------------
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-04 16:17 ` John P. Linderman
2020-03-04 17:25 ` Bakul Shah
2020-03-05 0:55 ` Rob Pike
@ 2020-03-05 2:05 ` Kurt H Maier
2020-03-05 4:17 ` Ken Thompson via TUHS
2 siblings, 1 reply; 68+ messages in thread
From: Kurt H Maier @ 2020-03-05 2:05 UTC (permalink / raw)
To: John P. Linderman; +Cc: The Eunuchs Hysterical Society
On Wed, Mar 04, 2020 at 11:17:46AM -0500, John P. Linderman wrote:
> I think that was a useful option, but the irony of Rob
> adding an option to "tac" was hard to overlook.
tac came back from Jersey waving flags?
khm
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-05 2:05 ` Kurt H Maier
@ 2020-03-05 4:17 ` Ken Thompson via TUHS
2020-03-05 14:53 ` Dan Cross
2020-03-05 21:50 ` Dave Horsfall
0 siblings, 2 replies; 68+ messages in thread
From: Ken Thompson via TUHS @ 2020-03-05 4:17 UTC (permalink / raw)
To: Kurt H Maier; +Cc: The Eunuchs Hysterical Society
[-- Attachment #1: Type: text/plain, Size: 421 bytes --]
do i get a prize:
ls -tj
/bin/ls: illegal option -- j
usage: ls [-ABCFGHLOPRSTUWabcdefghiklmnopqrstuwx1] [file ...]
On Wed, Mar 4, 2020 at 6:06 PM Kurt H Maier <khm@sciops.net> wrote:
> On Wed, Mar 04, 2020 at 11:17:46AM -0500, John P. Linderman wrote:
> > I think that was a useful option, but the irony of Rob
> > adding an option to "tac" was hard to overlook.
>
> tac came back from Jersey waving flags?
>
> khm
>
[-- Attachment #2: Type: text/html, Size: 766 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-05 4:17 ` Ken Thompson via TUHS
@ 2020-03-05 14:53 ` Dan Cross
2020-03-05 21:50 ` Dave Horsfall
1 sibling, 0 replies; 68+ messages in thread
From: Dan Cross @ 2020-03-05 14:53 UTC (permalink / raw)
To: Ken Thompson; +Cc: The Eunuchs Hysterical Society
[-- Attachment #1: Type: text/plain, Size: 510 bytes --]
On Wed, Mar 4, 2020 at 11:18 PM Ken Thompson via TUHS <tuhs@minnie.tuhs.org>
wrote:
> do i get a prize:
>
Depends on whether you do your grocery shopping at Trader Joe's.
ls -tj
> /bin/ls: illegal option -- j
> usage: ls [-ABCFGHLOPRSTUWabcdefghiklmnopqrstuwx1] [file ...]
>
Very nice. Wasn't there something in the fortune file at one point about
the "Monty Python and the Holy Grail" bridge crossing scene where the
question was, "what $n$ lower case letters are not options to ls(1)?"
- Dan C.
[-- Attachment #2: Type: text/html, Size: 1081 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-05 4:17 ` Ken Thompson via TUHS
2020-03-05 14:53 ` Dan Cross
@ 2020-03-05 21:50 ` Dave Horsfall
2020-03-05 21:56 ` Warner Losh
1 sibling, 1 reply; 68+ messages in thread
From: Dave Horsfall @ 2020-03-05 21:50 UTC (permalink / raw)
To: The Eunuchs Hysterical Society
On Wed, 4 Mar 2020, Ken Thompson via TUHS wrote:
> do i get a prize:
> ls -tj
> /bin/ls: illegal option -- j
> usage: ls [-ABCFGHLOPRSTUWabcdefghiklmnopqrstuwx1] [file ...]
Another candidate for option-cleansing... Interesting; I get different
options with the Mac and FreeBSD:
Mac:
usage: ls [-ABCFGHLOPRSTUWabcdefghiklmnopqrstuwx1] [file ...]
FreeBSD:
usage: ls [-ABCFGHILPRSTUWZabcdfghiklmnopqrstuwxy1,] [-D format] [file ...]
So FreeBSD has added up "y,D:" (in getopt(3)-speak); my eyes are burning...
-- Dave
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-05 21:50 ` Dave Horsfall
@ 2020-03-05 21:56 ` Warner Losh
2020-03-08 5:26 ` Greg 'groggy' Lehey
0 siblings, 1 reply; 68+ messages in thread
From: Warner Losh @ 2020-03-05 21:56 UTC (permalink / raw)
To: Dave Horsfall; +Cc: The Eunuchs Hysterical Society
[-- Attachment #1: Type: text/plain, Size: 814 bytes --]
On Thu, Mar 5, 2020 at 2:51 PM Dave Horsfall <dave@horsfall.org> wrote:
> On Wed, 4 Mar 2020, Ken Thompson via TUHS wrote:
>
> > do i get a prize:
> > ls -tj
> > /bin/ls: illegal option -- j
> > usage: ls [-ABCFGHLOPRSTUWabcdefghiklmnopqrstuwx1] [file ...]
>
> Another candidate for option-cleansing... Interesting; I get different
> options with the Mac and FreeBSD:
>
> Mac:
>
> usage: ls [-ABCFGHLOPRSTUWabcdefghiklmnopqrstuwx1] [file ...]
>
> FreeBSD:
>
> usage: ls [-ABCFGHILPRSTUWZabcdfghiklmnopqrstuwxy1,] [-D format]
> [file ...]
>
> So FreeBSD has added up "y,D:" (in getopt(3)-speak); my eyes are burning...
>
FreeBSD wouldn't need -, if there were a good filter to add , to large
numbers... Some of the proliferation of options has been due to a lack of
proper building-blocks....
Warner
[-- Attachment #2: Type: text/html, Size: 1259 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-05 21:56 ` Warner Losh
@ 2020-03-08 5:26 ` Greg 'groggy' Lehey
2020-03-08 5:32 ` Jon Steinhart
2020-03-08 9:51 ` Michael Kjörling
0 siblings, 2 replies; 68+ messages in thread
From: Greg 'groggy' Lehey @ 2020-03-08 5:26 UTC (permalink / raw)
To: Warner Losh; +Cc: The Eunuchs Hysterical Society
[-- Attachment #1: Type: text/plain, Size: 5818 bytes --]
On Thursday, 5 March 2020 at 14:56:58 -0700, Warner Losh wrote:
> On Thu, Mar 5, 2020 at 2:51 PM Dave Horsfall <dave@horsfall.org> wrote:
>> On Wed, 4 Mar 2020, Ken Thompson via TUHS wrote:
>>
>>> do i get a prize:
>>> ls -tj
>>> /bin/ls: illegal option -- j
>>> usage: ls [-ABCFGHLOPRSTUWabcdefghiklmnopqrstuwx1] [file ...]
>>
>> Another candidate for option-cleansing... Interesting; I get different
>> options with the Mac and FreeBSD:
>>
>> Mac:
>>
>> usage: ls [-ABCFGHLOPRSTUWabcdefghiklmnopqrstuwx1] [file ...]
>>
>> FreeBSD:
>>
>> usage: ls [-ABCFGHILPRSTUWZabcdfghiklmnopqrstuwxy1,] [-D format]
>> [file ...]
>>
>> So FreeBSD has added up "y,D:" (in getopt(3)-speak); my eyes are burning...
>
> FreeBSD wouldn't need -, if there were a good filter to add , to large
> numbers... Some of the proliferation of options has been due to a lack of
> proper building-blocks....
I wasn't going to join this discussion, but as the perpetrator of all
three of the options that Dave complains about, I think it's worth
explaining the rationale.
First: yes, filters are good. They make for an extraordinarily
flexible system. And many options are just bloat.
But on the other hand, let's follow on with your example and assume a
clever filter, say commafy, which would insert commas as needed in its
input:
$ ls -l | commafy 5
You really need the 5 (column number), because you can't rely on all
large numeric values to require commas. Consider:
$ ls -l 939585975893478543543
-rw-r--r-- 2 grog home 1719298048 8 Mar 14:14 939585975893478543543
The alternative would be to have the column number explicitly stated
in the filter, but that would make the filter more specific to ls.
But do you really want to add that much input when typing
interactively into a shell? How much easier it is just to write:
$ ls -l, 939585975893478543543
-rw-r--r-- 2 grog home 1,719,298,048 8 Mar 14:14 939585975893478543543
And then there are things that a filter can't easily do, the
rationales for -y and -D format. -y is really a workaround for a bug
in the POSIX specification for ls(1). From
https://pubs.opengroup.org/onlinepubs/009695399/utilities/ls.html:
-t
Sort with the primary key being time modified (most recently
modified first) and the secondary key being filename in the
collating sequence.
It's not immediately obvious, but these two keys sort in the opposite
order. The file name is sorted alphabetically, but the modification
time is the other way round (*reverse* chronological). This problem
bites you, for example, when you list files from two different cameras
that can take more than one image with the same time stamp. FAT
timestamps have a granularity of 1 second, so they all end up with
exactly the same time stamp.
From a diary entry for 24 January 2009
(http://www.lemis.com/grog/diary-jan2009.php?subtitle=%E2%80%9CNot%20a%20bug,%20a%20feature%E2%80%9D:%20episode%204714&article=lsorder#lsorder):
=== grog@dereel (/dev/ttyp2) ~/Photos/20061223/orig 63 -> ls -lTrt
-rwxrwxrwx 1 grog home 2478324 Dec 23 15:35:08 2006 DSCN1325.JPG
-rwxr-xr-x 1 grog home 1628592 Dec 23 17:11:00 2006 img_5504.jpg
-rwxr-xr-x 1 grog home 1621982 Dec 23 17:11:00 2006 img_5503.jpg
-rwxrwxrwx 1 grog home 2583242 Dec 23 17:27:30 2006 DSCN1326.JPG
-rwxrwxrwx 1 grog home 2476707 Dec 23 17:27:48 2006 DSCN1327.JPG
The file names for images with different timestamps are sorted
alphabetically. The file names for images with the same timestamps
are sorted in reverse alphabetical order. What to do? Potentially
you could write a filter here too, though it wouldn't be simple,
because the timestamp representation depends on the age of the file.
And you can't just fix the bug, because it has been elevated to a
feature. So -y does the right thing.
And that date. There are three relatively arbitrary formats, two of
them depending on how long ago the timestamp was:
-rw-r--r-- 2 grog home 1,719,298,048 8 Mar 14:14 939585975893478543543
-rw-r--r-- 1 grog home 0 24 Sep 2012 foo
You can fix that (on FreeBSD and probably on macOS) with the equally
unsupported -T flag ("full timestamp"):
$ ls -lT 939585975893478543543 foo
-rw-r--r-- 2 grog home 1719298048 8 Mar 14:14:58 2020 939585975893478543543
-rw-r--r-- 1 grog home 0 24 Sep 14:42:57 2012 foo
Do we need another format? Maybe. Certainly it would help to have a
different format if you want to pass the output to a filter that looks
at the timestamp. What should it be? Your guess is as good as mine,
but probably different. Obvious choices are raw time_t and
YYYYMMDDhhmmss. So I introduced the -D option to allow the user to
choose his own output format.
Is this a good idea? I certainly had pangs of conscience every time,
and a non-standard option runs the risk of being incompatible with
other systems. For example, Linux uses -T to define the tab size
(arguably a better choice for a filter) and -D to produce output for
Emacs dired mode.
In summary: there's a tradeoff between the elegance of filters and the
effort that they require. Adding options has its disadvantages too.
You need to remember them, and they can easily become incompatible.
But these specific features make life considerably easier and add very
little to the size of the executable. I'd be interested to hear of
alternative solutions to the issues.
Greg
--
Sent from my desktop computer.
Finger grog@lemis.com for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed. If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 163 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-08 5:26 ` Greg 'groggy' Lehey
@ 2020-03-08 5:32 ` Jon Steinhart
2020-03-08 9:30 ` Tyler Adams
2020-03-08 9:51 ` Michael Kjörling
1 sibling, 1 reply; 68+ messages in thread
From: Jon Steinhart @ 2020-03-08 5:32 UTC (permalink / raw)
To: The Eunuchs Hysterical Society
After following this discussion, I guess that I have a simplistic way to
determine whether something should be a dash option or a filter. In
general, I'd make a filter if whatever it was doing was applicable to
more than one command, a dash option otherwise.
Jon
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-08 5:32 ` Jon Steinhart
@ 2020-03-08 9:30 ` Tyler Adams
[not found] ` <CAC0cEp8eFRkkLTw88WVaKZoKy+qsrhuC8LkzmmsbqtdZgMf8eQ@mail.gmail.com>
0 siblings, 1 reply; 68+ messages in thread
From: Tyler Adams @ 2020-03-08 9:30 UTC (permalink / raw)
To: Jon Steinhart; +Cc: The Eunuchs Hysterical Society
[-- Attachment #1: Type: text/plain, Size: 1075 bytes --]
The idea of a simple rule is great, but the suggested rule fails on sort -u
which afaik came after sort | uniq for performance reasons.
Another idea on the same vein is that a flag should be added only when the
job can be done inside the program and not with stdin/stdout (or no flag
can be added if one can reproduce the same behavior using pipelines).
So, you need sort -u because only within sort can you get the performance
needed to get the job done.
But you don't need -h in ls -lh. All the information to render a human
readable number is present on stdout of ls -l. You could easily have a
filter which renders numbers with options like adding commas, dots,
scientific notation, precision, money, units, etc.
Tyler
On Sun, Mar 8, 2020, 07:33 Jon Steinhart <jon@fourwinds.com> wrote:
> After following this discussion, I guess that I have a simplistic way to
> determine whether something should be a dash option or a filter. In
> general, I'd make a filter if whatever it was doing was applicable to
> more than one command, a dash option otherwise.
>
> Jon
>
[-- Attachment #2: Type: text/html, Size: 1592 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-08 5:26 ` Greg 'groggy' Lehey
2020-03-08 5:32 ` Jon Steinhart
@ 2020-03-08 9:51 ` Michael Kjörling
1 sibling, 0 replies; 68+ messages in thread
From: Michael Kjörling @ 2020-03-08 9:51 UTC (permalink / raw)
To: tuhs
On 8 Mar 2020 16:26 +1100, from grog@lemis.com (Greg 'groggy' Lehey):
> FAT timestamps have a granularity of 1 second,
Not quite.
Last modified time is recorded to within two seconds (FAT squeezes the
seconds into a 5-bit field, which allows packing a time into two bytes).
Other times are recorded with different granularity, sometimes
depending on the OS/version used to make the change to the file
system.
And of course FAT has no concept of time zones; everything is local
time, all the time.
https://en.wikipedia.org/wiki/Design_of_the_FAT_file_system#Directory_entry
has some of the gory details.
--
Michael Kjörling • https://michael.kjorling.se • michael@kjorling.se
“Remember when, on the Internet, nobody cared that you were a dog?”
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
@ 2020-03-13 10:45 Dave Horsfall
2020-03-14 4:35 ` Greg 'groggy' Lehey
0 siblings, 1 reply; 68+ messages in thread
From: Dave Horsfall @ 2020-03-13 10:45 UTC (permalink / raw)
To: The Eunuchs Hysterical Society
Meant for the list (and don't get me started on Reply All)...
-- Dave
---------- Forwarded message ----------
Date: Fri, 13 Mar 2020 21:43:51 +1100 (EST)
From: Dave Horsfall <dave@horsfall.org>
To: Greg 'groggy' Lehey <grog@lemis.com>
Subject: Re: [TUHS] Command line options and complexity
On Fri, 13 Mar 2020, Greg 'groggy' Lehey wrote:
>> -h is a gnuism, isn't it?
>
> It might have originated there, but then I would expect it to be spelt
> '--produce-human-readable-output'. I haven't been able to establish from the
> FreeBSD sources or commit logs when it was introduced. It would clearly have
> been a reimplementation.
It's in "df" as well, praise Cthulu:
aneurin# df -h
Filesystem Size Used Avail Capacity Mounted on
/dev/ad0s1a 496M 302M 154M 66% /
devfs 1.0K 1.0K 0B 100% /dev
tmpfs 1000 272K 999M 0% /tmp
/dev/ad0s1d 2.9G 1.4G 1.2G 54% /usr
/dev/ad0s1e 989M 581M 329M 64% /var
/dev/ad0s1f 3.9G 2.2G 1.4G 62% /home
/dev/ad0s1g 8.9G 8.0G 127M 98% /usr/local
fdescfs 1.0K 1.0K 0B 100% /dev/fd
procfs 4.0K 4.0K 0B 100% /proc
(Memo to self: see where all the room has gone in /usr/local, as that's where I
assigned the leftover space after the other partitions.)
No, I've never liked stuffing everything under the root file system as both the
Mac and Penguin do; fill the root file system and you're hosed (and I also have
an itch about /tmp being there as it's a world-writable directory).
>> https://pubs.opengroup.org/onlinepubs/9699919799/utilities/ls.html does
>> specify the -S switch. That's POSIX, isn't it?
>
> So it is! This was the first option that I wanted to add, back when I still
> had practice wheels. I asked my mentor, and he said "not the Unix way", so I
> let it be. Then Wes Peters came up with the idea, and I thought he committed
> it, but it seems that it ultimately came from Kostas Blekos in 2005, based on
> the same feature on NetBSD and OpenBSD. I wonder when it made it to POSIX.
Years ago I wrote a simple script "lss" which did the sort after being
howled down on one of the FreeBSD lists; what a surprise to see "-S"...
Heck, back in my UNSW days I suggested extending stty() to cover non-TTY
devices and got trashed by the AGSM/ElecEng mob; well well, look at ioctl()
when it appeared.
-- Dave
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-13 10:45 Dave Horsfall
@ 2020-03-14 4:35 ` Greg 'groggy' Lehey
2020-03-14 19:52 ` John P. Linderman
0 siblings, 1 reply; 68+ messages in thread
From: Greg 'groggy' Lehey @ 2020-03-14 4:35 UTC (permalink / raw)
To: Dave Horsfall; +Cc: The Eunuchs Hysterical Society
[-- Attachment #1: Type: text/plain, Size: 1439 bytes --]
On Friday, 13 March 2020 at 21:45:21 +1100, Dave Horsfall wrote:
> On Fri, 13 Mar 2020, Greg 'groggy' Lehey wrote:
>
>>> -h is a gnuism, isn't it?
>>
>> It might have originated there, but then I would expect it to be spelt
>> '--produce-human-readable-output'. I haven't been able to establish from the
>> FreeBSD sources or commit logs when it was introduced. It would clearly have
>> been a reimplementation.
>
> It's in "df" as well, praise Cthulu:
>
> aneurin# df -h
> Filesystem Size Used Avail Capacity Mounted on
> /dev/ad0s1a 496M 302M 154M 66% /
> /dev/ad0s1d 2.9G 1.4G 1.2G 54% /usr
> /dev/ad0s1e 989M 581M 329M 64% /var
...
It also has the , option:
=== grog@eureka (/dev/pts/72) ~ 8 -> df -,
Filesystem 1048576-blocks Used Avail Capacity Mounted on
/dev/ada0p4 39,662 21,918 14,571 60% /
/dev/ada0p2 39,662 13,447 23,042 37% /destdir
/dev/ada0p5 3,705,520 1,831,345 1,577,733 54% /home
/dev/ada1p1 7,629,565 6,358,607 1,194,661 84% /Photos
I find it much easier to see the relative size like that.
Greg
--
Sent from my desktop computer.
Finger grog@lemis.com for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed. If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 163 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-14 4:35 ` Greg 'groggy' Lehey
@ 2020-03-14 19:52 ` John P. Linderman
2020-03-14 20:25 ` Steffen Nurpmeso
0 siblings, 1 reply; 68+ messages in thread
From: John P. Linderman @ 2020-03-14 19:52 UTC (permalink / raw)
To: Greg 'groggy' Lehey; +Cc: The Eunuchs Hysterical Society
[-- Attachment #1: Type: text/plain, Size: 2974 bytes --]
Here's a command I wrote long ago using a different way to deal with
options:
*isee*
Usage: isee format file ...
Display specified inode information for files passed as arguments.
Items of the form ``%X'' in format will be replaced for these X:
dev inode ino mode nlink uid gid rdev size atime
mtime ctime now filename
Parenthesized printf-style format specifications can follow a %
to override the default format for the various items.
%filename is the name of the current file argument.
%now is the time (in seconds) when the command started running.
The other items are from the stat structure.
Example: isee "%(40s)filename: %mtime %mode" /dev/null
Show file modification time and mode of /dev/null
inode is just a synonym for ino.
Instead of a kazillion options, the %-stat-field items identify *what* you
want to see and the printf-style formats identify *how* you want them
shown. Someone in the Murray Hill library added strftime formats for date
fields, a fine addition, in my view. Adding readable user and group names
rather than numerical ids would be worth considering. *Maybe* having a
"rwx"-style form for mode. Sorting can be done by piping the output through
sort. Don't get hung up on shortcomings of the command, just consider how a
few familiar concepts and pipes can be combined to provide a large number
of options.
On Sat, Mar 14, 2020 at 12:35 AM Greg 'groggy' Lehey <grog@lemis.com> wrote:
> On Friday, 13 March 2020 at 21:45:21 +1100, Dave Horsfall wrote:
> > On Fri, 13 Mar 2020, Greg 'groggy' Lehey wrote:
> >
> >>> -h is a gnuism, isn't it?
> >>
> >> It might have originated there, but then I would expect it to be spelt
> >> '--produce-human-readable-output'. I haven't been able to establish
> from the
> >> FreeBSD sources or commit logs when it was introduced. It would
> clearly have
> >> been a reimplementation.
> >
> > It's in "df" as well, praise Cthulu:
> >
> > aneurin# df -h
> > Filesystem Size Used Avail Capacity Mounted on
> > /dev/ad0s1a 496M 302M 154M 66% /
> > /dev/ad0s1d 2.9G 1.4G 1.2G 54% /usr
> > /dev/ad0s1e 989M 581M 329M 64% /var
> ...
>
> It also has the , option:
>
> === grog@eureka (/dev/pts/72) ~ 8 -> df -,
> Filesystem 1048576-blocks Used Avail Capacity Mounted on
> /dev/ada0p4 39,662 21,918 14,571 60% /
> /dev/ada0p2 39,662 13,447 23,042 37% /destdir
> /dev/ada0p5 3,705,520 1,831,345 1,577,733 54% /home
> /dev/ada1p1 7,629,565 6,358,607 1,194,661 84% /Photos
>
> I find it much easier to see the relative size like that.
>
> Greg
> --
> Sent from my desktop computer.
> Finger grog@lemis.com for PGP public key.
> See complete headers for address and phone numbers.
> This message is digitally signed. If your Microsoft mail program
> reports problems, please read http://lemis.com/broken-MUA
>
[-- Attachment #2: Type: text/html, Size: 4297 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-14 19:52 ` John P. Linderman
@ 2020-03-14 20:25 ` Steffen Nurpmeso
0 siblings, 0 replies; 68+ messages in thread
From: Steffen Nurpmeso @ 2020-03-14 20:25 UTC (permalink / raw)
To: John P. Linderman; +Cc: The Eunuchs Hysterical Society
John P. Linderman wrote in
<CAC0cEp-dL2iPikiGvaQ_s9_6AS=mFO4RvbT423fNJ3gQiLdthQ@mail.gmail.com>:
|Here's a command I wrote long ago using a different way to deal with \
|options:
|
| isee
|Usage: isee format file ...
| Display specified inode information for files passed as arguments.
| Items of the form ``%X'' in format will be replaced for these X:
|dev inode ino mode nlink uid gid rdev size atime
|mtime ctime now filename
| Parenthesized printf-style format specifications can follow a %
| to override the default format for the various items.
| %filename is the name of the current file argument.
| %now is the time (in seconds) when the command started running.
| The other items are from the stat structure.
|
| Example: isee "%(40s)filename: %mtime %mode" /dev/null
| Show file modification time and mode of /dev/null
|
|inode is just a synonym for ino.
|
|Instead of a kazillion options, the %-stat-field items identify what \
|you want to see and the printf-style formats identify how you want \
|them shown. Someone in the Murray Hill library added strftime
|formats for date fields, a fine addition, in my view. Adding readable \
|user and group names rather than numerical ids would be worth considering. \
|Maybe having a "rwx"-style form for mode. Sorting can be
|done by piping the output through sort. Don't get hung up on shortcomings \
|of the command, just consider how a few familiar concepts and pipes \
|can be combined to provide a large number of options.
When i switched to FreeBSD around 2001, the handbook was on the
CDs i had, and i stumbled upon a very impressive assembler
example. It is still there[1], at least in parts(?). Coming from
C64, then DOS/4DOS and <2 years Linux, aka kid games,
grey-industry, MS and xeyes background, i read
[1] https://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/x86-fpu.html
Personally, I like to keep it simple. Something either is
a number, so I process it. Or it is not a number, so I discard
it. I do not like the computer complaining about me typing in an
extra character when it is obvious that it is an extra
character. Duh!
Plus, it allows me to break up the monotony of computing and
type in a query instead of just a number:
What is the best pinhole diameter for the
focal length of 150?
There is no reason for the computer to spit out a number of complaints:
Syntax error: What
Syntax error: is
Syntax error: the
Syntax error: best
Et cetera, et cetera, et cetera.
Secondly, I like the # character to denote the start of
a comment which extends to the end of the line. This does not
take too much effort to code, and lets me treat input files for
my software as executable scripts.
and it was like being warped from Chaplin's Modern Times to a rich
man's California style living! And that in assembler!!
% pinhole
Computer,
What size pinhole do I need for the focal length of 150?
150 490 306 362 2930 12
Hmmm... How about 160?
160 506 316 362 3125 12
Let's make it 155, please.
155 498 311 362 3027 12
Ah, let's try 157...
157 501 313 362 3066 12
156?
156 500 312 362 3047 12
That's it! Perfect! Thank you very much!
^D
Nonetheless: i never managed to create Hippie-proof programs in
real life.
--steffen
|
|Der Kragenbaer, The moon bear,
|der holt sich munter he cheerfully and one by one
|einen nach dem anderen runter wa.ks himself off
|(By Robert Gernhardt)
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
@ 2020-03-10 18:42 Doug McIlroy
2020-03-10 19:38 ` Dan Cross
0 siblings, 1 reply; 68+ messages in thread
From: Doug McIlroy @ 2020-03-10 18:42 UTC (permalink / raw)
To: tuhs
> This begs questions of stability
Astute question. I had that in my original draft, but eliminited
it for what I thought was clarity. Anyway, depending on implementation
of sort, you may need sort -s. Of course it doesn't matter which copy
among several equal lines uniq produces, nor does it matter in sort
when there are no comparison options--they're all the same.
> I don't know enough about the
> internals of sed to know even what algorithm it uses
> (... a disk-based merge sort?)
sed is not a sorting program--basically it copies input to
output, making line-by-line editing changes. That's the
way I meant to use it in sed s/nonkeys//|sort -keys|uniq.
(I have added options to sort, hopefully for clarity).
The argument to sed here means substitute the empty
string for the nonkey fields (specified by a regular expression).
If "sed" was a typo for "sort", all versions of sort that
I know of use an internal sorting algorithm for big chunks
of the file, then combines the chunks by merge. But internal
sorting varies all over the map--variations on quicksort,
radix sort, merge sort, ...
Doug
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-10 18:42 Doug McIlroy
@ 2020-03-10 19:38 ` Dan Cross
0 siblings, 0 replies; 68+ messages in thread
From: Dan Cross @ 2020-03-10 19:38 UTC (permalink / raw)
To: Doug McIlroy; +Cc: The Eunuchs Hysterical Society
[-- Attachment #1: Type: text/plain, Size: 2210 bytes --]
On Tue, Mar 10, 2020 at 2:43 PM Doug McIlroy <doug@cs.dartmouth.edu> wrote:
> > This begs questions of stability
>
> Astute question. I had that in my original draft, but eliminited
> it for what I thought was clarity. Anyway, depending on implementation
> of sort, you may need sort -s. Of course it doesn't matter which copy
> among several equal lines uniq produces, nor does it matter in sort
> when there are no comparison options--they're all the same.
>
Thanks. That's interesting.
Did `sort -s` come later? The idea that you preferred clarity over
stability for `sort -u` would indicate so, otherwise one might imagine that
`-u` would just imply `-s` and that would be that.
> I don't know enough about the
> > internals of sed to know even what algorithm it uses
> > (... a disk-based merge sort?)
>
> sed is not a sorting program--basically it copies input to
> output, making line-by-line editing changes. That's the
> way I meant to use it in sed s/nonkeys//|sort -keys|uniq.
> (I have added options to sort, hopefully for clarity).
> The argument to sed here means substitute the empty
> string for the nonkey fields (specified by a regular expression).
>
`sed` in my email was a typo, as you speculated below.
Interestingly, this `sed` construction prior to `sort` loses information,
which perhaps doesn't matter in any given specific case, but is
insufficient in general, which I gathered to be the entire reason you
implemented `sort -u`.
If "sed" was a typo for "sort",
It was.
all versions of sort that
> I know of use an internal sorting algorithm for big chunks
> of the file, then combines the chunks by merge. But internal
> sorting varies all over the map--variations on quicksort,
> radix sort, merge sort, ...
>
It's the details of the internal sorts that are most interesting in some
sense, as the merges are probably fairly straight forward but the internal
sorts will affect stability and have other interesting characteristics.
As an aside, one must imagine that, in this day and age, a "big chunk" is
probably big enough to hold the vast majority of files entirely in RAM, and
only exceptionally large files actually require merging multiple blocks.
- Dan C.
[-- Attachment #2: Type: text/html, Size: 3279 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
@ 2020-03-10 16:15 Doug McIlroy
2020-03-10 17:38 ` Dan Cross
0 siblings, 1 reply; 68+ messages in thread
From: Doug McIlroy @ 2020-03-10 16:15 UTC (permalink / raw)
To: tuhs
> The idea of a simple rule is great, but the suggested rule fails on sort -u
> which afaik came after sort | uniq for performance reasons.
As the guilty party for most of sort's comparison options, I can
attest that efficiency was not an objective of -u. It was invented
precisely because uniq had proved useful, but not when one was
interested in uniqueness only of some key aspect of the data.
-u differs from uniq in that -u selects samples based on
equality of keys, not equality of lines. In the default
case of whole-line keys, sort -u of course does exactly
what sort|uniq does.
For many applications of -u with keys, the non-key fields
are not of interest. Then sed s/nonkeys//|sort|uniq may
suffice. But sed did not exist when -u was invented.
And not all sort key specs are easily imitated in sed.
Doug
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-10 16:15 Doug McIlroy
@ 2020-03-10 17:38 ` Dan Cross
2020-03-10 17:44 ` Bakul Shah
0 siblings, 1 reply; 68+ messages in thread
From: Dan Cross @ 2020-03-10 17:38 UTC (permalink / raw)
To: Doug McIlroy; +Cc: The Eunuchs Hysterical Society
[-- Attachment #1: Type: text/plain, Size: 1381 bytes --]
On Tue, Mar 10, 2020 at 12:16 PM Doug McIlroy <doug@cs.dartmouth.edu> wrote:
> > The idea of a simple rule is great, but the suggested rule fails on sort
> -u
> > which afaik came after sort | uniq for performance reasons.
>
> As the guilty party for most of sort's comparison options, I can
> attest that efficiency was not an objective of -u. It was invented
> precisely because uniq had proved useful, but not when one was
> interested in uniqueness only of some key aspect of the data.
>
> -u differs from uniq in that -u selects samples based on
> equality of keys, not equality of lines. In the default
> case of whole-line keys, sort -u of course does exactly
> what sort|uniq does.
>
> For many applications of -u with keys, the non-key fields
> are not of interest. Then sed s/nonkeys//|sort|uniq may
> suffice. But sed did not exist when -u was invented.
> And not all sort key specs are easily imitated in sed.
>
This begs questions of stability: in the event of non-unique keys and
non-key fields in the sortable data, which "records" (lines) are kept and
which are discarded? Surely the "first" is kept and subsequent entries with
the same key suppressed, but I confess I don't know enough about the
internals of sed to know even what algorithm it uses (I assume a disk-based
merge sort?), but I would imagine these details have changed over time.
- Dan C.
[-- Attachment #2: Type: text/html, Size: 1790 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-10 17:38 ` Dan Cross
@ 2020-03-10 17:44 ` Bakul Shah
2020-03-10 18:09 ` Dan Cross
0 siblings, 1 reply; 68+ messages in thread
From: Bakul Shah @ 2020-03-10 17:44 UTC (permalink / raw)
To: Dan Cross; +Cc: The Eunuchs Hysterical Society, Doug McIlroy
On Tue, 10 Mar 2020 13:38:23 -0400 Dan Cross <crossd@gmail.com> wrote:
>
> This begs questions of stability: in the event of non-unique keys and
> non-key fields in the sortable data, which "records" (lines) are kept and
> which are discarded? Surely the "first" is kept and subsequent entries with
> the same key suppressed, but I confess I don't know enough about the
> internals of sed to know even what algorithm it uses (I assume a disk-based
> merge sort?), but I would imagine these details have changed over time.
FreeBSD manpage for sort says that -u implies a stable sort,
similar to -s.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-10 17:44 ` Bakul Shah
@ 2020-03-10 18:09 ` Dan Cross
0 siblings, 0 replies; 68+ messages in thread
From: Dan Cross @ 2020-03-10 18:09 UTC (permalink / raw)
To: Bakul Shah; +Cc: The Eunuchs Hysterical Society, Doug McIlroy
[-- Attachment #1: Type: text/plain, Size: 797 bytes --]
On Tue, Mar 10, 2020 at 1:44 PM Bakul Shah <bakul@bitblocks.com> wrote:
> On Tue, 10 Mar 2020 13:38:23 -0400 Dan Cross <crossd@gmail.com> wrote:
> >
> > This begs questions of stability: in the event of non-unique keys and
> > non-key fields in the sortable data, which "records" (lines) are kept and
> > which are discarded? Surely the "first" is kept and subsequent entries
> with
> > the same key suppressed, but I confess I don't know enough about the
> > internals of sed to know even what algorithm it uses (I assume a
> disk-based
> > merge sort?), but I would imagine these details have changed over time.
>
> FreeBSD manpage for sort says that -u implies a stable sort,
> similar to -s.
>
Thanks; that makes sense. I'm still interested in historical data, though.
:-)
- Dan C.
[-- Attachment #2: Type: text/html, Size: 1255 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
@ 2020-03-05 4:57 Doug McIlroy
2020-03-05 22:17 ` Diomidis Spinellis
0 siblings, 1 reply; 68+ messages in thread
From: Doug McIlroy @ 2020-03-05 4:57 UTC (permalink / raw)
To: tuhs
> These go all the way back to v7 unix, where ls has an option to reverse
the sort order (which could have been done by passing the output to tac).
A cool idea, but tac was not in v7. And tail didn't get the -r
option until v8.
As for rev, I don't know why it was first written, but one
use was to examine suffixes--a kind of thing that several
word lovers in the Unix lab were prone to do.
Apropos of using rev to make rhyming dictionaries, Walker's
Rhyming Dictionary was published decades before Noah
Webster's dictionary appeared and stayed in print
for about 200 years. Notionally the relation between
webster and walker is
rev <webster | sort | rev >walker
Doug
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-05 4:57 Doug McIlroy
@ 2020-03-05 22:17 ` Diomidis Spinellis
0 siblings, 0 replies; 68+ messages in thread
From: Diomidis Spinellis @ 2020-03-05 22:17 UTC (permalink / raw)
To: tuhs
On 05-Mar-20 6:57, Doug McIlroy wrote:
>> These go all the way back to v7 unix, where ls has an option to reverse
> the sort order (which could have been done by passing the output to tac).
>
> A cool idea, but tac was not in v7. And tail didn't get the -r
> option until v8.
Tail acquired a -r option between 3BSD [1] and 4BSD [2].
I remember using that option on SunOS in 1990 as part of a prank we
played on a friend at the university. On the Sun 3 workstations we were
using at the time, one could enter the monitor/debugger program by
pressing L1-A. By remotely logging into a workstation and running a
shell loop, one could ensure that when the monitor was entered the
active program would be that shell. It was then easy to modify the uid
field for the active process (the loop-running shell) and set it to
zero. After exiting the monitor, a subshell launched from that shell
would have full root privileges. All we had to do was wait for the
friend to lock his workstation when taking a break in order to obtain
root privileges on his workstation and then change to his uid in order
to modify his files via NFS on the university's Gould file server.
Based on this capability, I wrote the following script that would rename
all our friend's files and directories to words from the dictionary.
The script also created (via tail -r) another script that would undo
this change.
#!/bin/sh
TMP=/tmp
DIR=$1
FILES=$TMP/f.$$
WORDS=$TMP/w.$$
CMD=$TMP/c.$$
REV=$TMP/r.$$
trap '' 0 1 2 3 15
find $DIR -depth -print >$FILES
head -`wc -l <$FILES|sed 's/[ ]*//'` /usr/dict/words >$WORDS
paste $FILES $WORDS |
sed -e '
/^\. /d
s/\(.*\)\/\(.*\) \(.*\)/mv \1\/\2 \1\/\3/
' >$CMD
rm $FILES $WORDS
tail -r $CMD |
sed -e '
s/mv \(.*\) \(.*\)/mv \2 \1/
' >$REV
sh <$CMD
rm $CMD
Unfortunately, it turned out that tail -r had a limit on the number of
lines it could reverse. Although the script and its undo worked fine on
a test set of a small number of files, when run on our friend's
directory it created a faulty undo script. Our friend ended up
graduating with files named "abaca" and "abacinate".
[1]
https://dspinellis.github.io/manview/?src=https%3A%2F%2Fraw.githubusercontent.com%2Fdspinellis%2Funix-history-repo%2FBSD-3%2Fusr%2Fman%2Fman1%2Ftail.1&name=BSD%203%3A%20tail(1)&link=https%3A%2F%2Fgithub.com%2Fdspinellis%2Funix-history-repo%2Fblob%2FBSD-3%2Fusr%2Fman%2Fman1%2Ftail.1
[2]
https://dspinellis.github.io/manview/?src=https%3A%2F%2Fraw.githubusercontent.com%2Fdspinellis%2Funix-history-repo%2FBSD-4%2Fusr%2Fman%2Fman1%2Ftail.1&name=BSD%204%3A%20tail(1)&link=https%3A%2F%2Fgithub.com%2Fdspinellis%2Funix-history-repo%2Fblob%2FBSD-4%2Fusr%2Fman%2Fman1%2Ftail.1
--
Diomidis Spinellis
Free edX MOOC on Unix Tools: Data, Software, and Production Engineering
https://www.spinellis.gr/unix?tuhs20200306
^ permalink raw reply [flat|nested] 68+ messages in thread
* [TUHS] Command line options and complexity
@ 2020-03-03 18:15 Jon Steinhart
2020-03-03 18:44 ` Adam Thornton
2020-03-10 23:03 ` Dan Stromberg
0 siblings, 2 replies; 68+ messages in thread
From: Jon Steinhart @ 2020-03-03 18:15 UTC (permalink / raw)
To: tuhs
OK, this should be good for some conversation. A friend sent me this
link today: http://danluu.com/cli-complexity/
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-03 18:15 Jon Steinhart
@ 2020-03-03 18:44 ` Adam Thornton
2020-03-04 4:11 ` Tyler Adams
` (2 more replies)
2020-03-10 23:03 ` Dan Stromberg
1 sibling, 3 replies; 68+ messages in thread
From: Adam Thornton @ 2020-03-03 18:44 UTC (permalink / raw)
To: Jon Steinhart, The Eunuchs Hysterical Society
[-- Attachment #1: Type: text/plain, Size: 1181 bytes --]
I've heard people say that there isn't really any alternative to this kind
of complexity for command line tools, but people who say that have never
really tried the alternative, something like PowerShell. I have plenty of
complaints about PowerShell, but passing structured data around and easily
being able to operate on structured data without having to hold metadata
information in my head so that I can pass the appropriate metadata to the
right command line tools at that right places the pipeline isn't among my
complaints3 <https://danluu.com/cli-complexity/#fn:W>.
Somewhat disingenuous. I mean, yes, that's true, but on the other hand it
means that you have to keep the "what Powershell commands operate on what
structure" in your head instead, since you can no longer assume the
pipelines to be a universal interface.
Same basic problem as CMS Pipelines. Fantastically powerful, and nowhere
near as easy to compose good functionality as "it's just a byte stream."
Adam
On Tue, Mar 3, 2020 at 11:16 AM Jon Steinhart <jon@fourwinds.com> wrote:
> OK, this should be good for some conversation. A friend sent me this
> link today: http://danluu.com/cli-complexity/
>
[-- Attachment #2: Type: text/html, Size: 1796 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-03 18:44 ` Adam Thornton
@ 2020-03-04 4:11 ` Tyler Adams
2020-03-04 6:03 ` Dave Horsfall
2020-03-04 21:50 ` Random832
2020-03-04 22:03 ` Random832
2 siblings, 1 reply; 68+ messages in thread
From: Tyler Adams @ 2020-03-04 4:11 UTC (permalink / raw)
To: Adam Thornton; +Cc: The Eunuchs Hysterical Society
[-- Attachment #1: Type: text/plain, Size: 1522 bytes --]
> These go all the way back to v7 unix, where ls has an option to reverse
the sort order (which could have been done by passing the output to tac).
Good point. Why was this done in v7 unix and why wasn't it thrown out?
Tyler
On Tue, Mar 3, 2020, 20:45 Adam Thornton <athornton@gmail.com> wrote:
> I've heard people say that there isn't really any alternative to this kind
> of complexity for command line tools, but people who say that have never
> really tried the alternative, something like PowerShell. I have plenty of
> complaints about PowerShell, but passing structured data around and easily
> being able to operate on structured data without having to hold metadata
> information in my head so that I can pass the appropriate metadata to the
> right command line tools at that right places the pipeline isn't among my
> complaints3 <https://danluu.com/cli-complexity/#fn:W>.
>
> Somewhat disingenuous. I mean, yes, that's true, but on the other hand it
> means that you have to keep the "what Powershell commands operate on what
> structure" in your head instead, since you can no longer assume the
> pipelines to be a universal interface.
>
> Same basic problem as CMS Pipelines. Fantastically powerful, and nowhere
> near as easy to compose good functionality as "it's just a byte stream."
>
> Adam
>
> On Tue, Mar 3, 2020 at 11:16 AM Jon Steinhart <jon@fourwinds.com> wrote:
>
>> OK, this should be good for some conversation. A friend sent me this
>> link today: http://danluu.com/cli-complexity/
>>
>
[-- Attachment #2: Type: text/html, Size: 2628 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-04 4:11 ` Tyler Adams
@ 2020-03-04 6:03 ` Dave Horsfall
2020-03-04 6:48 ` arnold
0 siblings, 1 reply; 68+ messages in thread
From: Dave Horsfall @ 2020-03-04 6:03 UTC (permalink / raw)
To: The Eunuchs Hysterical Society
[-- Attachment #1: Type: text/plain, Size: 674 bytes --]
On Wed, 4 Mar 2020, Tyler Adams wrote:
> > These go all the way back to v7 unix, where ls has an option to
> > reverse the sort order (which could have been done by passing the
> > output to tac).
>
> Good point. Why was this done in v7 unix and why wasn't it thrown out?
I seem to recall that "sort -r" was in V6, or perhaps that was one of the
programs I'd back-ported from V7 (being stuck with 11/40-class boxes).
And speaking of "tac" (which I never saw), I couldn't think of a single
use for "rev" (although no doubt I'll now get told). Mind you, you get
some amusing output with the "man" command because of the way that the
underlining works...
-- Dave
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-04 6:03 ` Dave Horsfall
@ 2020-03-04 6:48 ` arnold
2020-03-04 21:17 ` Dave Horsfall
2020-03-05 0:49 ` Lyndon Nerenberg
0 siblings, 2 replies; 68+ messages in thread
From: arnold @ 2020-03-04 6:48 UTC (permalink / raw)
To: tuhs, dave
Dave Horsfall <dave@horsfall.org> wrote:
> On Wed, 4 Mar 2020, Tyler Adams wrote:
>
> > > These go all the way back to v7 unix, where ls has an option to
> > > reverse the sort order (which could have been done by passing the
> > > output to tac).
> >
> > Good point. Why was this done in v7 unix and why wasn't it thrown out?
There was no tac in V7 Unix. It was first posted to USENET, I don't
know by who, and picked up by Linux and *BSD.
> And speaking of "tac" (which I never saw), I couldn't think of a single
> use for "rev" (although no doubt I'll now get told).
It's useful for reading Hebrew sent in plain text email :-). Hebrew is
read right to left but stored in physical order (left to right) in files.
Arnold
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-03 18:44 ` Adam Thornton
2020-03-04 4:11 ` Tyler Adams
@ 2020-03-04 21:50 ` Random832
2020-03-04 23:19 ` Steffen Nurpmeso
2020-03-05 6:12 ` Alan D. Salewski
2020-03-04 22:03 ` Random832
2 siblings, 2 replies; 68+ messages in thread
From: Random832 @ 2020-03-04 21:50 UTC (permalink / raw)
To: Grant Taylor via TUHS
On Tue, Mar 3, 2020, at 13:44, Adam Thornton wrote:
> I've heard people say that there isn't really any alternative to this
> kind of complexity for command line tools, but people who say that have
> never really tried the alternative, something like PowerShell. I have
> plenty of complaints about PowerShell, but passing structured data
> around and easily being able to operate on structured data without
> having to hold metadata information in my head so that I can pass the
> appropriate metadata to the right command line tools at that right
> places the pipeline isn't among my complaints3
> <https://danluu.com/cli-complexity/#fn:W>.
>
> Somewhat disingenuous. I mean, yes, that's true, but on the other hand
> it means that you have to keep the "what Powershell commands operate on
> what structure" in your head instead, since you can no longer assume
> the pipelines to be a universal interface.
Sure, but "stdin is a sequence of any type, and the argument is an expression that operates on that type or the name of a property that that type has" is universal enough.
The part that has to operate on a specific structure isn't the command, it's the arguments.
For example, a powershell pipeline to produce a list of files sorted by modified date is:
gci . | sort lastwritetime | select name
all three *commands* are universal - not all objects have a "lastwritetime" and "name" property, but sort and select can operate on any property that the sequence of objects passed into it has.
(gci is an alias for get-childitem... it also has aliases ls and dir, but I'm emphasizing that it's not exclusive to directories)
*assuming that ls -t didn't exist*, to do this with unix tools that operate on text you would need:
ls -l | [somehow convert the date to a sortable format, probably in awk] | sort | [somehow pick the filename alone out of the output - possibly with cut or sed or awk again]
and it's very difficult to get tools like awk, sort, and cut to work on formats that contain more than one field that may contain embedded spaces (you can just about get away with it for ls output because the date is always three "words").
A significant portion of ls's options are related to sorting, because you can sort based on fields that are either not present in the output, or are not in a format that can be sorted textually.
Maybe it would be enough to have the universal interface be "tables" (i.e. text streams in some format that supports adequate escaping of embedded row and column delimiters)... or maybe even just table rows, and let the user deal with memorizing column numbers (or let each originating command support a fully general way to specify what columns are requested, as ps alone does on modern systems) Of course, this isn't *really* different from allowing any data structure - after all, the value for any field could itself be a fully escaped table in text format.
The benefit of having actual data structures with types is that when you *don't* end the pipeline with select, each object knows how to print itself [files print mode, mtime, size, and name in a human-readable format, more or less equivalent to ls -l] rather than just dumping out every single field that you might want sort or select to operate on.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-04 21:50 ` Random832
@ 2020-03-04 23:19 ` Steffen Nurpmeso
2020-03-05 6:12 ` Alan D. Salewski
1 sibling, 0 replies; 68+ messages in thread
From: Steffen Nurpmeso @ 2020-03-04 23:19 UTC (permalink / raw)
To: Random832; +Cc: Grant Taylor via TUHS
Random832 wrote in
<5019a751-d69a-4839-9a56-b977b275070d@www.fastmail.com>:
|On Tue, Mar 3, 2020, at 13:44, Adam Thornton wrote:
|> I've heard people say that there isn't really any alternative to this
|> kind of complexity for command line tools, but people who say that have
|> never really tried the alternative, something like PowerShell. I have
|> plenty of complaints about PowerShell, but passing structured data
|> around and easily being able to operate on structured data without
|> having to hold metadata information in my head so that I can pass the
|> appropriate metadata to the right command line tools at that right
|> places the pipeline isn't among my complaints3
|> <https://danluu.com/cli-complexity/#fn:W>.
|>
|> Somewhat disingenuous. I mean, yes, that's true, but on the other hand
|> it means that you have to keep the "what Powershell commands operate on
|> what structure" in your head instead, since you can no longer assume
|> the pipelines to be a universal interface.
|
|Sure, but "stdin is a sequence of any type, and the argument is an \
|expression that operates on that type or the name of a property that \
|that type has" is universal enough.
|
|The part that has to operate on a specific structure isn't the command, \
|it's the arguments.
|
|For example, a powershell pipeline to produce a list of files sorted \
|by modified date is:
|
|gci . | sort lastwritetime | select name
...
|*assuming that ls -t didn't exist*, to do this with unix tools that \
|operate on text you would need:
|
|ls -l | [somehow convert the date to a sortable format, probably in \
|awk] | sort | [somehow pick the filename alone out of the output - \
|possibly with cut or sed or awk again]
|
|and it's very difficult to get tools like awk, sort, and cut to work \
|on formats that contain more than one field that may contain embedded \
|spaces (you can just about get away with it for ls output because the \
|date is always three "words").
Yes, that is really bad, except only that a lot of output is
pretty portables since a very long time. FreeBSD started using
libxo in many base utilities, which can output in structured
formats. This includes CSV and even CBOR :), i do not know how
the latter integrates in Unix text utilities however. (I think
the format string syntax, that a bit originates in QT ??, could
have been warped to something better, like the Python ones, plus
further extensions, however. But it is an improvement to what the
standard formats end up with when reordering etc. comes into
place.)
--steffen
|
|Der Kragenbaer, The moon bear,
|der holt sich munter he cheerfully and one by one
|einen nach dem anderen runter wa.ks himself off
|(By Robert Gernhardt)
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-04 21:50 ` Random832
2020-03-04 23:19 ` Steffen Nurpmeso
@ 2020-03-05 6:12 ` Alan D. Salewski
1 sibling, 0 replies; 68+ messages in thread
From: Alan D. Salewski @ 2020-03-05 6:12 UTC (permalink / raw)
To: tuhs
On 2020-03-04 16:50:34, Random832 spake thus:
[...]
> Sure, but "stdin is a sequence of any type, and the argument is an expression that operates on that type or the name of a property that that type has" is universal enough.
>
> The part that has to operate on a specific structure isn't the command, it's the arguments.
>
> For example, a powershell pipeline to produce a list of files sorted by modified date is:
>
> gci . | sort lastwritetime | select name
>
> all three *commands* are universal - not all objects have a "lastwritetime" and "name" property, but sort and select can operate on any property that the sequence of objects passed into it has.
There are some examples of that type of thing in widely used Unix tools;
my use of 'sort -k1,1n' further down is demonstrating such a use case (the
'sort' command is being told that it is operating on numbers). But beyond
some lowest common denominator types ("number", "string", ...) how many
commands can really usefully operate on a large number of types? For
example, a program that can operate on IP addresses is probably doing
something different than a program that wants to operate on email
addresses.
I could see where named properties of some object can be used more
generally than types, but again there are widely used tools that do do
that (e.g., jq(1)). IMHO, though, they are more cumbersome to use than
most of the commands I need to use minute to minute.
> (gci is an alias for get-childitem... it also has aliases ls and dir, but I'm emphasizing that it's not exclusive to directories)
>
> *assuming that ls -t didn't exist*, to do this with unix tools that operate on text you would need:
>
> ls -l | [somehow convert the date to a sortable format, probably in awk] | sort | [somehow pick the filename alone out of the output - possibly with cut or sed or awk again]
(Just nit-picking at this particular example)
You could do it without ls[0]:
$ stat -c '%Y %n' * | sort -k1,1n | xargs -L1 sh -c 'echo "$@"'
That doesn't seem so bad to me, but if it was something I needed regularly
I'd of course put it in an alias[1] or (more likely) a short script file.
> and it's very difficult to get tools like awk, sort, and cut to work on formats that contain more than one field that may contain embedded spaces (you can just about get away with it for ls output because the date is always three "words").
[...]
Yes, that's often true. And when I enounter it I typically start out by
seeing if I can inject and remove tokens in the data at key places in the
pipeline. Beyond anything trivial, though, I then quickly start reaching
for tools to put the data into some form that more easily allow for it
(CSV, JSON, ...). But that invariably adds other complications (such as
the need to find or build tools to marshal/unmarshal the data, and to
deal with data-domain-specific notions of null-vs-empty-string).
For the (more common (for me)) case where there is only one field that
contains embedded spaces, I just try to get 'em at the end of the line
and let the shell deal with it:
$ some-command | while read -r first second rest; do ... ; done
> Maybe it would be enough to have the universal interface be "tables" (i.e. text streams in some format that supports adequate escaping of embedded row and column delimiters)... or maybe even just table rows, and let the user deal with memorizing column numbers (or let each originating command support a fully general way to specify what columns are requested, as ps alone does on modern systems) Of course, this isn't *really* different from allowing any data structure - after all, the value for any field could itself be a fully escaped table in text format.
[...]
Well, in some sense with byte streams you have a table of newline-delimited
bytes (rows), and byte subfields separated by whitespace (columns). And
anything on top of that could (in some context, and with some syntax) be
considered just further escaped tables in text format. I think that's
essentially the same thing that you said, only with the outermost table
syntax removed. But like you said, this isn't really different from
allowing any data structure. Importantly, though, it doesn't impose any
particular data structure, either.
I've worked at a couple of different places that had in-house tools for
working with explicit table semantics in command line suites, and where
they fit the data domain, that was hugely useful. Generally speaking, they
were special purpose enough to warrant their own tools, but still general
purpose enough to be composable (were designed for use in shell pipelines)
and applicable in domains beyond the intentions of their original authors.
Still, the burden of "thinking in tables" would make them too heavyweight
for a lot of common use cases. Sometimes my data structure is "paragraphs
of text":
$ lorem -p 3 | perl -00 -wnle '2 == $. && print' | wc -w
Other times I want a tree (JSON, s-expressions, ...), or even a stream of
trees[2]. I consider it a feature that these more complex data structures
are not assumed or imposed in contexts where they are not needed.
Take care,
-Al
[0] You could get 'ls' to do it, too, (without '-t') but here the use of
TIME_STYLE is a presumably non-portable (but handy!) GNU-ism:
$ TIME_STYLE='+%s' ls -l | tail -n +2 | sort -k6,6n | xargs -L1 sh -c 'shift 5; echo "$@"'
It's different from the '-t' option, though, in that it forces a
predicatable date field format in the output of 'ls -l', so side-steps
the need for downstream date parsing altogether and simply jumps into
sorting (after chopping off the 'total N' header (groans all around)).
[1] E.g.,
$ # read 'bmt' as: "by mtime"
$ alias bmt='stat -c "%Y %n" * | sort -k1,1n | xargs -L1 sh -c '"'echo "'"$@"'"'"
$ bmt
[2] Probably flattened.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-03 18:44 ` Adam Thornton
2020-03-04 4:11 ` Tyler Adams
2020-03-04 21:50 ` Random832
@ 2020-03-04 22:03 ` Random832
2020-03-04 23:25 ` Terry Jones
2 siblings, 1 reply; 68+ messages in thread
From: Random832 @ 2020-03-04 22:03 UTC (permalink / raw)
To: Grant Taylor via TUHS
I put a lot of thoughts in my previous message, but hit send before thinking of a good way to summarize my main point...
On Tue, Mar 3, 2020, at 13:44, Adam Thornton wrote:
> Somewhat disingenuous. I mean, yes, that's true, but on the other hand
> it means that you have to keep the "what Powershell commands operate on
> what structure" in your head instead, since you can no longer assume
> the pipelines to be a universal interface.
The thing is, each Unix command imposes an implied structure on its
input, so it's not *really* a universal interface. Some operate on
lines as free text, some operate on space-delimited fields [with no
good way to escape them, though some do support an IFS environment
variable to at least change the delimiter], some work best with
fixed-width fields. Few provide a way to embed delimiters [be they
newline/null for record separator, tab/comma/space field separators, or
a user-defined separator for commands that support that] within a
value. Sort requires all values to be comparable as either strings or
numbers. Most commands you might want to use as a source in a pipeline
also expect to be used directly for human-readable output, so they
produce output that can be difficult to use for further processing
(e.g. dates in ls, which not only can't be sorted directly, but also
are limited to minutes for dates in the past year, and days for dates
before that, and are in the local time zone)
Hardly *any* commands you'd use in a pipeline really operate on unstructured bytes. Compression, I suppose. But other than that, you have just as much need to know what commands operate on what structure in Unix as in Powershell - the only difference is that the serialization is explicitly part of the interface... and due to the typical inability to escape delimiters, leaky.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-04 22:03 ` Random832
@ 2020-03-04 23:25 ` Terry Jones
0 siblings, 0 replies; 68+ messages in thread
From: Terry Jones @ 2020-03-04 23:25 UTC (permalink / raw)
To: Random832; +Cc: Grant Taylor via TUHS
[-- Attachment #1: Type: text/plain, Size: 3694 bytes --]
On Wed, Mar 4, 2020 at 11:04 PM Random832 <random832@fastmail.com> wrote:
> Hardly *any* commands you'd use in a pipeline really operate on
> unstructured bytes. Compression, I suppose. But other than that, you have
> just as much need to know what commands operate on what structure in Unix
> as in Powershell - the only difference is that the serialization is
> explicitly part of the interface... and due to the typical inability to
> escape delimiters, leaky.
>
Another difference is that probably most people on this list are extremely
familiar with the various quirks and I/O nuances of the tools many have
been using every day for decades. Just as the native speakers of a natural
language can't so easily see/appreciate its complexity (e.g., pronunciation
in English!), I suspect many of us have internalized these idiosyncrasies.
I teach occasional shell/Python courses to absolute beginners (no computing
experience at all) and came to appreciate how weird the shell is (in the
sense of having baked-in historical accidents that cannot / will not /
should not be "corrected"). Some of my appreciation of that was due to
discussions on this list (e.g., regarding comment syntax, and the :
command) - so thanks!
I know what follows won't be to everyone's taste, but I like Python and I
love shell pipelines, so I tried to write a shell that gave you both and
which allowed fairly free mixing of invoking UNIX tools and running Python.
You can send anything down its pipelines - lines of text, atoms, numbers,
Python objects, whatever (in the Python _ variable). Of course the
receiving end of the pipeline needs to know (or figure out) what it's
getting. One advantage is that you have a carefully designed programming
language (no offence intended!) underlying the shell, so you can e.g.,
write shell functions in Python (and put them in a start-up file if you
want) and just pipe regular UNIX output into them and pipe their output
into whatever's next (more Python, another UNIX command, etc). Probably
almost no one would actually want to regularly do the following on the
command line, but you could:
>>> from os import stat
>>> def fd(): return [name for (name, time) in sorted((f, stat(f).st_mtime)
for f in _)]
>>> ls | fd() | tail -n 3
Here I've stuck a simple (DSU - see [1]) Python function in between two
UNIX commands and use it to get the most recently modified files.
You probably wouldn't want to do this either, but you could:
>>> seq 0 9 | list(map(lambda x: 2 ** int(x), _)) | tee /tmp/powers-of-two | sum(map(int, _))1023>>> cat /tmp/powers-of-two1248163264128256512
Of course it also lets you do things you *would* want to do :-)
More at https://github.com/terrycojones/daudin Python has fairly nice
tools for reading and evaluating Python code, which meant that getting a
first version of this implemented took only one evening of playing around.
It's pretty simple (and still has plenty of rough edges). Apologies if
this seems like self-promotion, but I very much enjoy thinking about things
in this thread and about how we work with information. I'm also constantly
blown away by how elegant UNIX is and how the core ideas have endured.
Pipelines are really wonderful, as "natural" alternative to function
composition as a mathematician or programmer would do it (see point #1 at
https://github.com/terrycojones/daudin#background--thanks), and I wanted to
build a shell that preserved that, while giving you Python. The overview of
their history on pages 67-70 of bwk's recent book [2] is very interesting.
Terry
[1] https://en.wikipedia.org/wiki/Schwartzian_transform
[2] https://www.amazon.com/UNIX-History-Memoir-Brian-Kernighan/dp/1695978552
[-- Attachment #2: Type: text/html, Size: 6073 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-03 18:15 Jon Steinhart
2020-03-03 18:44 ` Adam Thornton
@ 2020-03-10 23:03 ` Dan Stromberg
2020-03-11 3:18 ` Dave Horsfall
1 sibling, 1 reply; 68+ messages in thread
From: Dan Stromberg @ 2020-03-10 23:03 UTC (permalink / raw)
To: Jon Steinhart; +Cc: tuhs
[-- Attachment #1: Type: text/plain, Size: 475 bytes --]
When I took a comparative languages class in school, the teacher said that
the complexity of a programming language varies with the square of its
number of features.
I wonder if it's similar for command line options in shell-callables?
On the other hand, adding command line options was (at least at one time)
seen as a way of distinguishing GNU tools from Unix tools - that is, they
were seen as a way of avoiding the copyright lawsuits that were snipping at
BSD's heels.
[-- Attachment #2: Type: text/html, Size: 607 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-10 23:03 ` Dan Stromberg
@ 2020-03-11 3:18 ` Dave Horsfall
2020-03-11 4:02 ` Steve Nickolas
2020-03-11 22:56 ` Greg 'groggy' Lehey
0 siblings, 2 replies; 68+ messages in thread
From: Dave Horsfall @ 2020-03-11 3:18 UTC (permalink / raw)
To: The Eunuchs Hysterical Society
[-- Attachment #1: Type: text/plain, Size: 1952 bytes --]
On Tue, 10 Mar 2020, Dan Stromberg wrote:
> When I took a comparative languages class in school, the teacher said
> that the complexity of a programming language varies with the square of
> its number of features.
That sort of makes sense from a mathematical point of view, if you regard
it as a matrix of side effects. I hate to think about how it affects Perl
(my favourite language) though :-)
> I wonder if it's similar for command line options in shell-callables?
I'm starting to think that if a utility requires many options then perhaps
they ought to be split into filters (or at least environment variables); I
despair at how *ix is drifting from "one tool, one job" to "one size fits
all"...
The "ls" command for example really needs an option-ectomy; I find that I
don't really care about the exact number of bytes there are in a file as
the nearest KiB or MiB (or even GiB) is usually good enough, so I'd be
happy if "-h" was the default with some way to turn it off (yes, I know
that it's occasionally useful to add them all up in a column, but that
won't tell you how many media blocks are required).
Quickly now, without looking: which option shows unprintable characters in
a filename? Unless you use it regularly (in which case you have real
problems) you would have to look it up; I find that "ls ... | od -bc" to
be quicker, especially on filenames with trailing blanks etc (which "-B"
won't show).
> On the other hand, adding command line options was (at least at one
> time) seen seen as a way of distinguishing GNU tools from Unix tools -
> that is, they were seen as a way of avoiding the copyright lawsuits that
> were snipping at BSD's heels.
I've never liked GNU's "--bloody-long-option" convention as you still have
to look up which one does what, but I've never thought about that view; a
lot of long options still accept a single character (subject to feeping
creaturism, of course).
-- Dave
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-11 3:18 ` Dave Horsfall
@ 2020-03-11 4:02 ` Steve Nickolas
2020-03-11 22:56 ` Greg 'groggy' Lehey
1 sibling, 0 replies; 68+ messages in thread
From: Steve Nickolas @ 2020-03-11 4:02 UTC (permalink / raw)
To: Dave Horsfall; +Cc: The Eunuchs Hysterical Society
On Wed, 11 Mar 2020, Dave Horsfall wrote:
> I'm starting to think that if a utility requires many options then perhaps
> they ought to be split into filters (or at least environment variables); I
> despair at how *ix is drifting from "one tool, one job" to "one size fits
> all"...
>
> The "ls" command for example really needs an option-ectomy; I find that I
> don't really care about the exact number of bytes there are in a file as the
> nearest KiB or MiB (or even GiB) is usually good enough, so I'd be happy if
> "-h" was the default with some way to turn it off (yes, I know that it's
> occasionally useful to add them all up in a column, but that won't tell you
> how many media blocks are required).
>
> Quickly now, without looking: which option shows unprintable characters in a
> filename? Unless you use it regularly (in which case you have real problems)
> you would have to look it up; I find that "ls ... | od -bc" to be quicker,
> especially on filenames with trailing blanks etc (which "-B" won't show).
It would probably be interesting to define a simplified standard, because
yeesh, trying to implement even a command as basic as ls is just torture
(mainly because it basically requires putting all of "column" and most of
"sort" into it)!
> I've never liked GNU's "--bloody-long-option" convention as you still have to
> look up which one does what, but I've never thought about that view; a lot of
> long options still accept a single character (subject to feeping creaturism,
> of course).
I'm still into the one-character switch thing, personally.
-uso.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-11 3:18 ` Dave Horsfall
2020-03-11 4:02 ` Steve Nickolas
@ 2020-03-11 22:56 ` Greg 'groggy' Lehey
2020-03-11 23:14 ` Dan Cross
` (2 more replies)
1 sibling, 3 replies; 68+ messages in thread
From: Greg 'groggy' Lehey @ 2020-03-11 22:56 UTC (permalink / raw)
To: Dave Horsfall; +Cc: The Eunuchs Hysterical Society
[-- Attachment #1: Type: text/plain, Size: 1821 bytes --]
On Wednesday, 11 March 2020 at 14:18:08 +1100, Dave Horsfall wrote:
>
> The "ls" command for example really needs an option-ectomy; I find that I
> don't really care about the exact number of bytes there are in a file as
> the nearest KiB or MiB (or even GiB) is usually good enough, so I'd be
> happy if "-h" was the default with some way to turn it off (yes, I know
> that it's occasionally useful to add them all up in a column, but that
> won't tell you how many media blocks are required).
A good example. But you're not removing options, you're just
redefining them. In fact I find the -h option particularly emetic, so
a better choice in removing options would be to remove -h and use a
filter to mutilate the sizes:
$ ls -l | humanize
But that's a pain, isn't it? That's why there's a -h option for
people who like it. Note that you can't do it the other way round:
you can't get the exact size from -h output.
And then there's the question why you don't like the standard output.
Because the number strings are too long and difficult to read, maybe?
That's the rationale for the -, option.
> Quickly now, without looking: which option shows unprintable
> characters in a filename? Unless you use it regularly (in which
> case you have real problems) you would have to look it up; I find
> that "ls ... | od -bc" to be quicker, especially on filenames with
> trailing blanks etc (which "-B" won't show).
This is arguably a bug in the -B option. I certainly don't think the
pipe notation is quicker. But it's nice to have both alternatives.
Greg
--
Sent from my desktop computer.
Finger grog@lemis.com for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed. If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 163 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-11 22:56 ` Greg 'groggy' Lehey
@ 2020-03-11 23:14 ` Dan Cross
2020-03-12 0:42 ` Greg 'groggy' Lehey
2020-03-12 0:53 ` Steve Nickolas
2020-03-12 5:22 ` Dave Horsfall
2 siblings, 1 reply; 68+ messages in thread
From: Dan Cross @ 2020-03-11 23:14 UTC (permalink / raw)
To: Greg 'groggy' Lehey; +Cc: The Eunuchs Hysterical Society
[-- Attachment #1: Type: text/plain, Size: 3408 bytes --]
On Wed, Mar 11, 2020 at 6:57 PM Greg 'groggy' Lehey <grog@lemis.com> wrote:
> On Wednesday, 11 March 2020 at 14:18:08 +1100, Dave Horsfall wrote:
> >
> > The "ls" command for example really needs an option-ectomy; I find that I
> > don't really care about the exact number of bytes there are in a file as
> > the nearest KiB or MiB (or even GiB) is usually good enough, so I'd be
> > happy if "-h" was the default with some way to turn it off (yes, I know
> > that it's occasionally useful to add them all up in a column, but that
> > won't tell you how many media blocks are required).
>
> A good example. But you're not removing options, you're just
> redefining them. In fact I find the -h option particularly emetic, so
> a better choice in removing options would be to remove -h and use a
> filter to mutilate the sizes:
>
> $ ls -l | humanize
>
> But that's a pain, isn't it?
I don't know; that's subjective.
> That's why there's a -h option for
> people who like it.
That's incomplete, in that it implies that an option is the only way to
achieve the goal of reducing the perceived pain, but that's not the case.
(Note I'm not saying you intended that as an interpretation, but it's a
reasonable intuition for an intention.)
An interesting counterpoint to this argument is how columnized "ls" is
handled under Plan 9: there is no `-C` option to `ls` there; instead,
there's a general-purpose `mc` filter that figures out the size of the
window it's running in, reads its input, decides how many columns the input
will fit into, and emits it columnized. But yes, it would be a pain to type
`ls | mc` every time one wanted columnized `ls` output, so this is wrapped
up into a shell script called `lc`. Note that this lets you do stuff like,
`lc -l` and see multi-column long listings if the window is wide enough.
I got so used to this from plan9 that I keep an approximation in
$HOME/bin/lc: `exec ls -ACF "$@"`.
For the `humanize` thing, I don't see why one couldn't have an `lh` command
that generated "human-friendly long output from ls."
> Note that you can't do it the other way round:
> you can't get the exact size from -h output.
>
That's true, but now the logic is specialized to ls, and not applicable to
anything else (e.g., du? df? wc, perhaps?). Similarly with `-,`. It is not
general purpose, which is unfortunate.
Granted, combining these things would be a little challenging, but is it
likely that one would want `ls -l,h`? Optimize for the common case, etc....
And then there's the question why you don't like the standard output.
> Because the number strings are too long and difficult to read, maybe?
> That's the rationale for the -, option.
>
> > Quickly now, without looking: which option shows unprintable
> > characters in a filename? Unless you use it regularly (in which
> > case you have real problems) you would have to look it up; I find
> > that "ls ... | od -bc" to be quicker, especially on filenames with
> > trailing blanks etc (which "-B" won't show).
>
> This is arguably a bug in the -B option. I certainly don't think the
> pipe notation is quicker. But it's nice to have both alternatives.
By default, plan9 would quote filenames that had characters that were
special to the shell (there wasn't really the concept of "non-printable
characters in the Unix/TTY sense); this could be disabled by specifying the
`-Q` option.
- Dan C.
[-- Attachment #2: Type: text/html, Size: 4662 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-11 23:14 ` Dan Cross
@ 2020-03-12 0:42 ` Greg 'groggy' Lehey
0 siblings, 0 replies; 68+ messages in thread
From: Greg 'groggy' Lehey @ 2020-03-12 0:42 UTC (permalink / raw)
To: Dan Cross; +Cc: The Eunuchs Hysterical Society
[-- Attachment #1: Type: text/plain, Size: 5879 bytes --]
On Wednesday, 11 March 2020 at 19:14:32 -0400, Dan Cross wrote:
> On Wed, Mar 11, 2020 at 6:57 PM Greg 'groggy' Lehey <grog@lemis.com> wrote:
>
>> On Wednesday, 11 March 2020 at 14:18:08 +1100, Dave Horsfall wrote:
>>>
>>> The "ls" command for example really needs an option-ectomy; I find that I
>>> don't really care about the exact number of bytes there are in a file as
>>> the nearest KiB or MiB (or even GiB) is usually good enough, so I'd be
>>> happy if "-h" was the default with some way to turn it off (yes, I know
>>> that it's occasionally useful to add them all up in a column, but that
>>> won't tell you how many media blocks are required).
>>
>> A good example. But you're not removing options, you're just
>> redefining them. In fact I find the -h option particularly emetic, so
>> a better choice in removing options would be to remove -h and use a
>> filter to mutilate the sizes:
>>
>> $ ls -l | humanize
>>
>> But that's a pain, isn't it?
>
> I don't know; that's subjective.
It's certainly more work than -h.
>> That's why there's a -h option for people who like it.
>
> That's incomplete, in that it implies that an option is the only way
> to achieve the goal of reducing the perceived pain, but that's not
> the case. (Note I'm not saying you intended that as an
> interpretation, but it's a reasonable intuition for an intention.)
What I meant (and this is certainly my interpretation) was that
somebody added the -h option because of perceived pain with piping
output through another program. I didn't intend to imply that it was
the only alternative.
> An interesting counterpoint to this argument is how columnized "ls"
> is handled under Plan 9: there is no `-C` option to `ls` there;
> instead, there's a general-purpose `mc` filter that figures out the
> size of the window it's running in, reads its input, decides how
> many columns the input will fit into, and emits it columnized. But
> yes, it would be a pain to type `ls | mc` every time one wanted
> columnized `ls` output, so this is wrapped up into a shell script
> called `lc`. Note that this lets you do stuff like, `lc -l` and see
> multi-column long listings if the window is wide enough.
Yes, that sounds like an excellent method.
> For the `humanize` thing, I don't see why one couldn't have an `lh`
> command that generated "human-friendly long output from ls."
And yes, I deliberately didn't mention this option, though it occurred
to me. I have a couple of scripts like this, like:
alias l="ls -lbL,"
>> Note that you can't do it the other way round: you can't get the
>> exact size from -h output.
>
> That's true, but now the logic is specialized to ls, and not
> applicable to anything else (e.g., du? df? wc, perhaps?). Similarly
> with `-,`. It is not general purpose, which is unfortunate.
Yes, this is an issue that I mentioned in an earlier message (I added
a positional parameter to work around it). But this is in the nature
of the output. mc doesn't have this issue.
> Granted, combining these things would be a little challenging, but is it
> likely that one would want `ls -l,h`? Optimize for the common case,
> etc....
Heh. Never thought of that. But since -h (apparently) never produces
output with 4 digits, the -, doesn't ever come into effect. I've just
tried it on some big files, and the -, is effectively ignored.
> And then there's the question why you don't like the standard
> output.
I don't like the standard output because things like this are hard to
read:
-rw-r--r-- 1 grog lemis 8234010624 22 Mar 2012 Casanova-TV-1-5
-rw-r--r-- 1 grog home 13225168900 31 Aug 2019 Movie:_Sahara_2005-2016-04-11-2028
I find this easier to read:
-rw-r--r-- 1 grog lemis 8,234,010,624 22 Mar 2012 Casanova-TV-1-5
-rw-r--r-- 1 grog home 13,225,168,900 31 Aug 2019 Movie:_Sahara_2005-2016-04-11-2028
I can't speak for Dave, but this is also less painful:
-rw-r--r-- 1 grog lemis 7.7G 22 Mar 2012 Casanova-TV-1-5
-rw-r--r-- 1 grog home 12G 31 Aug 2019 Movie:_Sahara_2005-2016-04-11-2028
The problem for me there is the difficulty comparing lengths, and the
implicit inaccuracy.
>> Because the number strings are too long and difficult to read, maybe?
>> That's the rationale for the -, option.
>>
>>> Quickly now, without looking: which option shows unprintable
>>> characters in a filename? Unless you use it regularly (in which
>>> case you have real problems) you would have to look it up; I find
>>> that "ls ... | od -bc" to be quicker, especially on filenames with
>>> trailing blanks etc (which "-B" won't show).
>>
>> This is arguably a bug in the -B option. I certainly don't think the
>> pipe notation is quicker. But it's nice to have both alternatives.
>
> By default, plan9 would quote filenames that had characters that
> were special to the shell (there wasn't really the concept of
> "non-printable characters in the Unix/TTY sense); this could be
> disabled by specifying the `-Q` option.
Hmm. In this particular case, so does Linux:
=== grog@bilbo (/dev/pts/11) ~ 2 -> touch "foo "
=== grog@bilbo (/dev/pts/11) ~ 4 -> l foo*
-rw-r--r-- 1 grog grog 1499570 Jun 30 2012 foo
-rw-r--r-- 1 grog grog 0 Mar 12 10:40 'foo '
I wonder if that's something we should emulate in FreeBSD. At the
very least we should consider whether the lack of identification of
trailing blanks is a bug in the FreeBSD implementation of -B. This
option isn't in POSIX, and in Linux it means
-B, --ignore-backups
do not list implied entries ending with ~
So maybe it's a candidate for fixing.
Greg
--
Sent from my desktop computer.
Finger grog@lemis.com for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed. If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 163 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-11 22:56 ` Greg 'groggy' Lehey
2020-03-11 23:14 ` Dan Cross
@ 2020-03-12 0:53 ` Steve Nickolas
2020-03-12 3:09 ` Greg 'groggy' Lehey
` (2 more replies)
2020-03-12 5:22 ` Dave Horsfall
2 siblings, 3 replies; 68+ messages in thread
From: Steve Nickolas @ 2020-03-12 0:53 UTC (permalink / raw)
To: Greg 'groggy' Lehey; +Cc: The Eunuchs Hysterical Society
On Thu, 12 Mar 2020, Greg 'groggy' Lehey wrote:
> On Wednesday, 11 March 2020 at 14:18:08 +1100, Dave Horsfall wrote:
>>
>> The "ls" command for example really needs an option-ectomy; I find that I
>> don't really care about the exact number of bytes there are in a file as
>> the nearest KiB or MiB (or even GiB) is usually good enough, so I'd be
>> happy if "-h" was the default with some way to turn it off (yes, I know
>> that it's occasionally useful to add them all up in a column, but that
>> won't tell you how many media blocks are required).
>
> A good example. But you're not removing options, you're just
> redefining them. In fact I find the -h option particularly emetic, so
> a better choice in removing options would be to remove -h and use a
> filter to mutilate the sizes:
>
> $ ls -l | humanize
>
> But that's a pain, isn't it? That's why there's a -h option for
> people who like it. Note that you can't do it the other way round:
> you can't get the exact size from -h output.
>
> And then there's the question why you don't like the standard output.
> Because the number strings are too long and difficult to read, maybe?
> That's the rationale for the -, option.
>
>> Quickly now, without looking: which option shows unprintable
>> characters in a filename? Unless you use it regularly (in which
>> case you have real problems) you would have to look it up; I find
>> that "ls ... | od -bc" to be quicker, especially on filenames with
>> trailing blanks etc (which "-B" won't show).
>
> This is arguably a bug in the -B option. I certainly don't think the
> pipe notation is quicker. But it's nice to have both alternatives.
>
> Greg
> --
> Sent from my desktop computer.
> Finger grog@lemis.com for PGP public key.
> See complete headers for address and phone numbers.
> This message is digitally signed. If your Microsoft mail program
> reports problems, please read http://lemis.com/broken-MUA
>
I went through all the switches defined by POSIX, and figured that those
26 could be cut down. My concept reduced the number of switches from 26
to 9 (FLRadfiln). Of course, the idea is to be more minimalist than
POSIX, so some people's opinions on what is or isn't necessary may differ
from mine.
Of course, this changes the default behavior of ls because it no longer
would be able to do columnar listings (|column for that).
I felt -A was a redundant "almost -a".
I felt -C and -x were redundant because a tool like column(1) could be
used to do the same job (even though column(1) isn't POSIX).
I felt -H was a redundant "almost -L".
I felt -S, -r and -t could be implemented in other ways using sort(1).
I felt -c and -u were meaningless, but that's because of the filesystems I
usually work with that do not have functional equivalents. -u for one is
completely useless on VFAT even though it has such timestamps! YMMV.
I felt -g and -o could be replaced by cut(1).
I felt -k wasn't really all that important. Just halve the numbers.
I felt -m wasn't really all that important. There's other ways to convert
to that format, no doubt, through filters.
I felt -p was a redundant "almost -F".
I felt -q could be done just fine with something like tr(1).
I felt -s was a redundant "kindasorta -l".
And -1 becomes the new default, so it's redundant. ;)
Again, YMMV. ;)
-uso.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-12 0:53 ` Steve Nickolas
@ 2020-03-12 3:09 ` Greg 'groggy' Lehey
2020-03-12 3:34 ` Steve Nickolas
2020-03-12 5:38 ` Dave Horsfall
2020-03-12 6:48 ` Peter Jeremy
2 siblings, 1 reply; 68+ messages in thread
From: Greg 'groggy' Lehey @ 2020-03-12 3:09 UTC (permalink / raw)
To: Steve Nickolas; +Cc: The Eunuchs Hysterical Society
[-- Attachment #1: Type: text/plain, Size: 4120 bytes --]
On Wednesday, 11 March 2020 at 20:53:12 -0400, Steve Nickolas wrote:
> I went through all the switches defined by POSIX, and figured that
> those 26 could be cut down.
A brave man to defy POSIX! I wasn't so brave, which is why we have
the -y option.
> My concept reduced the number of switches from 26 to 9 (FLRadfiln).
> Of course, the idea is to be more minimalist than POSIX, so some
> people's opinions on what is or isn't necessary may differ from
> mine.
OK, let's compare notes:
> I felt -A was a redundant "almost -a".
Arguably -a could go too. The distinction seems arbitrary.
> I felt -C and -x were redundant because a tool like column(1) could be
> used to do the same job (even though column(1) isn't POSIX).
Neither would this ls(1) be.
> I felt -H was a redundant "almost -L".
No arguments, but I suspect that somebody had a good reason for this
distinction, and removing it could cause problems.
> I felt -S, -r and -t could be implemented in other ways using sort(1).
-S isn't POSIX. And to implement it without an option would mean
removing -h.
As I mentioned earlier, -t can't be done by a filter without
significantly modifying the timestamp output. That was my rationale
for the -D option, which allows sorting by an external filter.
-r could work.
> I felt -c and -u were meaningless, but that's because of the filesystems I
> usually work with that do not have functional equivalents. -u for one is
> completely useless on VFAT even though it has such timestamps! YMMV.
I think this says more about your file systems than about the options.
I find both incredibly useful, and there's no easy way to get the
information elsewhere. stat(1) would be an option, but then that
could replace ls(1) completely.
> I felt -g and -o could be replaced by cut(1).
-g is already obsolete in FreeBSD (accepted and ignored). -o has
already been repurposed (show file flags).
> I felt -k wasn't really all that important. Just halve the numbers.
Agreed.
> I felt -m wasn't really all that important. There's other ways to convert
> to that format, no doubt, through filters.
Possibly. Certainly I wouldn't miss it.
> I felt -p was a redundant "almost -F".
OK.
> I felt -q could be done just fine with something like tr(1).
I think that it could be replaced by -b. "?" isn't really very
helpful.
> I felt -s was a redundant "kindasorta -l".
I can't agree with that, but I've never used it. The only sensible
use would appear to be talking about disk blocks, but on FreeBSD at
any rate it looks at the BLOCKSIZE environment variable, which I have
set to 1048576 (so that utilities will display in MB where
appropriate), and that's what -s does too:
2079 -rw-r--r-- 1 grog wheel 2,178,735,915 4 Oct 11:15 Willkommen-bei-den-Honeckers---Spielfilm,-Deutschland-2016-20191003-125200.mp4
That makes it pretty useless.
So, any others?
-G: Colorized output. I'd be *really* happy to get rid of this, but
it's not easy to instate with a filter, so I suppose there are
enough people who like it that it will have to stay.
-P: Seems only to be there to cancel a -H or -L.
-W: "Display whiteouts when scanning directories". I don't even
understand what that is.
-a: See discussion of -A.
--color: Again, no thanks.
-f: We haven't really discussed this one. If you want to remove -S,
-r and -t, then arguably -f should become the default and be
-removed.
-n: Make it the default and require a filter to convert group and user
numbers to IDs.
-y: If we get rid of all sorting, it will no longer be needed.
-,: Make the option standard: output numbers with commas every 3
digits. Then this option specification wouldn't be needed.
Of course, none of this will happen. But it is interesting to think
about it. In particular, options like -g and -o, which are no longer
modern.
Greg
--
Sent from my desktop computer.
Finger grog@lemis.com for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed. If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 163 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-12 3:09 ` Greg 'groggy' Lehey
@ 2020-03-12 3:34 ` Steve Nickolas
2020-03-13 1:02 ` Greg 'groggy' Lehey
0 siblings, 1 reply; 68+ messages in thread
From: Steve Nickolas @ 2020-03-12 3:34 UTC (permalink / raw)
To: Greg 'groggy' Lehey; +Cc: The Eunuchs Hysterical Society
On Thu, 12 Mar 2020, Greg 'groggy' Lehey wrote:
> On Wednesday, 11 March 2020 at 20:53:12 -0400, Steve Nickolas wrote:
>> I went through all the switches defined by POSIX, and figured that
>> those 26 could be cut down.
>
> A brave man to defy POSIX! I wasn't so brave, which is why we have
> the -y option.
xD
>> My concept reduced the number of switches from 26 to 9 (FLRadfiln).
>> Of course, the idea is to be more minimalist than POSIX, so some
>> people's opinions on what is or isn't necessary may differ from
>> mine.
>
> OK, let's compare notes:
>
>> I felt -A was a redundant "almost -a".
>
> Arguably -a could go too. The distinction seems arbitrary.
Well, I think one or the other would be desirable. I figured -a was the
better to keep - since it shows all dotfiles where -A leaves off . and ..
.
>> I felt -C and -x were redundant because a tool like column(1) could be
>> used to do the same job (even though column(1) isn't POSIX).
>
> Neither would this ls(1) be.
Of course. ;)
<snip>
> -S isn't POSIX. And to implement it without an option would mean
> removing -h.
-h is a gnuism, isn't it?
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/ls.html does
specify the -S switch. That's POSIX, isn't it?
> As I mentioned earlier, -t can't be done by a filter without
> significantly modifying the timestamp output. That was my rationale
> for the -D option, which allows sorting by an external filter.
Understandable.
Honestly if the date format weren't standardized as it were, I would've
standardized on "yyyy-mm-dd,mm:ss" - which wouldn't need special
processing in order to pump into sort(1).
>> I felt -c and -u were meaningless, but that's because of the filesystems I
>> usually work with that do not have functional equivalents. -u for one is
>> completely useless on VFAT even though it has such timestamps! YMMV.
>
> I think this says more about your file systems than about the options.
> I find both incredibly useful, and there's no easy way to get the
> information elsewhere. stat(1) would be an option, but then that
> could replace ls(1) completely.
Perhaps true.
<snip>
> So, any others?
>
> -G: Colorized output. I'd be *really* happy to get rid of this, but
> it's not easy to instate with a filter, so I suppose there are
> enough people who like it that it will have to stay.
>
> -P: Seems only to be there to cancel a -H or -L.
>
> -W: "Display whiteouts when scanning directories". I don't even
> understand what that is.
I was using the link I referenced as my "standard", which doesn't have any
of those.
I can take or leave color ls. I don't like the GNU defaults because dark
blue is TOO dark on my default settings. I think the flags are adequate
to know what kind of file I'm dealing with.
> -f: We haven't really discussed this one. If you want to remove -S,
> -r and -t, then arguably -f should become the default and be
> -removed.
I used to use "dir|sort" a lot on PC DOS before it got "dir /o" in 5.0. I
wouldn't have a problem with removing sort from ls altogether.
<snip>
> Of course, none of this will happen. But it is interesting to think
> about it. In particular, options like -g and -o, which are no longer
> modern.
-uso.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-12 3:34 ` Steve Nickolas
@ 2020-03-13 1:02 ` Greg 'groggy' Lehey
0 siblings, 0 replies; 68+ messages in thread
From: Greg 'groggy' Lehey @ 2020-03-13 1:02 UTC (permalink / raw)
To: Steve Nickolas; +Cc: The Eunuchs Hysterical Society
[-- Attachment #1: Type: text/plain, Size: 1946 bytes --]
On Wednesday, 11 March 2020 at 23:34:46 -0400, Steve Nickolas wrote:
> On Thu, 12 Mar 2020, Greg 'groggy' Lehey wrote:
>> -S isn't POSIX. And to implement it without an option would mean
>> removing -h.
>
> -h is a gnuism, isn't it?
It might have originated there, but then I would expect it to be spelt
'--produce-human-readable-output'. I haven't been able to establish
from the FreeBSD sources or commit logs when it was introduced. It
would clearly have been a reimplementation.
> https://pubs.opengroup.org/onlinepubs/9699919799/utilities/ls.html does
> specify the -S switch. That's POSIX, isn't it?
So it is! This was the first option that I wanted to add, back when I
still had practice wheels. I asked my mentor, and he said "not the
Unix way", so I let it be. Then Wes Peters came up with the idea, and
I thought he committed it, but it seems that it ultimately came from
Kostas Blekos in 2005, based on the same feature on NetBSD and
OpenBSD. I wonder when it made it to POSIX.
>> As I mentioned earlier, -t can't be done by a filter without
>> significantly modifying the timestamp output. That was my rationale
>> for the -D option, which allows sorting by an external filter.
>
> Understandable.
>
> Honestly if the date format weren't standardized as it were, I would've
> standardized on "yyyy-mm-dd,mm:ss" - which wouldn't need special
> processing in order to pump into sort(1).
Yes, that was one of the possibilities I thought of. Another obvious
one was time_t, which is even easier to process. And then there's ISO
8601. That's why it didn't take me long to decide "do it *your* wayâ
with the -D option.
Greg
--
Sent from my desktop computer.
Finger grog@lemis.com for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed. If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 163 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-12 0:53 ` Steve Nickolas
2020-03-12 3:09 ` Greg 'groggy' Lehey
@ 2020-03-12 5:38 ` Dave Horsfall
2020-03-12 6:48 ` Peter Jeremy
2 siblings, 0 replies; 68+ messages in thread
From: Dave Horsfall @ 2020-03-12 5:38 UTC (permalink / raw)
To: The Eunuchs Hysterical Society
On Wed, 11 Mar 2020, Steve Nickolas wrote:
> I felt -c and -u were meaningless, but that's because of the filesystems
> I usually work with that do not have functional equivalents. -u for one
> is completely useless on VFAT even though it has such timestamps!
> YMMV.
I find those flags really useful when doing forensic analysis on a file
system :-) One particular instance was at $ORKPLACE some years back when
a critical chunk of a file system had somehow disappeared overnight (it
was our source base!). I got to work by comparing login sessions with
those someone-unknown "ls" flags and had just about nailed the perp who
was online at the time when I was ordered off it in no uncertain terms.
Ummm, did I mention that my then $BOSS had a habit of working from home
after a few (and quite a few) drinks? As I said, I was this -><- far away
from fingering him... As it stood I knew who it was but wasn't able to
prove it in time.
-- Dave
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-12 0:53 ` Steve Nickolas
2020-03-12 3:09 ` Greg 'groggy' Lehey
2020-03-12 5:38 ` Dave Horsfall
@ 2020-03-12 6:48 ` Peter Jeremy
2020-03-12 7:37 ` Steve Nickolas
2020-03-12 23:57 ` Greg 'groggy' Lehey
2 siblings, 2 replies; 68+ messages in thread
From: Peter Jeremy @ 2020-03-12 6:48 UTC (permalink / raw)
To: Steve Nickolas; +Cc: The Eunuchs Hysterical Society
[-- Attachment #1: Type: text/plain, Size: 729 bytes --]
On 2020-Mar-11 20:53:12 -0400, Steve Nickolas <usotsuki@buric.co> wrote:
>On Thu, 12 Mar 2020, Greg 'groggy' Lehey wrote:
>> a better choice in removing options would be to remove -h and use a
>> filter to mutilate the sizes:
>>
>> $ ls -l | humanize
How does humanize decide which column to work on? If it only works on
"ls -l", then it's not useful if I want other columns as well. Maybe
it could just humanize any large number it found, but you probably
don't want to "humanize" the inode number or filename.
>I felt -s was a redundant "kindasorta -l".
Except they are reporting completely different things - consider sparse
files or filesystems (like ZFS) that support compression.
--
Peter Jeremy
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 963 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-12 6:48 ` Peter Jeremy
@ 2020-03-12 7:37 ` Steve Nickolas
2020-03-12 7:42 ` Warner Losh
2020-03-12 23:57 ` Greg 'groggy' Lehey
1 sibling, 1 reply; 68+ messages in thread
From: Steve Nickolas @ 2020-03-12 7:37 UTC (permalink / raw)
To: Peter Jeremy; +Cc: The Eunuchs Hysterical Society
On Thu, 12 Mar 2020, Peter Jeremy wrote:
> On 2020-Mar-11 20:53:12 -0400, Steve Nickolas <usotsuki@buric.co> wrote:
>
>> I felt -s was a redundant "kindasorta -l".
>
> Except they are reporting completely different things - consider sparse
> files or filesystems (like ZFS) that support compression.
I was under the impression that -s simply showed the file size divided by
512 and didn't account for sparseness or compression.
(Of the filesystems I frequently work with, one of them does actually
support sparseness (ProDOS).)
-uso.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-12 7:37 ` Steve Nickolas
@ 2020-03-12 7:42 ` Warner Losh
0 siblings, 0 replies; 68+ messages in thread
From: Warner Losh @ 2020-03-12 7:42 UTC (permalink / raw)
To: Steve Nickolas; +Cc: The Eunuchs Hysterical Society
[-- Attachment #1: Type: text/plain, Size: 790 bytes --]
On Thu, Mar 12, 2020, 1:37 AM Steve Nickolas <usotsuki@buric.co> wrote:
> On Thu, 12 Mar 2020, Peter Jeremy wrote:
>
> > On 2020-Mar-11 20:53:12 -0400, Steve Nickolas <usotsuki@buric.co> wrote:
> >
> >> I felt -s was a redundant "kindasorta -l".
> >
> > Except they are reporting completely different things - consider sparse
> > files or filesystems (like ZFS) that support compression.
>
> I was under the impression that -s simply showed the file size divided by
> 512 and didn't account for sparseness or compression.
>
Stat returns two values. The offset of the last byte and the number of
blocks allocated to the file. Useful if you have a sparse file too...
Warner
(Of the filesystems I frequently work with, one of them does actually
> support sparseness (ProDOS).)
>
> -uso.
>
[-- Attachment #2: Type: text/html, Size: 1514 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-12 6:48 ` Peter Jeremy
2020-03-12 7:37 ` Steve Nickolas
@ 2020-03-12 23:57 ` Greg 'groggy' Lehey
1 sibling, 0 replies; 68+ messages in thread
From: Greg 'groggy' Lehey @ 2020-03-12 23:57 UTC (permalink / raw)
To: Peter Jeremy; +Cc: The Eunuchs Hysterical Society
[-- Attachment #1: Type: text/plain, Size: 1085 bytes --]
On Thursday, 12 March 2020 at 17:48:07 +1100, Peter Jeremy wrote:
> On 2020-Mar-11 20:53:12 -0400, Steve Nickolas <usotsuki@buric.co> wrote:
>> On Thu, 12 Mar 2020, Greg 'groggy' Lehey wrote:
>>> a better choice in removing options would be to remove -h and use a
>>> filter to mutilate the sizes:
>>>
>>> $ ls -l | humanize
>
> How does humanize decide which column to work on?
It knows. It was written that way.
> If it only works on "ls -l", then it's not useful if I want other
> columns as well.
Right. You'd have to change it. Recall that this was just an
example.
> Maybe it could just humanize any large number it found, but you
> probably don't want to "humanize" the inode number or filename.
Yes, this is exactly the scenario I described in an earlier mail
message, where I called it
$ ls -l | commafy 5
Greg
--
Sent from my desktop computer.
Finger grog@lemis.com for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed. If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 163 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-11 22:56 ` Greg 'groggy' Lehey
2020-03-11 23:14 ` Dan Cross
2020-03-12 0:53 ` Steve Nickolas
@ 2020-03-12 5:22 ` Dave Horsfall
2020-03-12 5:35 ` Steve Nickolas
2020-03-13 0:36 ` Greg 'groggy' Lehey
2 siblings, 2 replies; 68+ messages in thread
From: Dave Horsfall @ 2020-03-12 5:22 UTC (permalink / raw)
To: The Eunuchs Hysterical Society
On Thu, 12 Mar 2020, Greg 'groggy' Lehey wrote:
> A good example. But you're not removing options, you're just redefining
> them. In fact I find the -h option particularly emetic, so a better
> choice in removing options would be to remove -h and use a filter to
> mutilate the sizes:
>
> $ ls -l | humanize
I also had something like that in mind, except being British/Australian
I'd spell it with an "s" :-)
> But that's a pain, isn't it? That's why there's a -h option for people
> who like it. Note that you can't do it the other way round: you can't
> get the exact size from -h output.
Which is why I suggested there be a means to turn it off; I'm becoming a
fan of environment variables to modify the standard behaviour of tools
(but I loathe the Penguin/OS default to use colours).
> And then there's the question why you don't like the standard output.
> Because the number strings are too long and difficult to read, maybe?
> That's the rationale for the -, option.
More than likely; as I approach age 68 I notice that I'm losing some
cognitive facility... I might start using "," and see if I like it, but I
see that the Mac doesn't have it (my Penguin is off the air at the
moment), and having it as an environment variable would be nice.
>> Quickly now, without looking: which option shows unprintable
>> characters in a filename? Unless you use it regularly (in which
>> case you have real problems) you would have to look it up; I find
>> that "ls ... | od -bc" to be quicker, especially on filenames with
>> trailing blanks etc (which "-B" won't show).
>
> This is arguably a bug in the -B option. I certainly don't think the
> pipe notation is quicker. But it's nice to have both alternatives.
Agreed; as for the bug I think it comes down to what is meant by an
unprintable character. I certainly remember finding "hidden" set-uid
shells with the name of ".. " etc back when I was going after the
UNSW kiddies with an axe back in the late 70s...
-- Dave
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-12 5:22 ` Dave Horsfall
@ 2020-03-12 5:35 ` Steve Nickolas
2020-03-13 0:36 ` Greg 'groggy' Lehey
1 sibling, 0 replies; 68+ messages in thread
From: Steve Nickolas @ 2020-03-12 5:35 UTC (permalink / raw)
To: Dave Horsfall; +Cc: The Eunuchs Hysterical Society
On Thu, 12 Mar 2020, Dave Horsfall wrote:
> Which is why I suggested there be a means to turn it off; I'm becoming a fan
> of environment variables to modify the standard behaviour of tools (but I
> loathe the Penguin/OS default to use colours).
When I first used Linux, that wasn't the default. Personally, I don't
think it should be (actually I think there simply shouldn't be a color
mode at all to ls).
> More than likely; as I approach age 68 I notice that I'm losing some
> cognitive facility... I might start using "," and see if I like it, but I
> see that the Mac doesn't have it (my Penguin is off the air at the moment),
> and having it as an environment variable would be nice.
GNU ls does not appear to have a -, switch.
IBM, interestingly, introduced an environment variable in PC DOS 6.3 that
did the opposite thing. If the NO_SEP variable existed, it suppressed
commas in file sizes.
-uso.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-12 5:22 ` Dave Horsfall
2020-03-12 5:35 ` Steve Nickolas
@ 2020-03-13 0:36 ` Greg 'groggy' Lehey
2020-03-13 11:26 ` Dave Horsfall
2020-03-14 2:13 ` Greg A. Woods
1 sibling, 2 replies; 68+ messages in thread
From: Greg 'groggy' Lehey @ 2020-03-13 0:36 UTC (permalink / raw)
To: Dave Horsfall; +Cc: The Eunuchs Hysterical Society
[-- Attachment #1: Type: text/plain, Size: 2396 bytes --]
On Thursday, 12 March 2020 at 16:22:01 +1100, Dave Horsfall wrote:
> On Thu, 12 Mar 2020, Greg 'groggy' Lehey wrote:
>
>> A good example. But you're not removing options, you're just redefining
>> them. In fact I find the -h option particularly emetic, so a better
>> choice in removing options would be to remove -h and use a filter to
>> mutilate the sizes:
>>
>> $ ls -l | humanize
>
> I also had something like that in mind, except being British/Australian
> I'd spell it with an "s" :-)
It's a common misconception that -ize is US English. The Oxford
English Dictionary, normally not prescriptive, prefers it. See
https://www.oed.com/page/faqs/Frequently+asked+questions#spell. I
personally had -ise drummed out of me by my uncle, very much
Australian.
>> And then there's the question why you don't like the standard output.
>> Because the number strings are too long and difficult to read, maybe?
>> That's the rationale for the -, option.
>
> More than likely; as I approach age 68 I notice that I'm losing some
> cognitive facility... I might start using "," and see if I like it, but I
> see that the Mac doesn't have it (my Penguin is off the air at the
> moment), and having it as an environment variable would be nice.
Yes, currently only FreeBSD has it. But you have the sources. Apart
from option handling, it's only:
--- print.c (.../head/bin/ls/print.c) (revision 241014)
+++ print.c (.../stable/10/bin/ls/print.c) (working copy)
@@ -606,6 +606,10 @@
humanize_number(buf, sizeof(buf), (int64_t)bytes, "",
HN_AUTOSCALE, HN_B | HN_NOSPACE | HN_DECIMAL);
(void)printf("%*s ", (u_int)width, buf);
+ } else if (f_thousands) { /* with commas */
+ /* This format assignment needed to work round gcc bug. */
+ const char *format = "%*j'd ";
+ (void)printf(format, (u_int)width, bytes);
} else
(void)printf("%*jd ", (u_int)width, bytes);
}
A quick and dirty fix would be simply to replace the format string.
Greg
--
Sent from my desktop computer.
Finger grog@lemis.com for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed. If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 163 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-13 0:36 ` Greg 'groggy' Lehey
@ 2020-03-13 11:26 ` Dave Horsfall
2020-03-14 2:13 ` Greg A. Woods
1 sibling, 0 replies; 68+ messages in thread
From: Dave Horsfall @ 2020-03-13 11:26 UTC (permalink / raw)
To: The Eunuchs Hysterical Society
On Fri, 13 Mar 2020, Greg 'groggy' Lehey wrote:
>>> $ ls -l | humanize
>>
>> I also had something like that in mind, except being British/Australian
>> I'd spell it with an "s" :-)
>
> It's a common misconception that -ize is US English. The Oxford English
> Dictionary, normally not prescriptive, prefers it. See
> https://www.oed.com/page/faqs/Frequently+asked+questions#spell. I
> personally had -ise drummed out of me by my uncle, very much Australian.
I'm familiar with that (and also the fact that "aluminum" and "color" etc
were British spelling). Being born and bred British with pedantic parents
I've always hated "American" spelling as we called it, and it's sad to see
such noted media as the Sydney Morning Herald slowly adopting it over the
past few years; Australia has used British spelling at least since I
emigrated here in 1965.
Oh, it was meant to be a creat/create joke, BTW...
>> More than likely; as I approach age 68 I notice that I'm losing some
>> cognitive facility... I might start using "," and see if I like it,
>> but I see that the Mac doesn't have it (my Penguin is off the air at
>> the moment), and having it as an environment variable would be nice.
>
> Yes, currently only FreeBSD has it. But you have the sources. Apart
> from option handling, it's only:
[...]
I don't like my chances with suggesting that to Apple; I'm not even sure
if they even take user contributions (although back when I was on the dole
and having delusions of grandeur I did register as an Apple developer, but
I suspect that that's for non-Apple stuff i.e. it goes into the Apple
Store).
> A quick and dirty fix would be simply to replace the format string.
I have done the odd binary patch (usually to reconfigure Unify database
volumes back when I was with FGH)... Not right now, though, as it's time
for bed.
-- Dave
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-13 0:36 ` Greg 'groggy' Lehey
2020-03-13 11:26 ` Dave Horsfall
@ 2020-03-14 2:13 ` Greg A. Woods
2020-03-14 4:31 ` Greg 'groggy' Lehey
1 sibling, 1 reply; 68+ messages in thread
From: Greg A. Woods @ 2020-03-14 2:13 UTC (permalink / raw)
To: The Unix Heritage Society mailing list
[-- Attachment #1: Type: text/plain, Size: 1230 bytes --]
At Fri, 13 Mar 2020 11:36:47 +1100, Greg 'groggy' Lehey <grog@lemis.com> wrote:
Subject: Re: [TUHS] Command line options and complexity
>
> On Thursday, 12 March 2020 at 16:22:01 +1100, Dave Horsfall wrote:
> > On Thu, 12 Mar 2020, Greg 'groggy' Lehey wrote:
> > >
> > > And then there's the question why you don't like the standard output.
> > > Because the number strings are too long and difficult to read, maybe?
> > > That's the rationale for the -, option.
> >
> > More than likely; as I approach age 68 I notice that I'm losing some
> > cognitive facility... I might start using "," and see if I like it, but I
> > see that the Mac doesn't have it (my Penguin is off the air at the
> > moment), and having it as an environment variable would be nice.
>
> Yes, currently only FreeBSD has it.
Because of course NetBSD has chosen a different option letter: 'M'
Unfortunately on NetBSD and FreeBSD the appearance of commas (or
whatever is appropriate) depends on the locale being correctly
configured, and this is not always so easy to do!
--
Greg A. Woods <gwoods@acm.org>
Kelowna, BC +1 250 762-7675 RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com> Avoncote Farms <woods@avoncote.ca>
[-- Attachment #2: OpenPGP Digital Signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [TUHS] Command line options and complexity
2020-03-14 2:13 ` Greg A. Woods
@ 2020-03-14 4:31 ` Greg 'groggy' Lehey
0 siblings, 0 replies; 68+ messages in thread
From: Greg 'groggy' Lehey @ 2020-03-14 4:31 UTC (permalink / raw)
To: The Unix Heritage Society mailing list
[-- Attachment #1: Type: text/plain, Size: 933 bytes --]
On Friday, 13 March 2020 at 19:13:53 -0700, Greg A. Woods wrote:
> At Fri, 13 Mar 2020 11:36:47 +1100, Greg 'groggy' Lehey <grog@lemis.com> wrote:
>> Yes, currently only FreeBSD has it.
>
> Because of course NetBSD has chosen a different option letter: 'M'
Oh. Somehow I missed that. Damn.
> Unfortunately on NetBSD and FreeBSD the appearance of commas (or
> whatever is appropriate) depends on the locale being correctly
> configured, and this is not always so easy to do!
Agreed. I've been meaning to default to , if the locale doesn't
specify a delimiter, but haven't got round to it. Give me a problem
report (https://bugs.freebsd.org/bugzilla/) and I'll fix it.
Greg
--
Sent from my desktop computer.
Finger grog@lemis.com for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed. If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 163 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
end of thread, other threads:[~2020-03-14 20:26 UTC | newest]
Thread overview: 68+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-04 14:06 [TUHS] Command line options and complexity Nelson H. F. Beebe
2020-03-04 16:17 ` John P. Linderman
2020-03-04 17:25 ` Bakul Shah
2020-03-05 0:55 ` Rob Pike
2020-03-05 2:05 ` Kurt H Maier
2020-03-05 4:17 ` Ken Thompson via TUHS
2020-03-05 14:53 ` Dan Cross
2020-03-05 21:50 ` Dave Horsfall
2020-03-05 21:56 ` Warner Losh
2020-03-08 5:26 ` Greg 'groggy' Lehey
2020-03-08 5:32 ` Jon Steinhart
2020-03-08 9:30 ` Tyler Adams
[not found] ` <CAC0cEp8eFRkkLTw88WVaKZoKy+qsrhuC8LkzmmsbqtdZgMf8eQ@mail.gmail.com>
[not found] ` <CAEuQd1D7+dfap98AwPo2W41+06prrcVaAWk3Ve-ve0uQ0xBu3Q@mail.gmail.com>
2020-03-09 21:06 ` John P. Linderman
2020-03-09 21:22 ` Kurt H Maier
2020-03-11 17:41 ` John P. Linderman
2020-03-11 21:29 ` Warner Losh
2020-03-12 0:13 ` John P. Linderman
2020-03-12 0:34 ` Chet Ramey
2020-03-12 12:57 ` John P. Linderman
2020-03-12 19:24 ` Steffen Nurpmeso
2020-03-08 9:51 ` Michael Kjörling
-- strict thread matches above, loose matches on Subject: below --
2020-03-13 10:45 Dave Horsfall
2020-03-14 4:35 ` Greg 'groggy' Lehey
2020-03-14 19:52 ` John P. Linderman
2020-03-14 20:25 ` Steffen Nurpmeso
2020-03-10 18:42 Doug McIlroy
2020-03-10 19:38 ` Dan Cross
2020-03-10 16:15 Doug McIlroy
2020-03-10 17:38 ` Dan Cross
2020-03-10 17:44 ` Bakul Shah
2020-03-10 18:09 ` Dan Cross
2020-03-05 4:57 Doug McIlroy
2020-03-05 22:17 ` Diomidis Spinellis
2020-03-03 18:15 Jon Steinhart
2020-03-03 18:44 ` Adam Thornton
2020-03-04 4:11 ` Tyler Adams
2020-03-04 6:03 ` Dave Horsfall
2020-03-04 6:48 ` arnold
2020-03-04 21:17 ` Dave Horsfall
2020-03-05 0:49 ` Lyndon Nerenberg
2020-03-05 20:54 ` Dave Horsfall
2020-03-05 22:01 ` William Cheswick
2020-03-04 21:50 ` Random832
2020-03-04 23:19 ` Steffen Nurpmeso
2020-03-05 6:12 ` Alan D. Salewski
2020-03-04 22:03 ` Random832
2020-03-04 23:25 ` Terry Jones
2020-03-10 23:03 ` Dan Stromberg
2020-03-11 3:18 ` Dave Horsfall
2020-03-11 4:02 ` Steve Nickolas
2020-03-11 22:56 ` Greg 'groggy' Lehey
2020-03-11 23:14 ` Dan Cross
2020-03-12 0:42 ` Greg 'groggy' Lehey
2020-03-12 0:53 ` Steve Nickolas
2020-03-12 3:09 ` Greg 'groggy' Lehey
2020-03-12 3:34 ` Steve Nickolas
2020-03-13 1:02 ` Greg 'groggy' Lehey
2020-03-12 5:38 ` Dave Horsfall
2020-03-12 6:48 ` Peter Jeremy
2020-03-12 7:37 ` Steve Nickolas
2020-03-12 7:42 ` Warner Losh
2020-03-12 23:57 ` Greg 'groggy' Lehey
2020-03-12 5:22 ` Dave Horsfall
2020-03-12 5:35 ` Steve Nickolas
2020-03-13 0:36 ` Greg 'groggy' Lehey
2020-03-13 11:26 ` Dave Horsfall
2020-03-14 2:13 ` Greg A. Woods
2020-03-14 4:31 ` Greg 'groggy' Lehey
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).