The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
* [TUHS] On Bloat and the Idea of Small Specialized Tools
@ 2024-05-10 16:36 Clem Cole
       [not found] ` <CAGg_6+Ov6hYTxQ5M-hEBoOiUQ0UVRP0V+aVi0STKAALLDUGY7g@mail.gmail.com>
  0 siblings, 1 reply; 14+ messages in thread
From: Clem Cole @ 2024-05-10 16:36 UTC (permalink / raw)
  To: Rob Pike; +Cc: Computer Old Farts Followers

[-- Attachment #1: Type: text/plain, Size: 8116 bytes --]

While the idea of small tools that do one job well is the core tenant of
what I think of as the UNIX philosophy, this goes a bit beyond UNIX, so I
have moved this discussion to COFF and BCCing TUHS for now.

The key is that not all "bloat" is the same (really)—or maybe one person's
bloat is another person's preference.  That said, NIH leads to pure bloat
with little to recommend it, while multiple offerings are a choice. Maybe
the difference between the two may be one person's view over another.

On Fri, May 10, 2024 at 6:08 AM Rob Pike <robpike@gmail.com> wrote:

> Didn't recognize the command, looked it up. Sigh.
>
Like Rob -- this was a new one for me, too.
I looked, and it is on the SYS3 tape; see:
https://www.tuhs.org/cgi-bin/utree.pl?file=SysIII/usr/src/man/man1/nl.1


>   pr -tn <file>
>

> seems sufficient for me, but then that raises the question of your
> question.
>
Agreed, that has been burned into the ROMs in my  fingers since the
mid-1970s 😀
BTW: SYS3 has pr(1) with both switches too  (more in a minute)


> I've been developing a theory about how the existence of something leads
> to things being added to it that you didn't need at all and only thought of
> when the original thing was created.
>
That is a good point, and I generally agree with you.


> Bloat by example, if you will. I suspect it will not be a popular theory,
> however accurately it may describe the technological world.
>

Of course, sometimes the new features >>are<< easier (more natural *for
some people*).  And herein lies the core problem. The bloat is often
repetitive, and I suggest that it is often implemented in the wrong place -
and usually for the wrong reasons.

Bloat comes about because somebody thinks they need some feature and
probably doesn't understand that it is already there or how they can use
it. But they do know about it, their tool must be set up to exploit it - so
they do not need to reinvent it.  GUI-based tools are notorious for this
failure. Everyone seems to have a built-in (unique) editor, or a private
way to set up configuration options et al. But ... that walled garden is
comfortable for many users and >>can be<< useful sometimes.

Long ago, UNIX programmers learned that looking for $EDITOR in the
environment was way better than creating one.  Configuration was as ASCII
text, stored in /etc for system-wide and dot files in the home for users.
But it also means the >>output<< of each tool needs to be usable by each
other [*i.e.*, docx or xlx files are a no-no).

For example, for many things on my Mac, I do use the GUI-based tools --
there is no doubt they are better integrated with the core Mac system >>for
some tasks.<< But only if I obey a set of rules Apple decrees.  For
instance, this email read is easier much of the time than MH (or the HM
front end, for that matter), which I used for probably 25-30 years. But on
my Mac, I always have 4 or 5 iterm2(1) open running zsh(1) these days. And,
much of my typing (and everything I do as a programmer) is done in the shell
(including a simple text editor, not an 'IDE').  People who love IDEs swear
by them -- I'm just not impressed - there is nothing they do for me that
makes it easier, and I have learned yet another scheme.

That said, sadly, Apple is forcing me to learn yet another debugger since
none of the traditional UNIX-based ones still work on the M1-based systems.
But at least LLDB is in the same key as sdb/dbx/gdb *et al*., so it is a
PITA but not a huge thing as, in the end, LLDB is still based on the UNIX
idea of a single well-designed and specific to the task tool, to do each
job and can work with each other.

FWIW: I was recently a tad gob-smacked by the core idea of UNIX and its
tools, which I have taken for a fact since the 1970s.

It turns out that I've been helping with the PiDP-10 users (all of the
PiDPs are cool, BTW). Before I saw UNIX, I was paid to program a PDP-10. In
fact, my first UNIX job was helping move programs from the 10 to the UNIX.
Thus ... I had been thinking that doing a little PDP-10 hacking shouldn't
be too hard to dust off some of that old knowledge.  While some of it has,
of course, come back.  But daily, I am discovering small things that are so
natural with a few simple tools can be hard on those systems.

I am realizing (rediscovering) that the "build it into my tool" was the
norm in those days.   So instead of a pr(1) command, there was a tool that
created output to the lineprinter. You give it a file, and it is its job to
figure out what to do with it, so it has its set of features (switches) -
so "bloat" is that each tool (like many current GUI tools) has private ways
of doing things. If the maker of tool X decided to support some idea, they
would do it like tool Y.  The problem, of course, was that tools X and Y
had to 'know about' each type of file (in IBM terms, use its "access
method").  Yes, the engineers at DEC, in their wisdom, tried to
"standardize" those access methods/switches/features >>if you implemented
them<< -- but they are not all there.

This leads me back to the question Rob raises.  Years ago, I got into an
argument with Dave Cutler RE: UNIX *vs.* VMS. Dave's #1 complaint about
UNIX in those days was that it was not "standardized."  Every program was
different, and more to Dave's point, there was no attempt to make switches
or errors the same [getopt(3) had been introduced but was not being used by
most applications).  He hated that tar/tp used "keys" and tools like cpio
used switches.  Dave hated that I/O was so simple - in his world all user
programs should use his RMS access method of course [1].  VMS, TOPS, *etc.*,
tried to maintain a system-wide error scheme, and users could look things
like errors up in a system DB by error number, *etc*.  Simply put, VMS is
very "top-down."

My point with Dave was that by being "bottom-up," the best ideas in  UNIX
were able to rise. And yes, it did mean some rough edges and repeated
implementations of the same idea.  But UNIX offered a choice, and while Rob
and I like and find: pr -tn perfectly acceptable thank you, clearly someone
else desired the features that nl provides. The folks that put together
System 3 offer both solutions and let the user choose.

This, of course, comes as bloat, but maybe that is a type of bloat so bad?


My own thinking is this - get things down to the basics and simplest
privatives and then build back up.  It's okay to offer choices, as long as
the foundation is simple and clean.  To me, bloat becomes an issue when you
do the same thing over and over again, particularly because you can not
utilize what is there already, the worst example is NIH - which happens way
more than it should.


I think the kind of bloat that GUI tools and TOPS et al. created forces
recreation, not reuse. But offering choice and the expense of multiple
tools that do the same things strikes me as reasonable/probably a good
thing.


1.]  BTW: One of my favorite DEC stories WRT to VMS engineering has to do
with the RMS I/O system.  Supporting C using VMS was a bit of PITA.
 Eventually, the VMS engineers added Stream I/O - which simplified the C
runtime, but it was also made available for all technical languages.
Fairly soon after it was released, the DEC Marketing folks discovered
almost all new programs, regardless of language, had started to use Stream
I/O and many older programs were being rewritten by customers to use it. In
fact, inside of DEC itself, the languages group eventually rewrote things
like the FTN runtime to use streams, making it much smaller/easier to
maintain.   My line in the old days: "It's not so bad that ever I/O has
offer 1000 options, it's that Dave to check each one for every I/O. It's a
classic example of how you can easily build RMS I/O out of stream-based
I/O, but the other way around is much harder.   My point here is to *use
the right primitives*. RMS may have made it easier to build RDB, but it
impeded everything else.

[-- Attachment #2: Type: text/html, Size: 15955 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [TUHS] Re: [COFF] Re: On Bloat and the Idea of Small Specialized Tools
       [not found]     ` <20240511213532.GB8330@mit.edu>
@ 2024-05-12 19:34       ` Adam Thornton
  2024-05-12 19:47         ` Larry McVoy
                           ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Adam Thornton @ 2024-05-12 19:34 UTC (permalink / raw)
  To: The Eunuchs Hysterical Society

[-- Attachment #1: Type: text/plain, Size: 2362 bytes --]

On Sat, May 11, 2024 at 2:35 PM Theodore Ts'o <tytso@mit.edu> wrote:

>
> I bet most of the young'uns would not be trying to do this as a shell
> script, but using the Cloud SDK with perl or python or Go, which is
> *way* more bloaty than using /bin/sh.
>
> So while some of us old farts might be bemoaning the death of the Unix
> philosophy, perhaps part of the reality is that the Unix philosophy
> were ideal for a simpler time, but might not be as good of a fit
> today


I'm finding myself in agreement.  I might well do this with jq, but as you
point out, you're using the jq DSL pretty extensively to pull out the
fields.  On the other hand, I don't think that's very different than piping
stuff through awk, and I don't think anyone feels like _that_ would be
cheating.  And jq -L is pretty much equivalent to awk -F, which is how I
would do this in practice, rather than trying to inline the whole jq bit.

But it does come down to the same argument as
https://www.microsoft.com/en-us/research/uploads/prod/2019/04/fork-hotos19.pdf

And it is true that while fork() is a great model for single-threaded
pipeline-looking tasks, it's not really what you want for an interactive
multithreaded application on your phone's GUI.

Oddly, I'd have a slightly different reason for reaching for Python (which
is probably how I'd do this anyway), and that's the batteries-included
bit.  If I write in Python, I've got the gcloud api available as a Python
module, and I've got a JSON parser also available as a Python module (but I
bet all the JSON unmarshalling is already handled in the gcloud library),
and I don't have to context-switch to the same degree that I would if I
were stringing it together in the shell.  Instead of "make an HTTP request
to get JSON text back, then parse that with repeated calls to jq", I'd just
get an object back from the instance fetch request, pick out the fields I
wanted, and I'd be done.

I'm afraid only old farts write anything in Perl anymore.  The kids just
mutter "OK, Boomer" when you try to tell them how much better CPAN was than
PyPi.  And it sure feels like all the cool kids have abandoned Go for Rust,
although Go would be a perfectly reasonable choice for this task as well
(and would look a lot like Python: get an object back, pick off the useful
fields).

Adam

[-- Attachment #2: Type: text/html, Size: 2961 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [TUHS] Re: [COFF] Re: On Bloat and the Idea of Small Specialized Tools
  2024-05-12 19:34       ` [TUHS] Re: [COFF] " Adam Thornton
@ 2024-05-12 19:47         ` Larry McVoy
  2024-05-12 20:13           ` [TUHS] Re: forking, " John Levine
  2024-05-12 20:43         ` [TUHS] " Dave Horsfall
                           ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Larry McVoy @ 2024-05-12 19:47 UTC (permalink / raw)
  To: Adam Thornton; +Cc: The Eunuchs Hysterical Society

On Sun, May 12, 2024 at 12:34:20PM -0700, Adam Thornton wrote:
> But it does come down to the same argument as
> https://www.microsoft.com/en-us/research/uploads/prod/2019/04/fork-hotos19.pdf
> 
> And it is true that while fork() is a great model for single-threaded
> pipeline-looking tasks, it's not really what you want for an interactive
> multithreaded application on your phone's GUI.

Perhaps a meaningless aside, but I agree on fork().  In the last major
project I did, which was cross platform {windows,macos, all the major
Unices, Linux}, we adopted spawn() rather than fork/exec.  There is no way
(that I know of) to fake fork() on Windows but it's easy to fake spawn().

--lm

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [TUHS] Re: forking, Re: [COFF] Re: On Bloat and the Idea of Small Specialized Tools
  2024-05-12 19:47         ` Larry McVoy
@ 2024-05-12 20:13           ` John Levine
  2024-05-12 22:56             ` Dan Cross
  0 siblings, 1 reply; 14+ messages in thread
From: John Levine @ 2024-05-12 20:13 UTC (permalink / raw)
  To: tuhs

It appears that Larry McVoy <lm@mcvoy.com> said:
>Perhaps a meaningless aside, but I agree on fork().  In the last major
>project I did, which was cross platform {windows,macos, all the major
>Unices, Linux}, we adopted spawn() rather than fork/exec.  There is no way
>(that I know of) to fake fork() on Windows but it's easy to fake spawn().

The whole point of fork() is that it lets you get the effect of spawn with
a lot less internal mechanism.  Spawn is equivalent to:

  fork()
  ... do stuff to files and environment ...
  exec()

By separating the fork and the exec, they didn't have to put all of
the stuff in the 12 paragraphs in the spawn() man page into the the
tiny PDP-11 kernel.

These days now that programs include multi-megabyte shared libraries
just for fun, I agree that the argument is less persuasive.  On the
third hard, we now understand virtual memory and paging systems a lot
better so we don't need kludges like vfork().

R's,
John


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [TUHS] Re: [COFF] Re: On Bloat and the Idea of Small Specialized Tools
  2024-05-12 19:34       ` [TUHS] Re: [COFF] " Adam Thornton
  2024-05-12 19:47         ` Larry McVoy
@ 2024-05-12 20:43         ` Dave Horsfall
  2024-05-13  2:33         ` Alexis
  2024-05-13  5:23         ` markus schnalke
  3 siblings, 0 replies; 14+ messages in thread
From: Dave Horsfall @ 2024-05-12 20:43 UTC (permalink / raw)
  To: The Eunuchs Hysterical Society

[-- Attachment #1: Type: text/plain, Size: 570 bytes --]

On Sun, 12 May 2024, Adam Thornton wrote:

> I'm afraid only old farts write anything in Perl anymore.  The kids just
> mutter "OK, Boomer" when you try to tell them how much better CPAN was than
> PyPi.  And it sure feels like all the cool kids have abandoned Go for Rust,
> although Go would be a perfectly reasonable choice for this task as well
> (and would look a lot like Python: get an object back, pick off the useful
> fields).

I must be an old fart then; the last language I used where white space was 
part of the syntax was FORTRAN...

-- Dave

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [TUHS] Re: forking, Re: [COFF] Re: On Bloat and the Idea of Small Specialized Tools
  2024-05-12 20:13           ` [TUHS] Re: forking, " John Levine
@ 2024-05-12 22:56             ` Dan Cross
  2024-05-12 23:34               ` Larry McVoy
  2024-05-13  3:29               ` Andrew Warkentin
  0 siblings, 2 replies; 14+ messages in thread
From: Dan Cross @ 2024-05-12 22:56 UTC (permalink / raw)
  To: John Levine; +Cc: tuhs

On Sun, May 12, 2024 at 4:14 PM John Levine <johnl@taugh.com> wrote:
> It appears that Larry McVoy <lm@mcvoy.com> said:
> >Perhaps a meaningless aside, but I agree on fork().  In the last major
> >project I did, which was cross platform {windows,macos, all the major
> >Unices, Linux}, we adopted spawn() rather than fork/exec.  There is no way
> >(that I know of) to fake fork() on Windows but it's easy to fake spawn().
>
> The whole point of fork() is that it lets you get the effect of spawn with
> a lot less internal mechanism.  Spawn is equivalent to:
>
>   fork()
>   ... do stuff to files and environment ...
>   exec()
>
> By separating the fork and the exec, they didn't have to put all of
> the stuff in the 12 paragraphs in the spawn() man page into the the
> tiny PDP-11 kernel.

Perhaps, but as I've written here before, `fork`/`exec` vs `spawn` is
a false dichotomy. Another alternative is a `proccreate`/`procrun`
pair, the former of which creates an unrunnable process, the latter of
which marks it runnable. Coupled with a set of primitives to
manipulate the state of an extant, but unrunnable, process and you
have the advantages of fork/exec without the downsides (which are
well-known; https://www.microsoft.com/en-us/research/uploads/prod/2019/04/fork-hotos19.pdf).
Similarly, this gives you the functionality of spawn, without the
downside of a singularly complicated interface. Could you have
implemented that in something as small as the PDP-7? Perhaps not, but
it does not follow that `fork` now remains a good primitive.

My spelunking in the original GENIE documentation leads me to believe
that its `fork` provided functionality similar to what I described.

        - Dan C.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [TUHS] Re: forking, Re: [COFF] Re: On Bloat and the Idea of Small Specialized Tools
  2024-05-12 22:56             ` Dan Cross
@ 2024-05-12 23:34               ` Larry McVoy
  2024-05-13  1:34                 ` Dave Horsfall
  2024-05-13  3:29               ` Andrew Warkentin
  1 sibling, 1 reply; 14+ messages in thread
From: Larry McVoy @ 2024-05-12 23:34 UTC (permalink / raw)
  To: Dan Cross; +Cc: John Levine, tuhs

On Sun, May 12, 2024 at 06:56:35PM -0400, Dan Cross wrote:
> Similarly, this gives you the functionality of spawn, without the
> downside of a singularly complicated interface. Could you have
> implemented that in something as small as the PDP-7? Perhaps not, but
> it does not follow that `fork` now remains a good primitive.

Our spawnvp() implmentation is 40 lines of code.  Worked fine everywhere.
I can post it if you like.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [TUHS] Re: forking, Re: [COFF] Re: On Bloat and the Idea of Small Specialized Tools
  2024-05-12 23:34               ` Larry McVoy
@ 2024-05-13  1:34                 ` Dave Horsfall
  2024-05-13 13:21                   ` Larry McVoy
  0 siblings, 1 reply; 14+ messages in thread
From: Dave Horsfall @ 2024-05-13  1:34 UTC (permalink / raw)
  To: The Eunuchs Hysterical Society

On Sun, 12 May 2024, Larry McVoy wrote:

> Our spawnvp() implmentation is 40 lines of code.  Worked fine everywhere.
> I can post it if you like.

Pretty please...

-- Dave

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [TUHS] Re: [COFF] Re: On Bloat and the Idea of Small Specialized Tools
  2024-05-12 19:34       ` [TUHS] Re: [COFF] " Adam Thornton
  2024-05-12 19:47         ` Larry McVoy
  2024-05-12 20:43         ` [TUHS] " Dave Horsfall
@ 2024-05-13  2:33         ` Alexis
  2024-05-13  2:57           ` Warner Losh
  2024-05-13  5:23         ` markus schnalke
  3 siblings, 1 reply; 14+ messages in thread
From: Alexis @ 2024-05-13  2:33 UTC (permalink / raw)
  To: The Unix Heritage Society


> On Sat, May 11, 2024 at 2:35 PM Theodore Ts'o <tytso@mit.edu> 
> wrote:
>
> So while some of us old farts might be bemoaning the death of 
> the
> Unix
> philosophy, perhaps part of the reality is that the Unix 
> philosophy
> were ideal for a simpler time, but might not be as good of a fit
> today

Hm .... i guess it might depend on the specific use-case(s) 
involved?

At one point i realised that a primary reason i enjoy using *n*x 
systems is that they're fundamentally 
_text-oriented_. (Unsurprisingly, of course, given the context in 
which Unix was developed.) i spend a lot of my time interacting 
and working with text, and *n*x systems provide me with many 
useful tools for this. Quoting the old "UNIX As Literature" piece, 
https://theody.net/elements.html:

"[T]he most recurrent complaint was that [Unix] was too 
text-oriented. People really hated the command line, with all the 
utilities, obscure flags, and arguments they had to memorize. They 
hated all the typing. One mislaid character and you had to start 
over. Interestingly, this complaint came most often from users of 
the GUI-laden Macintosh or Windows platforms. ...

"[A] suspiciously high proportion of my UNIX colleagues had 
already developed, in some prior career, a comfort and fluency 
with text and printed words. ...

"With UNIX, text — on the command line, STDIN, STDOUT, STDERR — is 
the primary interface mechanism: UNIX system utilities are a sort 
of Lego construction set for word-smiths. Pipes and filters 
connect one utility to the next, text flows invisibly 
between. Working with a shell, awk/lex derivatives, or the utility 
set is literally a word dance."

Perl, with its pervasive regex-based functionality and extensive 
Unicode support, fits neatly into this. i find regexes an 
_incredibly_ powerful tool for working with text, whether via 
Perl, sed, awk, or whatever. But my experience is that many people 
treat regexes as an anathema, with Zawinski's "Now you have two 
problems" regularly trotted out as a thought-terminating 
cliché. Sure, regexes can, and do, get used where they shouldn't 
be[a]; that doesn't mean the baby should be thrown out with the 
bathwater. 

But if one is only working with text under sufferance, trying to 
avoid it via substantially more graphically-oriented environments, 
the text-based "Unix philosophy" and the tools associated with it 
might feel (and actually be) much less appropriate and 
useful. Fair enough. The Unix construction set will still be there 
for those of us who find them very appropriate and tremendously 
useful.


Alexis.

[a] It seems unlikely that anyone on this list hasn't already seen 
this, but just in case:

https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454

i'm looking forward to that comment sending OpenAI over the 
Mountains of Madness.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [TUHS] Re: [COFF] Re: On Bloat and the Idea of Small Specialized Tools
  2024-05-13  2:33         ` Alexis
@ 2024-05-13  2:57           ` Warner Losh
  0 siblings, 0 replies; 14+ messages in thread
From: Warner Losh @ 2024-05-13  2:57 UTC (permalink / raw)
  To: Alexis; +Cc: The Unix Heritage Society

[-- Attachment #1: Type: text/plain, Size: 4737 bytes --]

On Sun, May 12, 2024, 8:34 PM Alexis <flexibeast@gmail.com> wrote:

>
> > On Sat, May 11, 2024 at 2:35 PM Theodore Ts'o <tytso@mit.edu>
> > wrote:
> >
> > So while some of us old farts might be bemoaning the death of
> > the
> > Unix
> > philosophy, perhaps part of the reality is that the Unix
> > philosophy
> > were ideal for a simpler time, but might not be as good of a fit
> > today
>
> Hm .... i guess it might depend on the specific use-case(s)
> involved?
>

I created, years ago, a set of time legos. They were connected as a network
of producer / consumer interfaces. Each lego would do one thing and pass
the results to the next thing in the chain. A driver would read timing data
from the driver and convert it to a MI interface. Different other legos
would take time differences, compute phase or frequency differences and
these would feed into more sophisticated algorithms or output etc. All
locking was on yhe pipe's queues so all these algorithms were lock free
apart from the queueing or dequeueing of data.

Concrptually, this is just a bunch of pipe, with many to 1 or 1 to many
added. Each lego did one thing and passed the results along to the thing in
the chain... much like 'cmd | grep | awk | more'. Plus MI data
representations for almost everything so only the driver reader thread
cared about the hw. See also tty abstraction or ifnet abstraction in
unix....

So actually not a set of FDs passing data between process, but threads
doing the same sort of thing.

The whole data filtering paradigm works in lots of different ways. And it
still works really well by analogy.

Warner

ObComplaint: fork sucks for address spaces with 100s of threads. Forst
thing we created a child process we used to broker different threads
needing to run popen or system... having a create process / munge process /
start process API is kinda what we did behind the scenes though with "send
this data" and "receive that data". We iterated to this after the first
dozen attempts to closely broker fork/exec dance proved... unreliable.

At one point i realised that a primary reason i enjoy using *n*x
> systems is that they're fundamentally
> _text-oriented_. (Unsurprisingly, of course, given the context in
> which Unix was developed.) i spend a lot of my time interacting
> and working with text, and *n*x systems provide me with many
> useful tools for this. Quoting the old "UNIX As Literature" piece,
> https://theody.net/elements.html:
>
> "[T]he most recurrent complaint was that [Unix] was too
> text-oriented. People really hated the command line, with all the
> utilities, obscure flags, and arguments they had to memorize. They
> hated all the typing. One mislaid character and you had to start
> over. Interestingly, this complaint came most often from users of
> the GUI-laden Macintosh or Windows platforms. ...
>
> "[A] suspiciously high proportion of my UNIX colleagues had
> already developed, in some prior career, a comfort and fluency
> with text and printed words. ...
>
> "With UNIX, text — on the command line, STDIN, STDOUT, STDERR — is
> the primary interface mechanism: UNIX system utilities are a sort
> of Lego construction set for word-smiths. Pipes and filters
> connect one utility to the next, text flows invisibly
> between. Working with a shell, awk/lex derivatives, or the utility
> set is literally a word dance."
>
> Perl, with its pervasive regex-based functionality and extensive
> Unicode support, fits neatly into this. i find regexes an
> _incredibly_ powerful tool for working with text, whether via
> Perl, sed, awk, or whatever. But my experience is that many people
> treat regexes as an anathema, with Zawinski's "Now you have two
> problems" regularly trotted out as a thought-terminating
> cliché. Sure, regexes can, and do, get used where they shouldn't
> be[a]; that doesn't mean the baby should be thrown out with the
> bathwater.
>
> But if one is only working with text under sufferance, trying to
> avoid it via substantially more graphically-oriented environments,
> the text-based "Unix philosophy" and the tools associated with it
> might feel (and actually be) much less appropriate and
> useful. Fair enough. The Unix construction set will still be there
> for those of us who find them very appropriate and tremendously
> useful.
>
>
> Alexis.
>
> [a] It seems unlikely that anyone on this list hasn't already seen
> this, but just in case:
>
>
> https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454
>
> i'm looking forward to that comment sending OpenAI over the
> Mountains of Madness.
>

[-- Attachment #2: Type: text/html, Size: 6111 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [TUHS] Re: forking, Re: [COFF] Re: On Bloat and the Idea of Small Specialized Tools
  2024-05-12 22:56             ` Dan Cross
  2024-05-12 23:34               ` Larry McVoy
@ 2024-05-13  3:29               ` Andrew Warkentin
  1 sibling, 0 replies; 14+ messages in thread
From: Andrew Warkentin @ 2024-05-13  3:29 UTC (permalink / raw)
  To: The Eunuchs Historic Society

On Sun, May 12, 2024 at 4:57 PM Dan Cross <crossd@gmail.com> wrote:
>l.
>
> Perhaps, but as I've written here before, `fork`/`exec` vs `spawn` is
> a false dichotomy. Another alternative is a `proccreate`/`procrun`
> pair, the former of which creates an unrunnable process, the latter of
> which marks it runnable. Coupled with a set of primitives to
> manipulate the state of an extant, but unrunnable, process and you
> have the advantages of fork/exec without the downsides (which are
> well-known; https://www.microsoft.com/en-us/research/uploads/prod/2019/04/fork-hotos19.pdf).
> Similarly, this gives you the functionality of spawn, without the
> downside of a singularly complicated interface. Could you have
> implemented that in something as small as the PDP-7? Perhaps not, but
> it does not follow that `fork` now remains a good primitive.
>
IMO something like that is the best model (although it probably would
have been a bit complicated for a PDP-7/PDP-11). That's basically what
I'm doing in the OS that I'm writing
<https://gitlab.com/uxrt/uxrt-toplevel>. Processes will basically just
be containers for hierarchical groups of threads, and will have pretty
much no other state besides the command line. All of the context
normally associated with a process (file descriptor space,
permissions/UID/GID, filesystem namespace, virtual address space) will
instead be in separate objects that are explicitly bound to threads.
Separate APIs for creating an empty process, creating threads within
it, manipulating context objects and binding threads to them, and
starting the process will be provided (all of these APIs will use a
file-based transport underneath; this will be the first OS I know of
where literally everything is a file). The base process APIs will be
general enough to allow an efficient copy-on-write fork() to be
implemented on top of them for backwards compatibility and the
remaining use cases where forking still makes sense (since even all
process memory will be implemented with files, this will be
implemented with a special in-memory "shadow filesystem" that creates
alternate mappings of other memory filesystems).

Really I'd say there are actually several design decisions in
conventional Unix that made sense on a PDP-7 or PDP-11, but no longer
make sense in the modern world. For instance, the rather inflexible
security model with its fixed set of root-only system calls rather
than some form of role-based access control, or the use of on-disk
device nodes bound by numbers rather than something like separate
special filesystems for each driver that get union mounted together,
or the lack of integrated support for userspace filesystem servers
(yes, there's FUSE, but it's kind of a poorly integrated hack that is
rarely used for anything important).

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [TUHS] Re: [COFF] Re: On Bloat and the Idea of Small Specialized Tools
  2024-05-12 19:34       ` [TUHS] Re: [COFF] " Adam Thornton
                           ` (2 preceding siblings ...)
  2024-05-13  2:33         ` Alexis
@ 2024-05-13  5:23         ` markus schnalke
  2024-05-13  6:18           ` Andrew Warkentin
  3 siblings, 1 reply; 14+ messages in thread
From: markus schnalke @ 2024-05-13  5:23 UTC (permalink / raw)
  To: The Eunuchs Hysterical Society

Hoi.

> On Sat, May 11, 2024 at 2:35 PM Theodore Ts'o <tytso@mit.edu> wrote:
> 
>     I bet most of the young'uns would not be trying to do this as a shell
>     script, but using the Cloud SDK with perl or python or Go, which is
>     *way* more bloaty than using /bin/sh.
> 
>     So while some of us old farts might be bemoaning the death of the Unix
>     philosophy, perhaps part of the reality is that the Unix philosophy
>     were ideal for a simpler time, but might not be as good of a fit
>     today

It depends on what the Unix philosophy is seen to be. If it is
solving problems by reading text from standard in and printing to
standard out, then that might not be suitable anymore for many of
today's problems. But if it is prefering plain text to binary,
perfering simple solutions to complex ones, increasing the number
of operations one can perform by combining small generic parts,
... all because of good reasons ... Focussing on simplicity,
clarity, generality ... Omitting needless words! ... All this still
holds true, no matter if applied as shell scripts or within the
design of a new programming language or a programming interface.

It's not so much about the tools we use -- these should be suited
for the times you live in and the problems you have to solve --
but it's more about how you look at them and how you look at the
problems and what ideas for solutions you can imagine in your
mind. Here, Unix provides a continuing inspiration.

Only, like with every old book: when we read it today, we have to
read it within the background of the times back then and transfer
its message to today's times. The older the book, the more transfer
work has to be done, the more knowledgable the then younger and
more distant readers have to be, to really understand it.

Thus, in my oppinion, the Unix philosophy remains a good and very
relevant fit today, although not all of its applications from back
then still are.


meillo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [TUHS] Re: [COFF] Re: On Bloat and the Idea of Small Specialized Tools
  2024-05-13  5:23         ` markus schnalke
@ 2024-05-13  6:18           ` Andrew Warkentin
  0 siblings, 0 replies; 14+ messages in thread
From: Andrew Warkentin @ 2024-05-13  6:18 UTC (permalink / raw)
  To: The Eunuchs Historic Society

On Sun, May 12, 2024 at 11:23 PM markus schnalke <meillo@marmaro.de> wrote:
>
>
> It depends on what the Unix philosophy is seen to be. If it is
> solving problems by reading text from standard in and printing to
> standard out, then that might not be suitable anymore for many of
> today's problems. But if it is prefering plain text to binary,
> perfering simple solutions to complex ones, increasing the number
> of operations one can perform by combining small generic parts,
> ... all because of good reasons ... Focussing on simplicity,
> clarity, generality ... Omitting needless words! ... All this still
> holds true, no matter if applied as shell scripts or within the
> design of a new programming language or a programming interface.
>
> It's not so much about the tools we use -- these should be suited
> for the times you live in and the problems you have to solve --
> but it's more about how you look at them and how you look at the
> problems and what ideas for solutions you can imagine in your
> mind. Here, Unix provides a continuing inspiration.
>
> Only, like with every old book: when we read it today, we have to
> read it within the background of the times back then and transfer
> its message to today's times. The older the book, the more transfer
> work has to be done, the more knowledgable the then younger and
> more distant readers have to be, to really understand it.
>
> Thus, in my oppinion, the Unix philosophy remains a good and very
> relevant fit today, although not all of its applications from back
> then still are.
>
I agree, but it seems that most Unix developers haven't really cared
since the side branches and clones effectively took over from Research
Unix in the early 80s. They've added system calls and ad-hoc socket
RPC interfaces with abandon instead of using generic filesystem-based
extensibility APIs, added options to various commands that should just
have been separate programs, and written desktop
environments/applications that have poor composability, extensibility
and modularity (I guess KDE's KParts kind of counts as a mechanism for
composing applications, but it's limited by being based on plugins
rather than an open IPC-based API). The only Unix desktop I can think
of that really tries to follow the Unix philosophy somewhat is the
now-abandoned Étoilé <http://etoileos.com/etoile/>. There's also the
desktops of the rather obscure BTRON family
<http://tronweb.super-nova.co.jp/btronproducts.html>, although those
OSes are only vaguely Unix-like. Both have an object-centric rather
than application-centric model with support for embedding applications
within each other and controlling them with RPC APIs.

IMO, the best practical realization of the Unix philosophy for the
modern era would be a QNX/Plan 9-like OS with an Étoilé/BTRON-like
desktop, hence why I'm working on one. Some of the specifics of the
original Unix philosophy may not be relevant to large parts of modern
computing, but I'd say the general ideas still are.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [TUHS] Re: forking, Re: [COFF] Re: On Bloat and the Idea of Small Specialized Tools
  2024-05-13  1:34                 ` Dave Horsfall
@ 2024-05-13 13:21                   ` Larry McVoy
  0 siblings, 0 replies; 14+ messages in thread
From: Larry McVoy @ 2024-05-13 13:21 UTC (permalink / raw)
  To: Dave Horsfall; +Cc: The Eunuchs Hysterical Society

On Mon, May 13, 2024 at 11:34:38AM +1000, Dave Horsfall wrote:
> On Sun, 12 May 2024, Larry McVoy wrote:
> 
> > Our spawnvp() implmentation is 40 lines of code.  Worked fine everywhere.
> > I can post it if you like.
> 
> Pretty please...
> 
> -- Dave

/*
 * Copyright 1999-2002,2004-2006,2015-2016 BitMover, Inc
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

#include "system.h"

void	(*spawn_preHook)(int flags, char *av[]) = 0;

#ifndef WIN32
pid_t
bk_spawnvp(int flags, char *cmdname, char *av[])
{
	int	fd, status;
	pid_t	pid;
	char	*exec;

	/* Tell the calling process right away if there is no such program */
	unless (exec = which((char*)cmdname)) return (-1);

	if (spawn_preHook) spawn_preHook(flags, av);
	if (pid = fork()) {	/* parent */
		free(exec);
		if (pid == -1) return (pid);
		unless (flags & (_P_DETACH|_P_NOWAIT)) {
			if (waitpid(pid, &status, 0) != pid) status = -1;
			return (status);
		}
		return (pid);
	} else {		/* child */
		/*
		 * See win32/uwtlib/wapi_intf.c:spawnvp_ex()
		 * We leave nothing open on a detach, but leave
		 * in/out/err open on a normal fork/exec.
		 */
		if (flags & _P_DETACH) {
			unless (getenv("_NO_SETSID")) setsid();
			/* close everything to match winblows */
			for (fd = 0; fd < 100; fd++) (close)(fd);
		} else {
			/*
			 * Emulate having everything except in/out/err
			 * as being marked as close on exec to match winblows.
			 */
			for (fd = 3; fd < 100; fd++) (close)(fd);
		}
		execv(exec, av);
		perror(exec);
		_exit(19);
	}
}

#else /* ======== WIN32 ======== */

pid_t
bk_spawnvp(int flags, char *cmdname, char *av[])
{
	pid_t	pid;
	char	*exec;

	/* Tell the calling process right away if there is no such program */
	unless (exec = which((char*)cmdname)) return (-1);

	if (spawn_preHook) spawn_preHook(flags, av);
	/*
	 * We use our own version of spawn in uwtlib
	 * because the NT spawn() does not work well with tcl
	 */
	pid = _spawnvp_ex(flags, exec, av, 1);
	free(exec);
	return (pid);
}
#endif /* WIN32 */


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-05-13 13:22 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-10 16:36 [TUHS] On Bloat and the Idea of Small Specialized Tools Clem Cole
     [not found] ` <CAGg_6+Ov6hYTxQ5M-hEBoOiUQ0UVRP0V+aVi0STKAALLDUGY7g@mail.gmail.com>
     [not found]   ` <CAEoi9W7FbGZFhiddHWWqdivGFfgFAj9nsUApomswfP56rqTMpQ@mail.gmail.com>
     [not found]     ` <20240511213532.GB8330@mit.edu>
2024-05-12 19:34       ` [TUHS] Re: [COFF] " Adam Thornton
2024-05-12 19:47         ` Larry McVoy
2024-05-12 20:13           ` [TUHS] Re: forking, " John Levine
2024-05-12 22:56             ` Dan Cross
2024-05-12 23:34               ` Larry McVoy
2024-05-13  1:34                 ` Dave Horsfall
2024-05-13 13:21                   ` Larry McVoy
2024-05-13  3:29               ` Andrew Warkentin
2024-05-12 20:43         ` [TUHS] " Dave Horsfall
2024-05-13  2:33         ` Alexis
2024-05-13  2:57           ` Warner Losh
2024-05-13  5:23         ` markus schnalke
2024-05-13  6:18           ` Andrew Warkentin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).