Was thinking about our recent discussion about system call bloat and such. Seemed to me that there was some argument that it was needed in order to support modern needs. As I tried to say, I think that a good part of the bloat stemmed from we-need-to-add-this-to-support-that thinking instead of what's-the-best-way-to-extend-the-system-to-support-this-need thinking. So if y'all are up for it, I'd like to have a discussion on what abstractions would be appropriate in order to meet modern needs. Any takers? Jon
On Mon, 15 Feb 2021, Jon Steinhart wrote:
[...]
> So if y'all are up for it, I'd like to have a discussion on what
> abstractions would be appropriate in order to meet modern needs. Any
> takers?
Somebody once suggested a filesystem interface (it certainly fits the Unix
philosophy); I don't recall the exact details.
-- Dave
Dave Horsfall <dave@horsfall.org> wrote:
> On Mon, 15 Feb 2021, Jon Steinhart wrote:
>
> [...]
>
> > So if y'all are up for it, I'd like to have a discussion on what
> > abstractions would be appropriate in order to meet modern needs. Any
> > takers?
>
> Somebody once suggested a filesystem interface (it certainly fits the Unix
> philosophy); I don't recall the exact details.
>
> -- Dave
And it was done, over 30 years ago; see Plan 9 from Bell Labs....
Arnold
Jon Steinhart <jon@fourwinds.com> writes:
> So if y'all are up for it, I'd like to have a discussion on what
> abstractions would be appropriate in order to meet modern needs. Any
> takers?
A late friend of mine felt strongly that Unix needed an SQL interface to
the kernel. With all information and configuration in a well designed
schema, system administration could be greatly enhanced, he felt, and
could have standard interaction patterns across components -- instead of
all the quirky command line interfaces we have today, and their user
oriented output formats that you need to parse to use the data.
sysctl done right, so to speak.
-tih
--
Most people who graduate with CS degrees don't understand the significance
of Lisp. Lisp is the most important idea in computer science. --Alan Kay
Now that is an interesting idea. Did he ever get around to developing
it? Any documents? Any experimental results? (Mind you, he'd've run into
CJ Date's reservations on the incompleteness of SQL as a language stuck
between relational algebra and relational calculus ... :) )
Wesley Parish
On 16/02/21 9:15 pm, Tom Ivar Helbekkmo via TUHS wrote:
> Jon Steinhart <jon@fourwinds.com> writes:
>
>> So if y'all are up for it, I'd like to have a discussion on what
>> abstractions would be appropriate in order to meet modern needs. Any
>> takers?
> A late friend of mine felt strongly that Unix needed an SQL interface to
> the kernel. With all information and configuration in a well designed
> schema, system administration could be greatly enhanced, he felt, and
> could have standard interaction patterns across components -- instead of
> all the quirky command line interfaces we have today, and their user
> oriented output formats that you need to parse to use the data.
>
> sysctl done right, so to speak.
>
> -tih
> On Feb 15, 2021, at 11:56, Jon Steinhart <jon@fourwinds.com> wrote: > > Was thinking about our recent discussion about system call bloat and such. > Seemed to me that there was some argument that it was needed in order to > support modern needs. As I tried to say, I think that a good part of the > bloat stemmed from we-need-to-add-this-to-support-that thinking instead > of what's-the-best-way-to-extend-the-system-to-support-this-need thinking. > > So if y'all are up for it, I'd like to have a discussion on what abstractions > would be appropriate in order to meet modern needs. Any takers? The folks behind the Nerves Project (https://www.nerves-project.org) have done some serious thinking about this question, albeit mostly confined to the IoT space. They have also written (and distribute) some nifty implementation code. I won't try to cover all of their work here, but some high points include: - automated build and cross-compilation of entire Linux-based systems - automated distribution of (and fallbacks for) updated system code - separation of code and data using read-only and read/write file systems - support for multiple target platforms (e.g., processors, boards) - Erlang-style supervision trees (via Elixir) for critical services, etc. - extremely rapid boot times for the resulting (Linux-based) systems For more information, check out their web site, watch some presentations, and/or (gasp!) try out the code... -r
Tom Ivar Helbekkmo writes:
> Jon Steinhart <jon@fourwinds.com> writes:
>
> > So if y'all are up for it, I'd like to have a discussion on what
> > abstractions would be appropriate in order to meet modern needs. Any
> > takers?
>
> A late friend of mine felt strongly that Unix needed an SQL interface to
> the kernel. With all information and configuration in a well designed
> schema, system administration could be greatly enhanced, he felt, and
> could have standard interaction patterns across components -- instead of
> all the quirky command line interfaces we have today, and their user
> oriented output formats that you need to parse to use the data.
>
> sysctl done right, so to speak.
OK, that's interesting and makes my brain a bit crazy. Are we talking
select file_descriptor from file_table where file_name='foo' && flags='O_EXCL';
delete from process_table where process_id=pid;
and so on? Lots of possibilities for weird joins.
But, this wasn't exactly what I was looking for in my original post which was
maybe too terse.
There have been heated discussions on this list about kernel API bloat. In my
opinion, these discussions have mainly been people grumbling about what they
don't like. I'd like to flip the discussion around to what we would like.
Ken and Dennis did a great job with initial abstractions. Some on this list
have claimed that these abstractions weren't sufficient for modern times.
Now that we have new information from modern use cases, how would we rethink
the basic abstractions?
Quoting from something that I wrote a few years ago:
The original Apple Macintosh API was published in 1985 in a three-
volume set of books called Inside Macintosh (Addison-Wesley). The
set was over 1,200 pages long. It’s completely obsolete; modern
(UNIX-based) Macs don’t use any of it. Why didn’t this API
design last?
By contrast, version 6 of the UNIX operating system was released 10
years earlier in 1975, with a 321-page manual. It embodied a completely
different approach that sported a narrow and deep API.
Both the UNIX API and a large number of the original applications are
still in widespread use today, more than 40 years later, which is a
testament to the quality of the design. Not only that, but a large
number of the libraries are still in use and essentially unchanged,
though their functionality has been copied into many other systems.
While I don't have a count of the number of entries in the original Mac API,
I'm guessing that number of Linux system calls is getting closer to that
number.
Is there any way that the abstractions can be rethought to get us back to an
API that more concise and flexible? By flexible I mean the ability to support
new functionality without adding more system calls?
While the SQL interface notion is interesting, to me it's more in line with
using a different language to access the API. But it would be interesting to
see it fleshed out because maybe the abstractions provided by various tables
would be different.
Because it's easy pickings, I would claim that the socket system call is out
of line with the UNIX abstractions; it exists because of practical political
considerations, not because it's needed. I think that it would have fit
better folded into the open system call.
Something else added along with the networking was readv/writev. In this case,
I would claim that those are the correct modern abstraction and that read/write
are a subset.
Hope that clarifies the discussion that I'm trying to kick off.
Jon
Jon Steinhart wrote in <202102151956.11FJuRIh3079869@darkstar.fourwinds.com>: |Was thinking about our recent discussion about system call bloat and such. |Seemed to me that there was some argument that it was needed in order to |support modern needs. As I tried to say, I think that a good part of the |bloat stemmed from we-need-to-add-this-to-support-that thinking instead |of what's-the-best-way-to-extend-the-system-to-support-this-need thinking. | |So if y'all are up for it, I'd like to have a discussion on what abstrac\ |tions |would be appropriate in order to meet modern needs. Any takers? Proper program exit integer status codes. Now that "set -o pipefail" is a standardized feature of POSIX shells all that is needed are programs which properly handle errors and also report that to the outside. This is very hard, especially when put over existing codebases. But also new code. For example i use BTRFS (with a long term perspective to switch to ZFS, because of restartable snapshot sends, and also because of ZFS encrypted partitions to replace my several encfs-encrypted on-demand storages, these now can even be shared in between FreeBSD and Linux), (i use it at all because it ships with the Linux kernel, can be compiled-in, is copyright-compatible, that is i wanted to test that coming from over two decades of ext2/3/4 on Linux and of course the default of FreeBSD, and i really drive the entire thing with subvolumes, only the EFI boot partition is truly separate), anyhow, receiving snapshots can fail but the snapshot counts as having been properly received, and no exit status whatsoever will report the failure. (At least in my practical experiences.) Easy scriptability with proper (also meaning automatically interpretable) error reports. --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)
It's always useful to talk about requirements as the first part of the design process. At the high level, how important is backwards compatibility? Is the problem of how support existing application in scope, or not? Or is the assumption that emulation libraries will always be sufficient. How about performance, either of applications using the new API, or applcications using the legacy API's? And what are the hardware platforms that this new set of abstractions going to target? Is the goal only to target small embedded systems? Mobile handsets? Desktop systems? Is it supposed to be able to scale to super computers? Are web front-ends that need to be able to accept thousands of incoming TCP connections per second, and then redirect those connections to application logic servers in scope? Solutions that involve being able to support intpret general SQL queries may not scale in terms of performance and the ability to support thousands of file descriptors in a single process. Backwards compatibility is why we have multiple asynchronous I/O interfaces --- from select, poll, epoll, kqueue, and io_uring. And the reason why we've had multiple asynchronus I/O interfaces over the decades is because the performance requirements have changed, and the capability of hardware interfaces for high performance I/O has changed; it's no longer about I/O ports and interrupts, but instead, having multiple request and response queues through memory mapped I/O, and the need to be able to use multiple CPU's and multiplexing multiple network or storage transactions across a single doorbell or system call. If all of this is out of scope, then the design process will be much simpler, and perhaps more elegant; but the resulting design will not be useful for many of the use cases where Linux is used today. And perhaps that's OK. On the other hand, one person's simple, elegant design is another person's toy that isn't fit for their purpose. IBM once said that part of Linux's power is that it scales from wrist watches to super computers. Is that in scope for this theoretical design question? - Ted
[-- Attachment #1: Type: text/plain, Size: 1626 bytes --] On Feb 16, 2021, at 11:59 AM, Jon Steinhart <jon@fourwinds.com> wrote: > > The original Apple Macintosh API was published in 1985 in a three- > volume set of books called Inside Macintosh (Addison-Wesley). The > set was over 1,200 pages long. It’s completely obsolete; modern > (UNIX-based) Macs don’t use any of it. Why didn’t this API > design last? I think this is a little bit of a red herring; most of the original Macintosh Toolbox APIs would not be considered "system calls" then or now. The Macintosh Operating System APIs were a much more tightly-scoped set on top of which was the Toolbox. For example, in the original filesystem and device driver interfaces, you had _PBOpen, _PBClose, _PBRead, _PBWrite, and _PBControl. Sound familiar? One major difference is that these took a struct full of arguments (a parameter block in Macintosh API terminology) and could be used either synchronously or asynchronously with a callback, unlike the core UNIX filesystem calls. A more oranges-to-oranges comparison would be to look at the Macintosh Operating System and Toolbox API surface compared with, say, the SunOS and SunWindows API surface… And then, of course, there's the question of how long the design lasted: The Carbon API set is a direct descendant of the original Macintosh Operating System and Toolbox API set, and was supported for the entire lifetime of 32-bit executables on the Mac. I ported plenty of OS & Toolbox code to Carbon and it was mostly a matter of updating UI metrics and replacing direct structure accesses with equivalent function calls. -- Chris [-- Attachment #2: Type: text/html, Size: 7762 bytes --]
> On 16 Feb 2021, at 06:56, Jon Steinhart <jon@fourwinds.com> wrote:
>
> Was thinking about our recent discussion about system call bloat and such.
> Seemed to me that there was some argument that it was needed in order to
> support modern needs. As I tried to say, I think that a good part of the
> bloat stemmed from we-need-to-add-this-to-support-that thinking instead
> of what's-the-best-way-to-extend-the-system-to-support-this-need thinking.
>
> So if y'all are up for it, I'd like to have a discussion on what abstractions
> would be appropriate in order to meet modern needs. Any takers?
Plan9 showed that it’s possible to evolve the Unix model to encompass new needs without compromising the abstraction, although to be fair, it basically addressed only the first 15-20 years of changes since V7. Freedom to break backward compatibility is obviously a key enabler, and difficult to manage for a commercial system.
Despite its various issues, I think the Mach abstractions also stand up well as an insightful effort for their time.
One area that has continued to evolve in Unix, with a trail of (mostly) still-supported-but-no-longer-recommended APIs, is asynchronous event handling. mpx, select, poll, kevents, AIO, /dev/poll, epoll, port_create, inotify, dnotify, FEN, etc. What a mess!
Containers, jails, zones, namespaces, etc, is another area with diverse solutions, none of which have been sufficiently the Right Thing to be adopted by everyone else.
For today’s uses and hardware, the Unix API does too much: rich, stateful APIs copying everything from userland to kernel and back again — the context switching and data copying time is prohibitive, and so the kernel ends up being bypassed once it’s checked the permissions and allocated the hardware resources. I hesitate to call it a micro-kernel model, but the kernel is used less, and libraries and services take on more of the work.
d
> - separation of code and data using read-only and read/write file systems
I'll bite. How do you install code in a read-only file system? And
where does a.out go?
My guess is that /bin is in a file system of its own. Executables from
/letc and /lib are probably there too. On the other hand, I guess
users' personal code is still read/write.
I agree that such an arrangement is prudent. I don't see a way,
though, to update bin without disrupting most running programs.
Doug
On Sat, Feb 20, 2021 at 06:09:42PM -0500, M Douglas McIlroy wrote:
> > - separation of code and data using read-only and read/write file systems
>
> I'll bite. How do you install code in a read-only file system? And
> where does a.out go?
>
> My guess is that /bin is in a file system of its own. Executables from
> /letc and /lib are probably there too. On the other hand, I guess
> users' personal code is still read/write.
>
> I agree that such an arrangement is prudent. I don't see a way,
> though, to update bin without disrupting most running programs.
>
> Doug
I always wonder how to distunguish data and programs when people want
to separate them. One person's data is another person's program and
vice versa. Think scripting, config files, grammar definitions,
postscript files, exectuables to be fed to emulators, compilers,
linkers, code analysis tools, the examples are endless.
Turing already saw that form the theoretical point of view, others
(like Von Neumann) more from the practical persppective.
Data = Programs.
-Otto
To quote from Jon’s post: > There have been heated discussions on this list about kernel API bloat. In my > opinion, these discussions have mainly been people grumbling about what they > don't like. I'd like to flip the discussion around to what we would like. > Ken and Dennis did a great job with initial abstractions. Some on this list > have claimed that these abstractions weren't sufficient for modern times. > Now that we have new information from modern use cases, how would we rethink > the basic abstractions? I’d like to add the constraint of things that would have been implementable on the hardware of the late 1970’s, let’s say a PDP11/70 with Datakit or 3Mbps Ethernet or Arpanet; maybe also Apple 2 class bitmap graphics. And quote some other posts: > Because it's easy pickings, I would claim that the socket system call is out > of line with the UNIX abstractions; it exists because of practical political > considerations, not because it's needed. I think that it would have fit > better folded into the open system call. >> >> Somebody once suggested a filesystem interface (it certainly fits the Unix >> philosophy); I don't recall the exact details. > > And it was done, over 30 years ago; see Plan 9 from Bell Labs.... I would argue that quite a bit of that was implementable as early as 6th Edition. I was researching that very topic last Spring [1] and back ported Peter Weinberger’s File System Switch (FSS) from 8th to 6th Edition; the switch itself bloats the kernel by about half a kilobyte. I think it may be one of the few imaginable extensions that do not dilute the incredible bang/buck ratio of the V6 kernel. With that change in place a lot of other things become possible: - a Kilian style procfs - a Weinberger style network FS - a text/file based ioctl - a clean approach to named pipes - a different starting point to sockets Each of these would add to kernel size of course, hence I’m thinking about a split I/D kernel. To some extent it is surprising that the FSS did not happen around 1975, as many ideas around it were 'in the air' at the time (Heinz Lycklama’s peripheral Unix, the Spider network Filestore, Rand ports, Arpanet Unix, etc). With the benefit of hindsight, it isn’t a great code leap from the cdev switch to the FSS - but probably the ex ante conceptual leap was just too big at the time. Paul [1] Code diffs here: https://1587660.websites.xs4all.nl/cgi-bin/9995/vdiff?from=fab15b88a6a0f36bdb41f24f0b828a67c5f9fe03&to=b95342aaa826bb3c422963108c76d09969b1de93&sbs=1
> On Feb 20, 2021, at 15:09, M Douglas McIlroy <m.douglas.mcilroy@dartmouth.edu> wrote:
>
>> - separation of code and data using read-only and read/write file systems
>
> I'll bite. How do you install code in a read-only file system?
Disclaimer: I haven't actually used Nerves myself, just watched some presentations, read various web pages, etc. So anything I say about it is quite unreliable. And, although that item was (sort of) true, it was obviously rather misleading if interpreted too broadly. So, I'll try to provide some context to explain what I meant by it.
As I understand it, Nerves is intended as a build and delivery mechanism for IoT system software. It's supposed to be possible to upgrade a deployed device without blowing away its persistent saved state. And, if the upgrade fails, to back down to the previous version. Also, the running code on the device should not be able to trash the system software.
To support this, they use multiple file systems, with various updating attributes. For example, they might have two file systems for the system software and a third one for the persistent saved state. This lets a developer upload and boot a new copy of the system software, but fall back to the old version if something goes wrong.
-r
On Sat, 20 Feb 2021, M Douglas McIlroy wrote: >> - separation of code and data using read-only and read/write file >> systems > > I'll bite. How do you install code in a read-only file system? And where > does a.out go? I once worked for a place who reckoned that /bin and /lib etc ought to be in an EEPROM; I reckon that he was right (Penguin/OS dumps everything under /usr/bin, for example). > My guess is that /bin is in a file system of its own. Executables from > /letc and /lib are probably there too. On the other hand, I guess users' > personal code is still read/write. That's how we ran our RK-05 11/40s since Ed 5... Good fun writing a DJ-11 driver from the DH-11 source; even more fun when I wrote a UT-200 driver from the manual alone (I'm sure that "ei.c" is Out There Somewhere), junking IanJ's driver. The war stories that I could tell... > I agree that such an arrangement is prudent. I don't see a way, though, > to update bin without disrupting most running programs. Change is inevitable; the trick is to minimise the disruption. -- Dave, who carried RK-05s all over the UNSW campus
[-- Attachment #1: Type: text/plain, Size: 819 bytes --] On Sat, Feb 20, 2021 at 6:10 PM M Douglas McIlroy < m.douglas.mcilroy@dartmouth.edu> wrote: > > - separation of code and data using read-only and read/write file > systems > > I'll bite. How do you install code in a read-only file system? And > where does a.out go? > The best way I have seen this done is with overlay and union file system support. The 'writeable' versions are the file in /bin are overlayed as needed. To do this properly you need the stackable file system stuff we worked on at LCC and Sun. If you can interpose at the inode level it's very cool and flexible (Sun played with - but makes the Sun symlink nightmare seem like an easy night at the movies), at the filesystem switch layer (Locus and UCLA - scheme that was in BSD at one point - easier to manage/admin). ᐧ [-- Attachment #2: Type: text/html, Size: 1892 bytes --]
On Mon, 22 Feb 2021, Dave Horsfall wrote:
> I once worked for a place who reckoned that /bin and /lib etc ought to be in
> an EEPROM; I reckon that he was right (Penguin/OS dumps everything under
> /usr/bin, for example).
I have used distributions in the past that maintained the traditional
distinction.
While I've been stuck regarding bringing up a kernel, C compiler and libc
all together, (keeping in mind my desire to avoid gcc and glibc for the
project) the conceptual distribution I've been working on for some time
uses more or less the same abstraction as the BSDs, with distinct /bin and
/sbin vs. /usr/bin and /usr/sbin as I personally believe it should be,
that the stuff in /bin should be enough to bring up and/or run diagnostics
on a system, and everything else go in /usr.
-uso.
On Mon, Feb 22, 2021 at 09:40:57AM +1100, Dave Horsfall wrote: > That's how we ran our RK-05 11/40s since Ed 5... Good fun writing a DJ-11 > driver from the DH-11 source; even more fun when I wrote a UT-200 driver > from the manual alone (I'm sure that "ei.c" is Out There Somewhere), junking > IanJ's driver. https://minnie.tuhs.org/cgi-bin/utree.pl?file=AUSAM/sys/dmr/ei.c Cheers, Warren
I've just checked Slackware 14.* and it's still got a few binaries in
/bin, unlike the RedHat* group which has indeed sent them all to
/usr/bin. I don't know about the Debian* group, or if the Mandrake*
group have gone with the RedHat* or not. Let alone all the other
distros.
Wesley Parish
On 2/22/21, Dave Horsfall <dave@horsfall.org> wrote:
> On Sat, 20 Feb 2021, M Douglas McIlroy wrote:
>
>>> - separation of code and data using read-only and read/write file
>>> systems
>>
>> I'll bite. How do you install code in a read-only file system? And where
>> does a.out go?
>
> I once worked for a place who reckoned that /bin and /lib etc ought to be
> in an EEPROM; I reckon that he was right (Penguin/OS dumps everything
> under /usr/bin, for example).
>
>> My guess is that /bin is in a file system of its own. Executables from
>> /letc and /lib are probably there too. On the other hand, I guess users'
>> personal code is still read/write.
>
> That's how we ran our RK-05 11/40s since Ed 5... Good fun writing a DJ-11
> driver from the DH-11 source; even more fun when I wrote a UT-200 driver
> from the manual alone (I'm sure that "ei.c" is Out There Somewhere),
> junking IanJ's driver.
>
> The war stories that I could tell...
>
>> I agree that such an arrangement is prudent. I don't see a way, though,
>> to update bin without disrupting most running programs.
>
> Change is inevitable; the trick is to minimise the disruption.
>
> -- Dave, who carried RK-05s all over the UNSW campus
>
On Tue, 23 Feb 2021, Wesley Parish wrote:
> I've just checked Slackware 14.* and it's still got a few binaries in
> /bin, unlike the RedHat* group which has indeed sent them all to
> /usr/bin. I don't know about the Debian* group, or if the Mandrake*
> group have gone with the RedHat* or not. Let alone all the other
> distros.
Debian links /bin to /usr/bin.
-uso.
On Tue, Feb 23, 2021 at 01:25:51PM +1300, Wesley Parish wrote: > I've just checked Slackware 14.* and it's still got a few binaries in > /bin, unlike the RedHat* group which has indeed sent them all to > /usr/bin. I don't know about the Debian* group, or if the Mandrake* > group have gone with the RedHat* or not. Let alone all the other > distros. More information about the /usr migration, can be found at: * https://wiki.debian.org/UsrMerge * https://www.freedesktop.org/wiki/Software/systemd/TheCaseForTheUsrMerge/ One of the interesting points made in the above is that merging /bin and /usr/bin, et. al., was first done by Solaris 11 (ten years ago) and so one of the arguments for Linux distributions for proceeding with the /usr merge was to improve cross compatibility with legacy commercial Unix systems. So obviously, like so many other things, it's all Oracle's fault. :-) - Ted
On Mon, Feb 22, 2021 at 07:38:21PM -0500, Steve Nickolas wrote:
> On Tue, 23 Feb 2021, Wesley Parish wrote:
>
> > I've just checked Slackware 14.* and it's still got a few binaries in
> > /bin, unlike the RedHat* group which has indeed sent them all to
> > /usr/bin. I don't know about the Debian* group, or if the Mandrake*
> > group have gone with the RedHat* or not. Let alone all the other
> > distros.
>
> Debian links /bin to /usr/bin.
New installs of Debian will use a /usr merged configuration. However,
for pre-existing installations, we are not yet forcing, or even
strongly recommending, system administrators to install the usrmerge
package which will transition an legacy directory hierarchy to be /usr
merged. So at the moment, Debian packages need to support both merged
and non-merged configurations, which is not ideal from a pacakge
maintainer's POV.
- Ted
[-- Attachment #1: Type: text/plain, Size: 1101 bytes --] On Mon, Feb 22, 2021, 7:50 PM Theodore Ts'o <tytso@mit.edu> wrote: > On Mon, Feb 22, 2021 at 07:38:21PM -0500, Steve Nickolas wrote: > > On Tue, 23 Feb 2021, Wesley Parish wrote: > > > > > I've just checked Slackware 14.* and it's still got a few binaries in > > > /bin, unlike the RedHat* group which has indeed sent them all to > > > /usr/bin. I don't know about the Debian* group, or if the Mandrake* > > > group have gone with the RedHat* or not. Let alone all the other > > > distros. > > > > Debian links /bin to /usr/bin. > > New installs of Debian will use a /usr merged configuration. However, > for pre-existing installations, we are not yet forcing, or even > strongly recommending, system administrators to install the usrmerge > package which will transition an legacy directory hierarchy to be /usr > merged. So at the moment, Debian packages need to support both merged > and non-merged configurations, which is not ideal > I anticipate needing a /usr/bin/bash soon on my FreeBSD system for the same reason I have a /bin/bash pointing at /usr/local/bin/bash. Progress :) Warner > [-- Attachment #2: Type: text/html, Size: 1744 bytes --]
On 2/21/21, Steve Nickolas <usotsuki@buric.co> wrote:
>
> While I've been stuck regarding bringing up a kernel, C compiler and libc
> all together, (keeping in mind my desire to avoid gcc and glibc for the
> project) the conceptual distribution I've been working on for some time
> uses more or less the same abstraction as the BSDs, with distinct /bin and
> /sbin vs. /usr/bin and /usr/sbin as I personally believe it should be,
> that the stuff in /bin should be enough to bring up and/or run diagnostics
> on a system, and everything else go in /usr.
>
I don't see much of a point in maintaining the separation these days.
/bin and /usr/bin were originally separated because it wasn't possible
to fit everything on one disk, and (AFAIK) the separation was mostly
maintained after that to reduce the chance of filesystem corruption
rendering the system unbootable (which is much less of a problem
nowadays because of journalled and log-structured filesystems).
Under UX/RT, the OS I'm writing, all commands (administrative or
otherwise) will appear to be in /bin, and all daemons will appear to
be in /sbin (with corresponding symlinks in /usr). The separation into
administrative and regular commands will be meaningless since the
traditional root/non-root security model will be completely eliminated
in favor of role-based access control. The / and /usr separation will
be useless since it will be impossible to have a separate /usr
partition (the contents of the root will be dynamically bound from a
collection of individual package directories, and won't correspond to
the root of the system volume).
[-- Attachment #1: Type: text/plain, Size: 3764 bytes --] At Mon, 22 Feb 2021 20:31:49 -0700, Andrew Warkentin <andreww591@gmail.com> wrote: Subject: Re: [TUHS] Abstractions > > On 2/21/21, Steve Nickolas <usotsuki@buric.co> wrote: > > > > While I've been stuck regarding bringing up a kernel, C compiler and libc > > all together, (keeping in mind my desire to avoid gcc and glibc for the > > project) the conceptual distribution I've been working on for some time > > uses more or less the same abstraction as the BSDs, with distinct /bin and > > /sbin vs. /usr/bin and /usr/sbin as I personally believe it should be, > > that the stuff in /bin should be enough to bring up and/or run diagnostics > > on a system, and everything else go in /usr. > > I don't see much of a point in maintaining the separation these days. > /bin and /usr/bin were originally separated because it wasn't possible > to fit everything on one disk, and (AFAIK) the separation was mostly > maintained after that to reduce the chance of filesystem corruption > rendering the system unbootable (which is much less of a problem > nowadays because of journalled and log-structured filesystems). Maybe there isn't any impetus to _create_ a separate /usr these days of large software but even larger disks. However I think there are at least two good reasons to _maintain_ a separate /usr. At least for ostensibly POSIX and Unix compatible systems, that is. For one there's a huge amount of deeply embedded lore, human (finger and brain) memory, actual code, documentation, and widespread practices that use this separation and rely on it, effectively making it a requirement. As Steve mentions above there's also the concept of knowing the minimum requirements for bringing up a system capable of the most basic tasks. Of course there's likely going to be some variance in what any given person might define as "most basic tasks", but that's most a separate issue. However I will give one example of why this might be a good thing to know and preserved: it is highly useful for those creating "embedded" systems, or application specific systems. They can start with just the minimal root filesystem, and then know exactly what they have to add in order to meet their application's requirements precisely. (and the reasons for doing that can be much wider than many might assume) Also the basic idea of having a root filesystem that contains just and only what's necessary for the system to boot and run, and putting everything else that makes the system usable to users into /usr, is also still a worthwhile concept even just on its own. The maintenance of an illusion of a separate /usr can of course be easily done with a farm of symlinks, thus preserving any dependencies in anyone's memory, documentation, or code. However the reality of maintaining a separate minimal toolset for system bring-up is that it cannot be reliably done without constant and pervasive testing; and the very best (and perhaps only) way to achieve this, especially in any smaller open-source project, is for everyone to use it that way as much of the time as possible. I say this from decades long experience of slowly moving systems to having just one partition for both root and /usr and then on occasion testing with separate root and /usr, and every time I do this testing I find dependencies have crept in on something in /usr for basic booting. (and that's even when I base my system on a platform that still tries hard to maintain this separation of root and /usr!) BTW, I think it was Sun that first did some of this merging of root and /usr a very long time ago. -- Greg A. Woods <gwoods@acm.org> Kelowna, BC +1 250 762-7675 RoboHack <woods@robohack.ca> Planix, Inc. <woods@planix.com> Avoncote Farms <woods@avoncote.ca> [-- Attachment #2: OpenPGP Digital Signature --] [-- Type: application/pgp-signature, Size: 195 bytes --]
[-- Attachment #1: Type: text/plain, Size: 6096 bytes --] On 2/23/21 10:29 AM, Greg A. Woods wrote: > Maybe there isn't any impetus to _create_ a separate /usr these days > of large software but even larger disks. I'm undecided. Part of me likes the / (root) and /usr split. But another part of me questions /if/ and (if so) /why/ it is (still) /needed/. > However I think there are at least two good reasons to _maintain_ > a separate /usr. At least for ostensibly POSIX and Unix compatible > systems, that is. Does /usr actually /need/ to be a /separate/ file system? Or would a wholesale link from /usr to / (root) suffice? Or perhaps a collection of sym-links from /usr/<foo> to /<foo> suffice? > For one there's a huge amount of deeply embedded lore, human > (finger and brain) memory, actual code, documentation, and widespread > practices that use this separation and rely on it, effectively making > it a requirement. Are they relying on the /separation/ of separate file systems? Or are they simply relying on wrote memory for the path? Ergo sym-links could fulfill the perceived need? > As Steve mentions above there's also the concept of knowing the > minimum requirements for bringing up a system capable of the most > basic tasks. The pat response to this in the Linux community is "That's what the initrd / initramfs is for!" What that fails to take into account is if the system actually uses an initrd / initramfs or not. Many of the systems I maintain do /not/ use an initrd / initramfs. Thus the systems have /some/ actual /need/ to be able to bring up a minimal system to repair file system problems. Even if the so called problem is simply that the extent file system needs an fsck with human interaction (time since last check and / or maximum number of mounts). If you do use an initrd / initramfs, then you can reasonably safely lump everything* in the / (root) file system. */boot still tends to be it's own file system on Linux, mostly because that's where the initrd / initramfs image live which contain drivers for more fancy things (software RAID, LVM, ZFS, SAN, etc.) which are needed to bring up / (root). > Of course there's likely going to be some variance in what any > given person might define as "most basic tasks", but that's most a > separate issue. Agreed. However, I posit that "most basic tasks" be what is necessary to transition from single user mode to multi-user mode. Including any and all utilities required to fix file systems, work with logical volumes, SAN, etc. > However I will give one example of why this might be a good thing to > know and preserved: it is highly useful for those creating "embedded" > systems, or application specific systems. They can start with just the > minimal root filesystem, and then know exactly what they have to add > in order to meet their application's requirements precisely. (and the > reasons for doing that can be much wider than many might assume) Please elaborate on what that has to do with the / (root) vs /usr split? I feel like you're differentiating between a minimal install vs a kitchen sink install. Which seems to me to be independent of how the underlying file system(s) is (are) arranged. > Also the basic idea of having a root filesystem that contains just > and only what's necessary for the system to boot and run, and putting > everything else that makes the system usable to users into /usr, > is also still a worthwhile concept even just on its own. Many in the Linux community think this is the job of the initrd / initramfs. I personally believe that this is the job of the / (root) file system. Aside: In the event that /usr is on the / (root) file system, then the system should still be able to come up as if /usr didn't exist b/c it had been renamed or was on a separate file system. > The maintenance of an illusion of a separate /usr can of course be > easily done with a farm of symlinks, thus preserving any dependencies > in anyone's memory, documentation, or code. Agreed. With things like bind mounts, we don't even need to use sym-links. }:-) Though, one potential danger is that people see duplication between /bin/<foo> and /usr/bin/<foo> and decide to remove one of them. Doing so will ultimately remove both and cause someone to have a not good day. Aside: Perhaps these not good days are not something to be avoided, but instead something to be treated as a learning opportunity. Much like young kids need to learn that fire is hot for themselves. > However the reality of maintaining a separate minimal toolset for > system bring-up is that it cannot be reliably done without constant > and pervasive testing; and the very best (and perhaps only) way to > achieve this, especially in any smaller open-source project, is for > everyone to use it that way as much of the time as possible. I say > this from decades long experience of slowly moving systems to having > just one partition for both root and /usr and then on occasion testing > with separate root and /usr, and every time I do this testing I find > dependencies have crept in on something in /usr for basic booting. > (and that's even when I base my system on a platform that still tries > hard to maintain this separation of root and /usr!) I have a different conundrum regarding */bin. Why do I need nine different (s)bin directories in my path? I -- possibly naively -- believe that we have the technology to have all commands in /one/ directory, namely /bin. Quickly after that thought, I realize that I want different things in my path than other people do. So I end up with custom /bin directories. Which usually ends up with sym-links that reference variables or custom mounts (possibly via auto-mount applying some logic). > BTW, I think it was Sun that first did some of this merging of root > and /usr a very long time ago. Agreed. Though I'm far from authoritative. -- Grant. . . . unix || die [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4013 bytes --]
On Tue, Feb 23, 2021 at 11:28:13AM -0700, Grant Taylor via TUHS wrote: > > What that fails to take into account is if the system actually uses an > initrd / initramfs or not. Many of the systems I maintain do /not/ use an > initrd / initramfs. Thus the systems have /some/ actual /need/ to be able > to bring up a minimal system to repair file system problems. Even if the so > called problem is simply that the extent file system needs an fsck with > human interaction (time since last check and / or maximum number of mounts). There are two reasons why you might want to have an initramfs. One is you are using a distribution-provided generic kernel, in which case the device driver / kernel modules needed to access the root file system needed to be loaded from *somewhere*, and that's the in-memory initramfs/initrd. The other reason is how you run fsck on the root file system. That won't be needed if hardware is perfect, the kernel is bug-free(tm), and the root file system has journalling support, as all modern file systems tend to have. However, if it is needed, there are two ways to do this. One is the traditional way, which is to mount the root file system read/only, repair the file system, and if any changes were required to the root file system, force a reboot; otherwise, remount the root file system read-write, and proceed. The other way of doing this is to include the fsck program in the initrams, and run fsck on the root file system before it is mounted. Now you never have to worry about rebooting if any chances were made, since the root file system wasn't mounted and so there is no danger of invalid metadata being cached in memory. That being said, it's certainly possible to skip using an initramfs; it's geenrally not required, and if you're building your own kernel, with the device drivers you need for your hardwaer compiled into the kernel, most distributions will support skipping the initramfs. (Debian certainly does, in any case.) > */boot still tends to be it's own file system on Linux, mostly because > that's where the initrd / initramfs image live which contain drivers for > more fancy things (software RAID, LVM, ZFS, SAN, etc.) which are needed to > bring up / (root). /boot needs to exist due to limitations to the firmware and/or boot loader being used. If the boot loader is using the legacy PC Bios interfaces to read the kernel and initial ramdisk/file system, then those files need to be in a low-numbered LBA disk space, due to legacy BIOS/firmware limitations. It could also be a concern if you are using some exotic file system (say, ZFS), and the bootloader doesn't support that file system due to copyright licensing incompatibilities, or the boot loader just not supporting that bleeding-edge file system. In that case, you might have to keep /boot as an ext4 file system. Other than that, there is no reason why /boot needs to be its own file system, except that most installers will create one just because it's simpler to use the same approach for all cases, even if it's not needed for a particular use case. - Ted P.S. Oh, and if you are using UEFI, you might need to have yet another file system which is a Microsoft FAT file system, typically mounted as /boot/efi, to keep the UEFI firmware happy....
The recent discussions on the TUHS list of whether /bin and /usr/bin are different, or symlinked, brought to mind the limited disk and tape sizes of the 1970s and 1980s. Especially the lower-cost tape technologies had issues with correct recognition of an end-of-tape condition, making it hard to span a dump across tape volumes, and strongly suggesting that directory tree sizes be limited to what could fit on a single tape. I made an experiment today across a broad range of operating systems (many with multiple versions in our test farm), and produced these two tables, where version numbers are included only if the O/S changed practices: ------------------------------------------------------------------------ Systems with /bin a symlink to /usr/bin (or both to yet another common directory) [42 major variants]: ArchLinux Kali RedHat 8 Arco Kubuntu 19, 20 Q4OS Bitrig Lite ScientificLinux 7 CentOS 7, 8 Lubuntu 19 Septor ClearLinux Mabox Solaris 10, 11 Debian 10, 11 Magiea Solydk Deepin Manjaro Sparky DilOS Mint 20 Springdale Dyson MXLinux 19 Ubuntu 19, 20, 21 Fedora Neptune UCS Gnuinos Netrunner Ultimate Gobolinux Oracle Linux Unleashed Hefftor Parrot 4.7 Void IRIX PureOS Xubuntu 19, 20 ------------------------------------------------------------------------ Systems with separate /bin and /usr/bin [60 major variants]: Alpine Hipster OS108 AltLinux KaOS Ovios Antix KFreeBSD PacBSD Bitrig Kubuntu 18 Parrot 4.5 Bodhi LibertyBSD PCBSD CentOS 5, 6 LMDE PCLinuxOS ClonOS Lubuntu 17 Peppermint Debian 7--10 LXLE Salix DesktopBSD macOS ScientificLinux 6 Devuan MidnightBSD SlackEX DragonFlyBSD Mint 18--20 Slackware ElementaryOS MirBSD Solus FreeBSD 9--13 MXLinux 17, 18 T2 FuryBSD NetBSD 6-1010 Trident Gecko NomadBSD Trisquel Gentoo OmniOS TrueOS GhostBSD OmniTribblix Ubuntu 14--18 GNU/Hurd OpenBSD Xubuntu 18 HardenedBSD OpenMandriva Zenwalk Helium openSUSE Zorinos ------------------------------------------------------------------------ Some names appear in both tables, indicating a transition from separate directories to symlinked directories in more recent O/S releases. Many of these system names are spelled in mixed lettercase, and if I've botched some of them, I extend my apologies to their authors. Some of those systems run on multiple CPU architectures, and our test farm exploits that; however, I found no instance of the CPU type changing the separation or symbolic linking of /bin and /usr/bin. ------------------------------------------------------------------------------- - Nelson H. F. Beebe Tel: +1 801 581 5254 - - University of Utah FAX: +1 801 581 4148 - - Department of Mathematics, 110 LCB Internet e-mail: beebe@math.utah.edu - - 155 S 1400 E RM 233 beebe@acm.org beebe@computer.org - - Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ - -------------------------------------------------------------------------------
[-- Attachment #1: Type: text/plain, Size: 4716 bytes --] On 2/23/21 11:57 AM, Theodore Ts'o wrote: > There are two reasons why you might want to have an initramfs. Rather than getting into a tit for tat debate, I'll agree that we have both proposed reasons why you /might/ want to use an initramfs. The operative words are "you" and "might". Each person probably wants slightly different things. It's far from one size fits all. > The other reason is how you run fsck on the root file system. The same way that it's been done for years. Root is mounted read only and you run fsck to repair damage. If it's severe damage, you will likely need to boot off of something else. I've had both situations happen multiple times. The quintessential max mount count / max days since last check have happily been fixed while root was mounted read only. > That won't be needed if hardware is perfect, the kernel is > bug-free(tm), and the root file system has journalling support, > as all modern file systems tend to have. I wouldn't bet on that. I've had to run fsck on journalling file systems at boot / mount time multiple times. > However, if it is needed, there are two ways to do this. One is the > traditional way, which is to mount the root file system read/only, > repair the file system, and if any changes were required to the root > file system, force a reboot; otherwise, remount the root file system > read-write, and proceed. This is what happened in /most/ of the cases that I've needed to interact with fsck of a root file system. > The other way of doing this is to include the fsck program in the > initrams, and run fsck on the root file system before it is mounted. > Now you never have to worry about rebooting if any chances were made, > since the root file system wasn't mounted and so there is no danger > of invalid metadata being cached in memory. Oh ... I would definitely *NOT* say /never/. There are ways that a file system can get corrupted that will cause fsck to stop and require manual intervention. > That being said, it's certainly possible to skip using an initramfs; > it's geenrally not required, and if you're building your own kernel, > with the device drivers you need for your hardwaer compiled into > the kernel, most distributions will support skipping the initramfs. > (Debian certainly does, in any case.) And if you're building a minimal kernel, removing support for modules and what's required for swing-root saves space. ;-) > /boot needs to exist due to limitations to the firmware and/or boot > loader being used. Not necessarily. E.g. one single partition containing /boot and / (root). > If the boot loader is using the legacy PC Bios interfaces to read the > kernel and initial ramdisk/file system, then those files need to be in > a low-numbered LBA disk space, due to legacy BIOS/firmware limitations. So make sure said /boot & / (root) partition stays within that limitation. I don't recall exactly what that is. I think it's ~8 GB. But it's definitely possible to have small installations in that space. > It could also be a concern if you are using some exotic file system > (say, ZFS), and the bootloader doesn't support that file system due > to copyright licensing incompatibilities, or the boot loader just not > supporting that bleeding-edge file system. In that case, you might > have to keep /boot as an ext4 file system. That scenario is definitely a possibility. Though such scenarios are not a requirement and tend to be antithetical to minimal installations, like the type that would be used in embedded devices and possibly copied to ROM as indicated in a different post. > Other than that, there is no reason why /boot needs to be its own > file system, except that most installers will create one just because > it's simpler to use the same approach for all cases, even if it's > not needed for a particular use case. As Steve Gibson is famous for saying; The tyranny of the default. > P.S. Oh, and if you are using UEFI, you might need to have yet > another file system which is a Microsoft FAT file system, typically > mounted as /boot/efi, to keep the UEFI firmware happy.... Yes, the file system needs to exist. But that's part of the firmware, not the operating system. I also question if that FAT file system needs to be mounted or not. -- I don't know how GRUB et al. deal with a non-mounted UEFI file system. But even if it does need to be mounted, you can still get away with two partitions; / (root) and /boot/efi. I suspect UEFI does away with the LBA issue you mentioned. -- Grant. . . . unix || die [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4013 bytes --]
To add to the inventory below: Dell SVR4 /bin is a symlink to /usr/bin NEXTSTEP/486 3.3 /bin and /usr/bin are separate On 2/23/2021 1:37 PM, Nelson H. F. Beebe wrote: > The recent discussions on the TUHS list of whether /bin and /usr/bin > are different, or symlinked, brought to mind the limited disk and tape > sizes of the 1970s and 1980s. Especially the lower-cost tape > technologies had issues with correct recognition of an end-of-tape > condition, making it hard to span a dump across tape volumes, and > strongly suggesting that directory tree sizes be limited to what could > fit on a single tape. > > I made an experiment today across a broad range of operating systems > (many with multiple versions in our test farm), and produced these two > tables, where version numbers are included only if the O/S changed > practices: > > ------------------------------------------------------------------------ > Systems with /bin a symlink to /usr/bin (or both to yet another common > directory) [42 major variants]: > > ArchLinux Kali RedHat 8 > Arco Kubuntu 19, 20 Q4OS > Bitrig Lite ScientificLinux 7 > CentOS 7, 8 Lubuntu 19 Septor > ClearLinux Mabox Solaris 10, 11 > Debian 10, 11 Magiea Solydk > Deepin Manjaro Sparky > DilOS Mint 20 Springdale > Dyson MXLinux 19 Ubuntu 19, 20, 21 > Fedora Neptune UCS > Gnuinos Netrunner Ultimate > Gobolinux Oracle Linux Unleashed > Hefftor Parrot 4.7 Void > IRIX PureOS Xubuntu 19, 20 > > ------------------------------------------------------------------------ > Systems with separate /bin and /usr/bin [60 major variants]: > > Alpine Hipster OS108 > AltLinux KaOS Ovios > Antix KFreeBSD PacBSD > Bitrig Kubuntu 18 Parrot 4.5 > Bodhi LibertyBSD PCBSD > CentOS 5, 6 LMDE PCLinuxOS > ClonOS Lubuntu 17 Peppermint > Debian 7--10 LXLE Salix > DesktopBSD macOS ScientificLinux 6 > Devuan MidnightBSD SlackEX > DragonFlyBSD Mint 18--20 Slackware > ElementaryOS MirBSD Solus > FreeBSD 9--13 MXLinux 17, 18 T2 > FuryBSD NetBSD 6-1010 Trident > Gecko NomadBSD Trisquel > Gentoo OmniOS TrueOS > GhostBSD OmniTribblix Ubuntu 14--18 > GNU/Hurd OpenBSD Xubuntu 18 > HardenedBSD OpenMandriva Zenwalk > Helium openSUSE Zorinos > > ------------------------------------------------------------------------ > > Some names appear in both tables, indicating a transition from > separate directories to symlinked directories in more recent O/S > releases. > > Many of these system names are spelled in mixed lettercase, and if > I've botched some of them, I extend my apologies to their authors. > > Some of those systems run on multiple CPU architectures, and our test > farm exploits that; however, I found no instance of the CPU type > changing the separation or symbolic linking of /bin and /usr/bin. > > ------------------------------------------------------------------------------- > - Nelson H. F. Beebe Tel: +1 801 581 5254 - > - University of Utah FAX: +1 801 581 4148 - > - Department of Mathematics, 110 LCB Internet e-mail: beebe@math.utah.edu - > - 155 S 1400 E RM 233 beebe@acm.org beebe@computer.org - > - Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ - > ------------------------------------------------------------------------------- > -- voice: +1.512.784.7526 e-mail: sauer@technologists.com fax: +1.512.346.5240 Web: https://technologists.com/sauer/ Facebook/Google/Skype/Twitter: CharlesHSauer
[-- Attachment #1: Type: text/plain, Size: 4421 bytes --] On Tue, 23 Feb 2021 at 16:03, Charles H. Sauer <sauer@technologists.com> wrote: > To add to the inventory below: > Dell SVR4 /bin is a symlink to /usr/bin > NEXTSTEP/486 3.3 /bin and /usr/bin are separate > > On 2/23/2021 1:37 PM, Nelson H. F. Beebe wrote: > > The recent discussions on the TUHS list of whether /bin and /usr/bin > > are different, or symlinked, brought to mind the limited disk and tape > > sizes of the 1970s and 1980s. Especially the lower-cost tape > > technologies had issues with correct recognition of an end-of-tape > > condition, making it hard to span a dump across tape volumes, and > > strongly suggesting that directory tree sizes be limited to what could > > fit on a single tape. > > > > I made an experiment today across a broad range of operating systems > > (many with multiple versions in our test farm), and produced these two > > tables, where version numbers are included only if the O/S changed > > practices: > > > > ------------------------------------------------------------------------ > > Systems with /bin a symlink to /usr/bin (or both to yet another common > > directory) [42 major variants]: > > > > ArchLinux Kali RedHat 8 > > Arco Kubuntu 19, 20 Q4OS > > Bitrig Lite ScientificLinux 7 > > CentOS 7, 8 Lubuntu 19 Septor > > ClearLinux Mabox Solaris 10, 11 > > Debian 10, 11 Magiea Solydk > > Deepin Manjaro Sparky > > DilOS Mint 20 Springdale > > Dyson MXLinux 19 Ubuntu 19, 20, 21 > > Fedora Neptune UCS > > Gnuinos Netrunner Ultimate > > Gobolinux Oracle Linux Unleashed > > Hefftor Parrot 4.7 Void > > IRIX PureOS Xubuntu 19, 20 > > > > ------------------------------------------------------------------------ > > Systems with separate /bin and /usr/bin [60 major variants]: > > > > Alpine Hipster OS108 > > AltLinux KaOS Ovios > > Antix KFreeBSD PacBSD > > Bitrig Kubuntu 18 Parrot 4.5 > > Bodhi LibertyBSD PCBSD > > CentOS 5, 6 LMDE PCLinuxOS > > ClonOS Lubuntu 17 Peppermint > > Debian 7--10 LXLE Salix > > DesktopBSD macOS ScientificLinux 6 > > Devuan MidnightBSD SlackEX > > DragonFlyBSD Mint 18--20 Slackware > > ElementaryOS MirBSD Solus > > FreeBSD 9--13 MXLinux 17, 18 T2 > > FuryBSD NetBSD 6-1010 Trident > > Gecko NomadBSD Trisquel > > Gentoo OmniOS TrueOS > > GhostBSD OmniTribblix Ubuntu 14--18 > > GNU/Hurd OpenBSD Xubuntu 18 > > HardenedBSD OpenMandriva Zenwalk > > Helium openSUSE Zorinos > > > > ------------------------------------------------------------------------ > > > > Some names appear in both tables, indicating a transition from > > separate directories to symlinked directories in more recent O/S > > releases. > > > > Many of these system names are spelled in mixed lettercase, and if > > I've botched some of them, I extend my apologies to their authors. > > > > Some of those systems run on multiple CPU architectures, and our test > > farm exploits that; however, I found no instance of the CPU type > > changing the separation or symbolic linking of /bin and /usr/bin. > > > > Solaris /bin was a symlink to /usr/bin as early as 2.5.1. It's also worth pointing out that NetBSD, in addition to having a separate /bin and /usr/bin, has /rescue which has a large selection of statically linked binaries. -Henry [-- Attachment #2: Type: text/html, Size: 5886 bytes --]
[-- Attachment #1: Type: text/plain, Size: 2568 bytes --] At Tue, 23 Feb 2021 12:37:57 -0700, "Nelson H. F. Beebe" <beebe@math.utah.edu> wrote: Subject: Re: [TUHS] Abstractions > > The recent discussions on the TUHS list of whether /bin and /usr/bin > are different, or symlinked, brought to mind the limited disk and tape > sizes of the 1970s and 1980s. Especially the lower-cost tape > technologies had issues with correct recognition of an end-of-tape > condition, making it hard to span a dump across tape volumes, and > strongly suggesting that directory tree sizes be limited to what could > fit on a single tape. Hmmmm... you may just be mixing up the names of the archive tools you mean, but on the other hand maybe you don't know that "dump" does whole filesystems, not just sub-directories. That of course doesn't take anything away from what you were saying about making sure you could do a full dump onto a single tape with some types of less high-end and high-quality tape devices. But that's a "newer" problem. Original Unix dump(1m) had no trouble asking for additional tapes to be mounted when the filesystem required multiple tapes. So it has nothing to do with legacy of the original root and /usr split. > I made an experiment today across a broad range of operating systems > (many with multiple versions in our test farm), and produced these two > tables, where version numbers are included only if the O/S changed > practices: An interesting compilation, but sadly (to me at least) it is mostly a mess of GNU/Linux which, rightly or wrongly, I categorize all under one (extremely opaque) umbrella. BTW, when I said "long ago" for Solaris, I meant a REALLY long time ago. /bin has been a symlink to /usr/bin since Solaris 2.0 and yet /usr could/can still be a separate filesystem on a Solaris installation. This is accomplished by putting everything necessary to boot the system up to the point where other additional filesystems can be mounted using just the programs found in /sbin. Of course this wasn't done smoothly and completely all in one go. IIRC /sbin/sh didn't exist until Solaris-9, and (also IIRC) it is just a copy of /usr/bin/sh. So Sun pushed everything in /bin to /usr/bin, then copied a few things back to /sbin as they found they needed them. Kind of a half-assed hack that wasn't well thought out, and had very poor motivations. I'd forgot that IRIX was following Solaris on this track. -- Greg A. Woods <gwoods@acm.org> Kelowna, BC +1 250 762-7675 RoboHack <woods@robohack.ca> Planix, Inc. <woods@planix.com> Avoncote Farms <woods@avoncote.ca> [-- Attachment #2: OpenPGP Digital Signature --] [-- Type: application/pgp-signature, Size: 195 bytes --]
Greg Woods responds to my posting: >> Hmmmm... you may just be mixing up the names of the archive tools you >> mean, but on the other hand maybe you don't know that "dump" does whole >> filesystems, not just sub-directories. I meant "dump" as a generic verb, not specifically the Unix dump utility. Many sites also used tar to backup directory trees: after all, tar means Tape ARchiver. >> Original Unix dump(1m) had no trouble asking for additional tapes ... That was, however, contingent on a reliable signal from the tape unit, and my strong recollection is that when we moved to various types of cheap cassette tapes, the end-of-tape indicator was unreliable. Thus, we paid attention to both disk and tape sizes. Today, with 10TB+ on LTO-8 tapes, it isn't an issue for us, and we also tend to have many different ZFS volumes representing various parts of the filesystem, allowing different backup and snapshotting policies. Besides tapes and snapshots, we also have a live SAN mirror, and a remote snapshot server, giving plenty of data replication, and the warm fuzzy feelings from that. After 20 years of ZFS, I don't recall us ever losing data. We have also gone through two generations of major fileserver upgrades and complete data migrations without service interruptions (except for a brief interval for each user account to synchronize data on old and new servers). ------------------------------------------------------------------------------- - Nelson H. F. Beebe Tel: +1 801 581 5254 - - University of Utah FAX: +1 801 581 4148 - - Department of Mathematics, 110 LCB Internet e-mail: beebe@math.utah.edu - - 155 S 1400 E RM 233 beebe@acm.org beebe@computer.org - - Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ - -------------------------------------------------------------------------------
[-- Attachment #1: Type: text/plain, Size: 1911 bytes --] At Tue, 23 Feb 2021 16:15:52 -0500, Henry Bent <henry.r.bent@gmail.com> wrote: Subject: Re: [TUHS] Abstractions > > It's also worth > pointing out that NetBSD, in addition to having a separate /bin and > /usr/bin, has /rescue which has a large selection of statically linked > binaries. Indeed. However /rescue is really just a hack to avoid the problems that occur when basic tools are dynamic-linked. My vastly preferred alternative is to static-link everything. Of course with C libraries these days that means the binaries can be rather large -- albiet still relatively small in comparison to modern disks. In any case I've also built NetBSD such that all of the base system binaries are linked together into one binary (we call this "crunchgen", but Linux usually calls it "Busybox(tm)"). I decided to put all the bin directories together into one for the ultimate savings of space and time and effort, but it would be trivial to keep the root and /usr split for better managing application-specific embedded systems. This hard-static-linking of everything into one binary results in a surprisingly small, indeed very tiny, system. For i386 (32-bit) it could probably boot multiuser in about 16mb of RAM. What I've got so far is a bootable image file of a "complete" NetBSD-5/i386 systems that's just a tiny bit over 7Mb. It contains a kernel and a ramdisk image with a 12Mb filesystem containing a crunchgen binary with almost everything in it (247 system programs, including all the networking tools, but no named, and no toolchain, no mailer, and no manual pages -- not atypical of what was delivered with some commercial unix systems of days gone by, but of course updated with modern things like ssh, etc..) -- Greg A. Woods <gwoods@acm.org> Kelowna, BC +1 250 762-7675 RoboHack <woods@robohack.ca> Planix, Inc. <woods@planix.com> Avoncote Farms <woods@avoncote.ca> [-- Attachment #2: OpenPGP Digital Signature --] [-- Type: application/pgp-signature, Size: 195 bytes --]
On 2/23/21, Greg A. Woods <woods@robohack.ca> wrote: > > For one there's a huge amount of deeply embedded lore, human (finger and > brain) memory, actual code, documentation, and widespread practices that > use this separation and rely on it, effectively making it a requirement. > That is only a justification for keeping the /usr hierarchy around (and using symlinks/binding to make stuff appear in both places), not for arbitrarily separating programs and libraries between the two. > > However the reality of maintaining a separate minimal toolset for system > bring-up is that it cannot be reliably done without constant and > pervasive testing; and the very best (and perhaps only) way to achieve > this, especially in any smaller open-source project, is for everyone to > use it that way as much of the time as possible. I say this from > decades long experience of slowly moving systems to having just one > partition for both root and /usr and then on occasion testing with > separate root and /usr, and every time I do this testing I find > dependencies have crept in on something in /usr for basic booting. (and > that's even when I base my system on a platform that still tries hard to > maintain this separation of root and /usr!) > With a system-wide package manger a set of basic packages can be maintained without having an arbitrary separation into root and usr. The reference distribution of UX/RT will have several nested sets of packages rather than a separation of binaries between root and usr. The smallest will be what is included in the supervisor image (the equivalent of a kernel image and initramfs combined into one), which will be what is required to mount the system volume. Above that will be the minimal system, which will be the set of packages required to boot to a multi-user login. All of this will be in the base system repository, along with a few other optional groups of packages (including a full desktop environment). Most optional third-party application packages will be in a separate repository (like ports or pkgsrc under BSD, but using the same package manager as the base system and available by default without any special configuration). On 2/23/21, Theodore Ts'o <tytso@mit.edu> wrote: > > /boot needs to exist due to limitations to the firmware and/or boot > loader being used. If the boot loader is using the legacy PC Bios > interfaces to read the kernel and initial ramdisk/file system, then > those files need to be in a low-numbered LBA disk space, due to legacy > BIOS/firmware limitations. It could also be a concern if you are > using some exotic file system (say, ZFS), and the bootloader doesn't > support that file system due to copyright licensing incompatibilities, > or the boot loader just not supporting that bleeding-edge file system. > In that case, you might have to keep /boot as an ext4 file system. > The BIOS addressing limitations only happen with CHS-only BIOSes, which haven't really been a thing since the mid-to-late 90s. The only reason to have a separate /boot partition for anything newer than that is because of bootloader limitations. On 2/23/21, Grant Taylor via TUHS <tuhs@minnie.tuhs.org> wrote: > > I have a different conundrum regarding */bin. Why do I need nine > different (s)bin directories in my path? I -- possibly naively -- > believe that we have the technology to have all commands in /one/ > directory, namely /bin. > > Quickly after that thought, I realize that I want different things in my > path than other people do. So I end up with custom /bin directories. > Which usually ends up with sym-links that reference variables or custom > mounts (possibly via auto-mount applying some logic). > UX/RT will solve the issue of different sets of programs in the path in different user or application contexts with per-process and per-user namespaces (since fine-grained security will be deeply integrated into the system and neither on-disk device files nor setuid binaries will exist, there shouldn't be any security concerns with letting regular users bind and mount stuff for themselves). $PATH will just be set to "/bin" in the vast majority of cases.
[-- Attachment #1: Type: text/plain, Size: 2488 bytes --] On Tue, Feb 23, 2021, 7:47 PM Greg A. Woods <woods@robohack.ca> wrote: > At Tue, 23 Feb 2021 16:15:52 -0500, Henry Bent <henry.r.bent@gmail.com> > wrote: > Subject: Re: [TUHS] Abstractions > > > > It's also worth > > pointing out that NetBSD, in addition to having a separate /bin and > > /usr/bin, has /rescue which has a large selection of statically linked > > binaries. > > Indeed. However /rescue is really just a hack to avoid the problems > that occur when basic tools are dynamic-linked. > > My vastly preferred alternative is to static-link everything. > > Of course with C libraries these days that means the binaries can be > rather large -- albiet still relatively small in comparison to modern > disks. > > In any case I've also built NetBSD such that all of the base system > binaries are linked together into one binary (we call this "crunchgen", > but Linux usually calls it "Busybox(tm)"). I decided to put all the bin > directories together into one for the ultimate savings of space and time > and effort, but it would be trivial to keep the root and /usr split for > better managing application-specific embedded systems. > > This hard-static-linking of everything into one binary results in a > surprisingly small, indeed very tiny, system. For i386 (32-bit) it > could probably boot multiuser in about 16mb of RAM. > I booted a FreeBSD/i386 4 system, sans compilers and a few other things, off 16MB CF card in the early 2000s. I did both static (one binary) and dynamic and found dynamic worked a lot better for the embedded system... I also did a 8MB PoC router and data logger image that was stripped to the bone. PicoBSD fit onto a 1.44MB floppy as lat as FreeBSD 4 and made a good firewall... Warner What I've got so far is a bootable image file of a "complete" > NetBSD-5/i386 systems that's just a tiny bit over 7Mb. It contains a > kernel and a ramdisk image with a 12Mb filesystem containing a crunchgen > binary with almost everything in it (247 system programs, including all > the networking tools, but no named, and no toolchain, no mailer, and no > manual pages -- not atypical of what was delivered with some commercial > unix systems of days gone by, but of course updated with modern things > like ssh, etc..) > > -- > Greg A. Woods <gwoods@acm.org> > > Kelowna, BC +1 250 762-7675 RoboHack <woods@robohack.ca> > Planix, Inc. <woods@planix.com> Avoncote Farms <woods@avoncote.ca> > [-- Attachment #2: Type: text/html, Size: 3755 bytes --]
Some additions: Systems with /bin a symlink to /usr/bin Digital UNIX 4.0 Tru64 UNIX 5.0 to 5.1B HP-UX 11i 11.23 and 11.31 Systems with separate /bin and /usr/bin SCO UNIX 3.2 V4.0 to V4.2 -- The more I learn the better I understand I know nothing.
On Tue, Feb 23, 2021 at 01:29:11PM -0700, Grant Taylor via TUHS wrote: > On 2/23/21 11:57 AM, Theodore Ts'o wrote: > > There are two reasons why you might want to have an initramfs. > > Rather than getting into a tit for tat debate, I'll agree that we have both > proposed reasons why you /might/ want to use an initramfs. The operative > words are "you" and "might". Each person probably wants slightly different > things. It's far from one size fits all. Sure, I was trying to enumerate the reasons why initramfs, for some combinations of hardware / configurations, might be necessary. > > /boot needs to exist due to limitations to the firmware and/or boot > > loader being used. > > Not necessarily. E.g. one single partition containing /boot and / (root). Sorry, I should have written, "/boot MAY need to exist". > > Other than that, there is no reason why /boot needs to be its own file > > system, except that most installers will create one just because it's > > simpler to use the same approach for all cases, even if it's not needed > > for a particular use case. > > As Steve Gibson is famous for saying; The tyranny of the default. I wouldn't say that; I'd rather say that if you have a huge combination of configurations that you have to test, those configurations which aren't regularly tested will tend to bitrot, or have odd failures in various error cases. The more corners that you have, the more corner cases. And this is where it's all about *who* gets to pay, either via money, or via their labor, to support these various cases. Weren't people just complaining, in other TUHS threads, of "bloat" in Linux? Well, this is how you get bloat. It's just that if it's a feature *you* want, then it's not bloat, but an essential feature, and if it's not provided, you whine mightily. And when you have a large number of enterprise customers paying $$$ to enterprise distribution vendors, each with their own set of essential features, and where *binary* backwards compatibility is considered an essential feature, then that's how you get what others will called "bloat". I would call this the "Tyrany of Gold", as in the reformulated Golden Rule, "The ones with the Gold, makes the Rules". > > P.S. Oh, and if you are using UEFI, you might need to have yet another > > file system which is a Microsoft FAT file system, typically mounted as > > /boot/efi, to keep the UEFI firmware happy.... > > Yes, the file system needs to exist. But that's part of the firmware, not > the operating system. I also question if that FAT file system needs to be > mounted or not. -- I don't know how GRUB et al. deal with a non-mounted > UEFI file system. GRUB doesn't care. But various system administration utilities that want to manage to UEFI boot menu (as distinct from the GRUB boot menu), they need to modify the files that are read by the UEFI firmware. So it's convenient if it's mounted *somewhere*. Also, even if it's not mounted, it's still a partition that has to be around, and one reason to keep it mounted is to avoid a system administrator from saying, "hmmm, what's this unused /dev/sda1 partition? I guess I can use it as an extra swap partition!" And then the system won't boot, and then they call the enterprise distro's help desk, and unnecessary calls into the help desk costs $$$, and distro's tend to optimize for unnecessary cost. (Plus lots of unhappy customers who are down, even if it is there own d*mned fault, is not good for business.) > But even if it does need to be mounted, you can still get away with two > partitions; / (root) and /boot/efi. I suspect UEFI does away with the LBA > issue you mentioned. Yes, in another 5 or 10 years, we can probably completely deprecate the MBR-based boot sequence. At which point there will be another series of whiners on TUHS ala the complaint that distributions are dropping support for i386.... But since most TUHS posters aren't paying $$$ to enterprise distributions, most enterpise distro engineers are going to give precisely zero f*cks. But hey, if you want to volunteer to provide the hard work for supporting these configurations to the community distribution, like Debian, those distros will be happy to accept the volunteer help. :-) - Ted
[-- Attachment #1: Type: text/plain, Size: 2686 bytes --] On 2/24/21 7:14 AM, Theodore Ts'o wrote: > I wouldn't say that; I'd rather say that if you have a huge combination > of configurations that you have to test, those configurations which > aren't regularly tested will tend to bitrot, or have odd failures > in various error cases. The more corners that you have, the more > corner cases. Fair enough. > I would call this the "Tyrany of Gold", as in the reformulated Golden > Rule, "The ones with the Gold, makes the Rules". Being a fan of the golden rule, I would not make, much less use, that derivation. I think it completely changes the meaning of the spirit behind the golden rule. I don't fault your logic. I just dislike where it ended up. > GRUB doesn't care. But various system administration utilities that > want to manage to UEFI boot menu (as distinct from the GRUB boot menu), > they need to modify the files that are read by the UEFI firmware. Valid distinction. > So it's convenient if it's mounted *somewhere*. Also, even if it's not > mounted, it's still a partition that has to be around, and one reason > to keep it mounted is to avoid a system administrator from saying, > "hmmm, what's this unused /dev/sda1 partition? I guess I can use it > as an extra swap partition!" I seem to recall hearing about a problem where a rogue rm could accidentally wipe out part of the UEFI. Maybe it was the contents of the /boot/efi partition. So, I'd suggest a happy medium of mounting it Read-Only. That way it's known to be used /and/ it's protected from a simple rogue rm. It can relatively easily be re-mounted as Read-Write when necessary. As well as subsequently re-mounted back to Read-Only. > Yes, in another 5 or 10 years, we can probably completely deprecate > the MBR-based boot sequence. At which point there will be another > series of whiners on TUHS ala the complaint that distributions are > dropping support for i386.... I feel like we've already abandoned i386 as in 80386 (or compatible) architecture. I think we now require Pentium (586?) or better. At some point, we'll completely remove 32-bit support from mainstream Linux distributions, thus requiring something from the 21st century. > But since most TUHS posters aren't paying $$$ to enterprise > distributions, most enterpise distro engineers are going to give > precisely zero f*cks. But hey, if you want to volunteer to provide > the hard work for supporting these configurations to the community > distribution, like Debian, those distros will be happy to accept the > volunteer help. :-) ~chuckle~ -- Grant. . . . unix || die [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4013 bytes --]
On Wed, Feb 24, 2021 at 10:50:03AM -0700, Grant Taylor via TUHS wrote: > > I would call this the "Tyrany of Gold", as in the reformulated Golden > > Rule, "The ones with the Gold, makes the Rules". > > Being a fan of the golden rule, I would not make, much less use, that > derivation. I think it completely changes the meaning of the spirit behind > the golden rule. Oh, sure. I agree completely that it's 180 degrees from the original golden rule; it had intended to be a joke. Unfortunately, years of living in a country whre the ones with the Gold really do make all of the Rules has gotten me to the point where if I don't laugh at it, I would have to cry.... > I seem to recall hearing about a problem where a rogue rm could accidentally > wipe out part of the UEFI. Maybe it was the contents of the /boot/efi > partition. So, I'd suggest a happy medium of mounting it Read-Only. That > way it's known to be used /and/ it's protected from a simple rogue rm. It > can relatively easily be re-mounted as Read-Write when necessary. As well > as subsequently re-mounted back to Read-Only. So technically it doesn't wipe out UEFI; it just will destroy the ability to boot the system. (e.g., this is where Grub lives, and if you delete it, UEFI will no longer be able to launch Grub, and hence, not boot Linux.) Fortunately, if you have a rescue CD / USB Thumb drive, it's relatively easy to recover from this. A rogue rm which deletes /bin (even if /bin is a symlink to /usr/bin, all of the shell scripts and /etc/passwd entries probably still refer to /bin/sh) is going to make the system similarly unbootable. As far as making a system more robust against rogue rm's, I really like scheme used by ChromeOS, where the entire file system is not only read-only, but protected by a cryptographic Merkle Tree such that if malware attempts to modify it, the system will crash. This is combined with firmware which will only load a kernel with a valid digital signature, and the user data is stored on an encrypted file system mounted on /mnt/stateful_partition and it is the only file system mounted read/write on a ChromeOS system. It violates a lot of expectations about where files should live on a "normal" Unix or Linux system, but it's defnitely way more safe and secure. > I feel like we've already abandoned i386 as in 80386 (or compatible) > architecture. I think we now require Pentium (586?) or better. At some > point, we'll completely remove 32-bit support from mainstream Linux > distributions, thus requiring something from the 21st century. For now, as far as I know, Debian still supports a 486 (or i386 with an i387 co-processor, which was my first Linux system). But yes, it is very likely, absent people showing up to volunteer to support 32-bit userspace at Debian (e.g., ongoing security updates, support for the i386 build farm, reporting and triaging build failures of packages on i386, etc.), that the i386 arch will probably get dropped after Debian Bullseye release (which will probably happpen sometime in mid-2021 if I had to guess). I'm not sure there are any 486's around any more, and it's likely most uses of systems with i386 binaries are on 64-bit processors running in 32-bit mode, so 486 vs 586 is probably not all that important in the grand scheme of things. - Ted
[-- Attachment #1: Type: text/plain, Size: 3530 bytes --] On 2/24/21 11:37 AM, Theodore Ts'o wrote: > Oh, sure. I agree completely that it's 180 degrees from the original > golden rule; it had intended to be a joke. Unfortunately, years of > living in a country whre the ones with the Gold really do make all > of the Rules has gotten me to the point where if I don't laugh at it, > I would have to cry.... When colleagues would say "you would think" or "I've been thinking" or the likes, with "We don't do that! The logo does it for us!" when dealing with IBM shenanigans. Again, laugh, lest I cry. > So technically it doesn't wipe out UEFI; it just will destroy the > ability to boot the system. (e.g., this is where Grub lives, and if > you delete it, UEFI will no longer be able to launch Grub, and hence, > not boot Linux.) ACK Either way, it causes someone to have a Bad Day™. > Fortunately, if you have a rescue CD / USB Thumb drive, it's relatively > easy to recover from this. And now we're back towards the start of this (sub)thread of a system being able to boot strap itself or not. > A rogue rm which deletes /bin (even if /bin is a symlink to /usr/bin, > all of the shell scripts and /etc/passwd entries probably still refer > to /bin/sh) is going to make the system similarly unbootable. Agreed. Though I think there is a difference in containing the damage to the OS vs going beyond it and damaging the firmware configuration. > As far as making a system more robust against rogue rm's, I really > like scheme used by ChromeOS, where the entire file system is > not only read-only, but protected by a cryptographic Merkle Tree > such that if malware attempts to modify it, the system will crash. > This is combined with firmware which will only load a kernel with a > valid digital signature, and the user data is stored on an encrypted > file system mounted on /mnt/stateful_partition and it is the only > file system mounted read/write on a ChromeOS system. It violates > a lot of expectations about where files should live on a "normal" > Unix or Linux system, but it's defnitely way more safe and secure. I've not looked at Chrome OS or how it does things because of my dislike for actually /using/ it. However, it sounds like it's worth popping the hood and looking at things. > For now, as far as I know, Debian still supports a 486 (or i386 with > an i387 co-processor, which was my first Linux system). But yes, > it is very likely, absent people showing up to volunteer to support > 32-bit userspace at Debian (e.g., ongoing security updates, support > for the i386 build farm, reporting and triaging build failures of > packages on i386, etc.), that the i386 arch will probably get dropped > after Debian Bullseye release (which will probably happpen sometime > in mid-2021 if I had to guess). I don't know how quickly 32-bit will disappear. I think the embedded market and other non-i386 32-bit platforms will likely keep 32-bit code around for a while yet. At least user space application code. Maybe the i386 kernel code will languish ~> bit rot. Or worse, get in the way of maintaining 64-bit code and thereby be ejected. > I'm not sure there are any 486's around any more, and it's likely most > uses of systems with i386 binaries are on 64-bit processors running > in 32-bit mode, so 486 vs 586 is probably not all that important in > the grand scheme of things. ¯\_(ツ)_/¯ -- Grant. . . . unix || die [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4013 bytes --]
[-- Attachment #1: Type: text/plain, Size: 4028 bytes --] At Tue, 23 Feb 2021 20:20:55 -0700, Warner Losh <imp@bsdimp.com> wrote: Subject: Re: [TUHS] Abstractions > > I booted a FreeBSD/i386 4 system, sans compilers and a few other things, > off 16MB CF card in the early 2000s. I did both static (one binary) and > dynamic and found dynamic worked a lot better for the embedded system... I guess it may depend on your measure of "better"? With a single static-linked binary on a modern demand paged system with shared text pages, the effect is that almost all instructions for any and all programs (and of course all libraries) are almost always paged in at any given time. The result is that program startup requires so few page-in faults that it appears to happen instantaneously. My little i386 image feels faster at the command line (e.g. running on an old Soekris box, even when the root filesystem is on a rather slow flash drive) than on any of the fastest non-static-linked systems I've ever used because of this -- that is of course until it is asked to do any actual computing or other I/O operations. :-) So, in an embedded system there will be many influencing factors, including such as how many exec()s there are during normal operations. For machines with oodles of memory and very fast and large SSDs (and using any kernel with a decently tuneable paging system) one can simply static-link all binaries separately and achieve similar results, at least for programs that are run relatively often. For example the build times of a full system build of, e.g. NetBSD, with a fully static-linked host system and toolchain are remarkably lower than on a fully dynamic-linked system since all the extra processing (and esp. any extra I/Os) done by the "stupid" dynamic linker (i.e. the one that's ubiquitous in modern unixy systems) are completely and forever eliminated. I haven't even measured the difference in years now because I find fully dynamic-linked systems too painful to use for intensive development of large systems. Taking this to the opposite extreme one need only use modern macOS on a machine with an older spinning-rust hard drive that has a loud seek arm to hear and feel how incredibly slow even the simplest tasks can be, e.g. typing "man man" after a reboot or a few days of not running "man". This is because on top of the "stupid" dynamic linker that's needed to start the "man" program, there's also a huge stinking pile of additional wrappers has been added to all of the toolchain command-line tools that require doing even more gratuitous I/O operations (as well as running perhaps millions more gratuitous instructions) for infrequent invocations (luckily these wrappers seem to cache some of the most expensive overhead). (note: "man" is not in the same boat as, e.g. the toolchain progs, and I'm not quite sure why it churns so much on first invocations) My little static-linked i386 system can run "man man" several (many?) thousand times before my old iMac can display even the first line of output. And that's for a simple small program -- just imagine the immense atrocities necessary to run a program that links to several dozen libraries (e.g. the typical GUI application like a web browser, with the saving grace that we don't usually restart browsers in a loop like we restart compilers; but, e.g. /usr/bin/php on macos links to 21 libraries, and even the linker (ld) needs 7 dynamic libraries). BTW, a non-stupid dynamic linker would work the way Multics did (and to some extent I think that's more how dynamic linking worked in AT&T UNIX (SysVr3.2) on the 3B2s), but such things are so much more complicated in a flat address space. Pre-binding, such as I think macOS and IRIX do (and maybe can be done with the most modern binutils), are somewhat like Multics "bound segments" (though still less flexible and perhaps less performant). -- Greg A. Woods <gwoods@acm.org> Kelowna, BC +1 250 762-7675 RoboHack <woods@robohack.ca> Planix, Inc. <woods@planix.com> Avoncote Farms <woods@avoncote.ca> [-- Attachment #2: OpenPGP Digital Signature --] [-- Type: application/pgp-signature, Size: 195 bytes --]
On Wed, 24 Feb 2021, Theodore Ts'o wrote: > On Wed, Feb 24, 2021 at 10:50:03AM -0700, Grant Taylor via TUHS wrote: >> Being a fan of the golden rule, I would not make, much less use, that >> derivation. I think it completely changes the meaning of the spirit behind >> the golden rule. > > Oh, sure. I agree completely that it's 180 degrees from the original > golden rule; it had intended to be a joke. Unfortunately, years of > living in a country whre the ones with the Gold really do make all of > the Rules has gotten me to the point where if I don't laugh at it, I > would have to cry.... I first heard this form used in the movie "Aladdin" (the 1992 Disney one, with Robin Williams). >> I seem to recall hearing about a problem where a rogue rm could accidentally >> wipe out part of the UEFI. Maybe it was the contents of the /boot/efi >> partition. So, I'd suggest a happy medium of mounting it Read-Only. That >> way it's known to be used /and/ it's protected from a simple rogue rm. It >> can relatively easily be re-mounted as Read-Write when necessary. As well >> as subsequently re-mounted back to Read-Only. <snip> > As far as making a system more robust against rogue rm's, I really > like scheme used by ChromeOS, where the entire file system is not only > read-only, but protected by a cryptographic Merkle Tree such that if > malware attempts to modify it, the system will crash. This is > combined with firmware which will only load a kernel with a valid > digital signature, and the user data is stored on an encrypted file > system mounted on /mnt/stateful_partition and it is the only file > system mounted read/write on a ChromeOS system. It violates a lot of > expectations about where files should live on a "normal" Unix or Linux > system, but it's defnitely way more safe and secure. It may not be as much of a protection, but I replaced the system rm on my Debian with one based on 4.4BSD (since I already had the code lying around) to which I added a bit of protection against attempts to "rm -rf /" after a worm got in and ran an obfuscated version of that...thankfully it didn't run as the superuser. I do get occasional "invalid switch" errors from it while using apt, so it probably uses a gnuism (since afaict, the code I used was strictly conformant to Posix). Otherwise, it hasn't caused any issues. -uso.
Steve Nickolas wrote in <alpine.DEB.2.21.2102241520550.15020@sd-119843.dedibox.fr>: |On Wed, 24 Feb 2021, Theodore Ts'o wrote: ... |> As far as making a system more robust against rogue rm's, I really |> like scheme used by ChromeOS, where the entire file system is not only |> read-only, but protected by a cryptographic Merkle Tree such that if |> malware attempts to modify it, the system will crash. This is |> combined with firmware which will only load a kernel with a valid |> digital signature, and the user data is stored on an encrypted file |> system mounted on /mnt/stateful_partition and it is the only file |> system mounted read/write on a ChromeOS system. It violates a lot of |> expectations about where files should live on a "normal" Unix or Linux |> system, but it's defnitely way more safe and secure. | |It may not be as much of a protection, but I replaced the system rm on my |Debian with one based on 4.4BSD (since I already had the code lying |around) to which I added a bit of protection against attempts to "rm -rf |/" after a worm got in and ran an obfuscated version of that...thankfully |it didn't run as the superuser. | |I do get occasional "invalid switch" errors from it while using apt, so it |probably uses a gnuism (since afaict, the code I used was strictly |conformant to Posix). Otherwise, it hasn't caused any issues. Just this week i finished my move from BSD compatibility to plain Linux-only (which you seem to run) for my "web" and my "web with credentials" user accounts; the accounts are gone now, instead i as "i" execute according overlays. pstree for example now say [sudo..] box-browse.sh---box-browse.sh---unshare--- su---.box-browse-gui-+-firefox-bin-+-Web Content when i browse totally boxed and unprivileged. (Still not CPU and memory restricted, but other than that.) The / root is the low level of an overlayfs, the upper level is a tmpfs that may not use more than 5 percent of RAM. It has its own minimal /dev (with audio even) and has read/write access to one shared folder. Ditto with credentials, but that runs in the global network namespace, whereas the unprivileged even runs isolated from that. It is a bit messy if you want to be portable to Linux distributions which use busybox unshare etc., because there you need to use chroot(1) yourself, and therefore mount /proc also yourself thereafter (ie unshare(1)s --mount-proc is effectively useless). Also it would be nice to be able to execute a few commands before you switch aka map user and group IDs in the containment (if you do so). But for open source software the answer there usually is "shut up and hack", thus. Of course with this approach the containers need to live in the same X11 session, therefore the one mounts only /tmp/.X11-unix (it is tremendous that Linux can "mount" a normal directory now!), the other just the plain /tmp. So an rm -f could destroy the shared folder (it lives on a filesystem with snapshot support though). For the credential account it could even wipe /tmp/ and the --bind mounted .mozilla encfs that is in the containment there (but ditto, plus specific backups). I have not looked at overlayfs code, but i think the whiteouts of the upper layer will be saved in the "work" layer, so an rm -rf / could possibly even not finish because 5 percent RAM could exceed earlier? Happy to (un)share ~150 lines sh(1) script. Yes Mr. Cole, thanks for this working on overlay (less so union) filesystems, it is tremendous! (P.S.: that .box-browse-gui.sh is a condome to prevent that firefox-bin as compiled by Mozilla locks me out of the system. Have seen this twice already when browsing serious German and Austrian magazines, needing a reboot. So i now have my browser session been protected by a shell guard, and my window manager menu has a "TOUCH" entry. If the timestamp of the touch file becomes older than 300 seconds i hear the first gong of Big Ben and need to touch .. after the fifth a "kill -TERM -1" happens.) --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)
On Wed, Feb 24, 2021 at 11:48:28AM -0700, Grant Taylor via TUHS wrote: > I've not looked at Chrome OS or how it does things because of my dislike for > actually /using/ it. However, it sounds like it's worth popping the hood > and looking at things. If you don't like using Chromebooks, the same scheme is used for Google's Container Optimized OS (intended for use in cloud VM's running docker images): Container-Optimized OS is an operating system image for your Compute Engine VMs that is optimized for running Docker containers. With Container-Optimized OS, you can bring up your Docker containers on Google Cloud Platform quickly, efficiently, and securely. Container-Optimized OS is maintained by Google and is based on the open source Chromium OS project. https://cloud.google.com/container-optimized-os/docs Cheers, - Ted
On Mon, 22 Feb 2021, Warren Toomey wrote:
>> That's how we ran our RK-05 11/40s since Ed 5... Good fun writing a
>> DJ-11 driver from the DH-11 source; even more fun when I wrote a UT-200
>> driver from the manual alone (I'm sure that "ei.c" is Out There
>> Somewhere), junking IanJ's driver.
>
> https://minnie.tuhs.org/cgi-bin/utree.pl?file=AUSAM/sys/dmr/ei.c
Nah; that's the IanJ rubbish. Mine was written from scratch (and actually
worked) but I don't think that it ever left UNSW.
-- Dave