From mboxrd@z Thu Jan 1 00:00:00 1970 From: tytso at mit.edu (Theodore Y. Ts'o) Date: Sun, 18 Nov 2018 16:24:02 -0500 Subject: [COFF] [TUHS] man-page style In-Reply-To: <20181117200908.2FAF918C073@mercury.lcs.mit.edu> References: <20181117200908.2FAF918C073@mercury.lcs.mit.edu> Message-ID: <20181118212402.GD32299@thunk.org> On Sat, Nov 17, 2018 at 03:09:08PM -0500, Noel Chiappa wrote: > > I looked at how Dave Clark was doing it on Multics, and I was green with envy. > He added, debugged and improved his code _on the running main campus system_, > sharing the machine with dozens of real users! Try doing that on UNIX > (although nowadays it's getting there, with loadable kernel stuff - but this > was in the 70's)! One of the things that made Multics amazing is that you could replace a shared library while other processes was using it --- and without anything crashing. To achieve this, there is a tag field which identifies the object type and its length. So if a library defines an expanded version of the data structure, the new fields are tacked onto the end of the data structure and the length field is bumped. Older callers of the library might pass in a version of the data structure with the original length field; hence, fields can't be accessed unless without first checking the structure tag. I stole this idea and used in Kerberos v5 and Linux's userspace ext2/3/4 utilities, where we use a error table code --- another Multics concept --- as the structure tag. So in the error_table file, we might have: ec EXT2_ET_MAGIC_BADBLOCKS_ITERATE, "Wrong magic number for badblocks_iterate structure" And that in each function that uses that structure, there'd be something like this: EXT2_CHECK_MAGIC(iter, EXT2_ET_MAGIC_BADBLOCKS_ITERATE); Where EXT2_CHECK_MAGIC is defined as: #define EXT2_CHECK_MAGIC(struct, code) \ if ((struct)->magic != (code)) return (code) (All MIT KerberosV5 and libext2fs structures have a 32-bit unsigned magic field as the first 4 bytes of the structure.) This technique is also useful so when I needed to add support for 64-bit block numbers, I could use the structure magic numbers to disambiguate which version of the object we were using. Hence unlike some shared libraries where the magic number has been incremented to indicate an ABI break every few months, e2fsprogs has not had an ABI break in over ten years. This also made it a bit easier to find use-after-free bugs in an era before valgrid/purify, by the simple expedient of zeroing the magic field when deallocating an object. > The security wasn't good, because Multics didn't have set-uid (so that only > Dave's code would have had access to that state database) - when they later > productized the code, they used Multics rings to make it secure. So that's a bit misleading. Setuid isn't really a good analogue for protection rings. The proper analogue for user mode versus kernel mode in the Unix world. (Where user mode is roughly speaking, Multics ring 4, and Kernel mode is Multics ring 0 --- the Honeywell hardware had support for 8 rings, but processes running below ring 4 have so little access that using them isn't terribly practical for general purpose programs. Processes ringing at rings 5 and higher wouldn't have access to most of what we in the Unix world would call "the standard POSIX system calls".) Code running in one ring can transition to higher rings via "gates", which would be the Unix equivalent of a system call. Hence, a Multics program running at Ring 4 could create its own gates that would provide an extremeted limited set of system services to programs running at Ring 5. Those programs wouldn't have access to the normal system calls, but only via the specified functions in the Ring 4 gates. This is sort of like Capsicum, but it's more powerful --- and it was designed decades before FreeBSD's Capsicum. > The nice thing was that to call up some subsystem to perform some service for > you, you didn't have to do IPC and then a process switch - it was a > _subroutine call_, in the CPU's hardware. Well, when you call a system call, you don't do a process switch either. So when Ring 4 code calls a ring 0 service, you chan think of it as a system call. It might not have been any slower than a normal function call, but remember, this is a CISC system. So another way of saying things is that normal function calls weren't any faster than a privilege transition via a system call! > The 386-Pentium actually had support for many segments, but I gather they are > in the process of deleting it in the latest machines because nobody's using > it. Which is a pity, because when done correctly (which it was - Intel hired > Paul Karger to architect it) it's just what you need for a truly secure system > (which Multics also had) - but that's another long message. One unfortunate thing about the 386 VM is that a segment plus offset gets translated to a 32-bit global virtual address, which is then translated to a physical address via a single page table. With Multics, each segment had its own page table which translated the segment+offset to a physical address. With only 32-bits of virtual address space on the 386, it's not at all clear aggressive use of segments ala Multics would have worked terribly well, due to the internal fragmentation of that 32-bit address space. So I've talked to some Multicians at MIT who might quibble with the claim that 386's deisgn was "done correctly". In any case, since no one really used segments on 32-bit x86, segment support ended up getting mostly dropped in 64-bit mode. (The FS and GS segments still kinda work, mostly to keep Windows happy. The CS, DS, ES, and SS segments are basically no-ops in the 64-bit x86 world.) Which is too bad. I suspect that with a 64-bit address space, designing an OS with a Multics-style segmentation architecture might have been possible. (But see Rob Pike's "Systems Software Research is Irrelevant" rant for the argument that even if was *possible* it was very unlikely to have happened, so for AMD and Intel to have neutered segmentation in the x86-64 architecture might have been a well justified decision.) Cheers, - Ted