* Re: [9fans] Acme mailreader - now: User mode filesystems in linux
@ 2004-12-17 15:31 bmaroshe
0 siblings, 0 replies; 18+ messages in thread
From: bmaroshe @ 2004-12-17 15:31 UTC (permalink / raw)
To: Russ Cox, Fans of the OS Plan 9 from Bell Labs
> You're stuck with the operating system you have,
> not the operating system you'd like to have. If one were
> designing the system from scratch one could always do
> better. Sadly this Coda discussion is about how to deal with
> what's already available on Unix.
So in your opinion it's better to design systems from scratch than to try to bring good ideas to kludgey ancient systems?
boris
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] Acme mailreader @ 2004-12-16 15:08 David Leimbach 2004-12-16 23:22 ` geoff 0 siblings, 1 reply; 18+ messages in thread From: David Leimbach @ 2004-12-16 15:08 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs hmmm not sure where this all came from on this thread of discussion :). On Wed, 15 Dec 2004 16:24:47 -0800, geoff@collyer.net <geoff@collyer.net> wrote: > OS X is in no sense a micro-kernel. The OS X kernel is huge: > > ; size /mach_kernel > __TEXT __DATA __OBJC others dec hex > 3022848 458752 0 643984 4125584 3ef390 > > and consists of a heavily-hacked Mach (3, I believe) kernel and a > FreeBSD kernel (with bits from other BSDs), combined into a single > kernel and running in a single address space. The BSD kernel does not > run in user mode. Remember that Mach was, as far as I know, the > largest ``micro-kernel'' ever produced, larger than most or all of its > contemporary ``macro-kernels'', so that some of us called it a > ``Machro-kernel''. It's an OSF Mach3 with "optimizations" :). The kernel is really nothing like FreeBSD. It's more like the BSD from NeXTStep with some Free/Open/Net BSD stuff hacked in. Also you are forgetting IOKit [the C++ framework for device drivers]. The Apple marketing team is just putting rubbish on the internet when it comes to claiming things are based on FreeBSD 5. In fairness, some of the userland applications and command line tools are, in fact, from FreeBSD but the amount of FreeBSD in XNU [the kernel] and Darwin is exaggerated. Porting things from FreeBSD 5, however to Mac OS X is quite painful because you have to deal with IOKit and the hardly FreeBSD-like bsd kernel portion. > > I haven't looked very hard (one could check out the mount_* sources > from the Darwin CVS servers), but mount(2) doesn't seem to have much > that's new, except for union mounts, which surprised me. I suspect > that most of the mount_* commands either invoke kernel machinery > (through the ``type'' argument to mount) or pretend to be NFS servers. > I've never yet seen a (l)unix system other than late Research Unix > that made user-mode file servers relatively easy and painless to write > (though I'd love to be shown a counter-example!). Of course, since > many (l)unix systems only allow the super-user to mount anything, > their maintainers may not see much utility in user-mode file servers. > It's sort of a cascade of vision-failures. Maybe because people don't know why Plan 9 is better than Unix they thing Unix is "the way". Religion often overrides common sense. Do we need more plan 9 "missionaries"? [probably] DragonflyBSD is working on making the VFS a message passing layer instead of a system call layer so doing something like 9p is probably already in their grand scheme of development. http://www.dragonflybsd.org/goals/vfsmodel.cgi This doesn't help Mac OS X of course. > > Also, /sys/src/cmd/upas/README is a little dated: > > --rw-rw-r-- M 5174 sys sys 1041 Dec 11 1999 README > > I'm not sure if it pre-dates upas/fs, but it describes how to port the > parts of upas that don't rely on Plan 9 facilities (transport more > than reading). I ported Plan 9's upas back to Unix while at the labs > (and also translated it into limbo), but some parts (e.g., upas/fs) > didn't have an obvious implementation, other than painfully pretending > to be an NFS server, at least at the time. > Might be interesting to see how DragonFlyBSD has come along and if it's possible to implement upas/fs with whatever they've done. Again this doesn't really help Mac OS X. I just think it's interesting. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] Acme mailreader 2004-12-16 15:08 [9fans] Acme mailreader David Leimbach @ 2004-12-16 23:22 ` geoff 2004-12-17 4:55 ` [9fans] Acme mailreader - now: User mode filesystems in linux Martin C.Atkins 0 siblings, 1 reply; 18+ messages in thread From: geoff @ 2004-12-16 23:22 UTC (permalink / raw) To: 9fans [-- Attachment #1: Type: text/plain, Size: 2028 bytes --] I worked at Apple in the BSD group. The XNU non-Mach code was clearly some BSD kernel and I don't really care which. My colleagues told me it started out with NetBSD but that that was eventually dwarfed by FreeBSD with contributions from elsewhere. While I was there, there was talk of dragging in code from the latest FreeBSD, notably the FFS with soft updates; I'm pretty sure that happened. Given that the group was (and probably still is) headed by Jordan Hubbard of FreeBSD fame, I suspect that they're continuing to pull in FreeBSD code and it isn't just hype. Note too that the XNU BSD code, measured in source lines, is almost exactly as huge as the Mach code, so the volume of *BSD code in XNU is not, in my opinion, exaggerated: it is (or was in 2002) half the kernel source. (I don't remember which side of the fence IOKit was counted against.) Yes, the XNU kernel details are different from a stock BSD kernel. It co-exists with Mach, after all. Porting graphical applications to native OS X (avoiding X11) is a pain too; Apple do a lot of things their own way, inheriting baggage from the pre-Unix Mac OS and NextStep (netinfo is just the French spelling of `Yellow Pages', ugh). Nevertheless, I stand by my statement that OS X is in no sense a micro-kernel, and that user-mode file servers will not (as a result of access to a micro-kernel) be easier to implement on OS X than on other (l)unixes. However, Martin Atkins has revealed the mystery kernel agent: coda. Apparently it's somewhat specialised but lets user-mode file servers catch opens and closes. Anyone in (l)unixland for a filesystem switch? Research Unix had one ~20 years ago, so it should be mouldy (er, mature) enough to be acceptable to (l)unixland. Throw in mounts by ordinary users and use of 9P as an unifying filesystem protocol (now pretty well aged in Plan 9), and it becomes possible to push lots of code and some hacks out of the kernel, while permitting some new and interesting work. [-- Attachment #2: Type: message/rfc822, Size: 6546 bytes --] From: David Leimbach <leimy2k@gmail.com> To: Fans of the OS Plan 9 from Bell Labs <9fans@cse.psu.edu> Subject: Re: [9fans] Acme mailreader Date: Thu, 16 Dec 2004 07:08:34 -0800 Message-ID: <3e1162e6041216070874f424e5@mail.gmail.com> hmmm not sure where this all came from on this thread of discussion :). On Wed, 15 Dec 2004 16:24:47 -0800, geoff@collyer.net <geoff@collyer.net> wrote: > OS X is in no sense a micro-kernel. The OS X kernel is huge: > > ; size /mach_kernel > __TEXT __DATA __OBJC others dec hex > 3022848 458752 0 643984 4125584 3ef390 > > and consists of a heavily-hacked Mach (3, I believe) kernel and a > FreeBSD kernel (with bits from other BSDs), combined into a single > kernel and running in a single address space. The BSD kernel does not > run in user mode. Remember that Mach was, as far as I know, the > largest ``micro-kernel'' ever produced, larger than most or all of its > contemporary ``macro-kernels'', so that some of us called it a > ``Machro-kernel''. It's an OSF Mach3 with "optimizations" :). The kernel is really nothing like FreeBSD. It's more like the BSD from NeXTStep with some Free/Open/Net BSD stuff hacked in. Also you are forgetting IOKit [the C++ framework for device drivers]. The Apple marketing team is just putting rubbish on the internet when it comes to claiming things are based on FreeBSD 5. In fairness, some of the userland applications and command line tools are, in fact, from FreeBSD but the amount of FreeBSD in XNU [the kernel] and Darwin is exaggerated. Porting things from FreeBSD 5, however to Mac OS X is quite painful because you have to deal with IOKit and the hardly FreeBSD-like bsd kernel portion. > > I haven't looked very hard (one could check out the mount_* sources > from the Darwin CVS servers), but mount(2) doesn't seem to have much > that's new, except for union mounts, which surprised me. I suspect > that most of the mount_* commands either invoke kernel machinery > (through the ``type'' argument to mount) or pretend to be NFS servers. > I've never yet seen a (l)unix system other than late Research Unix > that made user-mode file servers relatively easy and painless to write > (though I'd love to be shown a counter-example!). Of course, since > many (l)unix systems only allow the super-user to mount anything, > their maintainers may not see much utility in user-mode file servers. > It's sort of a cascade of vision-failures. Maybe because people don't know why Plan 9 is better than Unix they thing Unix is "the way". Religion often overrides common sense. Do we need more plan 9 "missionaries"? [probably] DragonflyBSD is working on making the VFS a message passing layer instead of a system call layer so doing something like 9p is probably already in their grand scheme of development. http://www.dragonflybsd.org/goals/vfsmodel.cgi This doesn't help Mac OS X of course. > > Also, /sys/src/cmd/upas/README is a little dated: > > --rw-rw-r-- M 5174 sys sys 1041 Dec 11 1999 README > > I'm not sure if it pre-dates upas/fs, but it describes how to port the > parts of upas that don't rely on Plan 9 facilities (transport more > than reading). I ported Plan 9's upas back to Unix while at the labs > (and also translated it into limbo), but some parts (e.g., upas/fs) > didn't have an obvious implementation, other than painfully pretending > to be an NFS server, at least at the time. > Might be interesting to see how DragonFlyBSD has come along and if it's possible to implement upas/fs with whatever they've done. Again this doesn't really help Mac OS X. I just think it's interesting. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] Acme mailreader - now: User mode filesystems in linux 2004-12-16 23:22 ` geoff @ 2004-12-17 4:55 ` Martin C.Atkins 2004-12-17 9:54 ` Martin C.Atkins 2004-12-17 15:44 ` Ronald G. Minnich 0 siblings, 2 replies; 18+ messages in thread From: Martin C.Atkins @ 2004-12-17 4:55 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs Hi all, On Thu, 16 Dec 2004 15:22:59 -0800 geoff@collyer.net wrote: >.. > However, Martin Atkins has revealed the mystery kernel agent: coda. > Apparently it's somewhat specialised but lets user-mode file servers > catch opens and closes. I was going to write a longer message today, anyway :-). Yes, the kernel-mode agent I use is the Coda filesystem driver, which has been in the stock linux kernel for several years. For those that don't already know: Coda is a remote filesystem that copes (more or less well) with disconnection from, and reconnection to the fileserver. Thus allowing clients to continue work in the disconnected state. I'm not sure how successful it was at this - I've never tried it - but it sounds like an interesting goal. This goal is also shared by Intermezzo, which was (also) started by Peter Braam - so presumably he felt Coda could be improved upon. However, judging by the News pages on their web sites, more seems to be happening with Coda, than with Intermezzo, recently. Anyway, the interesting thing about both these systems, from the point of view of this discussion, is that the real work is done in a user-mode agent, which communicates with a kernel-mode stub driver. In Coda the user-mode agent is, for some obscure reason, called Venus. Thus it was only necessary to reverse-engineer the kernel module <-> Venus protocol. This was surprisingly easy: it is documented in some detail on the Coda website, and Pavel Machek's podfuk had already worked out some of the more obscure details (but this wasn't a general-purpose library, didn't do some things I wanted, and was I thought, messy). The Coda driver is stable enough that I don't remember any hangs/etc, even while I was going through the trial-and-error process! The user-mode agent simply opens /dev/cfsN, for some N, and reads and writes messages down to the kernel module. My library for this is, as I said, about 1400 lines of Python. Trivial fileserver applications can be as small as 10-20 lines, and faulty fileservers (or library) never crash/hang the kernel (which is how things should be! :-). It is also possible to kill the fileserver, and restart with minimal side effects. It would be easy to make libraries for other languages. One curiosity, which is both an advantage, and a disadvantage depending on what you want to do: The user-mode agent is not involved in - does not even see - reads and writes. When a client opens a file, the user-mode agent makes a file somewhere containing the contents of the "virtual" file, opens it, and writes the file descriptor down into the kernel. The kernel module returns this file descriptor (more or less) to the client who reads and writes it as a normal file, with no intervention of Coda. When the client closes the file, the kernel driver tells the user-mode agent, which deals with the (possibly new) contents of the file, and might then remove it from the local filesystem. The advantage of this is that reads/writes happen at the same speed as reads and writes to local files. The disadvantages are that the open has to make a local copy of the entire contents of the file - even if it is very big - and can't process individual writes as commands, as in common in Plan 9. The user-mode agent might also have to read the file to work out what changed. However, you can rather easily process open+write+close as a command to the user-mode agent, or have a file whose contents are different every time it is opened. (So you do open+read+close, to read a status value, for example) Ideally, I'd like the processing of open to be able to decide whether to send a file descriptor down into the kernel, or to receive read/write messages - this seems to have been in previous versions of Coda - such is life! Re: Intermezzo - as Ron pointed out, it's kernel driver could also possibly be used for this purpose. However, it uses hooks into an underlying journalled filessystem - a requirement that I couldn't easily satisfy back when I started this work, and I wasn't sure that the kernel-user space interface was so easy to apply to my purpose. However, it might allow one to avoid the disadvantages mentioned in the last paragraph. I've been meaning to opensource the library for a while now - but I'd like to clean it up in a few places, before a proper release. This interest might be the spur to make me get around to it... >... > 9), and it becomes possible to push lots of code and some hacks out of > the kernel, while permitting some new and interesting work. No disagreements there! Martin -- Martin C. Atkins martin_ml@parvat.com Parvat Infotech Private Limited http://www.parvat.com{/,/martin} ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] Acme mailreader - now: User mode filesystems in linux 2004-12-17 4:55 ` [9fans] Acme mailreader - now: User mode filesystems in linux Martin C.Atkins @ 2004-12-17 9:54 ` Martin C.Atkins 2004-12-17 10:22 ` geoff ` (3 more replies) 2004-12-17 15:44 ` Ronald G. Minnich 1 sibling, 4 replies; 18+ messages in thread From: Martin C.Atkins @ 2004-12-17 9:54 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs Of course, nothing I wrote in the previous message avoids (l)unix's limitations regarding mount, etc. Nor does it allow users to write their own fileservers (without opening up various security holes, and having root access). However, thinking over lunch, I realised that there is a way of doing something quite nice (in the linux sense, if not the Plan 9 sense!) with what we already have. On login, each user starts a user-filesystem-daemon, which uses setuid to create a /dev/cfsN, if necessary, opens it to start serving, and mounts it on a conventional place: $HOME/mnt, say. When the user runs a program that wants to serve a filesystem, it attaches to the daemon, which creates $HOME/mnt/servicename, and forwards all requests for this directory hierarchy to the program. Replies are adjusted to be consistent with security - setuid bits removed, ownership forced to the user, etc., and sent back to the kernel. When the program terminates, the daemon cleans up, and removes servicename. Advantages over just hacking mount to allow anyone to mount anything on $HOME/mnt: 1) We don't need a /dev/cfsN for every filesystem, just one for each concurrent user. Also client fileserving programs don't have to compete to allocate /dev/cfsN's - which would need some sort of setuid - only the daemons do this. 2) The user-filesystem-daemon only has to run as root during initialisation, everything else runs as the user. 3) The user-filesystem-daemon can enforce file ownership (as the user) in the served directory hierarchy. It can also force off setuid bits, etc. Furthermore, users can only attach their fileservers to their own daemons! (A bit like per-process mount tables - of course, linux has this already, but not in a very user-friendly form) 4) The user-filesystem-daemon can clean up when/if a fileserver crashes. 5) Knowledge of the Coda protocol could be limited to the daemon, and a higher-level protocol used with the "real" fileservers. Thus we could move to other kernel mechanisms (e.g. fuse) if/when they become available. 6) All user filesystems are under $HOME/mnt - symlinks could be used from elsewhere. (or is this a disadvantage?) Disadvantages: 1) greater overhead - each fileserving message has to make the extra hops from daemon to fileserver program, and back. 2) Complexity? 3) Others...? The same approach would probably also work (better, easier?) with p9fs - but some of the advantages might already have been solved there, in other ways. What do people think? Martin -- Martin C. Atkins martin_ml@parvat.com Parvat Infotech Private Limited http://www.parvat.com{/,/martin} ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] Acme mailreader - now: User mode filesystems in linux 2004-12-17 9:54 ` Martin C.Atkins @ 2004-12-17 10:22 ` geoff 2004-12-17 10:45 ` Martin C.Atkins ` (3 more replies) 2004-12-17 13:41 ` Derek Fawcus ` (2 subsequent siblings) 3 siblings, 4 replies; 18+ messages in thread From: geoff @ 2004-12-17 10:22 UTC (permalink / raw) To: 9fans Someone at the 9bof claimed that at least one of the BSDs already permits users to mount things on any directory for which they have write permission. I suspect that the policy actually needs to be a little stricter than that; you don't want people mounting (system-wide) on /tmp. Perhaps any directory that you own would make more sense. But we also heard that the maintainers of at least one of the other BSDs or Linux have a religious aversion to users mounting anything. Certainly one would want to think through the interactions of set-id and user mounts. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] Acme mailreader - now: User mode filesystems in linux 2004-12-17 10:22 ` geoff @ 2004-12-17 10:45 ` Martin C.Atkins 2004-12-17 11:42 ` Andy Newman ` (2 subsequent siblings) 3 siblings, 0 replies; 18+ messages in thread From: Martin C.Atkins @ 2004-12-17 10:45 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Fri, 17 Dec 2004 02:22:22 -0800 geoff@collyer.net wrote: > ... But we also heard that the maintainers of at least one of > the other BSDs or Linux have a religious aversion to users mounting > anything. ... What was it you called it?... "a cascade of vision-failures." Very apt. Martin -- Martin C. Atkins martin_ml@parvat.com Parvat Infotech Private Limited http://www.parvat.com{/,/martin} ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] Acme mailreader - now: User mode filesystems in linux 2004-12-17 10:22 ` geoff 2004-12-17 10:45 ` Martin C.Atkins @ 2004-12-17 11:42 ` Andy Newman 2004-12-17 15:57 ` Ronald G. Minnich 2004-12-17 12:30 ` Latchesar Ionkov 2004-12-17 15:55 ` Ronald G. Minnich 3 siblings, 1 reply; 18+ messages in thread From: Andy Newman @ 2004-12-17 11:42 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs geoff@collyer.net wrote: > Someone at the 9bof claimed that at least one of the BSDs already > permits users to mount things FreeBSD does, don't know about the others. > Perhaps any directory that you own would make more sense. Comments taken from mount (in vfs_syscalls.c)... /* * If the user is not root, ensure that they own the directory * onto which we are attempting to mount. */ > Certainly one would want to think through the interactions > of set-id and user mounts. /* * Silently enforce MNT_NOSUID and MNT_NODEV for non-root users */ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] Acme mailreader - now: User mode filesystems in linux 2004-12-17 11:42 ` Andy Newman @ 2004-12-17 15:57 ` Ronald G. Minnich 0 siblings, 0 replies; 18+ messages in thread From: Ronald G. Minnich @ 2004-12-17 15:57 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Fri, 17 Dec 2004, Andy Newman wrote: > > Certainly one would want to think through the interactions > > of set-id and user mounts. > > /* > * Silently enforce MNT_NOSUID and MNT_NODEV for non-root users > */ that's what I did on 2.0.36, and the fact that you could not express the suid and other such nasty bits in the attributes of a 9p file made it pretyt easy to do that :-) ron ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] Acme mailreader - now: User mode filesystems in linux 2004-12-17 10:22 ` geoff 2004-12-17 10:45 ` Martin C.Atkins 2004-12-17 11:42 ` Andy Newman @ 2004-12-17 12:30 ` Latchesar Ionkov 2004-12-17 15:55 ` Ronald G. Minnich 3 siblings, 0 replies; 18+ messages in thread From: Latchesar Ionkov @ 2004-12-17 12:30 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs If you combine the restriction for mounting filesystems only on directories you have write access to, with the (enforced) creation of private namespace that Linux allows, mounting on /tmp is not a problem anymore. Lucho On Fri, Dec 17, 2004 at 02:22:22AM -0800, geoff@collyer.net said: > Someone at the 9bof claimed that at least one of the BSDs already > permits users to mount things on any directory for which they have > write permission. I suspect that the policy actually needs to be a > little stricter than that; you don't want people mounting > (system-wide) on /tmp. Perhaps any directory that you own would make > more sense. But we also heard that the maintainers of at least one of > the other BSDs or Linux have a religious aversion to users mounting > anything. Certainly one would want to think through the interactions > of set-id and user mounts. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] Acme mailreader - now: User mode filesystems in linux 2004-12-17 10:22 ` geoff ` (2 preceding siblings ...) 2004-12-17 12:30 ` Latchesar Ionkov @ 2004-12-17 15:55 ` Ronald G. Minnich 3 siblings, 0 replies; 18+ messages in thread From: Ronald G. Minnich @ 2004-12-17 15:55 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Fri, 17 Dec 2004 geoff@collyer.net wrote: > Someone at the 9bof claimed that at least one of the BSDs already > permits users to mount things on any directory for which they have > write permission. Russ Cox mentioned that NetBSD (??) allows you to mount on a directory you own. You have to own it. Still lotsa danger here. Linux has a religious aversion to users mounting things, and has since at least 1996 when I started asking them about this for a very early version of v9fs. They did not object to v9fs per se, just the concept of user mounts! ron ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] Acme mailreader - now: User mode filesystems in linux 2004-12-17 9:54 ` Martin C.Atkins 2004-12-17 10:22 ` geoff @ 2004-12-17 13:41 ` Derek Fawcus 2004-12-17 14:42 ` Karl Magdsick 2004-12-18 0:13 ` Tim Newsham 3 siblings, 0 replies; 18+ messages in thread From: Derek Fawcus @ 2004-12-17 13:41 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Fri, Dec 17, 2004 at 03:24:56PM +0530, Martin C.Atkins wrote: > 2) The user-filesystem-daemon only has to run as root during initialisation, > everything else runs as the user. > > 3) The user-filesystem-daemon can enforce file ownership (as the user) in > the served directory hierarchy. It can also force off setuid bits, etc. > Furthermore, users can only attach their fileservers to their own daemons! > (A bit like per-process mount tables - of course, linux has this already, but > not in a very user-friendly form) So while it's running, I can use gdb to attach to it and get around any security it's trying to enforce (turn setuid back on, change ownership to root, etc). DF ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] Acme mailreader - now: User mode filesystems in linux 2004-12-17 9:54 ` Martin C.Atkins 2004-12-17 10:22 ` geoff 2004-12-17 13:41 ` Derek Fawcus @ 2004-12-17 14:42 ` Karl Magdsick 2004-12-17 14:56 ` Russ Cox 2004-12-18 0:13 ` Tim Newsham 3 siblings, 1 reply; 18+ messages in thread From: Karl Magdsick @ 2004-12-17 14:42 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > On login, each user starts a user-filesystem-daemon, which uses setuid > to create a /dev/cfsN, if necessary, opens it to start serving, and > mounts it on a conventional place: $HOME/mnt, say. Why are the setuid meta-daemons needed? You'll want security/sanity checks within the kernel code anyway, so the main reason for the setuid meta-daemons is creating(??/destroying??) device files. It seems that you could get around this by using one device file for everyone, which brings us to... Why N different devices? The kernel can distinguish between open filehandles for /dev/cfs and use an array, tree, or map of structs to keep track of which file handle goes with which mount point. (Yay, flyweight design pattern!) This prevents nastiness with trying to find an unused device, creating devices on demand, and "garbage collecting" unused device files. (You wouldn't want a trillion unused devices sitting around.) The kernel can free associated structs when filehandles close. Having the meta-daemons garbage collecting unused device files seems like trouble. > When the user runs a program that wants to serve a filesystem, it attaches to > the daemon, which creates $HOME/mnt/servicename, and forwards all > requests for this directory hierarchy to the program. As mentioned in another post, you can prevent device files and setuid executibles in non-root owned filesystems and allow any user process to mount a fs on any mount point they own. The kernel needs its own cleanup code (what if one of the per-user filesystem meta-daemons crashes?) and security/sanity checks (only allowing root-owned processes to open filehandles to the device). Granted, using a map of N structs and 1 device instead of N devices each with 1 struct does increase kernel complexity, but only by a small amount. The kernel needs its own security/sanity checks and cleanup code even in the case of per-user setuid filesystem meta-daemons. Setuid per-user filesystem meta-daemons seem like they would greatly increase user-space complexity (and duplicate functionality that MUST also be in the kernel) and only minimally reduce kernel complexity. Don't get me wrong, I'm a big fan of micro/nanokernels. I dual booted BeOS 5 and L4-Linux on my desktop, back when the latest port to L4 user-space was Linux 2.2.(??20??). I got a warm fuzzy feeling every time my BeOS flakey 3Com NIC driver crashed and BeOS asked me if I wanted to restart the network driver. Buggy drivers weren't so fun, but a system that easily recovers from driver crashes is just cool! (I had a UDFS CD that would cause fs driver crashes in both Linux (kernel panic) and Win2k (BSOD). I really wished for user-space fs drivers on "normal" systems.) After a few days of 2 L4-Linux system lock-ups per day, I went back to using Linux as a regular kernel. (The diagnostic counters kept spinning, so it seemed that the Linux server had locked up, not L4, but all of my processes depended on the Linux server anyway.) In any case, I'd love to see user-space filesystem (and device) drivers on mainstream OSes. I don't have much knowledge of/experience with Plan9, but I've read that the system is designed so that it's very easy to port drivers between user-space and kernel-space. Is this correct? In a "standard" setup, how many of the drivers are (mostly) in user-space? -Karl ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] Acme mailreader - now: User mode filesystems in linux 2004-12-17 14:42 ` Karl Magdsick @ 2004-12-17 14:56 ` Russ Cox 0 siblings, 0 replies; 18+ messages in thread From: Russ Cox @ 2004-12-17 14:56 UTC (permalink / raw) To: Karl Magdsick, Fans of the OS Plan 9 from Bell Labs > [Lots of good arguments] You're stuck with the operating system you have, not the operating system you'd like to have. If one were designing the system from scratch one could always do better. Sadly this Coda discussion is about how to deal with what's already available on Unix. > I don't have much knowledge of/experience with Plan9, but I've read > that the system is designed so that it's very easy to port drivers > between user-space and kernel-space. Is this correct? No, it's not. It's not hard (I can't think of any system-level programming task I'd characterize as "hard" using Plan 9) but it's not trivial either. > In a "standard" setup, how many of the drivers are (mostly) in user-space? Anything that touches hardware is typically in the kernel, though in the case of particularly complicated hardware (like vga and usb), the kernel part just makes it possible for user-mode programs to get at the hardware and do the complicated stuff. On the other hand, file system drivers are typically outside the kernel, and the network and graphics devices have moved back and forth a few times. Russ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] Acme mailreader - now: User mode filesystems in linux 2004-12-17 9:54 ` Martin C.Atkins ` (2 preceding siblings ...) 2004-12-17 14:42 ` Karl Magdsick @ 2004-12-18 0:13 ` Tim Newsham 2004-12-18 0:13 ` boyd, rounin 3 siblings, 1 reply; 18+ messages in thread From: Tim Newsham @ 2004-12-18 0:13 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > However, thinking over lunch, I realised that there is a way of doing > something quite nice (in the linux sense, if not the Plan 9 sense!) with > what we already have. > > On login, each user starts a user-filesystem-daemon, which uses setuid > to create a /dev/cfsN, if necessary, opens it to start serving, and > mounts it on a conventional place: $HOME/mnt, say. [...] I think it would be better to just implement the v9fs protocol and let users mount it in a similar way as nfs. The v9fs layer in the kernel would simply communicate the kernel filesystem requests over some pipe to another machine (or the same machine) in much the same way as nfs requests get sent out. The layer could support options (or strictly enforce) to disable setuid bits and/or file ownership. A setuid mounting utility that enforced any no-setuid options could allow users to perform mounts. Userland filesystems could be implemented by providing a service that adheres to the v9fs. Adding a seperate userland demon that proxies a filesystem protocol (probably the same one) through to the kernel seems like an uneeded layer of complexity. It seems like there are a lot of projects out there that have interest in providing userland filesystems. They typically use nfs because its the easiest vector into the existing infrastructure. V9fs would definitely fill a useful niche. > 2) The user-filesystem-daemon only has to run as root during initialisation, > everything else runs as the user. A network daemon talking v9fs shouldn't impose any ownership restrictions. The restrictions should be imposed at mount time. > 3) The user-filesystem-daemon can enforce file ownership (as the user) in > the served directory hierarchy. It can also force off setuid bits, etc. > Furthermore, users can only attach their fileservers to their own daemons! > (A bit like per-process mount tables - of course, linux has this already, but > not in a very user-friendly form) This can be done in-kernel. In fact, there are already options in nfs to neuter suid bits, device nodes and provide user mappings. > 5) Knowledge of the Coda protocol could be limited to the daemon, and a > higher-level protocol used with the "real" fileservers. Thus we could move to > other kernel mechanisms (e.g. fuse) if/when they become available. V9fs already provides a useful protocol. > 6) All user filesystems are under $HOME/mnt - symlinks could be used > from elsewhere. (or is this a disadvantage?) Restrictions as to where mount points could be placed could be put in any setuid binary that mediates user mounts. Tim N. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] Acme mailreader - now: User mode filesystems in linux 2004-12-18 0:13 ` Tim Newsham @ 2004-12-18 0:13 ` boyd, rounin 2004-12-18 3:49 ` Ronald G. Minnich 0 siblings, 1 reply; 18+ messages in thread From: boyd, rounin @ 2004-12-18 0:13 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > I think it would be better to just implement the v9fs protocol > and let users mount it in a similar way as nfs. been there, done that. nfs is no fun. you have to get right up into the vfs layer. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] Acme mailreader - now: User mode filesystems in linux 2004-12-18 0:13 ` boyd, rounin @ 2004-12-18 3:49 ` Ronald G. Minnich 2004-12-23 16:04 ` boyd, rounin 0 siblings, 1 reply; 18+ messages in thread From: Ronald G. Minnich @ 2004-12-18 3:49 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Sat, 18 Dec 2004, boyd, rounin wrote: > > I think it would be better to just implement the v9fs protocol > > and let users mount it in a similar way as nfs. > > been there, done that. nfs is no fun. you have to get right up into > the vfs layer. ah, boyd, you gotta read what he's saying cause it's what we want. All Tim is saying is "let's just go 9p2000 from Linux VFS layer to userland servers". This is good. And since v9fs does that, and it's basically there on 2.6, it's the right way to go. ron ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] Acme mailreader - now: User mode filesystems in linux 2004-12-18 3:49 ` Ronald G. Minnich @ 2004-12-23 16:04 ` boyd, rounin 0 siblings, 0 replies; 18+ messages in thread From: boyd, rounin @ 2004-12-23 16:04 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > ah, boyd, you gotta read what he's saying cause it's what we want. that's what i meant. i wasn't being explicit enough. if you stick into the VFS (which i looked at once for the ULTRIX GFS, but it was too horrible) then everything is cool. btw: in brussels on my way back from Twente. gotta find me a prepaid WiFi login. i have some photos i'll stick 'em up some point. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] Acme mailreader - now: User mode filesystems in linux 2004-12-17 4:55 ` [9fans] Acme mailreader - now: User mode filesystems in linux Martin C.Atkins 2004-12-17 9:54 ` Martin C.Atkins @ 2004-12-17 15:44 ` Ronald G. Minnich 2004-12-18 12:35 ` Martin C.Atkins 1 sibling, 1 reply; 18+ messages in thread From: Ronald G. Minnich @ 2004-12-17 15:44 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Fri, 17 Dec 2004, Martin C.Atkins wrote: > > For those that don't already know: Coda is a remote filesystem that > copes (more or less well) with disconnection from, and reconnection > to the fileserver. Thus allowing clients to continue work in the > disconnected state. I'm not sure how successful it was at this - I've > never tried it - but it sounds like an interesting goal. This goal is > also shared by Intermezzo, which was (also) started by Peter Braam - > so presumably he felt Coda could be improved upon. As peter used to put it, 5KLOC (intermezzo) was in his mind better than 500KLOC (coda). He brought both file systems to fruition. > However, judging by the News pages on their web sites, more seems to > be happening with Coda, than with Intermezzo, recently. yeah, intermezzo limped along for 5 years, never quite worked, then died. But the kernel->user interface of intermezzo is perfectly usable. Sounds like you've gotten far with coda, so this is just an FYI. I only know a bit about this because I did a lot of work with imezzo early in the game, and got to the point where I could boot a linux node with imezzo as the root file system. That was interesting. ron ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] Acme mailreader - now: User mode filesystems in linux 2004-12-17 15:44 ` Ronald G. Minnich @ 2004-12-18 12:35 ` Martin C.Atkins 0 siblings, 0 replies; 18+ messages in thread From: Martin C.Atkins @ 2004-12-18 12:35 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs Gosh - lots of comments.. On Fri, 17 Dec 2004 08:44:52 -0700 (MST) "Ronald G. Minnich" <rminnich@lanl.gov> wrote: > > so presumably he felt Coda could be improved upon. > > As peter used to put it, 5KLOC (intermezzo) was in his mind better than > 500KLOC (coda). He brought both file systems to fruition. Good reason! But the kernel modules were only a small fraction of that total size, right? > yeah, intermezzo limped along for 5 years, never quite worked, then died. > But the kernel->user interface of intermezzo is perfectly usable. Sounds > like you've gotten far with coda, so this is just an FYI. Pity abut Intermezzo - trying it (for it's designed purpose) was on my todo list... Thanks for your assessment of the Intermezzo kernal->user interface. I've often wondered if I could have been used it for my driver itstead of coda, but like you say, I've gotten thus far with Coda... > in the game, and got to the point where I could boot a linux node with > imezzo as the root file system. That was interesting. Sounds fun! On Fri, 17 Dec 2004 13:41:09 +0000 Derek Fawcus <dfawcus@cisco.com> wrote: > > 3) The user-filesystem-daemon can enforce file ownership (as the user) in > > the served directory hierarchy. It can also force off setuid bits, etc. > > Furthermore, users can only attach their fileservers to their own daemons! > > (A bit like per-process mount tables - of course, linux has this already, but > > not in a very user-friendly form) > > So while it's running, I can use gdb to attach to it and get around any security > it's trying to enforce (turn setuid back on, change ownership to root, etc). Good point! I guess it would have to stay setuid to avoid that - pity. Is there another way of stopping gdb, et al? I see that EPERM on ptrace can be caused if the debuggee is already being debugged. Is there any way that a program can ptrace itself, just to stop anyone else from debugging it? (1/2 :-)) On Fri, 17 Dec 2004 09:42:15 -0500 Karl Magdsick <kmagnum@gmail.com> wrote: > > On login, each user starts a user-filesystem-daemon, which uses setuid > > to create a /dev/cfsN, if necessary, opens it to start serving, and > > mounts it on a conventional place: $HOME/mnt, say. > > Why are the setuid meta-daemons needed? You'll want security/sanity > checks within the kernel code anyway, so the main reason for the > setuid meta-daemons is creating(??/destroying??) device files. It > seems that you could get around this by using one device file for > everyone, which brings us to... > > Why N different devices? The kernel can distinguish between open > filehandles for /dev/cfs and use an array, tree, or map of structs to > [ plus more good points] Two reasons: 1) I was trying to see how far we could get without having to change *anything* in the stock linux kernel distribution. This makes distributing the result much easier... 2) When you mount a coda device, you are talking to the fileserver that opened that coda device. If you multiplexed the device, you would have to have some other way of saying *which* user-mode fileserver you were trying to mount. Otherwise, I agree. > As mentioned in another post, you can prevent device files and setuid > executibles in non-root owned filesystems and allow any user process > to mount a fs on any mount point they own. Yes, and these mechanisms are [probably best. > The kernel needs its own cleanup code (what if one of the per-user > filesystem meta-daemons crashes?) and security/sanity checks (only If the per-user filesystem meta-daemons crashes, then, like I said, with coda, the kernel is pretty safe already. However I'd like the user's view of things to be cleaned up too. and: On Fri, 17 Dec 2004 14:13:35 -1000 (HST) Tim Newsham <newsham@lava.net> wrote: > I think it would be better to just implement the v9fs protocol > and let users mount it in a similar way as nfs. The v9fs layer In the future, I totally agree. Unfortunately my requirement was for something that would work with all the linux systems already out there. > Adding a seperate userland demon that proxies a filesystem > protocol (probably the same one) through to the kernel seems > like an uneeded layer of complexity. So it now seems - I was just thinking aloud, and the result was interesting, and informative for me. > It seems like there are a lot of projects out there that have > interest in providing userland filesystems. They typically use > nfs because its the easiest vector into the existing infrastructure. > V9fs would definitely fill a useful niche. Yes. The lack of close messages (until recently) in nfs, certainly restricted its usefulness for this purpose though. > V9fs already provides a useful protocol. No disagreements there, however there have already been discussions in other threads pointing out that V9fs doesn't (yet?) have messages for linux things that Plan 9 doesn't have - such as symlinks. > Restrictions as to where mount points could be placed could be > put in any setuid binary that mediates user mounts. I was trying to get around the restriction that user mounts could only be under $HOME/mnt, imposed by my scheme. Given access to a more general mount, I don't see any reason to restrict it unnecessarily! Thanks all for interesting feedback! Martin -- Martin C. Atkins martin_ml@parvat.com Parvat Infotech Private Limited http://www.parvat.com{/,/martin} ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2004-12-23 16:04 UTC | newest] Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2004-12-17 15:31 [9fans] Acme mailreader - now: User mode filesystems in linux bmaroshe -- strict thread matches above, loose matches on Subject: below -- 2004-12-16 15:08 [9fans] Acme mailreader David Leimbach 2004-12-16 23:22 ` geoff 2004-12-17 4:55 ` [9fans] Acme mailreader - now: User mode filesystems in linux Martin C.Atkins 2004-12-17 9:54 ` Martin C.Atkins 2004-12-17 10:22 ` geoff 2004-12-17 10:45 ` Martin C.Atkins 2004-12-17 11:42 ` Andy Newman 2004-12-17 15:57 ` Ronald G. Minnich 2004-12-17 12:30 ` Latchesar Ionkov 2004-12-17 15:55 ` Ronald G. Minnich 2004-12-17 13:41 ` Derek Fawcus 2004-12-17 14:42 ` Karl Magdsick 2004-12-17 14:56 ` Russ Cox 2004-12-18 0:13 ` Tim Newsham 2004-12-18 0:13 ` boyd, rounin 2004-12-18 3:49 ` Ronald G. Minnich 2004-12-23 16:04 ` boyd, rounin 2004-12-17 15:44 ` Ronald G. Minnich 2004-12-18 12:35 ` Martin C.Atkins
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).