From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Fri, 17 Dec 2004 10:25:26 +0530 From: Martin C.Atkins To: Fans of the OS Plan 9 from Bell Labs <9fans@cse.psu.edu> Subject: Re: [9fans] Acme mailreader - now: User mode filesystems in linux Message-Id: <20041217102526.0b64d965.martin_ml@parvat.com> In-Reply-To: <9ccf822edf0a9a77c141ae47312638dd@collyer.net> References: <3e1162e6041216070874f424e5@mail.gmail.com> <9ccf822edf0a9a77c141ae47312638dd@collyer.net> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Topicbox-Message-UUID: 19c9d89a-eace-11e9-9e20-41e7f4b1d025 Hi all, On Thu, 16 Dec 2004 15:22:59 -0800 geoff@collyer.net wrote: >.. > However, Martin Atkins has revealed the mystery kernel agent: coda. > Apparently it's somewhat specialised but lets user-mode file servers > catch opens and closes. I was going to write a longer message today, anyway :-). Yes, the kernel-mode agent I use is the Coda filesystem driver, which has been in the stock linux kernel for several years. For those that don't already know: Coda is a remote filesystem that copes (more or less well) with disconnection from, and reconnection to the fileserver. Thus allowing clients to continue work in the disconnected state. I'm not sure how successful it was at this - I've never tried it - but it sounds like an interesting goal. This goal is also shared by Intermezzo, which was (also) started by Peter Braam - so presumably he felt Coda could be improved upon. However, judging by the News pages on their web sites, more seems to be happening with Coda, than with Intermezzo, recently. Anyway, the interesting thing about both these systems, from the point of view of this discussion, is that the real work is done in a user-mode agent, which communicates with a kernel-mode stub driver. In Coda the user-mode agent is, for some obscure reason, called Venus. Thus it was only necessary to reverse-engineer the kernel module <-> Venus protocol. This was surprisingly easy: it is documented in some detail on the Coda website, and Pavel Machek's podfuk had already worked out some of the more obscure details (but this wasn't a general-purpose library, didn't do some things I wanted, and was I thought, messy). The Coda driver is stable enough that I don't remember any hangs/etc, even while I was going through the trial-and-error process! The user-mode agent simply opens /dev/cfsN, for some N, and reads and writes messages down to the kernel module. My library for this is, as I said, about 1400 lines of Python. Trivial fileserver applications can be as small as 10-20 lines, and faulty fileservers (or library) never crash/hang the kernel (which is how things should be! :-). It is also possible to kill the fileserver, and restart with minimal side effects. It would be easy to make libraries for other languages. One curiosity, which is both an advantage, and a disadvantage depending on what you want to do: The user-mode agent is not involved in - does not even see - reads and writes. When a client opens a file, the user-mode agent makes a file somewhere containing the contents of the "virtual" file, opens it, and writes the file descriptor down into the kernel. The kernel module returns this file descriptor (more or less) to the client who reads and writes it as a normal file, with no intervention of Coda. When the client closes the file, the kernel driver tells the user-mode agent, which deals with the (possibly new) contents of the file, and might then remove it from the local filesystem. The advantage of this is that reads/writes happen at the same speed as reads and writes to local files. The disadvantages are that the open has to make a local copy of the entire contents of the file - even if it is very big - and can't process individual writes as commands, as in common in Plan 9. The user-mode agent might also have to read the file to work out what changed. However, you can rather easily process open+write+close as a command to the user-mode agent, or have a file whose contents are different every time it is opened. (So you do open+read+close, to read a status value, for example) Ideally, I'd like the processing of open to be able to decide whether to send a file descriptor down into the kernel, or to receive read/write messages - this seems to have been in previous versions of Coda - such is life! Re: Intermezzo - as Ron pointed out, it's kernel driver could also possibly be used for this purpose. However, it uses hooks into an underlying journalled filessystem - a requirement that I couldn't easily satisfy back when I started this work, and I wasn't sure that the kernel-user space interface was so easy to apply to my purpose. However, it might allow one to avoid the disadvantages mentioned in the last paragraph. I've been meaning to opensource the library for a while now - but I'd like to clean it up in a few places, before a proper release. This interest might be the spur to make me get around to it... >... > 9), and it becomes possible to push lots of code and some hacks out of > the kernel, while permitting some new and interesting work. No disagreements there! Martin -- Martin C. Atkins martin_ml@parvat.com Parvat Infotech Private Limited http://www.parvat.com{/,/martin}