From mboxrd@z Thu Jan  1 00:00:00 1970
Date: Tue,  9 May 2000 10:46:12 -0400
From: Alexander Viro viro@math.psu.edu
Subject: [9fans] Plan 9 future (Was: Re: Are the Infernospaces gone?)
Topicbox-Message-UUID: a8e214c6-eac8-11e9-9e20-41e7f4b1d025
Message-ID: <20000509144612.ynhlTtc-KZi2dwM2Il8NxMwj98NajKZkCksnPBN9fiQ@z>



On Tue, 9 May 2000 forsyth@vitanuova.com wrote:

> >>1995 vintage Plan 9 one - e.g. our design handles ".." for any mount
> >>graphs in the right way, ditto for pwd, etc. I'ld love to compare the
>
> what is `the right way'?

See the comment in pwd(1) for the wrong one ;-) In our variant effect
of bind() can't be distinguished from the effect of fresh mount() - you
are getting a tree spliced onto the point in your namespace and that's it.
And no, we don't create a new dentry tree. The main idea being that we
added a new class of objects - mount nodes and use them to represent mount
linkage. Namespace is represented as a tree of mount nodes and each node
refers to a couple of dentries - mountpoint and root, that is. These
dentries belong to the forset - each tree comes from a filesystem and this
forest is the same for everyone. You have to deal with pairs (mount node,
dentry) to identify points in the namespace. That's the basic idea - doing
that required some changes to our data structures, etc., but pain was
surprisingly small. Implementing the namei(), bind(), etc. with these data
structures is more or less an obvious exercise. Union-mounts are done with
a special kind of mount node that anchors the cyclic list of components'
nodes. To avoid special-casing for covered directory we add into this list
a node with root equal to dentry of covered directory (in effect, bind the
covered directory into the union).

One difference being that our equivalent of bind() binds only the chunk
that contains the object we are binding. If /foo and /foo/bar/baz
are mountpoints and you bind /foo/bar onto /barf you will _not_ get
/barf/baz as a mountpoint. For one thing, if you want "recursive"
behaviour of bind() (a-la Plan 9 one) you can emulate it in userland,
unless there are _real_ loops. And these are rather bad idea - say ls -lR
and watch it running forever... In our case you can bind / onto /mnt and
that will not create any loops - you'll have the same effect as you would
get from mounting the root filesystem onto /mnt.

So there... I consider this tradeoff (non-recursive behaviour of bind())
worth the things it gives. YMMV. With that data structure per-process
namespaces become absolutely trivial - nothing cares whether the mount
nodes form a connected graph or not. So it's a matter of garbage-collector
(plain refcounting) and cloning the tree + switching your pwd, etc. in our
equivalent of rfork() (clone(2)). BFD...

If you need more details - feel free to ask, I'll be only happy to discuss
them. I'm really curious about the implementation in current Plan 9
kernel - would be interesting to compare...

There were rather, erm, funny moments - we had to support fchdir(2), for
one. And link(2). And full-blown rename(2) <stream of expletives>. Oh,
well - it took long, but it had been done...