From mboxrd@z Thu Jan  1 00:00:00 1970
Date: Sun,  4 Jan 2009 20:30:08 -0800
From: "Roman V. Shaposhnik" <rvs@sun.com>
In-reply-to: <20090103235716.GG8355@masters10.cs.jhu.edu>
To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net>
Message-id: <1231129808.11463.424.camel@goose.sun.com>
MIME-version: 1.0
Content-type: text/plain
Content-transfer-encoding: 7BIT
References: <a3fe2d1742ad4715ded95ccc69e6409b@quanstro.net>
	<1230850413.11463.79.camel@goose.sun.com>
	<20090101235716.GC8355@masters10.cs.jhu.edu>
	<1231017782.11463.193.camel@goose.sun.com>
	<20090103235716.GG8355@masters10.cs.jhu.edu>
Subject: Re: [9fans] sendfd() on native Plan 9?
Topicbox-Message-UUID: 796086a4-ead4-11e9-9d60-3106f5b1d025

On Sat, 2009-01-03 at 18:57 -0500, Nathaniel W Filardo wrote:
> That is, if I cannot refer to an object in my namespace and no
> server I can refer to will grant access, then I have achieved isolation of
> that object and any process running in my namespace.

That sounds like a reasonable expectation.

> Users make sense for remote services -- it's an excellent mechanism for
> reconnecting a principle with its capabilties (nabable objects and
> permissions) on a remote server.

Ok. So we are in agreement that the model doesn't need to be changed for
anything that gets to you through the #M. That leaves all the other
drivers in need for a discussion.

In fact, if there's more interest in this area -- this looks like
a subject worthy of its own thread: fine tuning of the access
permissions to the services provided by kernel drivers.

> "OS-level virtualization" is only a meaningful concept because,
> traditionally, our kernels were not.  I want there to be no distinction
> between the security guarantees that can be made when one or many kernels
> are in flight.

Agreed. Unfortunately, in UNIX "one kernel vs. many kernels"
requirements can NOT be easily unified. Thus the need for things like
Solaris zones. In Plan9, however, I suspect such unification can be made
with a very trivial amount of changes to the core system. That's why
I'm interested in this discussion to begin with.

In fact, why don't we re-frame this discussion (yet another new
thread? ;-)) as "what changes (if any) need to be implemented in
Plan9 so that there is no distinction between the security guarantees
that can be made when one or many kernels are in flight"

If nothing else, that would at least increase the practicality level.

> Run with me for a moment to see how this might work without the extra level
> of grouping.  Let's introduce an fdbind() call which takes a fd and a path,
> and binds the fd's chan on the path in the current namespace.  This is
> essentially an optimization around spawning a billion exportfs processes and
> mount()ing the resulting descriptors.

And as with any optimizations, can we, please, leave it out of the
discussion for a while? Since it is *just* and optimization, we can
always discuss it later.

> Now, suppose '#s' and '#p' -- the
> kernel services -- expose only the current process group (==namespace).  If
> I want different behavior, when I construct the new namespace, I ensure that
> I have open fds to my /proc and /srv and fdbind() these in the new namespace
> along with '#s' and '#p'.  If desired, the new namespaces' '#s' and '#p' can
> be posted (or sendfd()'d; see below) to the parent process, doing the
> inverse kind of exposure.

Ok. So the idea here boils down to partitioning the namespaces that
#s and #p serve based on the process groups (as defined by a shared
namespace), right? Then, the key problem really seems to be -- what's
the best way of unionizing these disjoint sets. If you're willing to
keep the old semantics for #s around (may be under a different name)
then exportfs doesn't seem to be too prohibitive, does it?

> > That said, the requirement itself makes sense. Although, even with
> > sendfd() you have to have a socket connection established before you can
> > pass fd's around. Except for the case where such socket connection
> > was created using socketpair() the sockets are *named* objects.
> > And that should present all the same problems that naming files
> > under /srv presents, shouldn't it?
>
> sendfd() is a little special since it uses "out of band structured ancillary
> data" (or something like that?) to do its magic.  See below.

I wasn't talking about ancillary data, I was talking about the point of
rendezvous for two processes. In case of UNIX, for unrelated processes,
that point of  rendezvous has to be a *named* entity. The fact that you
can exchange an unlimited amount of descriptors using it, doesn't solve
the /tmp-like issues. You're still dealing with namespace collisions,
etc.

> I've been imaginging sendfd() to be based on a non-enumerable /srv
> replacement, where possession of the file name and able to walk to it
> suffices to prove access rights to the posted descriptor.

These two form a very nice set of requirements. If I were to paraphrase,
what you really need is an srv-like device driven via a clone-like
interface.

> Let's say the protocol is that this magical device serves a ctl file,
> into which one writes a fd number and reads back a long, guaranteed unique,

Why does it have to be long and not just a sequential id?

> string.  This string names a file beside the aforementioned ctl file.
> Despite being a global service, this can be safely mounted in every namespace
> with no information exposure (non-enumerable).  Like /srv, it's even possible
> and sane to exportfs this (assuming that the clients are willing to trust the
> exportfs process).  It's possible to emulate /srv atop this by publishing a
> list of file names somewhere.

I need to think about it. But I understand what you have in mind now and
I agree that existing machinery doesn't seem to solve it. Now, before I
go away, here's a crazy idea that is guaranteed to get me in trouble ;-)

Would it be completely out of the question to slightly modify devdup.c
so that it accepts attaches of the following form: '#dPID' ?
Actually, since it already has ctl associated with every file id, you
can reuse that ctl to enforce a particular security policy. Then, your
sendfd() becomes a trivial matter of:
   open("#d[TARGET-PID]/[fid]");
on the receiving end and (may be) a control message (or even a chmod)
on the sending end. You will have to know the fid, but that information
needs to be exchanges anyway.

If the above sounds stupid -- I'd like to blame the late time of the
day ;-)

Thanks,
Roman.