From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org
X-Spam-Level: 
X-Spam-Status: No, score=-2.0 required=5.0 tests=DKIM_INVALID,DKIM_SIGNED,
	NICE_REPLY_A,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no
	version=3.4.4
Received: (qmail 10888 invoked from network); 31 May 2022 15:11:18 -0000
Received: from 9front.inri.net (168.235.81.73)
  by inbox.vuxu.org with ESMTPUTF8; 31 May 2022 15:11:18 -0000
Received: from mail.posixcafe.org ([45.76.19.58]) by 9front; Tue May 31 11:09:31 -0400 2022
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=posixcafe.org;
	s=20200506; t=1654009767;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:mime-version:mime-version:content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=0GFCNSNnEEOe3TKsQDFpDnAKL9xh/DcdGFGNLAKzHX0=;
	b=c/yFaN+8t/uM4cT3iyRdUOWwb7gpsyY7Q1Ps8fX8GuRQqbE9VBMp0FsL5z7sCiQhUzGZX3
	LOItsu7viMURykBLus2tU5L9rUuUB0Ba3LekRcOb9X9c+RoMwLGw1RRJ0PAq2w+K/zzSxH
	pcgWcQu4bp1YW2NfPUB3aXmbu+hUB28=
Received: from [192.168.168.200] (161-97-228-135.lpcnextlight.net [161.97.228.135])
	by mail.posixcafe.org (OpenSMTPD) with ESMTPSA id 735b8ffd (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO)
	for <9front@9front.org>;
	Tue, 31 May 2022 10:09:26 -0500 (CDT)
Message-ID: <475b654b-641e-6f08-c5bc-bcf6aab4f51a@posixcafe.org>
Date: Tue, 31 May 2022 09:09:03 -0600
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
 Thunderbird/91.8.1
Content-Language: en-US
To: 9front@9front.org
References: <b3419c6e-04a8-69f0-f7da-badbf0721155@posixcafe.org>
 <847F45EC7225C1D0B69E796B69D6E3ED@eigenstate.org>
 <CAFSF3XOFoOwjY+SDNyDVo8dL8-VM_Y4uBjc=Q_GoqG5PCM2-eA@mail.gmail.com>
From: Jacob Moody <moody@mail.posixcafe.org>
In-Reply-To: <CAFSF3XOFoOwjY+SDNyDVo8dL8-VM_Y4uBjc=Q_GoqG5PCM2-eA@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
List-ID: <9front.9front.org>
List-Help: <http://lists.9front.org>
X-Glyph: ➈
X-Bullshit: self-healing rails-scale component app 
Subject: Re: [9front] [PATCH] private /srv attach option
Reply-To: 9front@9front.org
Precedence: bulk

I spent the better half of yesterday thinking about how /srv fits into
the current work.

First lets talk about how we go about building an isolated namespace
as is. The general idea is to only use the root folders provided by #/,
then selectively binding in the necessities from the real root, then dropping
any reference to the real root from the namespace. This is roughly how the
current tcp80.namespace works, but this solution does not scale too well
to programs that require a lot of file system state.

An example is generally anything required out of /lib and/or /sys, these
are not folders given by #/ and binding in the entirety of them from
the real root isn't right either. So how do we setup the correct
state in /?

One way is to build a skeleton hierarchy stashed somewhere that can
be unioned on to /, and used as targets for binds from the real root
in the namespace file. This works pretty well for third party programs,
an example for this would be something like werc.

but shipping a skeleton for programs in tree sucks. So what else
do we have? Well there is always mntgen. But mntgen has some
issues:

1. There is some data leakage about what one client using through what folders show up when listing the root.
	Likely a somewhat easy fix
2. Mntgen only makes up directories at the root, we would want to extend this to child directories
	Also sounds like a doable fix
3. Mntgen is a userspace program, there is no facility for getting a fresh one in namespace files. You have to have
one sitting in /srv somewhere for it to be useful.

This last point is interesting and one I wanted to discuss more of. We could just provide a 'mntgen' command
to namespace files to get a private mount of mntgen. But you wouldn't want to do this if you are rebuilding
the namespace on each connection. A program wishing to do this could setup a mntgen/ramfs with the correct skeleton
in a private /srv to prevent having to share it with the rest of the system.

Now is the restriction to not being able to recurse through subsequent private /srv's once blocking
the global /srv an issue? I dont really think it is. I explicitly do not want chdev to proliferate
to every program on the system like openBSD's pledge. It is designed to be used at the very edge
of the process tree, for leaf processes that are handling potentially malicious input. In part due to
the fact that we never allow a process, or it's children, to regain access to something that has been lost.
This is to say that a program should know where it's process tree ends, and what capabilities it's children
will need. So you either end up with:

bind -c '#sp' /srv
chdev -r 's'

in the 'setup' phase of the parent if you know your children will never need to make another further private /srv.
If you do need to create a further private /srv, then simply kick that code snippet down the process
tree to the last place you need to substantiate the private /srv. If you know the child will want a private /srv
but the parent will not. Fork the child with RFNAMEG, then do 'chdev -r s' in the parent after the fork call.

This is only an issue if you:

1. Don't know what capabilities your children want.
	At which point chdev isn't very helpful to begin with.
2. Conditionally making a private /srv after you've begun reading potentially malicious input
	I think you kinda get what you aught to here.

I'm just trying to really rack my head for a scenario where you need to leave the global /srv on the table
because you don't know if your children will want another private /srv. If others have examples, please
let me know. I am pushing more because I _want_ it to be good enough for devsrv to just keep track of
sessions itself. I dont expect devsrv to be the last device we may want private sessions for, and growing
a new process group per device is not ideal. But maybe we can do a bit better then this,
I am going to poke around with an in-tree way (ctl file or similar) to see if that shakes out to being
a bit nicer. But I will say I do prefer the semantics of an attach option for getting a different 'tree'
of /srv, it feels right.

My grand plan is as you laid out, I want to turn namespaces in to security boundaries, and for the most part
we're there now. Chdev allows namespaces to be security boundaries.

The tricky part that we're trying to decide now is what tools do we want to give in order to make these types of namespaces
easier to build. It is much harder to lay out a plan for this. This mail serves to attempt to lay out what I have in my head
in this regard. The programs we choose to grow new features is dependent on their implementation quirks, such that our new use
case does not impose too much on to the existing code. So a lot of it turns in to 'see what the low hanging fruits are' and reflecting
on how/if these modifications produce a worthwhile tool.

So there's the big thought dump, pick it apart as desired.


Thanks,
moody