From: Jacob Moody <moody@mail.posixcafe.org>
To: 9front@9front.org
Subject: Re: [9front] [PATCH] Unmount to remove sharp devices.
Date: Sun, 22 May 2022 23:42:29 -0600 [thread overview]
Message-ID: <4bffa657-6b9e-8069-ae45-e9969c3542c5@posixcafe.org> (raw)
In-Reply-To: <F08256331012A8DFA68D0E25B44A4968@eigenstate.org>
Another go at this. Bit of a refactor of the
kernel changes and wrapped up the userspace work.
Since some of the thoughts are scattered throughout
the thread now I wanted to provide an overview of how things ended up:
A process can eject devices from it's namespace through
writes to /dev/drivers. Ejected devices can not be walked.
A 'permit' command is available to allow processes to eject
devices by whitelist rather then blacklist.
Support for these commands has been added to newns, allowing
namespace files to use eject/permit commands. These new commands
are used in /lib/namespace.ftp to restrict anonymous login.
aux/listen now allows individual namespace files to be used
per service. This namespace is sourced one per listener. Each
connection receives a copy of the namespace. A example
/rc/bin/service/!tcp80.namespace is provided which builds
an isolated webroot. In order to accomplish this without
bringing in most of /root, a /rc folder was added to #/.
This is prsented as one diff, but I plan to commit this
in chunks.
Thanks,
moody
---
diff 6fbb1acc8fa0b6655b14e8c46240a4a8d2d8c672 uncommitted
--- a/lib/namespace
+++ b/lib/namespace
@@ -22,6 +22,7 @@
# standard bin
bind /$cputype/bin /bin
+bind $rootdir'/rc' /rc
bind -a /rc/bin /bin
# internal networks
--- a/lib/namespace.ftp
+++ b/lib/namespace.ftp
@@ -8,5 +8,6 @@
# bind a personal incoming directory below incoming
bind -c /usr/none/incoming /usr/web/incoming/none
+permit |MedIa/
# this cuts off everything not mounted below /usr/web
bind /usr/web /
--- /tmp/diff100000406351
+++ b/rc/bin/service/!tcp80.namespace
@@ -1,0 +1,24 @@
+mount -aC #s/boot /root $rootspec
+
+# kernel devices
+bind #c /dev
+bind #d /fd
+bind -c #e /env
+bind #p /proc
+bind -a #l /net
+bind -a #I /net
+
+bind /root/$cputype/bin /bin
+bind /root/rc /rc
+bind -a /rc/bin /bin
+
+permit Mcde|pslI/
+
+# grab just our webroot
+bind /root/usr/web /srv
+
+# or bind in the actual root
+# bind -a /root /
+
+unmount /root
+eject Ms
--- a/sys/man/3/cons
+++ b/sys/man/3/cons
@@ -90,10 +90,32 @@
.PP
The
.B drivers
-file contains, one per line, a listing of the drivers configured in the kernel, in the format
+file contains, one per line, a listing of available kernel drivers, in the format
.IP
.EX
#c cons
+.EE
+.PP
+A process can eject a driver from the current namespace through a write to
+.B drivers.
+A message is one of:
+.IP "eject \f2drivers\fP"
+block access to the listed
+.I drivers.
+.IP "permit \f2drivers\fP"
+permit access to only the provided
+.I drivers.
+.PP
+\f2Drivers\fP is a string of driver characters. Ejecting
+.IR mnt (3)
+prevents new mounts in to the current namespace.
+The following blocks access to
+.IR env (3)
+and
+.IR sd (3):
+.IP
+.EX
+eject se
.EE
.PP
The
--- a/sys/man/3/root
+++ b/sys/man/3/root
@@ -10,6 +10,7 @@
.B /net
.B /net.alt
.B /proc
+.B /rc
.B /root
.B /srv
.fi
--- a/sys/man/6/namespace
+++ b/sys/man/6/namespace
@@ -59,6 +59,17 @@
.I new
is missing.
.TP
+.BI eject \ drivers
+Ejects the listed kernel
+.I drivers
+from the namespace.
+.I Drivers
+is a string of driver characters.
+.TP
+.BI permit \ drivers
+Permit access to only the listed kernel
+.I drivers.
+.TP
.BR clear
Clear the name space with
.BR rfork(RFCNAMEG) .
@@ -80,4 +91,5 @@
.SH "SEE ALSO"
.IR bind (1),
.IR namespace (4),
-.IR init (8)
+.IR init (8),
+.IR cons (3)
--- a/sys/man/8/listen
+++ b/sys/man/8/listen
@@ -96,6 +96,14 @@
an inbound call on the TCP network for port 565 executes service
.BR tcp565 .
.PP
+Services may have individual
+.IR namespace (6)
+files specified within
+.IR srvdir .
+If provided, the namespace is used as the parent for each connection
+to the corresponding service. Namespace files are found by appending a .namespace
+suffix to the service name.
+.PP
At least the following services are available in
.BR /bin/service .
.TF \ tcp0000
--- a/sys/src/9/boot/boot.c
+++ b/sys/src/9/boot/boot.c
@@ -25,6 +25,7 @@
buf[1+read(open("/env/cputype", OREAD|OCEXEC), buf+1, sizeof buf - 6)] = '\0';
strcat(buf, bin);
bind(buf, bin, MAFTER);
+ bind("/root/rc", "/rc", MREPL);
bind("/rc/bin", bin, MAFTER);
exec("/bin/bootrc", argv);
--- a/sys/src/9/port/chan.c
+++ b/sys/src/9/port/chan.c
@@ -1272,7 +1272,7 @@
Chan*
namec(char *aname, int amode, int omode, ulong perm)
{
- int len, n, t, nomount;
+ int len, n, t, nomount, devunmount;
Chan *c;
Chan *volatile cnew;
Path *volatile path;
@@ -1292,6 +1292,24 @@
name = aname;
/*
+ * When unmounting, the name parameter must be accessed
+ * using Aopen in order to get the real chan from
+ * something like /srv/cs or /fd/0. However when sandboxing,
+ * unmounting a sharp from a union is a valid operation even
+ * if the device is blocked.
+ */
+ devunmount = 0;
+ if(amode == Aunmount){
+ /*
+ * Doing any walks down the device could leak information
+ * about the existence of files.
+ */
+ if(name[0] == '#' && utflen(name) == 2)
+ devunmount = 1;
+ amode = Aopen;
+ }
+
+ /*
* Find the starting off point (the current slash, the root of
* a device tree, or the current dot) as well as the name to
* evaluate starting there.
@@ -1313,24 +1331,13 @@
up->genbuf[n++] = *name++;
}
up->genbuf[n] = '\0';
- /*
- * noattach is sandboxing.
- *
- * the OK exceptions are:
- * | it only gives access to pipes you create
- * d this process's file descriptors
- * e this process's environment
- * the iffy exceptions are:
- * c time and pid, but also cons and consctl
- * p control of your own processes (and unfortunately
- * any others left unprotected)
- */
n = chartorune(&r, up->genbuf+1)+1;
- if(up->pgrp->noattach && utfrune("|decp", r)==nil)
- error(Enoattach);
t = devno(r, 1);
if(t == -1)
error(Ebadsharp);
+ if(!devunmount && !devallowed(up->pgrp, r))
+ error(Enoattach);
+
c = devtab[t]->attach(up->genbuf+n);
break;
--- a/sys/src/9/port/dev.c
+++ b/sys/src/9/port/dev.c
@@ -31,6 +31,63 @@
}
void
+deveject(Pgrp *pgrp, int invert, char *devs)
+{
+ int i, t, w;
+ char *p;
+ Rune r;
+ u64int mask[nelem(pgrp->notallowed)];
+
+ if(invert)
+ memset(mask, 0xFF, sizeof mask);
+ else
+ memset(mask, 0, sizeof mask);
+
+ w = sizeof mask[0] * 8;
+ for(p = devs; *p != '\0';){
+ p += chartorune(&r, p);
+ t = devno(r, 1);
+ if(t == -1)
+ continue;
+ if(invert)
+ mask[t/w] &= ~(1<<t%w);
+ else
+ mask[t/w] |= 1<<t%w;
+ }
+
+ wlock(&pgrp->ns);
+ for(i=0; i < nelem(pgrp->notallowed); i++)
+ pgrp->notallowed[i] |= mask[i];
+ wunlock(&pgrp->ns);
+}
+
+int
+devallowed(Pgrp *pgrp, int r)
+{
+ int t, w, b;
+
+ t = devno(r, 1);
+ if(t == -1)
+ return 0;
+
+ w = sizeof(u64int) * 8;
+ rlock(&pgrp->ns);
+ b = !(pgrp->notallowed[t/w] & 1<<t%w);
+ runlock(&pgrp->ns);
+ return b;
+}
+
+int
+canmount(Pgrp *pgrp)
+{
+ /*
+ * Devmnt is not usable directly from user procs, so
+ * having it removed is interpreted to block any mounts.
+ */
+ return devallowed(pgrp, 'M');
+}
+
+void
devdir(Chan *c, Qid qid, char *n, vlong length, char *user, long perm, Dir *db)
{
db->name = n;
--- a/sys/src/9/port/devcons.c
+++ b/sys/src/9/port/devcons.c
@@ -39,6 +39,18 @@
CMrdb, "rdb", 0,
};
+enum
+{
+ CMeject,
+ CMpermit,
+};
+
+Cmdtab drivermsg[] =
+{
+ CMeject, "eject", 0,
+ CMpermit, "permit", 0,
+};
+
void
printinit(void)
{
@@ -332,7 +344,7 @@
"cons", {Qcons}, 0, 0660,
"consctl", {Qconsctl}, 0, 0220,
"cputime", {Qcputime}, 6*NUMSIZE, 0444,
- "drivers", {Qdrivers}, 0, 0444,
+ "drivers", {Qdrivers}, 0, 0666,
"hostdomain", {Qhostdomain}, DOMLEN, 0664,
"hostowner", {Qhostowner}, 0, 0664,
"kmesg", {Qkmesg}, 0, 0440,
@@ -583,9 +595,15 @@
case Qdrivers:
b = smalloc(READSTR);
k = 0;
- for(i = 0; devtab[i] != nil; i++)
+
+ rlock(&up->pgrp->ns);
+ for(i = 0; devtab[i] != nil; i++){
+ if(up->pgrp->notallowed[i/(sizeof(u64int)*8)] & 1<<i%(sizeof(u64int)*8))
+ continue;
k += snprint(b+k, READSTR-k, "#%C %s\n",
devtab[i]->dc, devtab[i]->name);
+ }
+ runlock(&up->pgrp->ns);
if(waserror()){
free(b);
nexterror();
@@ -622,7 +640,7 @@
long l, bp;
char *a;
Mach *mp;
- int id;
+ int id, i, invert;
ulong offset;
Cmdbuf *cb;
Cmdtab *ct;
@@ -674,6 +692,32 @@
case Qconfig:
error(Eperm);
+ break;
+
+ case Qdrivers:
+ cb = parsecmd(a, n);
+
+ if(waserror()) {
+ free(cb);
+ nexterror();
+ }
+ ct = lookupcmd(cb, drivermsg, nelem(drivermsg));
+ invert = 0;
+ switch(ct->index) {
+ case CMeject:
+ invert = 0;
+ break;
+ case CMpermit:
+ invert = 1;
+ break;
+ default:
+ error(Ebadarg);
+ break;
+ }
+ for(i = 1; i < cb->nf; i++)
+ deveject(up->pgrp, invert, cb->f[i]);
+ poperror();
+ free(cb);
break;
case Qreboot:
--- a/sys/src/9/port/devroot.c
+++ b/sys/src/9/port/devroot.c
@@ -105,6 +105,7 @@
addrootdir("net");
addrootdir("net.alt");
addrootdir("proc");
+ addrootdir("rc");
addrootdir("root");
addrootdir("srv");
addrootdir("shr");
--- a/sys/src/9/port/devshr.c
+++ b/sys/src/9/port/devshr.c
@@ -464,7 +464,7 @@
cclose(c);
return nc;
case Qcroot:
- if(up->pgrp->noattach)
+ if(!canmount(up->pgrp))
error(Enoattach);
if((perm & DMDIR) == 0 || mode != OREAD)
error(Eperm);
@@ -498,7 +498,7 @@
sch->shr = shr;
break;
case Qcshr:
- if(up->pgrp->noattach)
+ if(!canmount(up->pgrp))
error(Enoattach);
if((perm & DMDIR) != 0 || mode != OWRITE)
error(Eperm);
@@ -731,7 +731,7 @@
Mhead *h;
Mount *m;
- if(up->pgrp->noattach)
+ if(!canmount(up->pgrp))
error(Enoattach);
sch = tosch(c);
if(sch->level != Qcmpt)
--- a/sys/src/9/port/mkdevc
+++ b/sys/src/9/port/mkdevc
@@ -78,6 +78,9 @@
if(ARGC < 2)
exit "usage"
+ if(ndev >= 256)
+ exit "device count will overflow Pgrp.notallowed"
+
printf "#include \"u.h\"\n";
printf "#include \"../port/lib.h\"\n";
printf "#include \"mem.h\"\n";
--- a/sys/src/9/port/portdat.h
+++ b/sys/src/9/port/portdat.h
@@ -121,6 +121,7 @@
Amount, /* to be mounted or mounted upon */
Acreate, /* is to be created */
Aremove, /* will be removed by caller */
+ Aunmount, /* unmount arg[0] */
COPEN = 0x0001, /* for i/o */
CMSG = 0x0002, /* the message channel for a mount */
@@ -484,7 +485,7 @@
{
Ref;
RWlock ns; /* Namespace n read/one write lock */
- int noattach;
+ u64int notallowed[4]; /* Room for 256 devices */
Mhead *mnthash[MNTHASH];
};
--- a/sys/src/9/port/portfns.h
+++ b/sys/src/9/port/portfns.h
@@ -413,6 +413,9 @@
ushort nhgets(void*);
ulong µs(void);
long lcycles(void);
+void deveject(Pgrp*,int,char*);
+int devallowed(Pgrp*, int);
+int canmount(Pgrp*);
#pragma varargck argpos iprint 1
#pragma varargck argpos panic 1
--- a/sys/src/9/port/sysfile.c
+++ b/sys/src/9/port/sysfile.c
@@ -1048,7 +1048,7 @@
nexterror();
}
- if(up->pgrp->noattach)
+ if(!canmount(up->pgrp))
error(Enoattach);
ac = nil;
@@ -1160,14 +1160,8 @@
nexterror();
}
if(name != nil) {
- /*
- * This has to be namec(..., Aopen, ...) because
- * if arg[0] is something like /srv/cs or /fd/0,
- * opening it is the only way to get at the real
- * Chan underneath.
- */
validaddr((uintptr)name, 1, 0);
- cmounted = namec(name, Aopen, OREAD, 0);
+ cmounted = namec(name, Aunmount, OREAD, 0);
}
cunmount(cmount, cmounted);
poperror();
--- a/sys/src/9/port/sysproc.c
+++ b/sys/src/9/port/sysproc.c
@@ -34,6 +34,7 @@
Egrp *oeg;
ulong pid, flag;
Mach *wm;
+ char *devs;
flag = va_arg(list, ulong);
/* Check flags before we commit */
@@ -44,6 +45,11 @@
if((flag & (RFENVG|RFCENVG)) == (RFENVG|RFCENVG))
error(Ebadarg);
+ /*
+ * Code using RFNOMNT expects to block all but
+ * the following devices.
+ */
+ devs = "|decp";
if((flag&RFPROC) == 0) {
if(flag & (RFMEM|RFNOWAIT))
error(Ebadarg);
@@ -60,12 +66,12 @@
up->pgrp = newpgrp();
if(flag & RFNAMEG)
pgrpcpy(up->pgrp, opg);
- /* inherit noattach */
- up->pgrp->noattach = opg->noattach;
+ /* inherit notallowed */
+ memmove(up->pgrp->notallowed, opg->notallowed, sizeof up->pgrp->notallowed);
closepgrp(opg);
}
if(flag & RFNOMNT)
- up->pgrp->noattach = 1;
+ deveject(up->pgrp, 1, devs);
if(flag & RFREND) {
org = up->rgrp;
up->rgrp = newrgrp();
@@ -177,8 +183,8 @@
p->pgrp = newpgrp();
if(flag & RFNAMEG)
pgrpcpy(p->pgrp, up->pgrp);
- /* inherit noattach */
- p->pgrp->noattach = up->pgrp->noattach;
+ /* inherit notallowed */
+ memmove(p->pgrp->notallowed, up->pgrp->notallowed, sizeof p->pgrp->notallowed);
}
else {
p->pgrp = up->pgrp;
@@ -185,7 +191,7 @@
incref(p->pgrp);
}
if(flag & RFNOMNT)
- p->pgrp->noattach = 1;
+ deveject(p->pgrp, 1, devs);
if(flag & RFREND)
p->rgrp = newrgrp();
--- a/sys/src/cmd/aux/listen.c
+++ b/sys/src/cmd/aux/listen.c
@@ -136,6 +136,7 @@
{
int ctl, pid, start;
char dir[40], err[128], ds[128];
+ char prog[Maxpath], serv[Maxserv], ns[Maxpath];
long childs;
Announce *a;
Waitmsg *wm;
@@ -178,6 +179,10 @@
sleep((pid*10)%200);
snprint(ds, sizeof ds, "%s!%s!%s", protodir, addr, a->a);
+ snprint(serv, sizeof serv, "%s%s", proto, a->a);
+ snprint(prog, sizeof prog, "%s/%s", srvdir, serv);
+ snprint(ns, sizeof ns, "%s.namespace", prog);
+
whined = a->whined;
/* a process per service */
@@ -201,7 +206,11 @@
else
exits("ctl");
}
- dolisten(dir, ctl, srvdir, a->a, &childs);
+ procsetname("%s %s", dir, ds);
+ if(!trusted)
+ if(newns("none", ns) < 0)
+ syslog(0, listenlog, "can't build namespace %s: %r\n", ns);
+ dolisten(dir, ctl, serv, prog, &childs);
close(ctl);
}
default:
@@ -299,6 +308,8 @@
continue;
if(strncmp(nm, proto, nlen) != 0)
continue;
+ if(strstr(nm + nlen, ".namespace") != nil)
+ continue;
addannounce(nm + nlen);
}
free(db);
@@ -329,15 +340,10 @@
}
void
-dolisten(char *dir, int ctl, char *srvdir, char *port, long *pchilds)
+dolisten(char *dir, int ctl, char *serv, char *prog, long *pchilds)
{
char ndir[40], wbuf[64];
- char prog[Maxpath], serv[Maxserv];
int nctl, data, wfd, nowait;
-
- procsetname("%s %s!%s!%s", dir, proto, addr, port);
- snprint(serv, sizeof serv, "%s%s", proto, port);
- snprint(prog, sizeof prog, "%s/%s", srvdir, serv);
wfd = -1;
nowait = RFNOWAIT;
--- a/sys/src/libauth/newns.c
+++ b/sys/src/libauth/newns.c
@@ -14,8 +14,8 @@
static int setenv(char*, char*);
static char *expandarg(char*, char*);
static int splitargs(char*, char*[], char*, int);
-static int nsfile(char*, Biobuf *, AuthRpc *);
-static int nsop(char*, int, char*[], AuthRpc*);
+static int nsfile(char*, Biobuf *, AuthRpc *, int);
+static int nsop(char*, int, char*[], AuthRpc*, int);
static int catch(void*, char*);
int newnsdebug;
@@ -35,7 +35,7 @@
{
Biobuf *b;
char home[4*ANAMELEN];
- int afd, cdroot;
+ int afd, cdroot, dfd;
char *path;
AuthRpc *rpc;
@@ -51,6 +51,10 @@
}
/* rpc != nil iff afd >= 0 */
+ dfd = open("#c/drivers", OWRITE|OCEXEC);
+ if(dfd < 0 && newnsdebug)
+ fprint(2, "open #c/drivers: %r\n");
+
if(file == nil){
if(!newns){
werrstr("no namespace file specified");
@@ -70,7 +74,8 @@
setenv("home", home);
}
- cdroot = nsfile(newns ? "newns" : "addns", b, rpc);
+ cdroot = nsfile(newns ? "newns" : "addns", b, rpc, dfd);
+ close(dfd);
Bterm(b);
freecloserpc(rpc);
@@ -87,7 +92,7 @@
}
static int
-nsfile(char *fn, Biobuf *b, AuthRpc *rpc)
+nsfile(char *fn, Biobuf *b, AuthRpc *rpc, int dfd)
{
int argc;
char *cmd, *argv[NARG+1], argbuf[MAXARG*NARG];
@@ -103,7 +108,7 @@
continue;
argc = splitargs(cmd, argv, argbuf, NARG);
if(argc)
- cdroot |= nsop(fn, argc, argv, rpc);
+ cdroot |= nsop(fn, argc, argv, rpc, dfd);
}
atnotify(catch, 0);
return cdroot;
@@ -143,7 +148,7 @@
}
static int
-nsop(char *fn, int argc, char *argv[], AuthRpc *rpc)
+nsop(char *fn, int argc, char *argv[], AuthRpc *rpc, int dfd)
{
char *argv0;
ulong flags;
@@ -181,7 +186,7 @@
b = Bopen(argv[0], OREAD|OCEXEC);
if(b == nil)
return 0;
- cdroot |= nsfile(fn, b, rpc);
+ cdroot |= nsfile(fn, b, rpc, dfd);
Bterm(b);
}else if(strcmp(argv0, "clear") == 0 && argc == 0){
rfork(RFCNAMEG);
@@ -212,6 +217,14 @@
}else if(strcmp(argv0, "cd") == 0 && argc == 1){
if(chdir(argv[0]) == 0 && *argv[0] == '/')
cdroot = 1;
+ }else if(argc >= 1 && (strcmp(argv0, "permit") == 0 || strcmp(argv0, "eject") == 0)){
+ //We should not silently fail if we can not honor a permit/eject
+ //due to the parent namespace missing #c/drivers.
+ if(dfd <= 0)
+ sysfatal("%s requested, but could not open #c/drivers", argv0);
+ for(i=0; i < argc; i++)
+ if(fprint(dfd, "%s %s\n", argv0, argv[i]) < 0 && newnsdebug)
+ fprint(2, "%s: %s %s %r\n", fn, argv0, argv[i]);
}
return cdroot;
}
next prev parent reply other threads:[~2022-05-23 5:44 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-04 14:09 Jacob Moody
2022-05-04 15:05 ` ori
2022-05-04 15:31 ` ori
2022-05-04 16:15 ` Stanley Lieber
2022-05-04 17:41 ` Lyndon Nerenberg (VE7TFX/VE6BBM)
2022-05-04 17:55 ` Jacob Moody
2022-05-05 1:59 ` Alex Musolino
2022-05-05 16:07 ` Jacob Moody
2022-05-08 2:55 ` Jacob Moody
2022-05-11 14:47 ` Jacob Moody
2022-05-11 16:11 ` Stanley Lieber
2022-05-12 4:29 ` Jacob Moody
2022-05-12 3:18 ` ori
2022-05-12 5:10 ` Jacob Moody
2022-05-12 14:21 ` ori
2022-05-23 5:42 ` Jacob Moody [this message]
2022-05-23 17:06 ` cinap_lenrek
2022-05-23 17:37 ` Jacob Moody
2022-05-25 19:03 ` Jacob Moody
2022-05-25 20:53 ` hiro
2022-05-25 21:20 ` Jacob Moody
2022-05-26 5:55 ` Jacob Moody
2022-05-26 23:36 ` unobe
2022-05-27 0:33 ` Jacob Moody
2022-05-27 3:25 ` unobe
2022-05-26 3:13 ` ori
2022-05-27 1:11 ` Lyndon Nerenberg (VE7TFX/VE6BBM)
2022-05-27 2:25 ` Frank D. Engel, Jr.
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4bffa657-6b9e-8069-ae45-e9969c3542c5@posixcafe.org \
--to=moody@mail.posixcafe.org \
--cc=9front@9front.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).