From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-0.1 required=5.0 tests=DKIM_INVALID,DKIM_SIGNED, NICE_REPLY_A,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 31241 invoked from network); 12 May 2022 05:13:47 -0000 Received: from 9front.inri.net (168.235.81.73) by inbox.vuxu.org with ESMTPUTF8; 12 May 2022 05:13:47 -0000 Received: from mail.posixcafe.org ([45.76.19.58]) by 9front; Thu May 12 01:10:36 -0400 2022 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=posixcafe.org; s=20200506; t=1652332232; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Lvdz49CTe1wn8OIDAa1OW1MQTg2Ti0WGv+acgKC9RtE=; b=p9bTY2QZPPeRqKFgFzQBW2pXxTHoRmAaLb6KHSG99AltAl3vVh2L+J9caNTohJeBjU0IX9 6NgiOHbpjGgftWpTyf79gJ476uOtNBjIVIEKLjfWCWrtFzc2mfVBbj83CusvVJNhoVr+Tb zFYnxJo0SCuhZOuCcSK4/6m4GjGzpVI= Received: from [192.168.168.200] (161-97-228-135.lpcnextlight.net [161.97.228.135]) by mail.posixcafe.org (OpenSMTPD) with ESMTPSA id 75313fd5 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO) for <9front@9front.org>; Thu, 12 May 2022 00:10:32 -0500 (CDT) Message-ID: Date: Wed, 11 May 2022 23:10:26 -0600 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.1 Content-Language: en-US To: 9front@9front.org References: <31DDCB42A0F36DFFCF3ECB669151D8C6@eigenstate.org> From: Jacob Moody In-Reply-To: <31DDCB42A0F36DFFCF3ECB669151D8C6@eigenstate.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit List-ID: <9front.9front.org> List-Help: X-Glyph: ➈ X-Bullshit: scale-out progressive dependency ActivityPub over SVG markup-based singleton general-purpose optimizer Subject: Re: [9front] [PATCH] Unmount to remove sharp devices. Reply-To: 9front@9front.org Precedence: bulk On 5/11/22 21:18, ori@eigenstate.org wrote: > Quoth Jacob Moody : >> Hello, >> >> Gave this another go: http://okturing.com/src/13592/body >> >> This patch includes new namepace file for rc-httpd >> and changes the existing anonymous login namespace for >> ftpd, both make use of the sandboxing features. > > Looking like it's getting useful :) I'm glad it's starting to look that way > comments inline. > >> thanks, >> moody >> >> diff 9126ee3eea90d639f4e877c01400248581d10f65 uncommitted >> --- a/lib/namespace >> +++ b/lib/namespace >> @@ -1,5 +1,6 @@ >> # root >> mount -aC #s/boot /root $rootspec >> +bind $rootdir'/rc' /rc >> bind -a $rootdir / > > Wondering why this change was needed; did a local change > leak in? > Yeah I dont love this, but there is some reason. rc needs /rc/lib/rcmain and most scripts expect to find themselves under /rc/bin. My goal for the namespace for rc-httpd was to steal what was just needed from the real root fs, unmount the root fs, then block /srv to make #s/boot inaccessible. So I figured the easiest way to go about that was for #/ to expost a rc dir. Or at least I couldn't find a better way of bringing in /root/rc to / without bringing in most of /root. /root/rc needs explicitely bound over / because the original bind of /root to / is done with MAFTER. So #/'s empty dirs take priority. If it were to stay I think the line should be moved down to where /$cputype/bin is bound. Doing the first /root to / bind with MBEFORE would also fix this. >> >> # kernel devices >> --- a/lib/namespace.ftp >> +++ b/lib/namespace.ftp >> @@ -8,5 +8,6 @@ >> # bind a personal incoming directory below incoming >> bind -c /usr/none/incoming /usr/web/incoming/none >> >> +permit |MedIa/ >> # this cuts off everything not mounted below /usr/web >> bind /usr/web / >> --- /tmp/diff100000742872 >> +++ b/lib/namespace.rc-httpd >> @@ -1,0 +1,21 @@ >> +mount -aC #s/boot /root $rootspec >> + >> +# kernel devices >> +bind #c /dev >> +bind #d /fd >> +bind -c #e /env >> +bind #p /proc >> +bind -a #l /net >> +bind -a #I /net > > duplicated line? > Upper case I and lower case L. >> + >> +bind /root/$cputype/bin /bin >> +bind /root/rc /rc >> +bind -a /rc/bin /bin >> + >> +permit Mcde|plI/ >> + >> +. /root/cfg/$sysname/namespace.rc-httpd >> + >> +unmount /root >> +block Mc >> +unmount #c /dev >> --- a/sys/man/3/cons >> +++ b/sys/man/3/cons >> @@ -90,10 +90,30 @@ >> .PP >> The >> .B drivers >> -file contains, one per line, a listing of the drivers configured in the kernel, in the format >> +file contains, one per line, a listing of available kernel drivers, in the format >> .IP >> .EX >> #c cons >> +.EE >> +.PP >> +A process can forfeit access to a driver, for itself and it's future children, through a write to >> +.B drivers. > > Is this accurate? I'd expect the way this is implement to affect all procs > in the current namespace, not just the current one and its children. > > Maybe: 'a process can eject a driver from > the current namespace through a write to > drivers'. You are correct about the implementation, it is shared by Pgrp. Your suggestion sounds good to me. >> +A message is one of: >> +.IP "block \f2drivers\fP" >> +block access to the listed >> +.I drivers. >> +.IP "permit \f2drivers\fP" >> +permit access to just the provided >> +.I drivers. >> +.PP >> +Drivers are identified by their short hand, the first column returned on reads, >> +without the leading sharp. The following blocks access to > > Clunky phrasing. Perhaps '\f2drivers\f2 is a string of driver characters'. > May also need to mention the implicication of disallowing 'M'. That makes more sense, and I agree we should mention the quirks of 'M'. >> +.IR env (3) >> +and >> +.IR sd (3): >> +.IP >> +.EX >> +block se >> .EE >> .PP >> The >> --- a/sys/man/6/namespace >> +++ b/sys/man/6/namespace >> @@ -59,6 +59,15 @@ >> .I new >> is missing. >> .TP >> +.BI block \ drivers >> +Removes access to the listed kernel >> +.I drivers. >> +Devices are identified by their sharp name. >> +.TP >> +.BI permit \ drivers >> +Permit access to only the listed kernel >> +.I drivers. >> +.TP >> .BR clear >> Clear the name space with >> .BR rfork(RFCNAMEG) . >> @@ -80,4 +89,5 @@ >> .SH "SEE ALSO" >> .IR bind (1), >> .IR namespace (4), >> -.IR init (8) >> +.IR init (8), >> +.IR cons (3) >> --- a/sys/src/9/boot/boot.c >> +++ b /sys/src/9/boot/boot.c >> @@ -19,6 +19,7 @@ >> if(await(buf, sizeof(buf)) < 0) >> goto Err; >> >> + bind("/root/rc", "/rc", MREPL); >> bind(root, "/", MAFTER); > > Hm. as before, I don't understand why this bind is needed: > binding /root to / should already lead to /rc being /root/rc. > See above. >> >> buf[0] = '/'; >> --- a/sys/src/9/port/chan.c >> +++ b/sys/src/9/port/chan.c >> @@ -1272,7 +1272,7 @@ >> Chan* >> namec(char *aname, int amode, int omode, ulong perm) >> { >> - int len, n, t, nomount; >> + int len, n, t, nomount, devunmount; >> Chan *c; >> Chan *volatile cnew; >> Path *volatile path; >> @@ -1292,6 +1292,24 @@ >> name = aname; >> >> /* >> + * When unmounting, the name parameter must be accessed >> + * using Aopen in order to get the real chan from >> + * something like /srv/cs or /fd/0. However when sandboxing, >> + * unmounting a sharp from a union is a valid operation even >> + * if the device is blocked. >> + */ >> + if(amode == Aunmount){ >> + /* >> + * Doing any walks down the device could leak information >> + * about the existence of files. >> + */ >> + if(name[0] == '#' && utflen(name) == 2) >> + devunmount = 1; >> + amode = Aopen; >> + } else >> + devunmount = 0; >> + >> + /* >> * Find the starting off point (the current slash, the root of >> * a device tree, or the current dot) as well as the name to >> * evaluate starting there. >> @@ -1313,24 +1331,13 @@ >> up->genbuf[n++] = *name++; >> } >> up->genbuf[n] = '\0'; >> - /* >> - * noattach is sandboxing. >> - * >> - * the OK exceptions are: >> - * | it only gives access to pipes you create >> - * d this process's file descriptors >> - * e this process's environment >> - * the iffy exceptions are: >> - * c time and pid, but also cons and consctl >> - * p control of your own processes (and unfortunately >> - * any others left unprotected) >> - */ >> n = chartorune(&r, up->genbuf+1)+1; >> - if(up->pgrp->noattach && utfrune("|decp", r)==nil) >> - error(Enoattach); >> t = devno(r, 1); >> if(t == -1) >> error(Ebadsharp); >> + if(!devunmount && !driversallowed(up->pgrp, r)) >> + error(Enoattach); >> + >> c = devtab[t]->attach(up->genbuf+n); >> break; >> >> --- a/sys/src/9/port/devcons.c >> +++ b/sys/src/9/port/devcons.c >> @@ -39,6 +39,18 @@ >> CMrdb, "rdb", 0, >> }; >> >> +enum >> +{ >> + CMblock, >> + CMpermit, >> +}; >> + >> +Cmdtab drivermsg[] = >> +{ >> + CMblock, "block", 0, >> + CMpermit, "permit", 0, >> +}; >> + >> void >> printinit(void) >> { >> @@ -332,7 +344,7 @@ >> "cons", {Qcons}, 0, 0660, >> "consctl", {Qconsctl}, 0, 0220, >> "cputime", {Qcputime}, 6*NUMSIZE, 0444, >> - "drivers", {Qdrivers}, 0, 0444, >> + "drivers", {Qdrivers}, 0, 0666, >> "hostdomain", {Qhostdomain}, DOMLEN, 0664, >> "hostowner", {Qhostowner}, 0, 0664, >> "kmesg", {Qkmesg}, 0, 0440, >> @@ -583,9 +595,15 @@ >> case Qdrivers: >> b = smalloc(READSTR); >> k = 0; >> - for(i = 0; devtab[i] != nil; i++) >> + >> + rlock(&up->pgrp->ns); >> + for(i = 0; devtab[i] != nil; i++){ >> + if(up->pgrp->notallowed[i/(sizeof(u64int)*8)] & 1<> + continue; >> k += snprint(b+k, READSTR-k, "#%C %s\n", devtab[i]->dc, devtab[i]->name); >> + } >> + runlock(&up->pgrp->ns); >> if(waserror()){ >> free(b); >> nexterror(); >> @@ -676,6 +694,26 @@ >> error(Eperm); >> break; >> >> + case Qdrivers: >> + cb = parsecmd(a, n); >> + >> + if(waserror()) { >> + free(cb); >> + nexterror(); >> + } >> + ct = lookupcmd(cb, drivermsg, nelem(drivermsg)); >> + switch(ct->index) { >> + case CMblock: >> + driverscmd(up->pgrp, 0, cb->nf-1, cb->f+1); >> + break; >> + case CMpermit: >> + driverscmd(up->pgrp, 1, cb->nf-1, cb->f+1); >> + break; >> + } >> + poperror(); >> + free(cb); >> + break; >> + >> case Qreboot: >> if(!iseve()) >> error(Eperm); >> @@ -935,6 +973,64 @@ >> break; >> } >> return n+1; >> +} >> + >> +void >> +driverscmd(Pgrp *pgrp, int invert, int argc, char *argv[]) >> +{ >> + int i, t, w; >> + char *p; >> + Rune r; >> + u64int mask[nelem(pgrp->notallowed)]; >> + >> + if(invert) >> + memset(mask, 0xFF, sizeof mask); >> + else >> + memset(mask, 0, sizeof mask); >> + >> + w = sizeof mask[0] * 8; >> + for(i=0; i < argc; i++) >> + for(p = argv[i]; *p != '\0';){ >> + p += chartorune(&r, p); >> + t = devno(r, 1); >> + if(t == -1) >> + continue; >> + if(invert) >> + mask[t/w] &= ~(1<> + else >> + mask[t/w] |= 1<> + } >> + >> + wlock(&pgrp->ns); >> + for(i=0; i < nelem(pgrp->notallowed); i++) >> + pgrp->notallowed[i] |= mask[i]; >> + wunlock(&pgrp->ns); >> +} >> + >> +int >> +driversallowed(Pgrp *pgrp, int r) >> +{ >> + int t, w, b; >> + >> + t = devno(r, 1); >> + if(t == -1) >> + return 0; >> + >> + w = sizeof(u64int) * 8; >> + rlock(&pgrp->ns); >> + b = !(pgrp->notallowed[t/w] & 1<> + runlock(&pgrp->ns); >> + return b; >> +} >> + >> +int >> +canmount(Pgrp *pgrp) >> +{ >> + /* >> + * Devmnt is not usable directly from user procs, so >> + * having it removed is interpreted to block any mounts. >> + */ >> + return driversallowed(pgrp, 'M'); >> } >> >> void >> --- a/sys/src/9/port/devroot.c >> +++ b/sys/src/9/port/devroot.c >> @@ -105,6 +105,7 @@ >> addrootdir("net"); >> addrootdir("net.alt"); >> addrootdir("proc"); >> + addrootdir("rc"); >> addrootdir("root"); >> addrootdir("srv"); >> addrootdir("shr"); >> --- a/sys/src/9/port/devshr.c >> +++ b/sys/src/9/port/devshr.c >> @@ -464,7 +464,7 @@ >> cclose(c); >> return nc; >> case Qcroot: >> - if(up->pgrp->noattach) >> + if(!canmount(up->pgrp)) >> error(Enoattach); >> if((perm & DMDIR) == 0 || mode != OREAD) >> error(Eperm); >> @@ -498,7 +498,7 @@ >> sch->shr = shr; >> break; >> case Qcshr: >> - if(up->pgrp->noattach) >> + if(!canmount(up->pgrp)) >> error(Enoattach); >> if((perm & DMDIR) != 0 || mode != OWRITE) >> error(Eperm); >> @@ -731,7 +731,7 @@ >> Mhead *h; >> Mount *m; >> >> - if(up->pgrp->noattach) >> + if(!canmount(up->pgrp)) >> error(Enoattach); >> sch = tosch(c); >> if(sch->level != Qcmpt) >> --- a/sys/src/9/port/mkdevc >> +++ b/sys/src/9/port/mkdevc >> @@ -78,6 +78,9 @@ >> if(ARGC < 2) >> exit "usage" >> >> + if(ndev >= 256) >> + exit "device count will overflow Pgrp.notallowed" >> + >> printf "#include \"u.h\"\n"; >> printf "#include \"../port/lib.h\"\n"; >> printf "#include \"mem.h\"\n"; >> --- a/sys/src/9/port/portdat.h >> +++ b/sys/src/9/port/portdat.h >> @@ -121,6 +121,7 @@ >> Amount, /* to be mounted or mounted upon */ >> Acreate, /* is to be created */ >> Aremove, /* will be removed by caller */ >> + Aunmount, /* unmount arg[0] */ >> >> COPEN = 0x0001, /* for i/o */ >> CMSG = 0x0002, /* the message channel for a mount */ >> @@ -484,7 +485,7 @@ >> { >> Ref; >> RWlock ns; /* Namespace n read/one write lock */ >> - int noattach; >> + u64int notallowed[4]; /* Room for 256 devices */ >> Mhead *mnthash[MNTHASH]; >> }; >> >> --- a/sys/src/9/port/portfns.h >> +++ b/sys/src/9/port/portfns.h >> @@ -413,6 +413,9 @@ >> ushort nhgets(void*); >> ulong µs(void); >> long lcycles(void); >> +void driverscmd(Pgrp*,int,int,char**); >> +int driversallowed(Pgrp*, int); >> +int canmount(Pgrp*); >> >> #pragma varargck argpos iprint 1 >> #pragma varargck argpos panic 1 >> --- a/sys/src/9/port/sysfile.c >> +++ b/sys/src/9/port/sysfile.c >> @@ -1048,7 +1048,7 @@ >> nexterror(); >> } >> >> - if(up->pgrp->noattach) >> + if(!canmount(up->pgrp)) >> error(Enoattach); >> >> ac = nil; >> @@ -1160,14 +1160,8 @@ >> nexterror(); >> } >> if(name != nil) { >> - /* >> - * This has to be namec(..., Aopen, ...) because >> - * if arg[0] is something like /srv/cs or /fd/0, >> - * opening it is the only way to get at the real >> - * Chan underneath. >> - */ >> validaddr((uintptr)name, 1, 0); >> - cmounted = namec(name, Aopen, OREAD, 0); >> + cmounted = namec(name, Aunmount, OREAD, 0); >> } >> cunmount(cmount, cmounted); >> poperror(); >> --- a/sys/src/9/port/sysproc.c >> +++ b/sys/src/9/port/sysproc.c >> @@ -34,6 +34,7 @@ >> Egrp *oeg; >> ulong pid, flag; >> Mach *wm; >> + char *devs; >> >> flag = va_arg(list, ulong); >> /* Check flags before we commit */ >> @@ -44,6 +45,11 @@ >> if((flag & (RFENVG|RFCENVG)) == (RFENVG|RFCENVG)) >> error(Ebadarg); >> >> + /* >> + * Code using RFNOMNT expects to block all but >> + * the following devices. >> + */ >> + devs = "|decp"; >> if((flag&RFPROC) == 0) { >> if(flag & (RFMEM|RFNOWAIT)) >> error(Ebadarg); >> @@ -60,12 +66,12 @@ >> up->pgrp = newpgrp(); >> if(flag & RFNAMEG) >> pgrpcpy(up->pgrp, opg); >> - /* inherit noattach */ >> - up->pgrp->noattach = opg->noattach; >> + /* inherit notallowed */ >> + memmove(up->pgrp->notallowed, opg->notallowed, sizeof up->pgrp->notallowed); >> closepgrp(opg); >> } >> if(flag & RFNOMNT) >> - up->pgrp->noattach = 1; >> + driverscmd(up->pgrp, 1, 1, &devs); >> if(flag & RFREND) { >> org = up->rgrp; >> up->rgrp = newrgrp(); >> @@ -177,8 +183,8 @@ >> p->pgrp = newpgrp(); >> if(flag & RFNAMEG) >> pgrpcpy(p->pgrp, up->pgrp); >> - /* inherit noattach */ >> - p->pgrp->noattach = up->pgrp->noattach; >> + /* inherit notallowed */ >> + memmove(p->pgrp->notallowed, up->pgrp->notallowed, sizeof p->pgrp->notallowed); >> } >> else { >> p->pgrp = up->pgrp; >> @@ -185,7 +191,7 @@ >> incref(p->pgrp); >> } >> if(flag & RFNOMNT) >> - p->pgrp->noattach = 1; >> + driverscmd(p->pgrp, 1, 1, &devs); >> >> if(flag & RFREND) >> p->rgrp = newrgrp(); >> --- a/sys/src/libauth/newns.c >> +++ b/sys/src/libauth/newns.c >> @@ -14,8 +14,8 @@ >> static int setenv(char*, char*); >> static char *expandarg(char*, char*); >> static int splitargs(char*, char*[], char*, int); >> -static int nsfile(char*, Biobuf *, AuthRpc *); >> -static int nsop(char*, int, char*[], AuthRpc*); >> +static int nsfile(char*, Biobuf *, AuthRpc *, int); >> +static int nsop(char*, int, char*[], AuthRpc*, int); >> static int catch(void*, char*); >> >> int newnsdebug; >> @@ -35,7 +35,7 @@ >> { >> Biobuf *b; >> char home[4*ANAMELEN]; >> - int afd, cdroot; >> + int afd, cdroot, dfd; >> char *path; >> AuthRpc *rpc; >> >> @@ -51,6 +51,10 @@ >> } >> /* rpc != nil iff afd >= 0 */ >> >> + dfd = open("#c/drivers", OWRITE|OCEXEC); >> + if(dfd < 0 && newnsdebug) >> + fprint(2, "open #c/drivers: %r\n"); >> + >> if(file == nil){ >> if(!newns){ >> werrstr("no namespace file specified"); >> @@ -70,7 +74,8 @@ >> setenv("home", home); >> } >> >> - cdroot = nsfile(newns ? "newns" : "addns", b, rpc); >> + cdroot = nsfile(newns ? "newns" : "addns", b, rpc, dfd); >> + close(dfd); >> Bterm(b); >> freecloserpc(rpc); >> >> @@ -87,7 +92,7 @@ >> } >> >> static int >> -nsfile(char *fn, Biobuf *b, AuthRpc *rpc) >> +nsfile(char *fn, Biobuf *b, AuthRpc *rpc, int dfd) >> { >> int argc; >> char *cmd, *argv[NARG+1], argbuf[MAXARG*NARG]; >> @@ -103,7 +108,7 @@ >> continue; >> argc = splitargs(cmd, argv, argbuf, NARG); >> if(argc) >> - cdroot |= nsop(fn, argc, argv, rpc); >> + cdroot |= nsop(fn, argc, argv, rpc, dfd); >> } >> atnotify(catch, 0); >> return cdroot; >> @@ -143,7 +148,7 @@ >> } >> >> static int >> -nsop(char *fn, int argc, char *argv[], AuthRpc *rpc) >> +nsop(char *fn, int argc, char *argv[], AuthRpc *rpc, int dfd) >> { >> char *argv0; >> ulong flags; >> @@ -181,7 +186,7 @@ >> b = Bopen(argv[0], OREAD|OCEXEC); >> if(b == nil) >> return 0; >> - cdroot |= nsfile(fn, b, rpc); >> + cdroot |= nsfile(fn, b, rpc, dfd); >> Bterm(b); >> }else if(strcmp(argv0, "clear") == 0 && argc == 0){ >> rfork(RFCNAMEG); >> @@ -212,6 +217,14 @@ >> }else if(strcmp(argv0, "cd") == 0 && argc == 1){ >> if(chdir(argv[0]) == 0 && *argv[0] == '/') >> cdroot = 1; >> + }else if(argc >= 1 && (strcmp(argv0, "permit") == 0 || strcmp(argv0, "block") == 0)){ >> + //We should not silently fail if we can not honor a permit/block >> + //due to the parent namespace missing #c/drivers. >> + if(dfd <= 0) >> + sysfatal("%s requested, but could not open #c/drivers", argv0); >> + for(i=0; i < argc; i++) >> + if(fprint(dfd, "%s %s\n", argv0, argv[i]) < 0 && newnsdebug) >> + fprint(2, "%s: %s %s %r\n", fn, argv0, argv[i]); >> } >> return cdroot; >> } >> >