9front - general discussion about 9front
 help / color / mirror / Atom feed
* [9front] [PATCH] kernel: disallow executing from #| or #d
@ 2022-05-10  6:40 Jacob Moody
  2022-05-10 14:40 ` ori
  2022-05-10 20:52 ` [9front] " Anthony Martin
  0 siblings, 2 replies; 13+ messages in thread
From: Jacob Moody @ 2022-05-10  6:40 UTC (permalink / raw)
  To: 9front

Hello,

I noticed that you could execute from #| and #d.
ex:

cpu% bind '#|' /n/pipetest
cpu% /n/pipetest/data1 &
cpu% echo '#!/bin/rc' >> /n/pipetest/data
cpu% echo 'echo hello' >> /n/pipetest/data
cpu% hello

I believe this would also work for binaries
if the writer could predict what parts of the binary
the kernel will want to read. I am inclined to
block this behavior but would be curious what others
think. Included is a patch to error on OEXEC opens
in devdup and devpipe.

thanks,
moody

---
diff 6ca8e6bbafcc871301a90aa7bd4ca10533b1999a 521f304f5db4c74740b3e00503e2bf22e9f60dc6
--- a/sys/src/9/port/devdup.c	Mon May  9 11:22:00 2022
+++ b/sys/src/9/port/devdup.c	Tue May 10 00:35:16 2022
@@ -63,7 +63,7 @@
 	Chan *f;
 	int fd, twicefd;

-	if(omode & ORCLOSE)
+	if(omode & ORCLOSE || omode & OEXEC)
 		error(Eperm);
 	if(c->qid.type & QTDIR){
 		if(omode != 0)
--- a/sys/src/9/port/devpipe.c	Mon May  9 11:22:00 2022
+++ b/sys/src/9/port/devpipe.c	Tue May 10 00:35:16 2022
@@ -228,6 +228,8 @@
 		c->offset = 0;
 		return c;
 	}
+	if(omode == OEXEC)
+		error(Eperm);

 	p = c->aux;
 	qlock(p);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [9front] [PATCH] kernel: disallow executing from #| or #d
  2022-05-10  6:40 [9front] [PATCH] kernel: disallow executing from #| or #d Jacob Moody
@ 2022-05-10 14:40 ` ori
  2022-05-10 16:34   ` Jacob Moody
  2022-05-10 20:52 ` [9front] " Anthony Martin
  1 sibling, 1 reply; 13+ messages in thread
From: ori @ 2022-05-10 14:40 UTC (permalink / raw)
  To: 9front

Quoth Jacob Moody <moody@mail.posixcafe.org>:
> Hello,
> 
> I noticed that you could execute from #| and #d.
> ex:
> 
> cpu% bind '#|' /n/pipetest
> cpu% /n/pipetest/data1 &
> cpu% echo '#!/bin/rc' >> /n/pipetest/data
> cpu% echo 'echo hello' >> /n/pipetest/data
> cpu% hello
> 
> I believe this would also work for binaries
> if the writer could predict what parts of the binary
> the kernel will want to read. I am inclined to
> block this behavior but would be curious what others
> think. Included is a patch to error on OEXEC opens
> in devdup and devpipe.

I think all files should be the same, as much as
they can be -- we shouldn't necessarily need to
care if we have a pipe or not.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [9front] [PATCH] kernel: disallow executing from #| or #d
  2022-05-10 14:40 ` ori
@ 2022-05-10 16:34   ` Jacob Moody
  2022-05-10 19:59     ` Amavect
  0 siblings, 1 reply; 13+ messages in thread
From: Jacob Moody @ 2022-05-10 16:34 UTC (permalink / raw)
  To: 9front

On 5/10/22 08:40, ori@eigenstate.org wrote:
> I think all files should be the same, as much as
> they can be -- we shouldn't necessarily need to
> care if we have a pipe or not.
> 

In general I think I agree, but I would argue that its not the
exec code that is special casing itself from using these files.
The devices are electing themselves to say that execution is
not a supported operation on the files they serve, but perhaps
that is just semantics.

I want to explain more of why I think this is not ideal. I am
approaching this from the question "what capabilities does a sharp
device give you?". And I think it is a bit surprising to say access
to #| or #d also gives a process the ability to execute arbitrary code
stashed in to one end of a pipe. I can imagine such a case of building
a namespace where the binaries exposed are hand picked, and I think its
reasonable to want to restrict what binaries can be executed while also
allowing programs to use dup() and pipe().

But perhaps my approach here is wrong, it could be that it is inherently misleading
to think about a programs capabilities in terms of what kernel devices
the program has access to.


moody

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [9front] [PATCH] kernel: disallow executing from #| or #d
  2022-05-10 16:34   ` Jacob Moody
@ 2022-05-10 19:59     ` Amavect
  2022-05-10 22:47       ` Jacob Moody
  0 siblings, 1 reply; 13+ messages in thread
From: Amavect @ 2022-05-10 19:59 UTC (permalink / raw)
  To: 9front

I agree with ori. It should be consistent.
Now, what's weird to me is why it can execute the pipe without the x bit set.
I can still do this as user none. ACE is fun.
Also, trying to cp /bin/ls into data fails to execute properly.

Thanks,
Amavect

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [9front] Re: [PATCH] kernel: disallow executing from #| or #d
  2022-05-10  6:40 [9front] [PATCH] kernel: disallow executing from #| or #d Jacob Moody
  2022-05-10 14:40 ` ori
@ 2022-05-10 20:52 ` Anthony Martin
  1 sibling, 0 replies; 13+ messages in thread
From: Anthony Martin @ 2022-05-10 20:52 UTC (permalink / raw)
  To: 9front

Jacob Moody <moody@mail.posixcafe.org> once said:
> I believe this would also work for binaries
> if the writer could predict what parts of the binary
> the kernel will want to read. I am inclined to
> block this behavior but would be curious what others
> think. Included is a patch to error on OEXEC opens
> in devdup and devpipe.

OREAD and OEXEC are essentially the same thing on
Plan 9. If you can open and read a file, you can always
just copy the contents to a new file that you own and
change the mode. Unix retains the illusion.

Cheers,
  Anthony

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [9front] [PATCH] kernel: disallow executing from #| or #d
  2022-05-10 19:59     ` Amavect
@ 2022-05-10 22:47       ` Jacob Moody
  2022-05-11  4:21         ` Amavect
  0 siblings, 1 reply; 13+ messages in thread
From: Jacob Moody @ 2022-05-10 22:47 UTC (permalink / raw)
  To: 9front

On 5/10/22 13:59, Amavect wrote:
> I agree with ori. It should be consistent.

I guess I don't quite understand what is inconsistent with a device
telling the rest of the kernel that its files are unfit to be used
in an exec. It seems no more different then the device/file system
saying 'no this file is not fit for reads' and giving an error on reads.
I don't view that as read() being inconsistent.

> Now, what's weird to me is why it can execute the pipe without the x bit set.

That's due to the responsibility for honoring permission bits to be on the device
exposing the files itself. None of the code checks for DMEXEC before allowing
an open with OEXEC.

> Also, trying to cp /bin/ls into data fails to execute properly.

This is because #| does not honor reads with an offset, but #e does:

term% cat /bin/file > '#e/filetest'
term% '#e/filetest' /bin/file
/bin/file: amd64 plan 9 executable

On 5/10/22 14:52, Anthony Martin wrote:
> OREAD and OEXEC are essentially the same thing on
> Plan 9. If you can open and read a file, you can always
> just copy the contents to a new file that you own and
> change the mode. Unix retains the illusion.

This is true only if you have another place to copy the
binary to that you can then also exec from. Depending on your
namespace and current user, that may not be the case.

My interest here is really in making a restricted
namespace, like one with no writable directories,
useful from a capabilities context. You are right,
having an executable #e or a directory you can
copy something in to and execute from accomplishes
the same capability. But that also means you can't remove
that capability itself from the program without removing
access to both methods. We currently have a way of removing
that writable directory method through namespace operations,
but no way of removing the ability to use #e, #d, or #| this way.

Blocking the device entirely from the namespace is one way to do the later,
but thinking of access to #e as access to the capability of stashing and executing
arbitrary code didn't sit well with me on a first pass. On a second pass, #e
is really just a proc local writable directory that is restricted to only stashing
files. As such it probably makes sense that access to #e should be no different, in
terms of capabilities it gives, then any other writable folder.

For #| and #d, I am less convinced. Not only is the interface to execute from them
really not useful, due to not honoring read offsets, but I don't think I could generalize
their function in the same way I could with #e. That is to say, I don't think the capabilities
they give should be conflated with those of a typical writable dir.

This discussion is really interesting to me. Thank you for your thoughts.

moody

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [9front] [PATCH] kernel: disallow executing from #| or #d
  2022-05-10 22:47       ` Jacob Moody
@ 2022-05-11  4:21         ` Amavect
  2022-05-11  6:31           ` Jacob Moody
  0 siblings, 1 reply; 13+ messages in thread
From: Amavect @ 2022-05-11  4:21 UTC (permalink / raw)
  To: 9front

I think the correct thing to do is to get the devices to respect the permission bits,
potentially allowing execution if chmod made it that way. Also check rw bits.
This would make it more consistent to my judgement.
Then, we can decide whether to reject a wstat for trying to set the x bit.

Additionally, if you're restricting namespaces,
you pretty much have to set RFNOMNT.
This removes access to # directories.
Devpipe is rendered useless due to using attach messages (can still use pipe(2)),
while env and dup can still be pre-binded or not (like capabilities).

Thanks,
Amavect

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [9front] [PATCH] kernel: disallow executing from #| or #d
  2022-05-11  4:21         ` Amavect
@ 2022-05-11  6:31           ` Jacob Moody
  2022-05-11 16:32             ` Amavect
  0 siblings, 1 reply; 13+ messages in thread
From: Jacob Moody @ 2022-05-11  6:31 UTC (permalink / raw)
  To: 9front

On 5/10/22 22:21, Amavect wrote:
> I think the correct thing to do is to get the devices to respect the permission bits,
> potentially allowing execution if chmod made it that way. Also check rw bits.
> This would make it more consistent to my judgement.
> Then, we can decide whether to reject a wstat for trying to set the x bit.

I disagree, I put forward that having #|/^(data data1) executable has no valid
use case. Requiring +x set, then preventing +x from being set
is just a roundabout way of disabling opening with OEXEC, why
not just do that?

> Additionally, if you're restricting namespaces,
> you pretty much have to set RFNOMNT.

As is yes, I was poking around this code because I wish
to change those semantics. There is a patch
for this a bit earlier up in the mailing list.
Some of my thoughts are within the context
of that code.

> This removes access to # directories.

RFNOMNT does not remove access to #|, #d, #e, #c, or #p

> Devpipe is rendered useless due to using attach messages (can still use pipe(2)),

You can still attach to '#|' just fine after an RFNOMNT

cpu% rofk nm
cpu% bind '#|' /n/test

you can use pipe(2) after RFNOMNT specifically because RFNOMNT does not block #|.

The attach semantics of #| are a bit tricky with relation to the pipe
system call. An attach is sent to the device only when a walk is
started from '#', any walk to '#|' always gets you a new pair of pipes.
So in order to get an fd to each side, you need to only walk '#|'
once. That means without having the specific instance of #| 'pined'
somewhere, you can not refer to one end of the pipe by name which is
required to call exec. The pipe() system call does the walk for you,
and gives you back the fds without ever exposing the instance of
#|. So it is impossible to exec an end of a pipe walked to by
the pipe system call.

Or it would be, but you can execute that end of the pipe
by referring to it through #d:

#include <u.h>
#include <libc.h>

void
main(int argc, char **argv)
{
	int pipes[2];
	char *s;

	rfork(RFNAMEG|RFNOMNT);
	pipe(pipes);
	s = smprint("#d/%d", pipes[1]);
	if(!rfork(RFPROC|RFMEM)){
		execl(s, s, nil);
		sysfatal("failed to exec: %r");
	}
	fprint(pipes[0], "#!/bin/rc\n");
	fprint(pipes[0], "echo hello\n\n");
}

Now lets say you can block #d and #| for your process.
If my goal is to build a namespace that is not capable of executing
arbitrary memory, why should I have to block #d and #| to achieve this?
It doesn't seem like there is any practical use for them to leak this capability.

moody

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [9front] [PATCH] kernel: disallow executing from #| or #d
  2022-05-11  6:31           ` Jacob Moody
@ 2022-05-11 16:32             ` Amavect
  2022-05-11 16:50               ` Jacob Moody
  0 siblings, 1 reply; 13+ messages in thread
From: Amavect @ 2022-05-11 16:32 UTC (permalink / raw)
  To: 9front

On May 11, 2022 1:31:27 AM CDT, Jacob Moody <moody@mail.posixcafe.org> wrote:
>Requiring +x set, then preventing +x from being set
>is just a roundabout way of disabling opening with OEXEC, why
>not just do that?
I'm confused why devpipe allows wstat.
The roundabout way is more general,
checking read/write permissions as well.
The permission info should line up with how a dev allows permissions,
and making open dependent on what the mode is actually set to
would make the info and the behavior always line up.
It might be as easy as a bitwise or.
I'll hack up something later tonight.

If we don't decide on that route,
then devpipe should disallow wstat.

>RFNOMNT does not remove access to #|, #d, #e, #c, or #p
That's what I get for only reading the man page without testing.


Thanks,
Amavect

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [9front] [PATCH] kernel: disallow executing from #| or #d
  2022-05-11 16:32             ` Amavect
@ 2022-05-11 16:50               ` Jacob Moody
  2022-05-15  2:43                 ` Amavect
  0 siblings, 1 reply; 13+ messages in thread
From: Jacob Moody @ 2022-05-11 16:50 UTC (permalink / raw)
  To: 9front

On 5/11/22 10:32, Amavect wrote:
> If we don't decide on that route,
> then devpipe should disallow wstat.

The roundabout way of checking for permission bits was
my first pass at removing this capability. I am much
more in favor of just disabling wstat for devpipe
personally. If you did want to make one end
of a pipe read only or write only, doing it through
wstat feels obtuse due to the attach semantics.

> 
>> RFNOMNT does not remove access to #|, #d, #e, #c, or #p
> That's what I get for only reading the man page without testing.

Yeah the man page should not lie, how's this instead?

thanks,
moody

diff 51669adf2446385b38bab4efcb4133c19e9be806 uncommitted
--- a//sys/man/2/fork
+++ b//sys/man/2/fork
@@ -70,9 +70,16 @@
 .TP
 .B RFNOMNT
 If set, subsequent mounts into the new name space and dereferencing
-of pathnames starting with
+of most pathnames starting with
 .B #
-are disallowed.
+are disallowed. Specifically
+.IR pipe(3) ,
+.IR dup(3) ,
+.IR env(3) ,
+.IR cons(3) ,
+and
+.IR proc(3)
+are still permitted.
 .TP
 .B RFENVG
 If set, the environment variables are copied;


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [9front] [PATCH] kernel: disallow executing from #| or #d
  2022-05-11 16:50               ` Jacob Moody
@ 2022-05-15  2:43                 ` Amavect
  2022-05-15 15:26                   ` Amavect
  0 siblings, 1 reply; 13+ messages in thread
From: Amavect @ 2022-05-15  2:43 UTC (permalink / raw)
  To: 9front

On Wed, 11 May 2022 10:50:44 -0600
Jacob Moody <moody@mail.posixcafe.org> wrote:

> The roundabout way of checking for permission bits was
> my first pass at removing this capability. I am much
> more in favor of just disabling wstat for devpipe
> personally. If you did want to make one end
> of a pipe read only or write only, doing it through
> wstat feels obtuse due to the attach semantics.

Fair enough. I'm all for it now.
Just make sure #| stat has mode 666 since user none can read it despite
saying mode 600.

> Yeah the man page should not lie, how's this instead?

The .IR parts are off. The (3)s need to be R. See below.
(you already committed yours, whoops)

Thanks,
Amavect


diff 51669adf2446385b38bab4efcb4133c19e9be806 uncommitted
--- a//sys/man/2/fork
+++ b//sys/man/2/fork
@@ -70,9 +70,16 @@
 .TP
 .B RFNOMNT
 If set, subsequent mounts into the new name space and dereferencing
-of pathnames starting with
+of most pathnames starting with
 .B #
-are disallowed.
+are disallowed. Specifically
+.IR pipe (3),
+.IR dup (3),
+.IR env (3),
+.IR cons (3),
+and
+.IR proc (3)
+are still permitted.
 .TP
 .B RFENVG
 If set, the environment variables are copied;

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [9front] [PATCH] kernel: disallow executing from #| or #d
  2022-05-15  2:43                 ` Amavect
@ 2022-05-15 15:26                   ` Amavect
  2022-05-15 16:28                     ` Jacob Moody
  0 siblings, 1 reply; 13+ messages in thread
From: Amavect @ 2022-05-15 15:26 UTC (permalink / raw)
  To: 9front

Moody,

Your patch made the commas italic.

2c67fb9794369eea87c0d2179560f8f79198d34c
-.IR pipe(3) ,
-.IR dup(3) ,
-.IR env(3) ,
-.IR cons(3) ,
+.IR pipe (3) ,
+.IR dup (3) ,
+.IR env (3) ,
+.IR cons (3) ,

It needs to be the following:
+.IR pipe (3),
+.IR dup (3),
+.IR env (3),
+.IR cons (3),

This will make the commas regular.

Thanks,
Amavect

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [9front] [PATCH] kernel: disallow executing from #| or #d
  2022-05-15 15:26                   ` Amavect
@ 2022-05-15 16:28                     ` Jacob Moody
  0 siblings, 0 replies; 13+ messages in thread
From: Jacob Moody @ 2022-05-15 16:28 UTC (permalink / raw)
  To: 9front

On 5/15/22 09:26, Amavect wrote:
> Moody,
> 
> Your patch made the commas italic.
> 

Nice catch! I completely missed that.
Pushed.


Thank you,
moody

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-05-15 16:34 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-10  6:40 [9front] [PATCH] kernel: disallow executing from #| or #d Jacob Moody
2022-05-10 14:40 ` ori
2022-05-10 16:34   ` Jacob Moody
2022-05-10 19:59     ` Amavect
2022-05-10 22:47       ` Jacob Moody
2022-05-11  4:21         ` Amavect
2022-05-11  6:31           ` Jacob Moody
2022-05-11 16:32             ` Amavect
2022-05-11 16:50               ` Jacob Moody
2022-05-15  2:43                 ` Amavect
2022-05-15 15:26                   ` Amavect
2022-05-15 16:28                     ` Jacob Moody
2022-05-10 20:52 ` [9front] " Anthony Martin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).