[PATCH] faccessat: fix error code on setreXid failure

mailing list of musl libc
 help / color / mirror / code / Atom feed

* [PATCH] faccessat: fix error code on setreXid failure
@ 2018-01-30 20:32 Alexander Monakov
  2018-01-30 21:33 ` Rich Felker
  0 siblings, 1 reply; 6+ messages in thread
From: Alexander Monakov @ 2018-01-30 20:32 UTC (permalink / raw)
  To: musl

Commit 316d6741b68b485205d7233c98bd6c795bb80370 changed one use of
SYS_exit in 'checker' without changing another just three lines above,
and then commit f9fb20b42da0e755d93de229a5a737d79a0e8f60 changed the
meaning of return value, causing EACCES to be reported instead of EBUSY
if preparatory setregid/setreuid fail.
---

This is the minimal fix for the issue, but it appears there's another:
collecting checker's exit code and reaping the zombie is implemented as

		int status;
		do {
			__syscall(SYS_wait4, pid, &status, __WCLONE, 0);
		} while (!WIFEXITED(status) && !WIFSIGNALED(status));

but I don't understand why this retry loop is required and correct:

- if another thread won the race to collect the zombie by doing something
  like waitpid(-1, 0, __WALL), it fails to check syscall's return value and
  uses uninitialized 'status', possibly causing an infinite loop or OOB access
  in the parent;

- the code seems to assume that the zombie will not be auto-collected even if
  SIGCHLD disposition is set to SIG_IGN; this sounds logical, but not explicitly
  documented as far as I can tell;

- if the two problems above don't arise, I don't see how the test in while ()
  condition can fail; we have signals blocked, so waitpid can only return when
  the child no longer exists.

Plus, using CLONE_VM | CLONE_VFORK would help conserve resources.

Alexander

 src/unistd/faccessat.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/unistd/faccessat.c b/src/unistd/faccessat.c
index 33478959..954cbdb4 100644
--- a/src/unistd/faccessat.c
+++ b/src/unistd/faccessat.c
@@ -25,7 +25,7 @@ static int checker(void *p)
 	int i;
 	if (__syscall(SYS_setregid, __syscall(SYS_getegid), -1)
 	    || __syscall(SYS_setreuid, __syscall(SYS_geteuid), -1))
-		__syscall(SYS_exit, 1);
+		return sizeof errors/sizeof *errors - 1;
 	ret = __syscall(SYS_faccessat, c->fd, c->filename, c->amode, 0);
 	for (i=0; i < sizeof errors/sizeof *errors - 1 && ret!=errors[i]; i++);
 	return i;
-- 
2.11.0



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] faccessat: fix error code on setreXid failure
  2018-01-30 20:32 [PATCH] faccessat: fix error code on setreXid failure Alexander Monakov
@ 2018-01-30 21:33 ` Rich Felker
  2018-01-30 21:51   ` Alexander Monakov
  0 siblings, 1 reply; 6+ messages in thread
From: Rich Felker @ 2018-01-30 21:33 UTC (permalink / raw)
  To: musl

On Tue, Jan 30, 2018 at 11:32:37PM +0300, Alexander Monakov wrote:
> Commit 316d6741b68b485205d7233c98bd6c795bb80370 changed one use of
> SYS_exit in 'checker' without changing another just three lines above,
> and then commit f9fb20b42da0e755d93de229a5a737d79a0e8f60 changed the
> meaning of return value, causing EACCES to be reported instead of EBUSY
> if preparatory setregid/setreuid fail.

This looks right.

> This is the minimal fix for the issue, but it appears there's another:
> collecting checker's exit code and reaping the zombie is implemented as
> 
> 		int status;
> 		do {
> 			__syscall(SYS_wait4, pid, &status, __WCLONE, 0);
> 		} while (!WIFEXITED(status) && !WIFSIGNALED(status));
> 
> but I don't understand why this retry loop is required and correct:

AFAIK wait4 can also return due to Stopped status or trace-related
reasons, not just exit. That was the motivation I think.

> - if another thread won the race to collect the zombie by doing something
>   like waitpid(-1, 0, __WALL), it fails to check syscall's return value and
>   uses uninitialized 'status', possibly causing an infinite loop or OOB access
>   in the parent;

Yes, I think that should be fixed even though I think it's broken
usage.

> - the code seems to assume that the zombie will not be auto-collected even if
>   SIGCHLD disposition is set to SIG_IGN; this sounds logical, but not explicitly
>   documented as far as I can tell;

Indeed, I'm not sure, but I don't know any good fix.

> - if the two problems above don't arise, I don't see how the test in while ()
>   condition can fail; we have signals blocked, so waitpid can only return when
>   the child no longer exists.
> 
> Plus, using CLONE_VM | CLONE_VFORK would help conserve resources.

I'm not sure if it's safe -- having tasks sharing vm with different
permissions is usually a very bad thing. It might be ok here but I'm
not sure.

> 
> Alexander
> 
>  src/unistd/faccessat.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/unistd/faccessat.c b/src/unistd/faccessat.c
> index 33478959..954cbdb4 100644
> --- a/src/unistd/faccessat.c
> +++ b/src/unistd/faccessat.c
> @@ -25,7 +25,7 @@ static int checker(void *p)
>  	int i;
>  	if (__syscall(SYS_setregid, __syscall(SYS_getegid), -1)
>  	    || __syscall(SYS_setreuid, __syscall(SYS_geteuid), -1))
> -		__syscall(SYS_exit, 1);
> +		return sizeof errors/sizeof *errors - 1;
>  	ret = __syscall(SYS_faccessat, c->fd, c->filename, c->amode, 0);
>  	for (i=0; i < sizeof errors/sizeof *errors - 1 && ret!=errors[i]; i++);
>  	return i;

Looks ok except it encodes an assumption that EBUSY is last. It might
make more sense to goto the errno-searching loop.

Rich


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] faccessat: fix error code on setreXid failure
  2018-01-30 21:33 ` Rich Felker
@ 2018-01-30 21:51   ` Alexander Monakov
  2018-01-30 22:07     ` Rich Felker
  0 siblings, 1 reply; 6+ messages in thread
From: Alexander Monakov @ 2018-01-30 21:51 UTC (permalink / raw)
  To: musl

On Tue, 30 Jan 2018, Rich Felker wrote:
> 
> AFAIK wait4 can also return due to Stopped status or trace-related
> reasons, not just exit. That was the motivation I think.

We know we are not tracing this child, and stop notifications are only
delivered if WUNTRACED is given in flags, aren't they?

> > - the code seems to assume that the zombie will not be auto-collected even if
> >   SIGCHLD disposition is set to SIG_IGN; this sounds logical, but not explicitly
> >   documented as far as I can tell;
> 
> Indeed, I'm not sure, but I don't know any good fix.

Bring back the pipe (similar to how posix_spawn receives the status)?

> > --- a/src/unistd/faccessat.c
> > +++ b/src/unistd/faccessat.c
> > @@ -25,7 +25,7 @@ static int checker(void *p)
> >  	int i;
> >  	if (__syscall(SYS_setregid, __syscall(SYS_getegid), -1)
> >  	    || __syscall(SYS_setreuid, __syscall(SYS_geteuid), -1))
> > -		__syscall(SYS_exit, 1);
> > +		return sizeof errors/sizeof *errors - 1;
> >  	ret = __syscall(SYS_faccessat, c->fd, c->filename, c->amode, 0);
> >  	for (i=0; i < sizeof errors/sizeof *errors - 1 && ret!=errors[i]; i++);
> >  	return i;
> 
> Looks ok except it encodes an assumption that EBUSY is last. It might
> make more sense to goto the errno-searching loop.

Well, the loop also implicitly encodes that assumption anyway: it stops
at the last entry regardless if it matches, making EBUSY the fallback code
for unrecognized SYS_faccessat return values.

The loop will be gone if the pipe method is re-introduced.

Alexander


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] faccessat: fix error code on setreXid failure
  2018-01-30 21:51   ` Alexander Monakov
@ 2018-01-30 22:07     ` Rich Felker
  2018-01-30 22:20       ` Alexander Monakov
  0 siblings, 1 reply; 6+ messages in thread
From: Rich Felker @ 2018-01-30 22:07 UTC (permalink / raw)
  To: musl

On Wed, Jan 31, 2018 at 12:51:58AM +0300, Alexander Monakov wrote:
> On Tue, 30 Jan 2018, Rich Felker wrote:
> > 
> > AFAIK wait4 can also return due to Stopped status or trace-related
> > reasons, not just exit. That was the motivation I think.
> 
> We know we are not tracing this child, and stop notifications are only
> delivered if WUNTRACED is given in flags, aren't they?

I'm not sure what can happen if it's all running under strace -f or
something. And I'm not sure what the conditions for stop notifications
are. If it's assured that they can't happen then maybe the loop can be
removed.

> > > - the code seems to assume that the zombie will not be auto-collected even if
> > >   SIGCHLD disposition is set to SIG_IGN; this sounds logical, but not explicitly
> > >   documented as far as I can tell;
> > 
> > Indeed, I'm not sure, but I don't know any good fix.
> 
> Bring back the pipe (similar to how posix_spawn receives the status)?

Ah yes.

> > > --- a/src/unistd/faccessat.c
> > > +++ b/src/unistd/faccessat.c
> > > @@ -25,7 +25,7 @@ static int checker(void *p)
> > >  	int i;
> > >  	if (__syscall(SYS_setregid, __syscall(SYS_getegid), -1)
> > >  	    || __syscall(SYS_setreuid, __syscall(SYS_geteuid), -1))
> > > -		__syscall(SYS_exit, 1);
> > > +		return sizeof errors/sizeof *errors - 1;
> > >  	ret = __syscall(SYS_faccessat, c->fd, c->filename, c->amode, 0);
> > >  	for (i=0; i < sizeof errors/sizeof *errors - 1 && ret!=errors[i]; i++);
> > >  	return i;
> > 
> > Looks ok except it encodes an assumption that EBUSY is last. It might
> > make more sense to goto the errno-searching loop.
> 
> Well, the loop also implicitly encodes that assumption anyway: it stops
> at the last entry regardless if it matches, making EBUSY the fallback code
> for unrecognized SYS_faccessat return values.

Indeed.

> The loop will be gone if the pipe method is re-introduced.

Maybe this is the right approach. Anyone else have opinions on it?
This would just mean reverting the most recent commit to it then
possibly fixing other issues you found, right?

Rich


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] faccessat: fix error code on setreXid failure
  2018-01-30 22:07     ` Rich Felker
@ 2018-01-30 22:20       ` Alexander Monakov
  2018-02-01  3:19         ` Rich Felker
  0 siblings, 1 reply; 6+ messages in thread
From: Alexander Monakov @ 2018-01-30 22:20 UTC (permalink / raw)
  To: musl

On Tue, 30 Jan 2018, Rich Felker wrote:
> > We know we are not tracing this child, and stop notifications are only
> > delivered if WUNTRACED is given in flags, aren't they?
> 
> I'm not sure what can happen if it's all running under strace -f or
> something. And I'm not sure what the conditions for stop notifications
> are. If it's assured that they can't happen then maybe the loop can be
> removed.

Well, currently musl is just inconsistent, as in other instance(s?)
(most notably in posix_spawn) it makes a single call to waitpid
without retrying even though the same concerns apply.

Alexander


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] faccessat: fix error code on setreXid failure
  2018-01-30 22:20       ` Alexander Monakov
@ 2018-02-01  3:19         ` Rich Felker
  0 siblings, 0 replies; 6+ messages in thread
From: Rich Felker @ 2018-02-01  3:19 UTC (permalink / raw)
  To: musl

On Wed, Jan 31, 2018 at 01:20:45AM +0300, Alexander Monakov wrote:
> On Tue, 30 Jan 2018, Rich Felker wrote:
> > > We know we are not tracing this child, and stop notifications are only
> > > delivered if WUNTRACED is given in flags, aren't they?
> > 
> > I'm not sure what can happen if it's all running under strace -f or
> > something. And I'm not sure what the conditions for stop notifications
> > are. If it's assured that they can't happen then maybe the loop can be
> > removed.
> 
> Well, currently musl is just inconsistent, as in other instance(s?)
> (most notably in posix_spawn) it makes a single call to waitpid
> without retrying even though the same concerns apply.

OK, reading the spec I think without WUNTRACED or WCONTINUED, waitpid
can only succeed if the child has terminated. So it's probably fine
not to loop as long as signals are blocked.

It's perhaps (probably?) also safe to use CLONE_VM for the same reason
as in posix_spawn: because we have all (including implementation
internal) signals blocked, set*id() cannot succeed while the child
exists and thus you can't get into a situation where the child is
sharing memory but has different credentials. (The attack I have in
mind here to defend against is that the child drops root then runs
user-provided code, and the user-provided code modifies the code
that's running in the other process with elevated privileges.)

Rich

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-02-01  3:19 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-30 20:32 [PATCH] faccessat: fix error code on setreXid failure Alexander Monakov
2018-01-30 21:33 ` Rich Felker
2018-01-30 21:51   ` Alexander Monakov
2018-01-30 22:07     ` Rich Felker
2018-01-30 22:20       ` Alexander Monakov
2018-02-01  3:19         ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).