preventable SIGSEGV when bad AT_SYSINFO

mailing list of musl libc
 help / color / mirror / code / Atom feed

* preventable SIGSEGV when bad AT_SYSINFO_EHDR
@ 2017-09-19 16:46 John Reiser
  2017-09-19 17:21 ` Markus Wichmann
  0 siblings, 1 reply; 3+ messages in thread
From: John Reiser @ 2017-09-19 16:46 UTC (permalink / raw)
  To: musl

__dls3() and friends in musl/ldso/dynlink.c should check Elf headers more carefully.
I saw a SIGSEGV in decode_dyn() because vdso_base = ElfXX_auxv[{AT_SYSINFO_EHDR}].a_ptr
pointed to a region that was all zero, and thus vdso.dynv == 0.  The operating system
kernel is the only one who can perform a fork() or clone(), but other software can
perform execve().  In my case that other software had a bug.  However, the blame
for the SIGSEGV rests on __dls3() because it did not validate input data.  [This is
the stuff of exploits.]  Calling a_crash() is OK; but a preventable SIGSEGV must be
avoided, both directly and because it indicates a lack of secure implementation.

It is [mostly] reasonable that __dls3() should trust that a non-zero vdso_base points to
a region that is readable, is as big and as aligned as an ElfXX_Ehdr, and is const
(no other thread is writing it, neither is any other process via a shared memory mapping);
but after that ldso should check.

In particular, these should be checked:
   0 == memcmp(ELFMAG, &.e_ident[EI_MAG0], SELFMAG)
   .e_machine matches the executing ldso
   .e_ident[{EI_CLASS, EI_DATA}] match the executing ldso
   .e_phnum != 0
   .e_phentsize >= sizeof(ElfXX_Phdr);  and larger *IS ALLOWED*: derived classes, etc.
   .e_phnum * .e_phentsize is not too large  [loops that increment a pointer by .e_phentsize]
   .e_phoff >= sizeof(ElfXX_Ehdr);  overlap of Ehdr and Phdr is a logical error
   (.e_phoff + .e_phnum * .e_phentsize) < .st_size;  no access beyond EOF

-- 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: preventable SIGSEGV when bad AT_SYSINFO_EHDR
  2017-09-19 16:46 preventable SIGSEGV when bad AT_SYSINFO_EHDR John Reiser
@ 2017-09-19 17:21 ` Markus Wichmann
  2017-09-19 18:48   ` Rich Felker
  0 siblings, 1 reply; 3+ messages in thread
From: Markus Wichmann @ 2017-09-19 17:21 UTC (permalink / raw)
  To: musl

On Tue, Sep 19, 2017 at 09:46:19AM -0700, John Reiser wrote:
> __dls3() and friends in musl/ldso/dynlink.c should check Elf headers more carefully.
> I saw a SIGSEGV in decode_dyn() because vdso_base = ElfXX_auxv[{AT_SYSINFO_EHDR}].a_ptr
> pointed to a region that was all zero, and thus vdso.dynv == 0.  The operating system
> kernel is the only one who can perform a fork() or clone(), but other software can
> perform execve().  In my case that other software had a bug.  However, the blame
> for the SIGSEGV rests on __dls3() because it did not validate input data.  [This is
> the stuff of exploits.]  Calling a_crash() is OK; but a preventable SIGSEGV must be
> avoided, both directly and because it indicates a lack of secure implementation.
> 

How esoteric.

As far as I know, the aux headers come from the kernel and are
implicitly trusted because of that. If you have some userspace program
trying to do execve() without a kernel call, then that program needs to
correctly implement the aux headers. Mistrusting aux headers is as
sensible as mistrusting the kernel. For example, we fetch our user
credentials out of the aux headers in __init_libc().

But if your program emulates execve(), then how does security come into
play here? Security is only important if security domains are switched,
which a userspace program can't do (discounting kernel bugs, of course).
And so you're left with a program that might be able to exploit musl
into running arbitrary code in its own security domain. Newsflash: You
can already run arbitrary code in your own security domain. It's called
program execution. And if a buggy program was running with elevated
privileges and somehow could be tricked into running its execve()
emulation with attacker controlled data, then the problem has already
happened a long time ago, and you're merely asking us to fix the
symptoms.

In short, the quickest fix is to not emulate execve() but call the
kernel instead, hoping it won't screw up the ABI (if it does, abandon
ship, since all hope is lost).

Another fix would be to just link your programs statically. That way,
all the __dls*() functions don't run at all.

> It is [mostly] reasonable that __dls3() should trust that a non-zero vdso_base points to
> a region that is readable, is as big and as aligned as an ElfXX_Ehdr, and is const
> (no other thread is writing it, neither is any other process via a shared memory mapping);
> but after that ldso should check.
> 

No, if the value is set then it is correct by definition! Just as much
as AT_UID or AT_SECURE.

> In particular, these should be checked:
>   0 == memcmp(ELFMAG, &.e_ident[EI_MAG0], SELFMAG)
>   .e_machine matches the executing ldso
>   .e_ident[{EI_CLASS, EI_DATA}] match the executing ldso
>   .e_phnum != 0
>   .e_phentsize >= sizeof(ElfXX_Phdr);  and larger *IS ALLOWED*: derived classes, etc.

How do classes come into play in a file format? To my knowledge,
program headers have an explicitly defined layout, and a mismatching
phentsize is indicative of the program header in the file not being what
you thought it was.

>   .e_phnum * .e_phentsize is not too large  [loops that increment a pointer by .e_phentsize]
>   .e_phoff >= sizeof(ElfXX_Ehdr);  overlap of Ehdr and Phdr is a logical error
>   (.e_phoff + .e_phnum * .e_phentsize) < .st_size;  no access beyond EOF
> 

And where does that EOF check come from? We only know the start of the
kernel vDSO, but not its length. We can't stat() it, as it doesn't
really exist, it is merely injected into every process.

Ciao,
Markus

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: preventable SIGSEGV when bad AT_SYSINFO_EHDR
  2017-09-19 17:21 ` Markus Wichmann
@ 2017-09-19 18:48   ` Rich Felker
  0 siblings, 0 replies; 3+ messages in thread
From: Rich Felker @ 2017-09-19 18:48 UTC (permalink / raw)
  To: musl

On Tue, Sep 19, 2017 at 07:21:29PM +0200, Markus Wichmann wrote:
> On Tue, Sep 19, 2017 at 09:46:19AM -0700, John Reiser wrote:
> > __dls3() and friends in musl/ldso/dynlink.c should check Elf headers more carefully.
> > I saw a SIGSEGV in decode_dyn() because vdso_base = ElfXX_auxv[{AT_SYSINFO_EHDR}].a_ptr
> > pointed to a region that was all zero, and thus vdso.dynv == 0.  The operating system
> > kernel is the only one who can perform a fork() or clone(), but other software can
> > perform execve().  In my case that other software had a bug.  However, the blame
> > for the SIGSEGV rests on __dls3() because it did not validate input data.  [This is
> > the stuff of exploits.]  Calling a_crash() is OK; but a preventable SIGSEGV must be
> > avoided, both directly and because it indicates a lack of secure implementation.
> > 
> 
> How esoteric.
> 
> As far as I know, the aux headers come from the kernel and are
> implicitly trusted because of that. If you have some userspace program
> trying to do execve() without a kernel call, then that program needs to
> correctly implement the aux headers. Mistrusting aux headers is as
> sensible as mistrusting the kernel. For example, we fetch our user
> credentials out of the aux headers in __init_libc().
> 
> But if your program emulates execve(), then how does security come into
> play here? Security is only important if security domains are switched,
> which a userspace program can't do (discounting kernel bugs, of course).
> And so you're left with a program that might be able to exploit musl
> into running arbitrary code in its own security domain. Newsflash: You
> can already run arbitrary code in your own security domain. It's called

This is pretty much the end of the story -- no crossing of privilege
domains takes place so there is no "untrusted input from outside the
privilege domain" to validate. Theoretically there's a huge amount of
initial environmental state provided by the kernel/program-loader to
the entry point, not just auxv, and aside from not giving you any new
security properties, maximally validating it would be a huge amount of
work, both implementation work and runtime work, and could still not
catch all possible invalid cases.

Pretty much the only time it does make sense to validate what the
kernel gives you is when there are known historical kernels or other
loaders (Linux or Linux-ABI-compat loaders in other operating systems
or emulators) with bugs that would cause program misbehavior. For
example, see commit 54482898abe8d6d937ee67ea5974cd8eae859c37 which
validates that AT_SYSINFO_EHDR is not just present but nonzero, since
Linux/s390x wrongly passes it but with a zero value.

Rich

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-09-19 18:48 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-19 16:46 preventable SIGSEGV when bad AT_SYSINFO_EHDR John Reiser
2017-09-19 17:21 ` Markus Wichmann
2017-09-19 18:48   ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).