From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/10707 Path: news.gmane.org!.POSTED!not-for-mail From: "LeMay, Michael" Newsgroups: gmane.linux.lib.musl.general Subject: [RFC PATCH v3] support separate stack segment Date: Fri, 4 Nov 2016 20:35:40 +0000 Message-ID: <390CE752059EB848A71F4F676EBAB76D3AC298BC@ORSMSX114.amr.corp.intel.com> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable X-Trace: blaine.gmane.org 1478291766 21730 195.159.176.226 (4 Nov 2016 20:36:06 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Fri, 4 Nov 2016 20:36:06 +0000 (UTC) To: "musl@lists.openwall.com" Original-X-From: musl-return-10720-gllmg-musl=m.gmane.org@lists.openwall.com Fri Nov 04 21:36:02 2016 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1c2lDN-00044x-M2 for gllmg-musl@m.gmane.org; Fri, 04 Nov 2016 21:35:53 +0100 Original-Received: (qmail 32567 invoked by uid 550); 4 Nov 2016 20:35:55 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 32516 invoked from network); 4 Nov 2016 20:35:53 -0000 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.31,445,1473145200"; d="scan'208";a="27682201" Thread-Topic: [RFC PATCH v3] support separate stack segment Thread-Index: AdI22u6xUbkJ9FLrTEm308Nzo11i9g== Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.22.254.140] Xref: news.gmane.org gmane.linux.lib.musl.general:10707 Archived-At: This patch adds support for the separate stack segment feature that is enabled by the LLVM Clang -mseparate-stack-seg flag. This feature hardens SafeStack by defining a limit for DS and ES that is below all of the safe stacks. Only memory operands that need to refer to the safe stacks are directed to the SS segment. Thus, even if a pointer to a safe stack is used by some other memory operand, the segmentation feature of the CPU will block that access. The i386 __clone implementation is written in assembly language, so the compiler is unable to automatically add a stack segment override prefix to an instruction in that routine that accesses a safe stack. This patch adds that prefix to the source code. The Linux vDSO code may be incompatible with programs that use a separate stack segment. This patch prevents the vDSO from being invoked when that feature is enabled. The developer should add something like "-I/usr/include/x86_64-linux-gnu/asm" to CPPFLAGS during configuration so that ldt.h can be included in separate_stack_seg.c. Signed-off-by: Michael LeMay --- Makefile | 31 ++++++- arch/i386/syscall_arch.h | 9 ++ configure | 10 ++ src/env/__libc_start_main.c | 2 + src/internal/i386/separate_stack_seg.c | 163 +++++++++++++++++++++++++++++= ++++ src/internal/safe_stack.c | 18 +++- src/internal/separate_stack_seg.c | 13 +++ src/thread/i386/clone.s | 3 +- 8 files changed, 240 insertions(+), 9 deletions(-) create mode 100644 src/internal/i386/separate_stack_seg.c create mode 100644 src/internal/separate_stack_seg.c diff --git a/Makefile b/Makefile index 1eb7dca..a8126ae 100644 --- a/Makefile +++ b/Makefile @@ -48,8 +48,8 @@ CFLAGS_C99FSE =3D -std=3Dc99 -ffreestanding -nostdinc CFLAGS_ALL =3D $(CFLAGS_C99FSE) CFLAGS_ALL +=3D -D_XOPEN_SOURCE=3D700 -I$(srcdir)/arch/$(ARCH) -I$(srcdir)= /arch/generic -Iobj/src/internal -I$(srcdir)/src/internal -Iobj/include -I$= (srcdir)/include CFLAGS_ALL +=3D $(CPPFLAGS) $(CFLAGS_AUTO) -# This flag is selectively re-added for certain files below. -CFLAGS_ALL +=3D $(filter-out -fsanitize=3Dsafe-stack,$(CFLAGS)) +# These flags are selectively re-added for certain files below. +CFLAGS_ALL +=3D $(filter-out -fsanitize=3Dsafe-stack -mseparate-stack-seg,= $(CFLAGS)) =20 LDFLAGS_ALL =3D $(LDFLAGS_AUTO) $(LDFLAGS) =20 @@ -134,14 +134,13 @@ NOSSP_SRCS =3D $(wildcard crt/*.c) \ ldso/dlstart.c ldso/dynlink.c $(NOSSP_SRCS:%.c=3Dobj/%.o) $(NOSSP_SRCS:%.c=3Dobj/%.lo): CFLAGS_ALL +=3D = $(CFLAGS_NOSSP) =20 +SEP_STACK_SEG_OBJS =3D $(filter-out $(CRT_OBJS) obj/ldso/dlstart.o,$(ALL_O= BJS)) # The safestack attribute will be selectively forced within these files be= low. SAFE_STACK_OBJS =3D $(filter-out \ - $(CRT_OBJS) \ - obj/ldso/dlstart.o \ $(addprefix obj/src/env/__init_tls.,o lo) \ $(addprefix obj/src/env/__libc_start_main.,o lo) \ $(addprefix obj/src/internal/safe_stack.,o lo), \ - $(ALL_OBJS)) + $(SEP_STACK_SEG_OBJS)) =20 ifeq ($(SAFE_STACK),yes) =20 @@ -181,6 +180,28 @@ $(SAFE_STACK_OBJS) $(SAFE_STACK_OBJS:%.o=3D%.lo): CFLA= GS_ALL +=3D -fsanitize=3Dsafe-st =20 endif =20 +ifeq ($(ARCH)$(SUBARCH),i386ss) + +CFLAGS_ALL +=3D -DSEP_STACK_SEG=3D1 + +$(SEP_STACK_SEG_OBJS) $(SEP_STACK_SEG_OBJS:%.o=3D%.lo): CFLAGS_ALL +=3D -m= separate-stack-seg + +# The following function is run prior to segment restrictions being activa= ted. +# It contains code that compiles to an instruction that may access either = a +# stack allocation or a non-stack allocation depending on a condition. +# The X86FixupSeparateStack pass in LLVM does not support such code, since +# simply directing the affected memory operand to either DS or SS could re= sult +# in an invalid memory access if DS and SS have different base addresses. +# Specifying this attribute causes the X86FixupSeparateStack pass to not +# process this function. An alternative approach could have been to direc= t such +# memory operands to SS if DS and SS have the same base address, which is = the +# way that musl configures the segments. However, such compiler support h= as not +# been implemented. Regardless, adding segment override prefixes to funct= ions +# that only run with a flat memory model is superfluous. +obj/ldso/dynlink.lo: CFLAGS_ALL +=3D -mllvm -sep-stk-seg-flat-mem-func=3D_= _dls3 + +endif + $(CRT_OBJS): CFLAGS_ALL +=3D -DCRT =20 $(LOBJS) $(LDSO_OBJS): CFLAGS_ALL +=3D -fPIC diff --git a/arch/i386/syscall_arch.h b/arch/i386/syscall_arch.h index 4c9d874..4d7c3c2 100644 --- a/arch/i386/syscall_arch.h +++ b/arch/i386/syscall_arch.h @@ -52,8 +52,17 @@ static inline long __syscall6(long n, long a1, long a2, = long a3, long a4, long a return __ret; } =20 +#if !SEP_STACK_SEG +/* The vDSO may not be compiled with support for a separate stack segment. + * Avoid invoking the vDSO when this feature is enabled, since it may try = to + * access the stack using memory operands with base registers other than E= BP or + * ESP without also using a stack segment override prefix. A special comp= iler + * pass needs to be used to add such prefixes, and it is unlikely that a p= ass + * of that sort was applied when the vDSO was compiled. + */ #define VDSO_USEFUL #define VDSO_CGT_SYM "__vdso_clock_gettime" #define VDSO_CGT_VER "LINUX_2.6" +#endif =20 #define SYSCALL_USE_SOCKETCALL diff --git a/configure b/configure index a70009f..7d1d0b4 100755 --- a/configure +++ b/configure @@ -599,6 +599,16 @@ t=3D"$CFLAGS_C99FSE $CPPFLAGS $CFLAGS" =20 fnmatch '-fsanitize=3Dsafe-stack*|*\ -fsanitize=3Dsafe-stack*' "$CFLAGS" &= & SAFE_STACK=3Dyes =20 +if test "$ARCH" =3D "i386" ; then +if fnmatch '-mseparate-stack-seg*|*\ -mseparate-stack-seg*' "$CFLAGS" ; th= en +if test "$SAFE_STACK" =3D "yes" ; then +SUBARCH=3Dss +else +fail "$0: the separate stack segment feature requires that SafeStack also = be enabled" +fi +fi +fi + if test "$ARCH" =3D "x86_64" ; then trycppif __ILP32__ "$t" && ARCH=3Dx32 fi diff --git a/src/env/__libc_start_main.c b/src/env/__libc_start_main.c index dfb4ebb..424cf88 100644 --- a/src/env/__libc_start_main.c +++ b/src/env/__libc_start_main.c @@ -19,6 +19,7 @@ weak_alias(dummy1, __init_ssp); =20 void __preinit_unsafe_stack(void); void __init_unsafe_stack(void); +void __sep_stack_seg_init(int argc, char ***argvp, char ***envpp); =20 #define AUX_CNT 38 =20 @@ -73,6 +74,7 @@ int __libc_start_main(int (*main)(int,char **,char **), i= nt argc, char **argv) =20 __preinit_unsafe_stack(); __init_libc(envp, argv[0]); + __sep_stack_seg_init(argc, &argv, &envp); __libc_start_init(); =20 /* Pass control to the application */ diff --git a/src/internal/i386/separate_stack_seg.c b/src/internal/i386/sep= arate_stack_seg.c new file mode 100644 index 0000000..fb796cc --- /dev/null +++ b/src/internal/i386/separate_stack_seg.c @@ -0,0 +1,163 @@ +#define _GNU_SOURCE +#include "atomic.h" +#include "libc.h" +#include "syscall.h" +#include +#include +#include + +#if SEP_STACK_SEG + +#include + +char *__strdup(const char *s); + +extern uintptr_t __stack_base; + +static int modify_ldt(int func, void *ptr, unsigned long bytecount) { + return syscall(SYS_modify_ldt, func, ptr, bytecount); +} + +static void update_ldt(size_t len) { + struct user_desc stack_desc; + stack_desc.entry_number =3D 0; + stack_desc.base_addr =3D 0; + stack_desc.contents =3D 0; /* data */ + stack_desc.limit =3D (int)((len - 1) >> 12); + stack_desc.limit_in_pages =3D 1; + stack_desc.read_exec_only =3D 0; + stack_desc.seg_not_present =3D 0; + stack_desc.seg_32bit =3D 1; + stack_desc.useable =3D 1; + + if (modify_ldt(1, &stack_desc, sizeof(stack_desc)) =3D=3D -1) + a_crash(); +} + +static void verify_ldt_empty(void) { + uint64_t ldt; + + /* read the current LDT */ + int ldt_len =3D modify_ldt(0, &ldt, sizeof(ldt)); + if (ldt_len !=3D 0) + /* LDT not empty */ + a_crash(); +} + +#define SEG_SEL_LDT 4 +#define SEG_SEL_CPL3 3 +/* require restricted segment to occupy the first LDT entry */ +#define SEG_SEL_RESTRICTED (SEG_SEL_LDT | SEG_SEL_CPL3) + +__attribute__((__visibility__("hidden"))) +void __restrict_segments(void) +{ + uintptr_t limit, stack_base =3D __stack_base; + int data_seg_sel; + + __asm__ __volatile__ ("mov %%ds, %0" : "=3Dr"(data_seg_sel)); + /* assume that ES is identical to DS */ + + if ((data_seg_sel & SEG_SEL_LDT) =3D=3D SEG_SEL_LDT) { + if (data_seg_sel !=3D SEG_SEL_RESTRICTED) + a_crash(); + + /* Read the current limit from the segment register rather than + * relying on __stack_base, since __stack_base is in the + * default data segment and could potentially be subject to + * memory corruption. */ + __asm__ __volatile__ ("lsl %1, %0" : "=3Dr"(limit) : "m"(data_seg_sel)); + + if (limit < stack_base) + return; + } else + verify_ldt_empty(); + + update_ldt(stack_base); + + /* Reload the DS and ES segment registers from the new or updated LDT + * entry. */ + __asm__ __volatile__ ( + "mov %0, %%ds\n\t" + "mov %0, %%es\n\t" + :: + "r"(SEG_SEL_RESTRICTED) + ); +} + +extern char **__environ; + +/* Programs and much of the libc code expect to be able to access the argu= ments, + * environment, and auxv in DS, but they are initially located on the stac= k. + * This function moves them to the heap. This uses __strdup to copy data f= rom + * the stack, so it must run before segment limits are restricted. + */ +__attribute__((__visibility__("hidden"))) +void __sep_stack_seg_init(int argc, char ***argvp, char ***envpp) +{ + char **argv =3D *argvp; + char **envp =3D *envpp; + char **environ_end =3D envp; + size_t *auxv, *auxv_end; + char **new_argv =3D 0; + + while (*environ_end) environ_end++; + + auxv_end =3D (size_t *)environ_end + 1; + while (*auxv_end) auxv_end++; + auxv_end++; + + new_argv =3D malloc((uintptr_t)auxv_end - (uintptr_t)argvp); + if (!new_argv) + a_crash(); + + *new_argv =3D (char *)argc; + new_argv++; + + *argvp =3D new_argv; + + for (int i =3D 0; i < argc; i++) + new_argv[i] =3D __strdup(argv[i]); + new_argv +=3D argc; + *new_argv =3D NULL; + new_argv++; + + *envpp =3D __environ =3D new_argv; + while (envp !=3D environ_end) { + *new_argv =3D __strdup(*envp); + envp++; + new_argv++; + } + *new_argv =3D NULL; + envp++; + new_argv++; + + libc.auxv =3D (size_t *)new_argv; + memcpy(new_argv, envp, (uintptr_t)auxv_end - (uintptr_t)envp); + + __restrict_segments(); +} + +uintptr_t __safestack_addr_hint(size_t size) +{ + /* Try to allocate the new safe stack just below the lowest existing safe + * stack to help avoid a data segment limit that is too low and causes + * faults when accessing non-stack data above the limit. */ + + return __stack_base - size; +} + +#else + +static void dummy(void) {} +weak_alias(dummy, __restrict_segments); +static void dummy1(int argc, char ***argvp, char ***envpp) {} +weak_alias(dummy1, __sep_stack_seg_init); + +__attribute__((__visibility__("hidden"))) +uintptr_t __safestack_addr_hint(size_t size) +{ + return 0; +} + +#endif /*SEP_STACK_SEG*/ diff --git a/src/internal/safe_stack.c b/src/internal/safe_stack.c index 86804aa..54df703 100644 --- a/src/internal/safe_stack.c +++ b/src/internal/safe_stack.c @@ -8,9 +8,14 @@ =20 static bool unsafe_stack_ptr_inited =3D false; =20 +/* base address of safe stack allocated most recently */ +__attribute__((__visibility__("hidden"))) +uintptr_t __stack_base; + void *__mmap(void *, size_t, int, int, int, off_t); int __munmap(void *, size_t); int __mprotect(void *, size_t, int); +void __restrict_segments(void); =20 /* There are no checks for overflows past the end of this stack buffer. I= t must * be allocated with adequate space to meet the requirements of all of the= code @@ -21,7 +26,6 @@ static unsigned char preinit_us[4096]; __attribute__((__visibility__("hidden"))) void __init_unsafe_stack(void) { - void *stack_base; size_t stack_size; pthread_attr_t attr; struct pthread *self; @@ -39,7 +43,7 @@ void __init_unsafe_stack(void) if (pthread_getattr_np(self, &attr) !=3D 0) a_crash(); =20 - if (pthread_attr_getstack(&attr, &stack_base, &stack_size) !=3D 0) + if (pthread_attr_getstack(&attr, (void **)&__stack_base, &stack_size) != =3D 0) a_crash(); =20 stack_size *=3D 2; @@ -77,6 +81,8 @@ void __preinit_unsafe_stack(void) =20 #define ROUND(x) (((x)+PAGE_SIZE-1)&-PAGE_SIZE) =20 +uintptr_t __safestack_addr_hint(size_t size); + __attribute__((__visibility__("hidden"))) int __safestack_init_thread(struct pthread *restrict new, const pthread_at= tr_t *restrict attr) { @@ -88,7 +94,9 @@ int __safestack_init_thread(struct pthread *restrict new,= const pthread_attr_t * guard =3D ROUND(DEFAULT_GUARD_SIZE + attr->_a_guardsize); size =3D ROUND(new->stack_size + guard); =20 - map =3D __mmap(0, size, PROT_NONE, MAP_PRIVATE|MAP_ANON, -1, 0); + uintptr_t try_map =3D __safestack_addr_hint(size); + + map =3D __mmap((void *)try_map, size, PROT_NONE, MAP_PRIVATE|MAP_ANON, -1= , 0); if (map =3D=3D MAP_FAILED) goto fail; if (__mprotect(map+guard, size-guard, PROT_READ|PROT_WRITE) @@ -101,6 +109,10 @@ int __safestack_init_thread(struct pthread *restrict n= ew, const pthread_attr_t * new->safe_stack_size =3D size; new->stack =3D map + size; =20 + __stack_base =3D (uintptr_t)map; + + __restrict_segments(); + return 0; fail: return EAGAIN; diff --git a/src/internal/separate_stack_seg.c b/src/internal/separate_stac= k_seg.c new file mode 100644 index 0000000..fe665e3 --- /dev/null +++ b/src/internal/separate_stack_seg.c @@ -0,0 +1,13 @@ +#include "libc.h" +#include + +static void dummy(void) {} +weak_alias(dummy, __restrict_segments); +static void dummy1(int argc, char ***argvp, char ***envpp) {} +weak_alias(dummy1, __sep_stack_seg_init); + +__attribute__((__visibility__("hidden"))) +uintptr_t __safestack_addr_hint(size_t size) +{ + return 0; +} diff --git a/src/thread/i386/clone.s b/src/thread/i386/clone.s index 52fe7ef..f864231 100644 --- a/src/thread/i386/clone.s +++ b/src/thread/i386/clone.s @@ -22,7 +22,8 @@ __clone: and $-16,%ecx sub $16,%ecx mov 20(%ebp),%edi - mov %edi,(%ecx) + /* the ss prefix is needed to support the separate stack segment feature:= */ + mov %edi,%ss:(%ecx) mov 24(%ebp),%edx mov %esp,%esi mov 32(%ebp),%edi --=20 2.7.4