* thoughts on reallocarray, explicit_bzero? @ 2014-05-19 15:31 Isaac Dunham 2014-05-19 15:43 ` Rich Felker ` (2 more replies) 0 siblings, 3 replies; 30+ messages in thread From: Isaac Dunham @ 2014-05-19 15:31 UTC (permalink / raw) To: musl Having read up on the LibreSSL fork of OpenSSL and also recently backported a nuber of libXfont CVE fixes for integer overflows, I've seen the risk posed by malloc(n*sizeof(x)) and realloc(ptr, n*sizeof(x)). calloc(n, sizeof(x)) can be used in place of malloc(n * sizeof(x)), but there's no standard function that does overflow checking for realloc(). OpenBSD has provided the extension reallocarray(), which provides for bounds checking like calloc() does. Additionally, there are times when a compiler will optimize away calls to bzero() on areas that are not used before free(); this can result in passwords getting left in memory. OpenBSD uses a wrapper function called explicit_bzero() to keep this from happening, thugh it seems to be possible to use some ugliness with volatile to stop it. Should musl provide reallocarray()? And what's the best way to ensure that memory gets zeroed out? Thanks, Isaac Dunham ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: thoughts on reallocarray, explicit_bzero? 2014-05-19 15:31 thoughts on reallocarray, explicit_bzero? Isaac Dunham @ 2014-05-19 15:43 ` Rich Felker 2014-05-19 16:19 ` Daniel Cegiełka 2014-05-19 15:44 ` Daniel Cegiełka 2014-05-19 16:25 ` Szabolcs Nagy 2 siblings, 1 reply; 30+ messages in thread From: Rich Felker @ 2014-05-19 15:43 UTC (permalink / raw) To: musl On Mon, May 19, 2014 at 08:31:31AM -0700, Isaac Dunham wrote: > Having read up on the LibreSSL fork of OpenSSL and also recently > backported a nuber of libXfont CVE fixes for integer overflows, > I've seen the risk posed by malloc(n*sizeof(x)) and realloc(ptr, > n*sizeof(x)). > calloc(n, sizeof(x)) can be used in place of malloc(n * sizeof(x)), > but there's no standard function that does overflow checking for > realloc(). OpenBSD has provided the extension reallocarray(), which > provides for bounds checking like calloc() does. > > Additionally, there are times when a compiler will optimize away calls > to bzero() on areas that are not used before free(); this can result in > passwords getting left in memory. OpenBSD uses a wrapper function called > explicit_bzero() to keep this from happening, thugh it seems to be possible > to use some ugliness with volatile to stop it. > > Should musl provide reallocarray()? I don't think so; there's no precedent for it yet except OpenBSD. LibreSSL should just add a fallback implementation (trivial to write as a wrapper to realloc) for all non-OpenBSD systems that lack it. I'm not fundamentally opposed to reallocarray, but I also don't think it's useful to implement it unless there's a consensus from other implementations on adding it. > And what's the best way to ensure that memory gets zeroed out? read() from /dev/zero? ;-) Seriously, this is a tough one, because no matter what you try, a smart enough compiler will work around it. There's the Annex K memset_s, but Annex K is generally horrible. The OpenBSD approach of "explicit_bzero" is also pretty bad since "bzero" is a deprecated interface. Note that there's not even a way to _implement_ this kind of zero-filling without resorting to compiler extensions or asm. The compiler is free (e.g. with LTO) to look inside the implementation and determine that it does nothing. Solar has one clever "solution" to this problem in the bcrypt code he contributed to musl (see the source for the trick) but it's not an explicit zeroing; rather it's a reuse of the buffer in a way that would be very difficult for the compiler to optimize out. Rich ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: thoughts on reallocarray, explicit_bzero? 2014-05-19 15:43 ` Rich Felker @ 2014-05-19 16:19 ` Daniel Cegiełka 2014-05-20 6:19 ` Rich Felker 0 siblings, 1 reply; 30+ messages in thread From: Daniel Cegiełka @ 2014-05-19 16:19 UTC (permalink / raw) To: musl 2014-05-19 17:43 GMT+02:00 Rich Felker <dalias@libc.org>: >> >> Should musl provide reallocarray()? > > I don't think so; there's no precedent for it yet except OpenBSD. > LibreSSL should just add a fallback implementation (trivial to write > as a wrapper to realloc) for all non-OpenBSD systems that lack it. The problem is that these functions are now used everywhere in openbsd's repo., so it's not just LibreSSL but ed, pax, mandoc, m4, (open)ssh, (open)smtpd, etc. Thanks, Daniel ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: thoughts on reallocarray, explicit_bzero? 2014-05-19 16:19 ` Daniel Cegiełka @ 2014-05-20 6:19 ` Rich Felker 2014-05-20 15:50 ` Daniel Cegiełka 0 siblings, 1 reply; 30+ messages in thread From: Rich Felker @ 2014-05-20 6:19 UTC (permalink / raw) To: musl On Mon, May 19, 2014 at 06:19:11PM +0200, Daniel Cegiełka wrote: > 2014-05-19 17:43 GMT+02:00 Rich Felker <dalias@libc.org>: > > >> > >> Should musl provide reallocarray()? > > > > I don't think so; there's no precedent for it yet except OpenBSD. > > LibreSSL should just add a fallback implementation (trivial to write > > as a wrapper to realloc) for all non-OpenBSD systems that lack it. > > The problem is that these functions are now used everywhere in > openbsd's repo., so it's not just LibreSSL but ed, pax, mandoc, m4, > (open)ssh, (open)smtpd, etc. It looks like glibc is willing to add reallocarray, so that makes at least two systems now. If there's demand maybe we should add it. Rich ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: thoughts on reallocarray, explicit_bzero? 2014-05-20 6:19 ` Rich Felker @ 2014-05-20 15:50 ` Daniel Cegiełka 0 siblings, 0 replies; 30+ messages in thread From: Daniel Cegiełka @ 2014-05-20 15:50 UTC (permalink / raw) To: musl 2014-05-20 8:19 GMT+02:00 Rich Felker <dalias@libc.org>: > On Mon, May 19, 2014 at 06:19:11PM +0200, Daniel Cegiełka wrote: >> 2014-05-19 17:43 GMT+02:00 Rich Felker <dalias@libc.org>: >> >> >> >> >> Should musl provide reallocarray()? >> > >> > I don't think so; there's no precedent for it yet except OpenBSD. >> > LibreSSL should just add a fallback implementation (trivial to write >> > as a wrapper to realloc) for all non-OpenBSD systems that lack it. >> >> The problem is that these functions are now used everywhere in >> openbsd's repo., so it's not just LibreSSL but ed, pax, mandoc, m4, >> (open)ssh, (open)smtpd, etc. > > It looks like glibc is willing to add reallocarray, so that makes at > least two systems now. If there's demand maybe we should add it. If glibc will support this function, then we can also in musl. These functions will be distributed together with openssh, libressl, mandoc or pax etc., so I think it is better to have it well done in libc (like strlcat and strlcpy). Daniel > > Rich ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: thoughts on reallocarray, explicit_bzero? 2014-05-19 15:31 thoughts on reallocarray, explicit_bzero? Isaac Dunham 2014-05-19 15:43 ` Rich Felker @ 2014-05-19 15:44 ` Daniel Cegiełka 2014-05-19 16:16 ` Rich Felker 2014-05-19 16:25 ` Szabolcs Nagy 2 siblings, 1 reply; 30+ messages in thread From: Daniel Cegiełka @ 2014-05-19 15:44 UTC (permalink / raw) To: musl [-- Attachment #1: Type: text/plain, Size: 1115 bytes --] 2014-05-19 17:31 GMT+02:00 Isaac Dunham <ibid.ag@gmail.com>: > Having read up on the LibreSSL fork of OpenSSL and also recently > backported a nuber of libXfont CVE fixes for integer overflows, > I've seen the risk posed by malloc(n*sizeof(x)) and realloc(ptr, > n*sizeof(x)). > calloc(n, sizeof(x)) can be used in place of malloc(n * sizeof(x)), > but there's no standard function that does overflow checking for > realloc(). OpenBSD has provided the extension reallocarray(), which > provides for bounds checking like calloc() does. > > Additionally, there are times when a compiler will optimize away calls > to bzero() on areas that are not used before free(); this can result in > passwords getting left in memory. OpenBSD uses a wrapper function called > explicit_bzero() to keep this from happening, thugh it seems to be possible > to use some ugliness with volatile to stop it. > > Should musl provide reallocarray()? In my opinion, yes, we should. btw. no bzero()/bcopy() but memset() and memcpy() etc. Daniel > And what's the best way to ensure that memory gets zeroed out? > > Thanks, > Isaac Dunham [-- Attachment #2: explicit_bzero.diff --] [-- Type: text/plain, Size: 796 bytes --] diff -urN musl.orig/include/string.h musl/include/string.h --- musl.orig/include/string.h Fri May 9 09:49:36 2014 +++ musl/include/string.h Fri May 9 09:57:10 2014 @@ -82,6 +82,7 @@ char *strsep(char **, const char *); size_t strlcat (char *, const char *, size_t); size_t strlcpy (char *, const char *, size_t); +void explicit_bzero(void *b, size_t len); #endif #ifdef _GNU_SOURCE diff -urN musl.orig/src/string/explicit_bzero.c musl/src/string/explicit_bzero.c --- musl.orig/src/string/explicit_bzero.c Thu Jan 1 00:00:00 1970 +++ musl/src/string/explicit_bzero.c Fri May 9 09:57:45 2014 @@ -0,0 +1,8 @@ +#include <string.h> + +static void *(*volatile explicit_memset)(void *, int, size_t) = memset; + +void explicit_bzero(void *b, size_t len) +{ + (*explicit_memset)(b, 0, len); +} [-- Attachment #3: reallocarray.diff --] [-- Type: text/plain, Size: 1131 bytes --] diff -urN musl.orig/include/stdlib.h musl/include/stdlib.h --- musl.orig/include/stdlib.h Thu May 8 09:04:08 2014 +++ musl/include/stdlib.h Thu May 8 09:11:06 2014 @@ -44,6 +44,9 @@ void *realloc (void *, size_t); void free (void *); void *aligned_alloc(size_t alignment, size_t size); +#ifdef _BSD_SOURCE +void *reallocarray(void *, size_t, size_t); +#endif _Noreturn void abort (void); int atexit (void (*) (void)); diff -urN musl.orig/src/stdlib/reallocarray.c musl/src/stdlib/reallocarray.c --- musl.orig/src/stdlib/reallocarray.c Thu Jan 1 00:00:00 1970 +++ musl/src/stdlib/reallocarray.c Thu May 8 09:06:30 2014 @@ -0,0 +1,17 @@ +#include <stdlib.h> +#include <limits.h> +#include <errno.h> + +/* this is sqrt(SIZE_MAX+1), as s1*s2 <= SIZE_MAX + * if both s1 < MUL_NO_OVERFLOW and s2 < MUL_NO_OVERFLOW */ +#define MUL_NO_OVERFLOW (1UL << (sizeof(size_t) * 4)) + +void *reallocarray(void *optr, size_t nmemb, size_t size) +{ + if ((nmemb >= MUL_NO_OVERFLOW || size >= MUL_NO_OVERFLOW) && + nmemb > 0 && SSIZE_MAX / nmemb < size) { + errno = ENOMEM; + return NULL; + } + return realloc(optr, size * nmemb); +} ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: thoughts on reallocarray, explicit_bzero? 2014-05-19 15:44 ` Daniel Cegiełka @ 2014-05-19 16:16 ` Rich Felker 2014-05-19 16:30 ` Daniel Cegiełka ` (2 more replies) 0 siblings, 3 replies; 30+ messages in thread From: Rich Felker @ 2014-05-19 16:16 UTC (permalink / raw) To: musl On Mon, May 19, 2014 at 05:44:59PM +0200, Daniel Cegiełka wrote: > 2014-05-19 17:31 GMT+02:00 Isaac Dunham <ibid.ag@gmail.com>: > > Having read up on the LibreSSL fork of OpenSSL and also recently > > backported a nuber of libXfont CVE fixes for integer overflows, > > I've seen the risk posed by malloc(n*sizeof(x)) and realloc(ptr, > > n*sizeof(x)). > > calloc(n, sizeof(x)) can be used in place of malloc(n * sizeof(x)), > > but there's no standard function that does overflow checking for > > realloc(). OpenBSD has provided the extension reallocarray(), which > > provides for bounds checking like calloc() does. > > > > Additionally, there are times when a compiler will optimize away calls > > to bzero() on areas that are not used before free(); this can result in > > passwords getting left in memory. OpenBSD uses a wrapper function called > > explicit_bzero() to keep this from happening, thugh it seems to be possible > > to use some ugliness with volatile to stop it. > > > > Should musl provide reallocarray()? > > In my opinion, yes, we should. In the long term I'm not strongly decided one way or the other, but LibreSSL needs to provide its own fallbacks for these to be portable to any system but OpenBSD. I don't want to be part of their game of imposing new nonstandard interfaces where they could just as easily achieve the same via wrapping standard interfaces. > btw. no bzero()/bcopy() but memset() and memcpy() etc. > > Daniel > > > And what's the best way to ensure that memory gets zeroed out? > > > > Thanks, > > Isaac Dunham > diff -urN musl.orig/include/string.h musl/include/string.h > --- musl.orig/include/string.h Fri May 9 09:49:36 2014 > +++ musl/include/string.h Fri May 9 09:57:10 2014 > @@ -82,6 +82,7 @@ > char *strsep(char **, const char *); > size_t strlcat (char *, const char *, size_t); > size_t strlcpy (char *, const char *, size_t); > +void explicit_bzero(void *b, size_t len); > #endif > > #ifdef _GNU_SOURCE > diff -urN musl.orig/src/string/explicit_bzero.c musl/src/string/explicit_bzero.c > --- musl.orig/src/string/explicit_bzero.c Thu Jan 1 00:00:00 1970 > +++ musl/src/string/explicit_bzero.c Fri May 9 09:57:45 2014 > @@ -0,0 +1,8 @@ > +#include <string.h> > + > +static void *(*volatile explicit_memset)(void *, int, size_t) = memset; > + > +void explicit_bzero(void *b, size_t len) > +{ > + (*explicit_memset)(b, 0, len); > +} This is a nice trick, but IIRC I actually observed GCC optimizing out similar code before (instead of your static volatile, I used a volatile compound literal). At least the concept is right though: you want to prevent the compiler from being able to do any flow analysis at compile time, and making the function pointer volatile achieves this rather well. On the other hand, GCC will put the volatile pointer (if it even emits it) in non-constant memory, meaning it's an additional attack vector for function-pointer-overwrite attacks. And as mentioned in the other email, I don't really like the idea of making a new variant of a deprecated/removed function (bzero). The name "explicit" is rather unclear what it means too. > diff -urN musl.orig/include/stdlib.h musl/include/stdlib.h > --- musl.orig/include/stdlib.h Thu May 8 09:04:08 2014 > +++ musl/include/stdlib.h Thu May 8 09:11:06 2014 > @@ -44,6 +44,9 @@ > void *realloc (void *, size_t); > void free (void *); > void *aligned_alloc(size_t alignment, size_t size); > +#ifdef _BSD_SOURCE > +void *reallocarray(void *, size_t, size_t); > +#endif > > _Noreturn void abort (void); > int atexit (void (*) (void)); > diff -urN musl.orig/src/stdlib/reallocarray.c musl/src/stdlib/reallocarray.c > --- musl.orig/src/stdlib/reallocarray.c Thu Jan 1 00:00:00 1970 > +++ musl/src/stdlib/reallocarray.c Thu May 8 09:06:30 2014 > @@ -0,0 +1,17 @@ > +#include <stdlib.h> > +#include <limits.h> > +#include <errno.h> > + > +/* this is sqrt(SIZE_MAX+1), as s1*s2 <= SIZE_MAX > + * if both s1 < MUL_NO_OVERFLOW and s2 < MUL_NO_OVERFLOW */ > +#define MUL_NO_OVERFLOW (1UL << (sizeof(size_t) * 4)) > + > +void *reallocarray(void *optr, size_t nmemb, size_t size) > +{ > + if ((nmemb >= MUL_NO_OVERFLOW || size >= MUL_NO_OVERFLOW) && > + nmemb > 0 && SSIZE_MAX / nmemb < size) { > + errno = ENOMEM; > + return NULL; > + } > + return realloc(optr, size * nmemb); > +} While it's a bit ugly, if your goal is efficiency, it makes a lot more sense to special-case 32-bit systems and do a 32x32 -> 64 multiply. This makes it so you don't need division code or any extra branches. And for 64-bit systems, either nmemb or size being >32bit would be a pathological corner case (and very slow already anyway), so your check is efficient. Also, is there a reason you're using SSIZE_MAX? SIZE_MAX should work just as well here, but functionally it makes no difference. Rich ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: thoughts on reallocarray, explicit_bzero? 2014-05-19 16:16 ` Rich Felker @ 2014-05-19 16:30 ` Daniel Cegiełka 2014-05-19 16:32 ` Szabolcs Nagy 2015-01-28 22:01 ` Daniel Cegiełka 2 siblings, 0 replies; 30+ messages in thread From: Daniel Cegiełka @ 2014-05-19 16:30 UTC (permalink / raw) To: musl 2014-05-19 18:16 GMT+02:00 Rich Felker <dalias@libc.org>: >> _Noreturn void abort (void); >> int atexit (void (*) (void)); >> diff -urN musl.orig/src/stdlib/reallocarray.c musl/src/stdlib/reallocarray.c >> --- musl.orig/src/stdlib/reallocarray.c Thu Jan 1 00:00:00 1970 >> +++ musl/src/stdlib/reallocarray.c Thu May 8 09:06:30 2014 >> @@ -0,0 +1,17 @@ >> +#include <stdlib.h> >> +#include <limits.h> >> +#include <errno.h> >> + >> +/* this is sqrt(SIZE_MAX+1), as s1*s2 <= SIZE_MAX >> + * if both s1 < MUL_NO_OVERFLOW and s2 < MUL_NO_OVERFLOW */ >> +#define MUL_NO_OVERFLOW (1UL << (sizeof(size_t) * 4)) >> + >> +void *reallocarray(void *optr, size_t nmemb, size_t size) >> +{ >> + if ((nmemb >= MUL_NO_OVERFLOW || size >= MUL_NO_OVERFLOW) && >> + nmemb > 0 && SSIZE_MAX / nmemb < size) { >> + errno = ENOMEM; >> + return NULL; >> + } >> + return realloc(optr, size * nmemb); >> +} > > While it's a bit ugly, if your goal is efficiency, it makes a lot more > sense to special-case 32-bit systems and do a 32x32 -> 64 multiply. > This makes it so you don't need division code or any extra branches. > And for 64-bit systems, either nmemb or size being >32bit would be a > pathological corner case (and very slow already anyway), so your check > is efficient. It was a quick fix only from malloc.c from openbsd :) I use a lot of programs from openbsd and I had problems with the compilation. > > Also, is there a reason you're using SSIZE_MAX? SIZE_MAX should work > just as well here, but functionally it makes no difference. yes, I should correct it: http://www.openbsd.org/cgi-bin/cvsweb/src/lib/libc/stdlib/reallocarray.c?rev=1.1;content-type=text%2Fplain Thanks, Daniel > Rich ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: thoughts on reallocarray, explicit_bzero? 2014-05-19 16:16 ` Rich Felker 2014-05-19 16:30 ` Daniel Cegiełka @ 2014-05-19 16:32 ` Szabolcs Nagy 2015-01-28 22:01 ` Daniel Cegiełka 2 siblings, 0 replies; 30+ messages in thread From: Szabolcs Nagy @ 2014-05-19 16:32 UTC (permalink / raw) To: musl * Rich Felker <dalias@libc.org> [2014-05-19 12:16:54 -0400]: > On Mon, May 19, 2014 at 05:44:59PM +0200, Daniel Cegie??ka wrote: > > +#define MUL_NO_OVERFLOW (1UL << (sizeof(size_t) * 4)) > > + > > +void *reallocarray(void *optr, size_t nmemb, size_t size) > > +{ > > + if ((nmemb >= MUL_NO_OVERFLOW || size >= MUL_NO_OVERFLOW) && > > + nmemb > 0 && SSIZE_MAX / nmemb < size) { > > + errno = ENOMEM; > > + return NULL; > > + } > > + return realloc(optr, size * nmemb); > > +} > > While it's a bit ugly, if your goal is efficiency, it makes a lot more > sense to special-case 32-bit systems and do a 32x32 -> 64 multiply. > This makes it so you don't need division code or any extra branches. > And for 64-bit systems, either nmemb or size being >32bit would be a > pathological corner case (and very slow already anyway), so your check > is efficient. > > Also, is there a reason you're using SSIZE_MAX? SIZE_MAX should work > just as well here, but functionally it makes no difference. i think this ugly code is from openbsd they like to use such obfuscated checks for no apparent reason.. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: thoughts on reallocarray, explicit_bzero? 2014-05-19 16:16 ` Rich Felker 2014-05-19 16:30 ` Daniel Cegiełka 2014-05-19 16:32 ` Szabolcs Nagy @ 2015-01-28 22:01 ` Daniel Cegiełka 2015-01-28 22:34 ` Daniel Cegiełka 2 siblings, 1 reply; 30+ messages in thread From: Daniel Cegiełka @ 2015-01-28 22:01 UTC (permalink / raw) To: musl [-- Attachment #1: Type: text/plain, Size: 1539 bytes --] 2014-05-19 18:16 GMT+02:00 Rich Felker <dalias@libc.org>: > On Mon, May 19, 2014 at 05:44:59PM +0200, Daniel Cegiełka wrote: >> diff -urN musl.orig/src/string/explicit_bzero.c musl/src/string/explicit_bzero.c >> --- musl.orig/src/string/explicit_bzero.c Thu Jan 1 00:00:00 1970 >> +++ musl/src/string/explicit_bzero.c Fri May 9 09:57:45 2014 >> @@ -0,0 +1,8 @@ >> +#include <string.h> >> + >> +static void *(*volatile explicit_memset)(void *, int, size_t) = memset; >> + >> +void explicit_bzero(void *b, size_t len) >> +{ >> + (*explicit_memset)(b, 0, len); >> +} > > This is a nice trick, but IIRC I actually observed GCC optimizing out > similar code before (instead of your static volatile, I used a > volatile compound literal). At least the concept is right though: you > want to prevent the compiler from being able to do any flow analysis > at compile time, and making the function pointer volatile achieves > this rather well. On the other hand, GCC will put the volatile pointer > (if it even emits it) in non-constant memory, meaning it's an > additional attack vector for function-pointer-overwrite attacks. Linux kernel has similar functions and uses a barrier() here: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/lib/string.c?id=refs/tags/v3.19-rc6#n600 https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/linux/compiler.h?id=refs/tags/v3.19-rc6#n162 Is such a solution is more correct (and still portable)? Daniel > Rich [-- Attachment #2: explicit_bzero.c --] [-- Type: text/x-csrc, Size: 220 bytes --] #include <string.h> static void *(*volatile explicit_memset)(void *, int, size_t) = memset; void explicit_bzero(void *b, size_t len) { (*explicit_memset)(b, 0, len); __asm__ volatile ( "" : : "r" (b) : "memory" ); } [-- Attachment #3: explicit_bzero2.c --] [-- Type: text/x-csrc, Size: 135 bytes --] #include <string.h> void explicit_bzero(void *b, size_t len) { memset(b, 0, len); __asm__ volatile ( "" : : "r" (b) : "memory" ); } ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: thoughts on reallocarray, explicit_bzero? 2015-01-28 22:01 ` Daniel Cegiełka @ 2015-01-28 22:34 ` Daniel Cegiełka 2015-01-28 22:38 ` Nathan McSween 2015-01-29 2:19 ` Rich Felker 0 siblings, 2 replies; 30+ messages in thread From: Daniel Cegiełka @ 2015-01-28 22:34 UTC (permalink / raw) To: musl 2015-01-28 23:01 GMT+01:00 Daniel Cegiełka <daniel.cegielka@gmail.com>: > 2014-05-19 18:16 GMT+02:00 Rich Felker <dalias@libc.org>: >> On Mon, May 19, 2014 at 05:44:59PM +0200, Daniel Cegiełka wrote: > >>> diff -urN musl.orig/src/string/explicit_bzero.c musl/src/string/explicit_bzero.c >>> --- musl.orig/src/string/explicit_bzero.c Thu Jan 1 00:00:00 1970 >>> +++ musl/src/string/explicit_bzero.c Fri May 9 09:57:45 2014 >>> @@ -0,0 +1,8 @@ >>> +#include <string.h> >>> + >>> +static void *(*volatile explicit_memset)(void *, int, size_t) = memset; >>> + >>> +void explicit_bzero(void *b, size_t len) >>> +{ >>> + (*explicit_memset)(b, 0, len); >>> +} >> >> This is a nice trick, but IIRC I actually observed GCC optimizing out >> similar code before (instead of your static volatile, I used a >> volatile compound literal). At least the concept is right though: you >> want to prevent the compiler from being able to do any flow analysis >> at compile time, and making the function pointer volatile achieves >> this rather well. On the other hand, GCC will put the volatile pointer >> (if it even emits it) in non-constant memory, meaning it's an >> additional attack vector for function-pointer-overwrite attacks. > > Linux kernel has similar functions and uses a barrier() here: > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/lib/string.c?id=refs/tags/v3.19-rc6#n600 > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/linux/compiler.h?id=refs/tags/v3.19-rc6#n162 > > Is such a solution is more correct (and still portable)? I'm afraid that the only appropriate solution is to use memset_s() from C11 and the expectation that the compiler will accept it. barrier() does not give any guarantee that this function will be secure. Only compiler decides. I'm afraid that OpenBSD goes bad path with explicit_bzero(). The same applies to the linux kernel and memzero_explicit().. very stupid name... Daniel ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: thoughts on reallocarray, explicit_bzero? 2015-01-28 22:34 ` Daniel Cegiełka @ 2015-01-28 22:38 ` Nathan McSween 2015-01-28 22:54 ` Daniel Cegiełka 2015-01-29 2:19 ` Rich Felker 1 sibling, 1 reply; 30+ messages in thread From: Nathan McSween @ 2015-01-28 22:38 UTC (permalink / raw) To: musl [-- Attachment #1: Type: text/plain, Size: 2248 bytes --] How can a compiler optimize out a memory barrier? On Jan 28, 2015 2:34 PM, "Daniel Cegiełka" <daniel.cegielka@gmail.com> wrote: > 2015-01-28 23:01 GMT+01:00 Daniel Cegiełka <daniel.cegielka@gmail.com>: > > 2014-05-19 18:16 GMT+02:00 Rich Felker <dalias@libc.org>: > >> On Mon, May 19, 2014 at 05:44:59PM +0200, Daniel Cegiełka wrote: > > > >>> diff -urN musl.orig/src/string/explicit_bzero.c > musl/src/string/explicit_bzero.c > >>> --- musl.orig/src/string/explicit_bzero.c Thu Jan 1 00:00:00 1970 > >>> +++ musl/src/string/explicit_bzero.c Fri May 9 09:57:45 2014 > >>> @@ -0,0 +1,8 @@ > >>> +#include <string.h> > >>> + > >>> +static void *(*volatile explicit_memset)(void *, int, size_t) = > memset; > >>> + > >>> +void explicit_bzero(void *b, size_t len) > >>> +{ > >>> + (*explicit_memset)(b, 0, len); > >>> +} > >> > >> This is a nice trick, but IIRC I actually observed GCC optimizing out > >> similar code before (instead of your static volatile, I used a > >> volatile compound literal). At least the concept is right though: you > >> want to prevent the compiler from being able to do any flow analysis > >> at compile time, and making the function pointer volatile achieves > >> this rather well. On the other hand, GCC will put the volatile pointer > >> (if it even emits it) in non-constant memory, meaning it's an > >> additional attack vector for function-pointer-overwrite attacks. > > > > Linux kernel has similar functions and uses a barrier() here: > > > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/lib/string.c?id=refs/tags/v3.19-rc6#n600 > > > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/linux/compiler.h?id=refs/tags/v3.19-rc6#n162 > > > > Is such a solution is more correct (and still portable)? > > I'm afraid that the only appropriate solution is to use memset_s() > from C11 and the expectation that the compiler will accept it. > barrier() does not give any guarantee that this function will be > secure. Only compiler decides. I'm afraid that OpenBSD goes bad path > with explicit_bzero(). The same applies to the linux kernel and > memzero_explicit().. very stupid name... > > Daniel > [-- Attachment #2: Type: text/html, Size: 3178 bytes --] ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: thoughts on reallocarray, explicit_bzero? 2015-01-28 22:38 ` Nathan McSween @ 2015-01-28 22:54 ` Daniel Cegiełka 2015-01-28 23:02 ` Josiah Worcester 0 siblings, 1 reply; 30+ messages in thread From: Daniel Cegiełka @ 2015-01-28 22:54 UTC (permalink / raw) To: musl 2015-01-28 23:38 GMT+01:00 Nathan McSween <nwmcsween@gmail.com>: > How can a compiler optimize out a memory barrier? The problem is not whether to do it or not, but in the fact that the right solution is to use memset_s(). http://en.wikichip.org/wiki/c/c11#Bounds-checking_interfaces_annex_.28Annex_K.29 ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: thoughts on reallocarray, explicit_bzero? 2015-01-28 22:54 ` Daniel Cegiełka @ 2015-01-28 23:02 ` Josiah Worcester 0 siblings, 0 replies; 30+ messages in thread From: Josiah Worcester @ 2015-01-28 23:02 UTC (permalink / raw) To: musl On Wed, Jan 28, 2015 at 4:54 PM, Daniel Cegiełka <daniel.cegielka@gmail.com> wrote: > 2015-01-28 23:38 GMT+01:00 Nathan McSween <nwmcsween@gmail.com>: >> How can a compiler optimize out a memory barrier? > > The problem is not whether to do it or not, but in the fact that the > right solution is to use memset_s(). > > http://en.wikichip.org/wiki/c/c11#Bounds-checking_interfaces_annex_.28Annex_K.29 memset_s inherits the typical Annex K brain damage which makes it not at all library safe or sane to use, though. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: thoughts on reallocarray, explicit_bzero? 2015-01-28 22:34 ` Daniel Cegiełka 2015-01-28 22:38 ` Nathan McSween @ 2015-01-29 2:19 ` Rich Felker 2015-01-29 4:03 ` Brent Cook 1 sibling, 1 reply; 30+ messages in thread From: Rich Felker @ 2015-01-29 2:19 UTC (permalink / raw) To: musl On Wed, Jan 28, 2015 at 11:34:20PM +0100, Daniel Cegiełka wrote: > 2015-01-28 23:01 GMT+01:00 Daniel Cegiełka <daniel.cegielka@gmail.com>: > > 2014-05-19 18:16 GMT+02:00 Rich Felker <dalias@libc.org>: > >> On Mon, May 19, 2014 at 05:44:59PM +0200, Daniel Cegiełka wrote: > > > >>> diff -urN musl.orig/src/string/explicit_bzero.c musl/src/string/explicit_bzero.c > >>> --- musl.orig/src/string/explicit_bzero.c Thu Jan 1 00:00:00 1970 > >>> +++ musl/src/string/explicit_bzero.c Fri May 9 09:57:45 2014 > >>> @@ -0,0 +1,8 @@ > >>> +#include <string.h> > >>> + > >>> +static void *(*volatile explicit_memset)(void *, int, size_t) = memset; > >>> + > >>> +void explicit_bzero(void *b, size_t len) > >>> +{ > >>> + (*explicit_memset)(b, 0, len); > >>> +} > >> > >> This is a nice trick, but IIRC I actually observed GCC optimizing out > >> similar code before (instead of your static volatile, I used a > >> volatile compound literal). At least the concept is right though: you > >> want to prevent the compiler from being able to do any flow analysis > >> at compile time, and making the function pointer volatile achieves > >> this rather well. On the other hand, GCC will put the volatile pointer > >> (if it even emits it) in non-constant memory, meaning it's an > >> additional attack vector for function-pointer-overwrite attacks. > > > > Linux kernel has similar functions and uses a barrier() here: > > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/lib/string.c?id=refs/tags/v3.19-rc6#n600 > > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/linux/compiler.h?id=refs/tags/v3.19-rc6#n162 > > > > Is such a solution is more correct (and still portable)? > > I'm afraid that the only appropriate solution is to use memset_s() > from C11 and the expectation that the compiler will accept it. > barrier() does not give any guarantee that this function will be > secure. Only compiler decides. I'm afraid that OpenBSD goes bad path > with explicit_bzero(). The same applies to the linux kernel and > memzero_explicit().. very stupid name... I see no way memset_s is technically "better". It's unable to find and clear other temporary copies that have been made, and the barrier method described above already reliably clears the pointed-to copy. Rich ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: thoughts on reallocarray, explicit_bzero? 2015-01-29 2:19 ` Rich Felker @ 2015-01-29 4:03 ` Brent Cook 2015-01-29 4:15 ` Rich Felker 0 siblings, 1 reply; 30+ messages in thread From: Brent Cook @ 2015-01-29 4:03 UTC (permalink / raw) To: musl On Wed, Jan 28, 2015 at 8:19 PM, Rich Felker <dalias@libc.org> wrote: > On Wed, Jan 28, 2015 at 11:34:20PM +0100, Daniel Cegiełka wrote: >> 2015-01-28 23:01 GMT+01:00 Daniel Cegiełka <daniel.cegielka@gmail.com>: >> > 2014-05-19 18:16 GMT+02:00 Rich Felker <dalias@libc.org>: >> >> On Mon, May 19, 2014 at 05:44:59PM +0200, Daniel Cegiełka wrote: >> > >> >>> diff -urN musl.orig/src/string/explicit_bzero.c musl/src/string/explicit_bzero.c >> >>> --- musl.orig/src/string/explicit_bzero.c Thu Jan 1 00:00:00 1970 >> >>> +++ musl/src/string/explicit_bzero.c Fri May 9 09:57:45 2014 >> >>> @@ -0,0 +1,8 @@ >> >>> +#include <string.h> >> >>> + >> >>> +static void *(*volatile explicit_memset)(void *, int, size_t) = memset; >> >>> + >> >>> +void explicit_bzero(void *b, size_t len) >> >>> +{ >> >>> + (*explicit_memset)(b, 0, len); >> >>> +} >> >> >> >> This is a nice trick, but IIRC I actually observed GCC optimizing out >> >> similar code before (instead of your static volatile, I used a >> >> volatile compound literal). At least the concept is right though: you >> >> want to prevent the compiler from being able to do any flow analysis >> >> at compile time, and making the function pointer volatile achieves >> >> this rather well. On the other hand, GCC will put the volatile pointer >> >> (if it even emits it) in non-constant memory, meaning it's an >> >> additional attack vector for function-pointer-overwrite attacks. >> > >> > Linux kernel has similar functions and uses a barrier() here: >> > >> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/lib/string.c?id=refs/tags/v3.19-rc6#n600 >> > >> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/linux/compiler.h?id=refs/tags/v3.19-rc6#n162 >> > >> > Is such a solution is more correct (and still portable)? >> >> I'm afraid that the only appropriate solution is to use memset_s() >> from C11 and the expectation that the compiler will accept it. >> barrier() does not give any guarantee that this function will be >> secure. Only compiler decides. I'm afraid that OpenBSD goes bad path >> with explicit_bzero(). The same applies to the linux kernel and >> memzero_explicit().. very stupid name... > > I see no way memset_s is technically "better". It's unable to find and > clear other temporary copies that have been made, and the barrier > method described above already reliably clears the pointed-to copy. > > Rich Whatever method you choose, the method of testing is an interesting one, since seeing if the compiler optimized out a memset (because the memory was not read after a write) requires tricking the compiler into believing you aren't reading it. This test is pretty cool, IMO: https://github.com/libressl-portable/openbsd/blob/master/src/regress/lib/libc/explicit_bzero/explicit_bzero.c it is described a bit more here: https://plus.google.com/+MatthewDempsky/posts/KQHFBouxurX Getting around link-time optimizations required building the explicit_bzero function with independent compiler flags to ensure LTO was not enabled. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: thoughts on reallocarray, explicit_bzero? 2015-01-29 4:03 ` Brent Cook @ 2015-01-29 4:15 ` Rich Felker 2015-01-29 9:30 ` Daniel Cegiełka 0 siblings, 1 reply; 30+ messages in thread From: Rich Felker @ 2015-01-29 4:15 UTC (permalink / raw) To: musl On Wed, Jan 28, 2015 at 10:03:33PM -0600, Brent Cook wrote: > >> > Linux kernel has similar functions and uses a barrier() here: > >> > > >> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/lib/string.c?id=refs/tags/v3.19-rc6#n600 > >> > > >> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/linux/compiler.h?id=refs/tags/v3.19-rc6#n162 > >> > > >> > Is such a solution is more correct (and still portable)? > >> > >> I'm afraid that the only appropriate solution is to use memset_s() > >> from C11 and the expectation that the compiler will accept it. > >> barrier() does not give any guarantee that this function will be > >> secure. Only compiler decides. I'm afraid that OpenBSD goes bad path > >> with explicit_bzero(). The same applies to the linux kernel and > >> memzero_explicit().. very stupid name... > > > > I see no way memset_s is technically "better". It's unable to find and > > clear other temporary copies that have been made, and the barrier > > method described above already reliably clears the pointed-to copy. > > Whatever method you choose, the method of testing is an interesting > one, since seeing if the compiler optimized out a memset (because the > memory was not read after a write) requires tricking the compiler into > believing you aren't reading it. This test is pretty cool, IMO: > > https://github.com/libressl-portable/openbsd/blob/master/src/regress/lib/libc/explicit_bzero/explicit_bzero.c > > it is described a bit more here: > https://plus.google.com/+MatthewDempsky/posts/KQHFBouxurX The comment that pthread_attr_setstack could be used instead is interesting and would make the test a lot simpler, I think. > Getting around link-time optimizations required building the > explicit_bzero function with independent compiler flags to ensure LTO > was not enabled. As long as there's a barrier, LTO is no problem. The asm is a black box that's required to see the results of memset, since the address of the object reaches the asm, and the only way to ensure that such a black box sees the writes is for them to actually be performed. Rich ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: thoughts on reallocarray, explicit_bzero? 2015-01-29 4:15 ` Rich Felker @ 2015-01-29 9:30 ` Daniel Cegiełka 2015-01-29 10:04 ` Szabolcs Nagy 0 siblings, 1 reply; 30+ messages in thread From: Daniel Cegiełka @ 2015-01-29 9:30 UTC (permalink / raw) To: musl The concept of safe memory cleaning was mostly promoted by cryptographic libraries - eg. secure_memzero(). Unfortunately, we have currently too many interfaces for the same functionality: memset_s(), secure_memzero(), explicit_bzero(), memzero_explicit(). This is why I believe that OpenBSD (and linux developers) goes bad path, introducing yet another secure_memzero(). A better solution would be to promote a single standard (eg. memset_s()) and the expectation that the compiler will respect it. summing up: we have several options: * volatile based, but fails with LTO http://openwall.com/lists/musl/2014/05/19/5 * weak symbols based (from Matthew Dempsky): https://plus.google.com/+MatthewDempsky/posts/KQHFBouxurX http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/lib/libc/string/explicit_bzero.c?rev=1.3&content-type=text/x-cvsweb-markup * barrier based, but with asm inline: http://openwall.com/lists/musl/2015/01/28/34 Is the musl will support this feature to improve compatibility with BSD? Daniel ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: thoughts on reallocarray, explicit_bzero? 2015-01-29 9:30 ` Daniel Cegiełka @ 2015-01-29 10:04 ` Szabolcs Nagy 2015-01-29 10:31 ` Daniel Cegiełka 2015-01-29 10:54 ` Daniel Cegiełka 0 siblings, 2 replies; 30+ messages in thread From: Szabolcs Nagy @ 2015-01-29 10:04 UTC (permalink / raw) To: musl * Daniel Cegie??ka <daniel.cegielka@gmail.com> [2015-01-29 10:30:40 +0100]: > yet another secure_memzero(). A better solution would be to promote a > single standard (eg. memset_s()) and the expectation that the compiler > will respect it. > i think you don't know the semantics of memset_s (it uses nonsense types, has superflous arguments, handles constraint violations through global state etc) it is a complicated mess and not a good api to standardize on if all you want is to avoid information leak in crypto code (btw no memset based solution can provide complete protection against info leak: if the crypto function is interrupted by a signal then all the register state will be copied to the stack or altstack and kept around for arbitrarily long time which is plenty information leaked) ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: thoughts on reallocarray, explicit_bzero? 2015-01-29 10:04 ` Szabolcs Nagy @ 2015-01-29 10:31 ` Daniel Cegiełka 2015-01-29 10:54 ` Daniel Cegiełka 1 sibling, 0 replies; 30+ messages in thread From: Daniel Cegiełka @ 2015-01-29 10:31 UTC (permalink / raw) To: musl 2015-01-29 11:04 GMT+01:00 Szabolcs Nagy <nsz@port70.net>: > * Daniel Cegie??ka <daniel.cegielka@gmail.com> [2015-01-29 10:30:40 +0100]: >> yet another secure_memzero(). A better solution would be to promote a >> single standard (eg. memset_s()) and the expectation that the compiler >> will respect it. >> > > i think you don't know the semantics of memset_s > (it uses nonsense types, has superflous arguments, handles > constraint violations through global state etc) > > it is a complicated mess and not a good api to standardize on > if all you want is to avoid information leak in crypto code I gave this as an example - the intention is to have a single standard (vs secure_memzero(), explicit_bzero(), memzero_explicit(), ...). http://openwall.com/lists/musl/2015/01/14/5 btw. libsodium prefers memset_s() over explicit_bzero() and over weak symbols. https://github.com/jedisct1/libsodium/blob/master/src/libsodium/sodium/utils.c#L56 ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: thoughts on reallocarray, explicit_bzero? 2015-01-29 10:04 ` Szabolcs Nagy 2015-01-29 10:31 ` Daniel Cegiełka @ 2015-01-29 10:54 ` Daniel Cegiełka 1 sibling, 0 replies; 30+ messages in thread From: Daniel Cegiełka @ 2015-01-29 10:54 UTC (permalink / raw) To: musl 2015-01-29 11:04 GMT+01:00 Szabolcs Nagy <nsz@port70.net>: > * Daniel Cegie??ka <daniel.cegielka@gmail.com> [2015-01-29 10:30:40 +0100]: >> yet another secure_memzero(). A better solution would be to promote a >> single standard (eg. memset_s()) and the expectation that the compiler >> will respect it. >> > > i think you don't know the semantics of memset_s > (it uses nonsense types, has superflous arguments, handles > constraint violations through global state etc) btw. memset_s() is an attempt to solve the same problem. However, this version will not work with LTO: ftp://ftp.netbsd.org/pub/NetBSD/misc/apb/memset_s.20120224.diff #include <sys/cdefs.h> __RCSID("$NetBSD$"); #define __STDC_WANT_LIB_EXT1__ 1 #include <errno.h> #include <stdint.h> #include <string.h> /* * __memset_vp is a volatile pointer to a function. * It is initialised to point to memset, and should never be changed. */ static void * (* const volatile __memset_vp)(void *, int, size_t) = (memset); #undef memset_s /* in case it was defined as a macro */ errno_t memset_s(void *s, rsize_t smax, int c, rsize_t n) { errno_t err = 0; if (s == NULL) { err = EINVAL; goto out; } if (smax > RSIZE_MAX) { err = E2BIG; goto out; } if (n > RSIZE_MAX) { err = E2BIG; n = smax; } if (n > smax) { err = EOVERFLOW; n = smax; } /* Calling through a volatile pointer should never be optimised away. */ (*__memset_vp)(s, c, n); out: if (err == 0) return 0; else { errno = err; /* XXX call runtime-constraint handler */ return err; } } ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: thoughts on reallocarray, explicit_bzero? 2014-05-19 15:31 thoughts on reallocarray, explicit_bzero? Isaac Dunham 2014-05-19 15:43 ` Rich Felker 2014-05-19 15:44 ` Daniel Cegiełka @ 2014-05-19 16:25 ` Szabolcs Nagy 2014-05-19 16:45 ` Daniel Cegiełka ` (3 more replies) 2 siblings, 4 replies; 30+ messages in thread From: Szabolcs Nagy @ 2014-05-19 16:25 UTC (permalink / raw) To: musl * Isaac Dunham <ibid.ag@gmail.com> [2014-05-19 08:31:31 -0700]: > Having read up on the LibreSSL fork of OpenSSL and also recently > backported a nuber of libXfont CVE fixes for integer overflows, > I've seen the risk posed by malloc(n*sizeof(x)) and realloc(ptr, > n*sizeof(x)). > calloc(n, sizeof(x)) can be used in place of malloc(n * sizeof(x)), > but there's no standard function that does overflow checking for > realloc(). OpenBSD has provided the extension reallocarray(), which > provides for bounds checking like calloc() does. i'd use a saturated multiplication, because malloc/realloc are not the only places where overflowing size calculations may cause problems and in such cases (size_t)-1 is just as good as a failure and it can be added to your code without portability issues static size_t sizemul(size_t a, size_t b) { return b>1 && a>1 && a>-1/b ? -1 : a*b; } > Additionally, there are times when a compiler will optimize away calls > to bzero() on areas that are not used before free(); this can result in > passwords getting left in memory. OpenBSD uses a wrapper function called > explicit_bzero() to keep this from happening, thugh it seems to be possible > to use some ugliness with volatile to stop it. i don't see how the openbsd explicit_bzero stops the compiler to do optimizations.. (i guess they rely on that their gcc does not do lto or that libc is dynamic linked and the compiler has no 'explicit_bzero' builtin, neither of which is a great solution..) the usual approach to this is volatile function pointer: static void *(*volatile force_memset)(void,int,size_t) = memset; in general in c one cannot be sure that the secret bits are not leaked somewhere since the languge spec cannot give such guarantees that said either the volatile funcptr or actually reusing the memory such that it cannot be optimized away works in practice > Should musl provide reallocarray()? > And what's the best way to ensure that memory gets zeroed out? > > Thanks, > Isaac Dunham ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: thoughts on reallocarray, explicit_bzero? 2014-05-19 16:25 ` Szabolcs Nagy @ 2014-05-19 16:45 ` Daniel Cegiełka 2014-05-19 16:58 ` Rich Felker 2014-05-19 16:55 ` Rich Felker ` (2 subsequent siblings) 3 siblings, 1 reply; 30+ messages in thread From: Daniel Cegiełka @ 2014-05-19 16:45 UTC (permalink / raw) To: musl 2014-05-19 18:25 GMT+02:00 Szabolcs Nagy <nsz@port70.net>: > i don't see how the openbsd explicit_bzero stops the > compiler to do optimizations.. > > (i guess they rely on that their gcc does not do lto > or that libc is dynamic linked and the compiler has no > 'explicit_bzero' builtin, neither of which is a great > solution..) > > the usual approach to this is volatile function pointer: > > static void *(*volatile force_memset)(void,int,size_t) = memset; > > in general in c one cannot be sure that the secret bits > are not leaked somewhere since the languge spec cannot > give such guarantees > > that said either the volatile funcptr or actually reusing > the memory such that it cannot be optimized away works in > practice first version: void explicit_bzero(void * const b, const size_t l) { volatile unsigned char *p = (volatile unsigned char *) b; size_t i = (size_t) 0U; while (i < l) { p[i++] = 0U; } } Of course, if someone has better ideas... I'm very curious :) Daniel ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: thoughts on reallocarray, explicit_bzero? 2014-05-19 16:45 ` Daniel Cegiełka @ 2014-05-19 16:58 ` Rich Felker 0 siblings, 0 replies; 30+ messages in thread From: Rich Felker @ 2014-05-19 16:58 UTC (permalink / raw) To: musl On Mon, May 19, 2014 at 06:45:08PM +0200, Daniel Cegiełka wrote: > 2014-05-19 18:25 GMT+02:00 Szabolcs Nagy <nsz@port70.net>: > > > i don't see how the openbsd explicit_bzero stops the > > compiler to do optimizations.. > > > > (i guess they rely on that their gcc does not do lto > > or that libc is dynamic linked and the compiler has no > > 'explicit_bzero' builtin, neither of which is a great > > solution..) > > > > the usual approach to this is volatile function pointer: > > > > static void *(*volatile force_memset)(void,int,size_t) = memset; > > > > in general in c one cannot be sure that the secret bits > > are not leaked somewhere since the languge spec cannot > > give such guarantees > > > > that said either the volatile funcptr or actually reusing > > the memory such that it cannot be optimized away works in > > practice > > first version: > > void explicit_bzero(void * const b, const size_t l) > { > volatile unsigned char *p = (volatile unsigned char *) b; > size_t i = (size_t) 0U; > > while (i < l) { > p[i++] = 0U; > } > } > > Of course, if someone has better ideas... I'm very curious :) I'm pretty sure this does not work. The volatile pointer cast (which BTW is not necessary; it happens implicitly) does not, as far as I can tell, mean "access the object via an overlapped volatile object". Rather, it just means that the compiler cannot _automatically_ assume the pointed-to object is non-volatile. It's still free to determine via other means (e.g. inter-procedural analysis/LTO/etc.) that the pointed-to object is non-volatile (and of course, in cases where this matters, that its lifetime is ending) and thereby optimize out the whole thing as dead code. Rich ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: thoughts on reallocarray, explicit_bzero? 2014-05-19 16:25 ` Szabolcs Nagy 2014-05-19 16:45 ` Daniel Cegiełka @ 2014-05-19 16:55 ` Rich Felker 2014-05-19 18:12 ` Szabolcs Nagy 2014-05-19 22:08 ` Andy Lutomirski 2014-06-11 9:59 ` Thorsten Glaser 3 siblings, 1 reply; 30+ messages in thread From: Rich Felker @ 2014-05-19 16:55 UTC (permalink / raw) To: musl On Mon, May 19, 2014 at 06:25:57PM +0200, Szabolcs Nagy wrote: > i'd use a saturated multiplication, because malloc/realloc > are not the only places where overflowing size calculations > may cause problems and in such cases (size_t)-1 is just as > good as a failure and it can be added to your code without > portability issues > > static size_t sizemul(size_t a, size_t b) > { > return b>1 && a>1 && a>-1/b ? -1 : a*b; > } On 32-bit this can easily be optimized to just one conditional instead of three: uint64_t tmp = (uint64_t)a * b; return tmp>SIZE_MAX ? SIZE_MAX : tmp; Of course that requires an ifdef, which is perhaps ugly. Rich ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: thoughts on reallocarray, explicit_bzero? 2014-05-19 16:55 ` Rich Felker @ 2014-05-19 18:12 ` Szabolcs Nagy 0 siblings, 0 replies; 30+ messages in thread From: Szabolcs Nagy @ 2014-05-19 18:12 UTC (permalink / raw) To: musl * Rich Felker <dalias@libc.org> [2014-05-19 12:55:23 -0400]: > On Mon, May 19, 2014 at 06:25:57PM +0200, Szabolcs Nagy wrote: > > static size_t sizemul(size_t a, size_t b) > > { > > return b>1 && a>1 && a>-1/b ? -1 : a*b; > > } > > On 32-bit this can easily be optimized to just one conditional instead > of three: > > uint64_t tmp = (uint64_t)a * b; > return tmp>SIZE_MAX ? SIZE_MAX : tmp; > > Of course that requires an ifdef, which is perhaps ugly. or without ifdef static size_t sizemul(size_t a, size_t b) { if (2*sizeof a <= sizeof 1ULL) { unsigned long long m = 1ULL*a*b; return m>>8*sizeof a ? -1 : m; } return (a|b)>>4*sizeof a && b && a>-1/b ? -1 : a*b; } ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: thoughts on reallocarray, explicit_bzero? 2014-05-19 16:25 ` Szabolcs Nagy 2014-05-19 16:45 ` Daniel Cegiełka 2014-05-19 16:55 ` Rich Felker @ 2014-05-19 22:08 ` Andy Lutomirski 2014-05-20 0:41 ` Szabolcs Nagy 2014-06-11 9:59 ` Thorsten Glaser 3 siblings, 1 reply; 30+ messages in thread From: Andy Lutomirski @ 2014-05-19 22:08 UTC (permalink / raw) To: musl On 05/19/2014 09:25 AM, Szabolcs Nagy wrote: > * Isaac Dunham <ibid.ag@gmail.com> [2014-05-19 08:31:31 -0700]: >> Having read up on the LibreSSL fork of OpenSSL and also recently >> backported a nuber of libXfont CVE fixes for integer overflows, >> I've seen the risk posed by malloc(n*sizeof(x)) and realloc(ptr, >> n*sizeof(x)). >> calloc(n, sizeof(x)) can be used in place of malloc(n * sizeof(x)), >> but there's no standard function that does overflow checking for >> realloc(). OpenBSD has provided the extension reallocarray(), which >> provides for bounds checking like calloc() does. > > i'd use a saturated multiplication, because malloc/realloc > are not the only places where overflowing size calculations > may cause problems and in such cases (size_t)-1 is just as > good as a failure and it can be added to your code without > portability issues > > static size_t sizemul(size_t a, size_t b) > { > return b>1 && a>1 && a>-1/b ? -1 : a*b; > } Before going nuts trying to optimize this, it may pay to write some good-enough helper and to use native compiler support for this, which is already available in Clang [1] and should be coming reasonably soon in gcc [2]. I suspect that, on all reasonably platforms, if doublesize_t is the unsigned type that's twice as wide as size_t, then this isn't too bad either: doublesize_t total = (doublesize_t)a * (doublesize_t)b; if (total > SIZE_MAX) fail; For quite a while, gcc has had a 128-bit integer type that works on 64-bit platforms, and gcc should always support a 64-bit type on 32-bit platforms. On systems with widening multiply (e.g. x86), even if the optimizer doesn't detect the idiom, this is only a few cycles slower than the optimal code. [1] http://clang.llvm.org/docs/LanguageExtensions.html#checked-arithmetic-builtins [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61129 --Andy ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Re: thoughts on reallocarray, explicit_bzero? 2014-05-19 22:08 ` Andy Lutomirski @ 2014-05-20 0:41 ` Szabolcs Nagy 0 siblings, 0 replies; 30+ messages in thread From: Szabolcs Nagy @ 2014-05-20 0:41 UTC (permalink / raw) To: musl * Andy Lutomirski <luto@amacapital.net> [2014-05-19 15:08:11 -0700]: > On 05/19/2014 09:25 AM, Szabolcs Nagy wrote: > > i'd use a saturated multiplication, because malloc/realloc > > are not the only places where overflowing size calculations > > may cause problems and in such cases (size_t)-1 is just as > > good as a failure and it can be added to your code without > > portability issues > > > > static size_t sizemul(size_t a, size_t b) > > { > > return b>1 && a>1 && a>-1/b ? -1 : a*b; > > } > > Before going nuts trying to optimize this, it may pay to write some > good-enough helper and to use native compiler support for this, which is > already available in Clang [1] and should be coming reasonably soon in > gcc [2]. it's a shame that clang came up with this nonsese they managed to add 18 new compiler specific builtins, without actually addressing the practical issue: easy to use overflow check of size_t multiplication.. (or checking arithmetics of various other non-builtin types) (the several new multiprecision arithmetics builtins are bad too but less problematic in practice) they didn't make it easy to write backward compatible code either: historically ifdef hackery was used to "detect" builtins support using the __GNUC__ version macros, but clang has incompatible versioning and builtins now making the use of new builtins more painful (meanwhile a lot of code has idiomatic overflow checks in iso c which is not recognized by gcc or clang.. and many c parsing tools don't understand the fancy new builtins) > I suspect that, on all reasonably platforms, if doublesize_t is the > unsigned type that's twice as wide as size_t, then this isn't too bad > either: > > doublesize_t total = (doublesize_t)a * (doublesize_t)b; > if (total > SIZE_MAX) > fail; > > For quite a while, gcc has had a 128-bit integer type that works on > 64-bit platforms, and gcc should always support a 64-bit type on 32-bit > platforms. On systems with widening multiply (e.g. x86), even if the > optimizer doesn't detect the idiom, this is only a few cycles slower > than the optimal code. umm __int128 is only supported in gcc since 4.6 i think (and even after that there were some related brokenness in hacked toolchains so >=gcc-4.6 is not enough to check) otherwise yes with doublesize_t it is easy to do but the point was to do it in c for doublesize_t you would need configure time checks.. or nasty ifdef hackery.. and in the end you still need the fallback for implementations without such a type (the code i showed can be included in any source file where size_t is defined) > [1] > http://clang.llvm.org/docs/LanguageExtensions.html#checked-arithmetic-builtins > [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61129 ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: thoughts on reallocarray, explicit_bzero? 2014-05-19 16:25 ` Szabolcs Nagy ` (2 preceding siblings ...) 2014-05-19 22:08 ` Andy Lutomirski @ 2014-06-11 9:59 ` Thorsten Glaser 2014-06-11 12:59 ` Rich Felker 3 siblings, 1 reply; 30+ messages in thread From: Thorsten Glaser @ 2014-06-11 9:59 UTC (permalink / raw) To: musl Szabolcs Nagy <nsz <at> port70.net> writes: > static size_t sizemul(size_t a, size_t b) > { > return b>1 && a>1 && a>-1/b ? -1 : a*b; > } There is no -1 in size_t. (And *you* complain about OpenBSD checks…) > i don't see how the openbsd explicit_bzero stops the > compiler to do optimizations.. On OpenBSD: by being in libc which is not built with LTO. I’ve wondered about how to do this either. Maybe: void explicit_bzero(void *s, size_t n) { bzero(s, n); __lto_boundary } Then you #define __lto_boundary to something like __asm__ volatile ("" : : : "memory"); or __sync_synchronize(); or some C11 barrier function. bye, //mirabilos ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Re: thoughts on reallocarray, explicit_bzero? 2014-06-11 9:59 ` Thorsten Glaser @ 2014-06-11 12:59 ` Rich Felker 0 siblings, 0 replies; 30+ messages in thread From: Rich Felker @ 2014-06-11 12:59 UTC (permalink / raw) To: musl On Wed, Jun 11, 2014 at 09:59:56AM +0000, Thorsten Glaser wrote: > Szabolcs Nagy <nsz <at> port70.net> writes: > > > static size_t sizemul(size_t a, size_t b) > > { > > return b>1 && a>1 && a>-1/b ? -1 : a*b; > > } > > There is no -1 in size_t. (And *you* complain about OpenBSD checks…) The standard way (especially for generic programming, but it's usable anywhere) to get the max value for an unsigned type is to convert -1. b and a*b both have type size_t, so... > > i don't see how the openbsd explicit_bzero stops the > > compiler to do optimizations.. > > On OpenBSD: by being in libc which is not built with LTO. > I’ve wondered about how to do this either. Maybe: Yeah, that's a poor hack. We still probably have some places where "extern" is used as a compiler barrier, but it's wrong, and I'm working to identify and remove them all. > void > explicit_bzero(void *s, size_t n) > { > bzero(s, n); > __lto_boundary > } > > Then you #define __lto_boundary to something like > __asm__ volatile ("" : : : "memory"); > or > __sync_synchronize(); > or some C11 barrier function. These are not sufficient. It would probably need to be: __asm__ volatile ("" : : "r"(s) : "memory"); or similar. This is because volatility and memory-clobber only provide a barrier with respect to objects which exist in memory, and at the point of the asm (after transformations which do not disturb the observable behavior of the program), the pointed-to object with automatic storage does not exist. Rich ^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2015-01-29 10:54 UTC | newest] Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-05-19 15:31 thoughts on reallocarray, explicit_bzero? Isaac Dunham 2014-05-19 15:43 ` Rich Felker 2014-05-19 16:19 ` Daniel Cegiełka 2014-05-20 6:19 ` Rich Felker 2014-05-20 15:50 ` Daniel Cegiełka 2014-05-19 15:44 ` Daniel Cegiełka 2014-05-19 16:16 ` Rich Felker 2014-05-19 16:30 ` Daniel Cegiełka 2014-05-19 16:32 ` Szabolcs Nagy 2015-01-28 22:01 ` Daniel Cegiełka 2015-01-28 22:34 ` Daniel Cegiełka 2015-01-28 22:38 ` Nathan McSween 2015-01-28 22:54 ` Daniel Cegiełka 2015-01-28 23:02 ` Josiah Worcester 2015-01-29 2:19 ` Rich Felker 2015-01-29 4:03 ` Brent Cook 2015-01-29 4:15 ` Rich Felker 2015-01-29 9:30 ` Daniel Cegiełka 2015-01-29 10:04 ` Szabolcs Nagy 2015-01-29 10:31 ` Daniel Cegiełka 2015-01-29 10:54 ` Daniel Cegiełka 2014-05-19 16:25 ` Szabolcs Nagy 2014-05-19 16:45 ` Daniel Cegiełka 2014-05-19 16:58 ` Rich Felker 2014-05-19 16:55 ` Rich Felker 2014-05-19 18:12 ` Szabolcs Nagy 2014-05-19 22:08 ` Andy Lutomirski 2014-05-20 0:41 ` Szabolcs Nagy 2014-06-11 9:59 ` Thorsten Glaser 2014-06-11 12:59 ` Rich Felker
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/musl/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).