mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] [C23 string conversion 0/3]
@ 2023-05-26  9:25 Jens Gustedt
  2023-05-26  9:25 ` [musl] [C23 string conversion 1/3] C23: add the new memset_explicit function Jens Gustedt
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Jens Gustedt @ 2023-05-26  9:25 UTC (permalink / raw)
  To: musl

A collection of mem and string conversions

Jens Gustedt (3):
  C23: add the new memset_explicit function
  C23: implement the c8rtomb and mbrtoc8 functions
  C23: add the new include guards for string.h and wchar.h

 include/string.h             |  5 ++--
 include/uchar.h              |  6 +++++
 include/wchar.h              |  4 +--
 src/multibyte/c8rtomb.c      | 31 ++++++++++++++++++++++
 src/multibyte/mbrtoc8.c      | 51 ++++++++++++++++++++++++++++++++++++
 src/string/memset_explicit.c | 14 ++++++++++
 6 files changed, 107 insertions(+), 4 deletions(-)
 create mode 100644 src/multibyte/c8rtomb.c
 create mode 100644 src/multibyte/mbrtoc8.c
 create mode 100644 src/string/memset_explicit.c

-- 
2.34.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [musl] [C23 string conversion 1/3] C23: add the new memset_explicit function
  2023-05-26  9:25 [musl] [C23 string conversion 0/3] Jens Gustedt
@ 2023-05-26  9:25 ` Jens Gustedt
  2023-05-26  9:52   ` Joakim Sindholt
  2023-05-28 10:13   ` NRK
  2023-05-26  9:25 ` [musl] [C23 string conversion 2/3] C23: implement the c8rtomb and mbrtoc8 functions Jens Gustedt
  2023-05-26  9:25 ` [musl] [C23 string conversion 3/3] C23: add the new include guards for string.h and wchar.h Jens Gustedt
  2 siblings, 2 replies; 13+ messages in thread
From: Jens Gustedt @ 2023-05-26  9:25 UTC (permalink / raw)
  To: musl

This function is meant to work around the fact that C compilers are
allowed to optimize calls to memset out, if they are able to detect
that the byte array will die soon, anyway. This permission for memset
may lead to data leak when non-priveledged parts of an application
would be able to reconstruct secret information from memory received
through malloc or on the stack.

This function here is to force compilers to do the clean up operation
under all circumstances. How to do that is out of the scope of the C
standard, so there is not much help there, it only describes the
intent.

By having a slow bytewise copy, we intent also to have predictable
timing, such that we can avoid side-channel attacks. We also do our
best to remove the meta-information, which is the pointer value from
the stack and combine that with a synchronizing operation at the end.
---
 include/string.h             |  1 +
 src/string/memset_explicit.c | 14 ++++++++++++++
 2 files changed, 15 insertions(+)
 create mode 100644 src/string/memset_explicit.c

diff --git a/include/string.h b/include/string.h
index 05019c03..78ccccbd 100644
--- a/include/string.h
+++ b/include/string.h
@@ -27,6 +27,7 @@ extern "C" {
 void *memcpy (void *__restrict, const void *__restrict, size_t);
 void *memmove (void *, const void *, size_t);
 void *memset (void *, int, size_t);
+void *memset_explicit(void *, int, size_t);
 int memcmp (const void *, const void *, size_t);
 
 void *(memchr) (const void *, int, size_t);
diff --git a/src/string/memset_explicit.c b/src/string/memset_explicit.c
new file mode 100644
index 00000000..49ced751
--- /dev/null
+++ b/src/string/memset_explicit.c
@@ -0,0 +1,14 @@
+#include <string.h>
+#include <stdlib.h>
+#include <atomic.h>
+
+void *memset_explicit(void *dest, register int c, register size_t n)
+{
+  register unsigned char volatile *p    = dest;
+  register unsigned char volatile *stop = p + n;
+  for (; p < stop; ++p)
+    *p = c;
+  // the CAS operation serves as memory barrier, and destroys the
+  // information, if it happened to be spilled on the stack
+  return a_cas_p(&dest, dest, 0);
+}
-- 
2.34.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [musl] [C23 string conversion 2/3] C23: implement the c8rtomb and mbrtoc8 functions
  2023-05-26  9:25 [musl] [C23 string conversion 0/3] Jens Gustedt
  2023-05-26  9:25 ` [musl] [C23 string conversion 1/3] C23: add the new memset_explicit function Jens Gustedt
@ 2023-05-26  9:25 ` Jens Gustedt
  2023-05-26  9:25 ` [musl] [C23 string conversion 3/3] C23: add the new include guards for string.h and wchar.h Jens Gustedt
  2 siblings, 0 replies; 13+ messages in thread
From: Jens Gustedt @ 2023-05-26  9:25 UTC (permalink / raw)
  To: musl

The implementations each separate two cases: a fast path that
corresponds to calls with state variables that are in the initial
state and a leading input character that is ASCII; the other cases are
handled by a function for the slow path. These functions are
implemented such that the fast versions should not need stack
variables of their own, and thus can tail call into the slow path with
a jmp instruction if necessary. The fast versions could typically be
also used as shallow inline wrapper, if we like to go that way.

Only the implementation of mbrtoc8 is a bit tricky. This is because of
the weird conventions for dripping the output bytes one call after the
other. This is handled by using the state variable as a stack for the
up to three characters that are still to be sent to the output.
---
 include/uchar.h         |  6 +++++
 src/multibyte/c8rtomb.c | 31 +++++++++++++++++++++++++
 src/multibyte/mbrtoc8.c | 51 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 88 insertions(+)
 create mode 100644 src/multibyte/c8rtomb.c
 create mode 100644 src/multibyte/mbrtoc8.c

diff --git a/include/uchar.h b/include/uchar.h
index 7e5c4d40..9f6a3706 100644
--- a/include/uchar.h
+++ b/include/uchar.h
@@ -9,6 +9,9 @@ extern "C" {
 typedef unsigned short char16_t;
 typedef unsigned char32_t;
 #endif
+#if  __cplusplus < 201811L
+typedef unsigned char char8_t;
+#endif
 
 #define __NEED_mbstate_t
 #define __NEED_size_t
@@ -16,6 +19,9 @@ typedef unsigned char32_t;
 #include <features.h>
 #include <bits/alltypes.h>
 
+size_t c8rtomb(char *__restrict, char8_t, mbstate_t *__restrict);
+size_t mbrtoc8(char8_t *__restrict, const char *__restrict, size_t, mbstate_t *__restrict);
+
 size_t c16rtomb(char *__restrict, char16_t, mbstate_t *__restrict);
 size_t mbrtoc16(char16_t *__restrict, const char *__restrict, size_t, mbstate_t *__restrict);
 
diff --git a/src/multibyte/c8rtomb.c b/src/multibyte/c8rtomb.c
new file mode 100644
index 00000000..8645e112
--- /dev/null
+++ b/src/multibyte/c8rtomb.c
@@ -0,0 +1,31 @@
+#include <uchar.h>
+#include <wchar.h>
+
+__attribute__((__noinline__))
+static size_t c8rtomb_slow(char *__restrict s, unsigned char c8, mbstate_t *__restrict st)
+{
+	// We need an internal state different from mbrtowc.
+	static mbstate_t internal_state;
+	if (!st) st = &internal_state;
+	wchar_t wc;
+
+	// mbrtowc has return values -2, -1, 0, 1.
+	size_t res = mbrtowc(&wc, (char const*)&c8, 1, st);
+	switch (res) {
+	// our implementation of wcrtomb ignores the state
+	case  1: res = wcrtomb(s, wc, 0); break;
+	case  0: res = 1; if (s) *s = 0;  break;
+	}
+	return res;
+}
+
+static size_t __c8rtomb(char *__restrict s, unsigned char c8, mbstate_t *__restrict st)
+{
+        if (st && !*(unsigned*)st && (c8 < 0x80)) {
+		if (s) *s = c8;
+		return 1;
+	}
+	return c8rtomb_slow(s, c8, st);
+}
+
+weak_alias(__c8rtomb, c8rtomb);
diff --git a/src/multibyte/mbrtoc8.c b/src/multibyte/mbrtoc8.c
new file mode 100644
index 00000000..0c73fa10
--- /dev/null
+++ b/src/multibyte/mbrtoc8.c
@@ -0,0 +1,51 @@
+#include <stdlib.h>
+#include <wchar.h>
+#include <uchar.h>
+#include <errno.h>
+#include "internal.h"
+
+__attribute__((__noinline__))
+static size_t mbrtoc8_slow(unsigned char *__restrict pc8, const char *__restrict src, size_t n, mbstate_t *__restrict st)
+{
+	static unsigned internal_state;
+	wchar_t wc;
+	unsigned* pending = st ? (void*)st : &internal_state;
+
+	// The high bit is set if there is still missing input. If it
+	// is not set and the value is not zero, a previous call has
+	// stacked the missing output bytes withing the state.
+	if ((-*pending) > INT_MIN) {
+		if (pc8) *pc8 = *pending;
+		(*pending) >>= 8;
+		return -3;
+	}
+
+	// mbrtowc has return values -2, -1, 0, 1, ..., 4.
+	size_t res = mbrtowc(&wc, src, n, (void*)pending);
+	if (res <= 4) {
+		_Alignas(unsigned) unsigned char s[8] = { 0 };
+		// Write the result bytes to s over the word boundary
+		// as we need it. Our wcrtomb implementation ignores
+		// the state variable, there will be no errors and the
+		// return value can be ignored.
+		wcrtomb((void*)(s+3), wc, 0);
+		// Depending on the endianess this maybe a single load
+		// instruction. We want to be sure that the bytes are
+		// in this order, and that the high-order byte is 0.
+		*pending = (0u|s[4])<<000 | (0u|s[5])<<010 | (0u|s[6])<<020 | (0u|s[7])<<030;
+		if (pc8) *pc8 = s[3];
+	}
+	return res;
+}
+
+static size_t __mbrtoc8(unsigned char *__restrict pc8, const char *__restrict src, size_t n, mbstate_t *__restrict st)
+{
+	unsigned char *s = (void*)src;
+	if (st && !*(unsigned*)st && (s[0] < 0x80u)) {
+		if (pc8) *pc8 = s[0];
+		return s[0]>0;
+	}
+	return mbrtoc8_slow(pc8, src, n, st);
+}
+
+weak_alias(__mbrtoc8, mbrtoc8);
-- 
2.34.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [musl] [C23 string conversion 3/3] C23: add the new include guards for string.h and wchar.h
  2023-05-26  9:25 [musl] [C23 string conversion 0/3] Jens Gustedt
  2023-05-26  9:25 ` [musl] [C23 string conversion 1/3] C23: add the new memset_explicit function Jens Gustedt
  2023-05-26  9:25 ` [musl] [C23 string conversion 2/3] C23: implement the c8rtomb and mbrtoc8 functions Jens Gustedt
@ 2023-05-26  9:25 ` Jens Gustedt
  2 siblings, 0 replies; 13+ messages in thread
From: Jens Gustedt @ 2023-05-26  9:25 UTC (permalink / raw)
  To: musl

---
 include/string.h | 4 ++--
 include/wchar.h  | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/string.h b/include/string.h
index 78ccccbd..3e1de8d5 100644
--- a/include/string.h
+++ b/include/string.h
@@ -1,5 +1,5 @@
-#ifndef	_STRING_H
-#define	_STRING_H
+#ifndef	__STDC_VERSION_STRING_H__
+#define	__STDC_VERSION_STRING_H__ 202311L
 
 #ifdef __cplusplus
 extern "C" {
diff --git a/include/wchar.h b/include/wchar.h
index 3816a7cd..7febf76e 100644
--- a/include/wchar.h
+++ b/include/wchar.h
@@ -1,5 +1,5 @@
-#ifndef _WCHAR_H
-#define _WCHAR_H
+#ifndef __STDC_VERSION_WCHAR_H__
+#define __STDC_VERSION_WCHAR_H__ 202311L
 
 #ifdef __cplusplus
 extern "C" {
-- 
2.34.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] [C23 string conversion 1/3] C23: add the new memset_explicit function
  2023-05-26  9:25 ` [musl] [C23 string conversion 1/3] C23: add the new memset_explicit function Jens Gustedt
@ 2023-05-26  9:52   ` Joakim Sindholt
  2023-05-26 10:18     ` Jₑₙₛ Gustedt
  2023-05-28 10:13   ` NRK
  1 sibling, 1 reply; 13+ messages in thread
From: Joakim Sindholt @ 2023-05-26  9:52 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 2781 bytes --]

On Fri, 26 May 2023 11:25:43 +0200, Jens Gustedt <Jens.Gustedt@inria.fr> wrote:
> This function is meant to work around the fact that C compilers are
> allowed to optimize calls to memset out, if they are able to detect
> that the byte array will die soon, anyway. This permission for memset
> may lead to data leak when non-priveledged parts of an application
> would be able to reconstruct secret information from memory received
> through malloc or on the stack.
> 
> This function here is to force compilers to do the clean up operation
> under all circumstances. How to do that is out of the scope of the C
> standard, so there is not much help there, it only describes the
> intent.
> 
> By having a slow bytewise copy, we intent also to have predictable
> timing, such that we can avoid side-channel attacks. We also do our
> best to remove the meta-information, which is the pointer value from
> the stack and combine that with a synchronizing operation at the end.

I don't see how this is in any way useful. It's certainly not part of
the standard, which only says:

> The intention is that the memory store is always performed (i.e.,
> never elided), regardless of optimizations. This is in contrast to
> calls to the memset function (7.26.6.1)

> ---
>  include/string.h             |  1 +
>  src/string/memset_explicit.c | 14 ++++++++++++++
>  2 files changed, 15 insertions(+)
>  create mode 100644 src/string/memset_explicit.c
> 
> diff --git a/include/string.h b/include/string.h
> index 05019c03..78ccccbd 100644
> --- a/include/string.h
> +++ b/include/string.h
> @@ -27,6 +27,7 @@ extern "C" {
>  void *memcpy (void *__restrict, const void *__restrict, size_t);
>  void *memmove (void *, const void *, size_t);
>  void *memset (void *, int, size_t);
> +void *memset_explicit(void *, int, size_t);
>  int memcmp (const void *, const void *, size_t);
>  
>  void *(memchr) (const void *, int, size_t);
> diff --git a/src/string/memset_explicit.c b/src/string/memset_explicit.c
> new file mode 100644
> index 00000000..49ced751
> --- /dev/null
> +++ b/src/string/memset_explicit.c
> @@ -0,0 +1,14 @@
> +#include <string.h>
> +#include <stdlib.h>
> +#include <atomic.h>
> +
> +void *memset_explicit(void *dest, register int c, register size_t n)
> +{
> +  register unsigned char volatile *p    = dest;
> +  register unsigned char volatile *stop = p + n;
> +  for (; p < stop; ++p)
> +    *p = c;
> +  // the CAS operation serves as memory barrier, and destroys the
> +  // information, if it happened to be spilled on the stack
> +  return a_cas_p(&dest, dest, 0);
> +}
> -- 
> 2.34.1
> 

Musl effectively already has this function in that it has
explicit_bzero. Why not simply copy it? Hell, while we're at it,
implement explicit_bzero in terms of memset_explicit.

[-- Attachment #2: 0001-implement-C23-memset_explicit.patch --]
[-- Type: application/octet-stream, Size: 1571 bytes --]

From 60a2ad2c4f0b1af8a0c2e693d23e83ce87a6c489 Mon Sep 17 00:00:00 2001
From: Joakim Sindholt <opensource@zhasha.com>
Date: Fri, 26 May 2023 11:50:34 +0200
Subject: [PATCH] implement C23 memset_explicit

---
 include/string.h             | 1 +
 src/string/explicit_bzero.c  | 3 +--
 src/string/memset_explicit.c | 8 ++++++++
 3 files changed, 10 insertions(+), 2 deletions(-)
 create mode 100644 src/string/memset_explicit.c

diff --git a/include/string.h b/include/string.h
index db73d2a9..de4232b5 100644
--- a/include/string.h
+++ b/include/string.h
@@ -27,6 +27,7 @@ extern "C" {
 void *memcpy (void *__restrict, const void *__restrict, size_t);
 void *memmove (void *, const void *, size_t);
 void *memset (void *, int, size_t);
+void *memset_explicit (void *, int, size_t);
 int memcmp (const void *, const void *, size_t);
 void *memchr (const void *, int, size_t);
 
diff --git a/src/string/explicit_bzero.c b/src/string/explicit_bzero.c
index f2e12f23..738a096b 100644
--- a/src/string/explicit_bzero.c
+++ b/src/string/explicit_bzero.c
@@ -3,6 +3,5 @@
 
 void explicit_bzero(void *d, size_t n)
 {
-	d = memset(d, 0, n);
-	__asm__ __volatile__ ("" : : "r"(d) : "memory");
+	memset_explicit(d, 0, n);
 }
diff --git a/src/string/memset_explicit.c b/src/string/memset_explicit.c
new file mode 100644
index 00000000..ac54f0cf
--- /dev/null
+++ b/src/string/memset_explicit.c
@@ -0,0 +1,8 @@
+#include <string.h>
+
+void *memset_explicit(void *d, int c, size_t n)
+{
+	d = memset(d, c, n);
+	__asm__ __volatile__ ("" : : "r"(d) : "memory");
+	return d;
+}
-- 
2.26.3


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] [C23 string conversion 1/3] C23: add the new memset_explicit function
  2023-05-26  9:52   ` Joakim Sindholt
@ 2023-05-26 10:18     ` Jₑₙₛ Gustedt
  2023-05-26 20:16       ` Rich Felker
  0 siblings, 1 reply; 13+ messages in thread
From: Jₑₙₛ Gustedt @ 2023-05-26 10:18 UTC (permalink / raw)
  To: Joakim Sindholt; +Cc: musl

[-- Attachment #1: Type: text/plain, Size: 1532 bytes --]

Joakim,

on Fri, 26 May 2023 11:52:36 +0200 you (Joakim Sindholt
<opensource@zhasha.com>) wrote:

> I don't see how this is in any way useful. It's certainly not part of
> the standard, which only says:
> 
> > The intention is that the memory store is always performed (i.e.,
> > never elided), regardless of optimizations. This is in contrast to
> > calls to the memset function (7.26.6.1)  

There has been a long discussion in WG14 about this what is even
possible to say here. The clear intent in all discussions was to have
something that best inhibits all sorts of information leak.

What you are citing is just a footnote. The normative text says:

     The purpose of this function is to make sensitive information
     stored in the object inaccessible

So this is ist the expressed intent.

This is clearly a QoI issue. I think that indeed a sequential
reordering barrier is the minimal quality that implementations can
provide. But since this is not time critical, we might be able to
provide a bit more, with modest cost, such as synchronization with
other threads, and such as deleting the information where even the
object was located in the first place.


Thanks
Jₑₙₛ

-- 
:: ICube :::::::::::::::::::::::::::::: deputy director ::
:: Université de Strasbourg :::::::::::::::::::::: ICPS ::
:: INRIA Nancy Grand Est :::::::::::::::::::::::: Camus ::
:: :::::::::::::::::::::::::::::::::::: ☎ +33 368854536 ::
:: https://icube-icps.unistra.fr/index.php/Jens_Gustedt ::

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] [C23 string conversion 1/3] C23: add the new memset_explicit function
  2023-05-26 10:18     ` Jₑₙₛ Gustedt
@ 2023-05-26 20:16       ` Rich Felker
  2023-05-26 20:35         ` Jₑₙₛ Gustedt
  0 siblings, 1 reply; 13+ messages in thread
From: Rich Felker @ 2023-05-26 20:16 UTC (permalink / raw)
  To: Jₑₙₛ Gustedt; +Cc: Joakim Sindholt, musl

On Fri, May 26, 2023 at 12:18:29PM +0200, Jₑₙₛ Gustedt wrote:
> Joakim,
> 
> on Fri, 26 May 2023 11:52:36 +0200 you (Joakim Sindholt
> <opensource@zhasha.com>) wrote:
> 
> > I don't see how this is in any way useful. It's certainly not part of
> > the standard, which only says:
> > 
> > > The intention is that the memory store is always performed (i.e.,
> > > never elided), regardless of optimizations. This is in contrast to
> > > calls to the memset function (7.26.6.1)  
> 
> There has been a long discussion in WG14 about this what is even
> possible to say here. The clear intent in all discussions was to have
> something that best inhibits all sorts of information leak.
> 
> What you are citing is just a footnote. The normative text says:
> 
>      The purpose of this function is to make sensitive information
>      stored in the object inaccessible
> 
> So this is ist the expressed intent.
> 
> This is clearly a QoI issue. I think that indeed a sequential
> reordering barrier is the minimal quality that implementations can
> provide. But since this is not time critical, we might be able to
> provide a bit more, with modest cost, such as synchronization with
> other threads, and such as deleting the information where even the
> object was located in the first place.

Zeroing the local var dest does not do that. It makes things worse, by
forcing dest to get spilled to memory so it can be acted on as an
object by a_cas_p. It will do nothing to clear whatever register or
stack slot the value in dest might have previously lived in.

We had all of these discussions back when explicit_bzero was added,
and it was done the way it was done because it's portable (within the
framework of what musl already requires) and non-arch-specific, has
zero overhead, avoids any code duplication or bad performance
open-coding another memset variant, and avoids taking part in any
security theater (pretending we can clear things we can't).

Rich

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] [C23 string conversion 1/3] C23: add the new memset_explicit function
  2023-05-26 20:16       ` Rich Felker
@ 2023-05-26 20:35         ` Jₑₙₛ Gustedt
  2023-05-26 20:57           ` Rich Felker
  0 siblings, 1 reply; 13+ messages in thread
From: Jₑₙₛ Gustedt @ 2023-05-26 20:35 UTC (permalink / raw)
  To: Rich Felker; +Cc: Joakim Sindholt, musl

[-- Attachment #1: Type: text/plain, Size: 1314 bytes --]

Rich,

on Fri, 26 May 2023 16:16:03 -0400 you (Rich Felker <dalias@libc.org>)
wrote:

> We had all of these discussions back when explicit_bzero was added,
> and it was done the way it was done because it's portable (within the
> framework of what musl already requires) and non-arch-specific, has
> zero overhead, avoids any code duplication or bad performance
> open-coding another memset variant,

My impression is that such information is quite difficult to find, but
maybe I didn't search enough. Sometimes code comments would help ;-)

Bad performance really isn't a valid argument in this case. This is
not supposed to be efficient. Any user that uses this has to know that
they are trading it for something.

> and avoids taking part in any security theater (pretending we can
> clear things we can't).

It is not about taking part. For me it is just about offering to our
users the best service for this tricky question that we may, and not
less.


Thanks
Jₑₙₛ

-- 
:: ICube :::::::::::::::::::::::::::::: deputy director ::
:: Université de Strasbourg :::::::::::::::::::::: ICPS ::
:: INRIA Nancy Grand Est :::::::::::::::::::::::: Camus ::
:: :::::::::::::::::::::::::::::::::::: ☎ +33 368854536 ::
:: https://icube-icps.unistra.fr/index.php/Jens_Gustedt ::

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] [C23 string conversion 1/3] C23: add the new memset_explicit function
  2023-05-26 20:35         ` Jₑₙₛ Gustedt
@ 2023-05-26 20:57           ` Rich Felker
  2023-05-27  6:49             ` Jₑₙₛ Gustedt
  0 siblings, 1 reply; 13+ messages in thread
From: Rich Felker @ 2023-05-26 20:57 UTC (permalink / raw)
  To: Jₑₙₛ Gustedt; +Cc: Joakim Sindholt, musl

On Fri, May 26, 2023 at 10:35:52PM +0200, Jₑₙₛ Gustedt wrote:
> Rich,
> 
> on Fri, 26 May 2023 16:16:03 -0400 you (Rich Felker <dalias@libc.org>)
> wrote:
> 
> > We had all of these discussions back when explicit_bzero was added,
> > and it was done the way it was done because it's portable (within the
> > framework of what musl already requires) and non-arch-specific, has
> > zero overhead, avoids any code duplication or bad performance
> > open-coding another memset variant,
> 
> My impression is that such information is quite difficult to find, but
> maybe I didn't search enough. Sometimes code comments would help ;-)

It was discussed in the "thoughts on reallocarray, explicit_bzero?"
thread in 2014, starting here:

https://www.openwall.com/lists/musl/2014/05/19/3

> Bad performance really isn't a valid argument in this case. This is
> not supposed to be efficient. Any user that uses this has to know that
> they are trading it for something.

I'm not sure whether performance here is a good criterion or not, but
making it ~gratuitously~ slow with an extra implementation of memcpy
open-coded here does not make sense.

> > and avoids taking part in any security theater (pretending we can
> > clear things we can't).
> 
> It is not about taking part. For me it is just about offering to our
> users the best service for this tricky question that we may, and not
> less.

Do you have something in mind about how the explcit_bzero
implementation we have doesn't do that?

Rich

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] [C23 string conversion 1/3] C23: add the new memset_explicit function
  2023-05-26 20:57           ` Rich Felker
@ 2023-05-27  6:49             ` Jₑₙₛ Gustedt
  2023-05-27 13:52               ` Rich Felker
  0 siblings, 1 reply; 13+ messages in thread
From: Jₑₙₛ Gustedt @ 2023-05-27  6:49 UTC (permalink / raw)
  To: Rich Felker; +Cc: Joakim Sindholt, musl

[-- Attachment #1: Type: text/plain, Size: 891 bytes --]

Rich,

on Fri, 26 May 2023 16:57:21 -0400 you (Rich Felker <dalias@libc.org>)
wrote:

> Do you have something in mind about how the explcit_bzero
> implementation we have doesn't do that?

I would be more comfortable with a stronger synchronization barrier
that works for all memory models and also in the presence of threads
and signals. So maybe using `a_barrier()`?

Also I think that relying on the compiler's `memset` is not a good
strategy. This puts us at their merci of whatever efficiency games
they are playing, now or in the future.

Thanks
Jₑₙₛ

-- 
:: ICube :::::::::::::::::::::::::::::: deputy director ::
:: Université de Strasbourg :::::::::::::::::::::: ICPS ::
:: INRIA Nancy Grand Est :::::::::::::::::::::::: Camus ::
:: :::::::::::::::::::::::::::::::::::: ☎ +33 368854536 ::
:: https://icube-icps.unistra.fr/index.php/Jens_Gustedt ::

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] [C23 string conversion 1/3] C23: add the new memset_explicit function
  2023-05-27  6:49             ` Jₑₙₛ Gustedt
@ 2023-05-27 13:52               ` Rich Felker
  0 siblings, 0 replies; 13+ messages in thread
From: Rich Felker @ 2023-05-27 13:52 UTC (permalink / raw)
  To: Jₑₙₛ Gustedt; +Cc: Joakim Sindholt, musl

On Sat, May 27, 2023 at 08:49:47AM +0200, Jₑₙₛ Gustedt wrote:
> Rich,
> 
> on Fri, 26 May 2023 16:57:21 -0400 you (Rich Felker <dalias@libc.org>)
> wrote:
> 
> > Do you have something in mind about how the explcit_bzero
> > implementation we have doesn't do that?
> 
> I would be more comfortable with a stronger synchronization barrier
> that works for all memory models and also in the presence of threads
> and signals. So maybe using `a_barrier()`?

There is no distinction here with repect to signals. With respect to
threads, I guess it's a distinction of whether data races in other
threads could still see the value that was supposed to have been
cleared. This is impossible already is there is any synchronization
ordering the operation of clearing with the access from another
thread. If there is not, then the operations are unordered with
respect to one another and the value *could already have been seen*
just by execution in a different order, before the explicit_memset. So
I don't see how this is supposed to be helpful.

> Also I think that relying on the compiler's `memset` is not a good
> strategy. This puts us at their merci of whatever efficiency games
> they are playing, now or in the future.

It's not the compiler's memset. It's the external function. Because
this is all built as -ffreestanding, the compiler is not able to
expand calls to standard functions as the builtins. -ffreestanding
implies -fno-builtin.

However, I claim it would actually be better/safer if it were the
compiler's memset.

Consider the case where explicit_bzero is inlined in the caller (the
only one in which anything significantly differs), and the caller is
trying to clear a private key in a local variable whose address has
not otherwise been taken (or for which all accesses through the
address were already collapsed via ipa/inlining).

With the compiler's memset, it can operate directly on the caller's
local variable in-place in a register, just zeroing the register,
*then* spill the zero to actual memory for the __asm__ barrier.

With an external memset, it has to first spill the key to memory for
memset to clear it. The copy in the register remains in place until
the register is reused for something else.

This is one aspect in which your version is preferable, but it
would/will be fixed by making src/include/string.h re-enable use of
builtins conditional on a configure test for them.

Rich

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] [C23 string conversion 1/3] C23: add the new memset_explicit function
  2023-05-26  9:25 ` [musl] [C23 string conversion 1/3] C23: add the new memset_explicit function Jens Gustedt
  2023-05-26  9:52   ` Joakim Sindholt
@ 2023-05-28 10:13   ` NRK
  2023-05-29  7:48     ` Jₑₙₛ Gustedt
  1 sibling, 1 reply; 13+ messages in thread
From: NRK @ 2023-05-28 10:13 UTC (permalink / raw)
  To: musl

Hi Jens,

On Fri, May 26, 2023 at 11:25:43AM +0200, Jens Gustedt wrote:
> By having a slow bytewise copy, we intent also to have predictable
> timing, such that we can avoid side-channel attacks.

I don't believe `volatile` provides any guarantee of emitting
constant-time operations (which can be CPU dependent). But even if it
happens to work out in practice, from a user/non-cryptographer's
perspective, I feel like claims like "avoiding side-channel attacks"
needs much more substantiation than just slapping a `volatile` on top of
a pointer.

But as I've said, not a cryptographer, so please *do* correct me if I'm
wrong or am being unnecessarily paranoid.

P.S: even if the claim is correct, other major implementation would also
have to agree to provide such guarantee in a documented manner for this
to be useful to the users. Otherwise, users will have to resort to
hard-coded libc checks or simply not rely on this property at all.

- NRK

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] [C23 string conversion 1/3] C23: add the new memset_explicit function
  2023-05-28 10:13   ` NRK
@ 2023-05-29  7:48     ` Jₑₙₛ Gustedt
  0 siblings, 0 replies; 13+ messages in thread
From: Jₑₙₛ Gustedt @ 2023-05-29  7:48 UTC (permalink / raw)
  To: NRK; +Cc: musl

[-- Attachment #1: Type: text/plain, Size: 3182 bytes --]

Hi,

on Sun, 28 May 2023 16:13:18 +0600 you (NRK <nrk@disroot.org>) wrote:

> On Fri, May 26, 2023 at 11:25:43AM +0200, Jens Gustedt wrote:
> > By having a slow bytewise copy, we intent also to have predictable
> > timing, such that we can avoid side-channel attacks.  
> 
> I don't believe `volatile` provides any guarantee of emitting
> constant-time operations (which can be CPU dependent). But even if it
> happens to work out in practice, from a user/non-cryptographer's
> perspective, I feel like claims like "avoiding side-channel attacks"
> needs much more substantiation than just slapping a `volatile` on top
> of a pointer.
> 
> But as I've said, not a cryptographer,

I am not a cryptographer either, I wrote that code in best effort
mode: doing anything that I can that doesn't harm the goals of this
function. Very likely, this code will sit there unattended for the
next 20 years. I personnally do not know what the future brings, what
compilers will be capable of, and what the expectation of users will
be. In my opinion putting `volatile` in here is worth it, even if it
only might inhibit one single exploit in the next 20 years.

The same holds for stronger synchronization, I don't know about the
capabilities (or bugs) of future systems where an external agent from
a different thread, process or outside tool might guess addresses and
peek into storage, caches, whatever, before the information is
erased. Closing the window for this as much as possible seems a valid
strategy to me.

`memset` and `memset_explicit` have quite orthogonal goals. The
first is to put a buffer in a known state *before* you use it, it is
used often and performance is crucial.

The second is to put a buffer back into a known state *after* you use
it, to remove all the traces that your application migh have left
there, and avoid any unintended outflow of information. It will be
used rarely, probably on small bits of data (pathwords and such) and
performance is irrelevant.

So I think that the arguments should go just in an opposite direction
for this particular function than we usually have it.

> so please *do* correct me if I'm wrong or am being unnecessarily
> paranoid.

In the contrary, you may not be sufficiently paranoid. Paranoia is key
for this function.

> P.S: even if the claim is correct, other major implementation would
> also have to agree to provide such guarantee in a documented manner
> for this to be useful to the users. Otherwise, users will have to
> resort to hard-coded libc checks or simply not rely on this property
> at all.

I don't buy that argument either. This function is not there for
specific guaratees that a platform gives. It is a best effort attempt
by *us* to avoid information leakage. This is our responsability here,
nothing else.


Thanks
Jₑₙₛ

-- 
:: ICube :::::::::::::::::::::::::::::: deputy director ::
:: Université de Strasbourg :::::::::::::::::::::: ICPS ::
:: INRIA Nancy Grand Est :::::::::::::::::::::::: Camus ::
:: :::::::::::::::::::::::::::::::::::: ☎ +33 368854536 ::
:: https://icube-icps.unistra.fr/index.php/Jens_Gustedt ::

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-05-29  7:48 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-26  9:25 [musl] [C23 string conversion 0/3] Jens Gustedt
2023-05-26  9:25 ` [musl] [C23 string conversion 1/3] C23: add the new memset_explicit function Jens Gustedt
2023-05-26  9:52   ` Joakim Sindholt
2023-05-26 10:18     ` Jₑₙₛ Gustedt
2023-05-26 20:16       ` Rich Felker
2023-05-26 20:35         ` Jₑₙₛ Gustedt
2023-05-26 20:57           ` Rich Felker
2023-05-27  6:49             ` Jₑₙₛ Gustedt
2023-05-27 13:52               ` Rich Felker
2023-05-28 10:13   ` NRK
2023-05-29  7:48     ` Jₑₙₛ Gustedt
2023-05-26  9:25 ` [musl] [C23 string conversion 2/3] C23: implement the c8rtomb and mbrtoc8 functions Jens Gustedt
2023-05-26  9:25 ` [musl] [C23 string conversion 3/3] C23: add the new include guards for string.h and wchar.h Jens Gustedt

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).