mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] c23 memset_explicit()
@ 2024-03-19 11:18 Aaron Peter Bachmann
  2024-03-19 13:50 ` Jₑₙₛ Gustedt
  0 siblings, 1 reply; 8+ messages in thread
From: Aaron Peter Bachmann @ 2024-03-19 11:18 UTC (permalink / raw)
  To: musl

I recognized neither
https://git.musl-libc.org/cgit/musl
nor
https://forge.icube.unistra.fr/icps/musl/-/branches
seem to include c23 memset_explicit().
Or it slipped my attention.

So I provide a patch. It compiles but is otherwise untested.
It is trivial enough that you would spot an error when merging.
No guards for c23 as mem is a reserved prefix.
It closely follows explicit_bzero.c. So I assume it fits into the coding 
style musl uses.

Regards, Aaron Peter Bachmann


diff --git a/include/string.h b/include/string.h
index 83e2b946..563b3b0a 100644
--- a/include/string.h
+++ b/include/string.h
@@ -27,6 +27,7 @@ extern "C" {
  void *memcpy (void *__restrict, const void *__restrict, size_t);
  void *memmove (void *, const void *, size_t);
  void *memset (void *, int, size_t);
+void *memset_explicit(void *, int, size_t);
  int memcmp (const void *, const void *, size_t);
  void *memchr (const void *, int, size_t);

diff --git a/src/string/memset_explicit.c b/src/string/memset_explicit.c
new file mode 100644
index 00000000..ac54f0cf
--- /dev/null
+++ b/src/string/memset_explicit.c
@@ -0,0 +1,8 @@
+#include <string.h>
+
+void *memset_explicit(void *d, int c, size_t n)
+{
+       d = memset(d, c, n);
+       __asm__ __volatile__ ("" : : "r"(d) : "memory");
+       return d;
+}


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [musl] c23 memset_explicit()
  2024-03-19 11:18 [musl] c23 memset_explicit() Aaron Peter Bachmann
@ 2024-03-19 13:50 ` Jₑₙₛ Gustedt
  2024-03-19 14:02   ` Rich Felker
  0 siblings, 1 reply; 8+ messages in thread
From: Jₑₙₛ Gustedt @ 2024-03-19 13:50 UTC (permalink / raw)
  To: Aaron Peter Bachmann; +Cc: musl

Aaron,

on Tue, 19 Mar 2024 12:18:20 +0100 you (Aaron Peter Bachmann
<aaron_ng@inode.at>) wrote:

> I recognized neither
> https://git.musl-libc.org/cgit/musl
> nor
> https://forge.icube.unistra.fr/icps/musl/-/branches
> seem to include c23 memset_explicit().
> Or it slipped my attention.

There had been such an implementation, but I removed it from the set
because there was no consensus how it should look like. I'd prefer
that someone else does it. If you want to read things up, there has
been a discussion on this list in May last year.

Your patch looks like the minimal thing that one would expect. For me
personally that does not seem good enough. One of the things that
bother me is that `memset` could have varying processing times, not
only depending on the length of the input (which is unavoidable), but
also depending on its contents.

Anyhow, Rich had elaborated a whole strategy how this feature would
better fallback to a builtin, if such a builtin exists. So I prefer
them doing it, whenever they are ready.

Thanks
Jₑₙₛ

-- 
:: ICube :::::::::::::::::::::::::::::: deputy director ::
:: Université de Strasbourg :::::::::::::::::::::: ICPS ::
:: INRIA Nancy Grand Est :::::::::::::::::::::::: Camus ::
:: :::::::::::::::::::::::::::::::::::: ☎ +33 368854536 ::
:: https://icube-icps.unistra.fr/index.php/Jens_Gustedt ::

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [musl] c23 memset_explicit()
  2024-03-19 13:50 ` Jₑₙₛ Gustedt
@ 2024-03-19 14:02   ` Rich Felker
  2024-03-19 14:24     ` Jₑₙₛ Gustedt
                       ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Rich Felker @ 2024-03-19 14:02 UTC (permalink / raw)
  To: Jₑₙₛ Gustedt; +Cc: Aaron Peter Bachmann, musl

On Tue, Mar 19, 2024 at 02:50:26PM +0100, Jₑₙₛ Gustedt wrote:
> Aaron,
> 
> on Tue, 19 Mar 2024 12:18:20 +0100 you (Aaron Peter Bachmann
> <aaron_ng@inode.at>) wrote:
> 
> > I recognized neither
> > https://git.musl-libc.org/cgit/musl
> > nor
> > https://forge.icube.unistra.fr/icps/musl/-/branches
> > seem to include c23 memset_explicit().
> > Or it slipped my attention.
> 
> There had been such an implementation, but I removed it from the set
> because there was no consensus how it should look like. I'd prefer
> that someone else does it. If you want to read things up, there has
> been a discussion on this list in May last year.
> 
> Your patch looks like the minimal thing that one would expect. For me
> personally that does not seem good enough. One of the things that
> bother me is that `memset` could have varying processing times, not
> only depending on the length of the input (which is unavoidable), but
> also depending on its contents.
> 
> Anyhow, Rich had elaborated a whole strategy how this feature would
> better fallback to a builtin, if such a builtin exists. So I prefer
> them doing it, whenever they are ready.

I think this implementation looks exactly like what I recall
requesting. I'm not sure what the builtin thing was. It might have
just been that I'd like (in general) to make it so musl is able to use
the builtins internally, but that only makes any distinction here if
LTO is in use (i.e. if memset_explicit is inlined into the caller).

I don't see where constant-time was part of the intended purpose of
memset_explicit (rather it seems to be intended just as a best-effort
way to avoid leaving around sensitive data, with all the possible
pitfalls that entails) and in general we don't make any promises of
constant-time in musl, but I don't see any reason the memset
implementations we use would have time dependency on original data
being overwritten, just things like whether it's cached. Maybe with a
large (many pages) buffer, something like zram could expose
information about the old contents through time or memory utilization,
but that's not really something we can defend against (and probably
not a good idea for robustness or data privacy).

Rich

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [musl] c23 memset_explicit()
  2024-03-19 14:02   ` Rich Felker
@ 2024-03-19 14:24     ` Jₑₙₛ Gustedt
  2024-03-19 15:51     ` Aaron Peter Bachmann
  2024-03-19 16:37     ` [musl] c23 memset_explicit() NRK
  2 siblings, 0 replies; 8+ messages in thread
From: Jₑₙₛ Gustedt @ 2024-03-19 14:24 UTC (permalink / raw)
  To: Rich Felker; +Cc: Aaron Peter Bachmann, musl

Rich,

on Tue, 19 Mar 2024 10:02:13 -0400 you (Rich Felker <dalias@libc.org>)
wrote:

> I think this implementation looks exactly like what I recall
> requesting.

I still have different expectations for that function, that's why I
removed my version, and why I would not sign it off as it is given
here. But if everybody else is satisfied, then this would be a perfect
target for a first inclusion. We have to start somewhere, and it would
be good if we did, better now than never ;-)

Thanks
Jₑₙₛ

-- 
:: ICube :::::::::::::::::::::::::::::: deputy director ::
:: Université de Strasbourg :::::::::::::::::::::: ICPS ::
:: INRIA Nancy Grand Est :::::::::::::::::::::::: Camus ::
:: :::::::::::::::::::::::::::::::::::: ☎ +33 368854536 ::
:: https://icube-icps.unistra.fr/index.php/Jens_Gustedt ::

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [musl] c23 memset_explicit()
  2024-03-19 14:02   ` Rich Felker
  2024-03-19 14:24     ` Jₑₙₛ Gustedt
@ 2024-03-19 15:51     ` Aaron Peter Bachmann
  2024-03-19 16:19       ` Rich Felker
  2024-03-19 16:37     ` [musl] c23 memset_explicit() NRK
  2 siblings, 1 reply; 8+ messages in thread
From: Aaron Peter Bachmann @ 2024-03-19 15:51 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl, Jens Gustedt

I have read the discussion  from 2023-05-26 to and including 2023-05-29.

https://www.openwall.com/lists/musl/2023/05/26/8 says
"implement explicit_bzero in terms of memset_explicit"
As a non-native speaker I am not entirely sure what that is supposed to 
mean.
I have two destination.
1. Implementing  explicit_bzero () by calling  memset_explicit()
That would save one line of source at the cost of an additional branch,
as it is a tail-call. I do not think that is a good tradeoff. But it is 
unlikely to to make a difference in practice.
2. Implement it like the implementation of explicit_bzero()
That is indeed what I have done. I lean towards this 2nd interpretation.

In
https://www.openwall.com/lists/musl/2023/05/26/8
Joakim Sindholt has proposed THE ABSOLUTE SAME PATCH already!
Even with d for destination, as it is done in explicit_bzero().
c23 uses s as destination, memset() in musl dest.

But Jens has a point.
In the long run I also hope for compilers recognizing memset_explicit() 
and also erase other copies it has made (in registers or on stack) 
without being explicitly requested to do so in the source.
And it should NOT be by an intrinsic someone has to call, but due to the 
knowledge of the semantic of memset_explicit().
If this is inline or by a function call is an implementation detail.

If I had not tried to adapt to the musl coding style I would have
accepted a few 100 cycles delay for the writes to take effect:

static void *(*volatile lib_memset_fp)(void *restrict,int,size_t)=memset;
void *memset_explicit(void *restrict dest,int val,size_t len){
     return (*lib_memset_fp)(dest,val,len);
}

Regards, Aaron Peter Bachmann

On 3/19/24 15:02, Rich Felker wrote:
> On Tue, Mar 19, 2024 at 02:50:26PM +0100, Jₑₙₛ Gustedt wrote:
>> Aaron,
>>
>> on Tue, 19 Mar 2024 12:18:20 +0100 you (Aaron Peter Bachmann
>> <aaron_ng@inode.at>) wrote:
>>
>>> I recognized neither
>>> https://git.musl-libc.org/cgit/musl
>>> nor
>>> https://forge.icube.unistra.fr/icps/musl/-/branches
>>> seem to include c23 memset_explicit().
>>> Or it slipped my attention.
>> There had been such an implementation, but I removed it from the set
>> because there was no consensus how it should look like. I'd prefer
>> that someone else does it. If you want to read things up, there has
>> been a discussion on this list in May last year.
>>
>> Your patch looks like the minimal thing that one would expect. For me
>> personally that does not seem good enough. One of the things that
>> bother me is that `memset` could have varying processing times, not
>> only depending on the length of the input (which is unavoidable), but
>> also depending on its contents.
>>
>> Anyhow, Rich had elaborated a whole strategy how this feature would
>> better fallback to a builtin, if such a builtin exists. So I prefer
>> them doing it, whenever they are ready.
> I think this implementation looks exactly like what I recall
> requesting. I'm not sure what the builtin thing was. It might have
> just been that I'd like (in general) to make it so musl is able to use
> the builtins internally, but that only makes any distinction here if
> LTO is in use (i.e. if memset_explicit is inlined into the caller).
>
> I don't see where constant-time was part of the intended purpose of
> memset_explicit (rather it seems to be intended just as a best-effort
> way to avoid leaving around sensitive data, with all the possible
> pitfalls that entails) and in general we don't make any promises of
> constant-time in musl, but I don't see any reason the memset
> implementations we use would have time dependency on original data
> being overwritten, just things like whether it's cached. Maybe with a
> large (many pages) buffer, something like zram could expose
> information about the old contents through time or memory utilization,
> but that's not really something we can defend against (and probably
> not a good idea for robustness or data privacy).
>
> Rich



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [musl] c23 memset_explicit()
  2024-03-19 15:51     ` Aaron Peter Bachmann
@ 2024-03-19 16:19       ` Rich Felker
  2024-03-20  7:42         ` [musl] Re: Potential bug in __res_msend_rc() wrt to union, initialization Aaron Peter Bachmann
  0 siblings, 1 reply; 8+ messages in thread
From: Rich Felker @ 2024-03-19 16:19 UTC (permalink / raw)
  To: Aaron Peter Bachmann; +Cc: musl, Jens Gustedt

On Tue, Mar 19, 2024 at 04:51:02PM +0100, Aaron Peter Bachmann wrote:
> I have read the discussion  from 2023-05-26 to and including 2023-05-29.
> 
> https://www.openwall.com/lists/musl/2023/05/26/8 says
> "implement explicit_bzero in terms of memset_explicit"
> As a non-native speaker I am not entirely sure what that is supposed
> to mean.
> I have two destination.
> 1. Implementing  explicit_bzero () by calling  memset_explicit()

This is what it means.

> That would save one line of source at the cost of an additional branch,
> as it is a tail-call. I do not think that is a good tradeoff. But it
> is unlikely to to make a difference in practice.

It's not a tail-call except on archs with args-in-registers and no
caller-reserved space for spilling args.

> 2. Implement it like the implementation of explicit_bzero()
> That is indeed what I have done. I lean towards this 2nd interpretation.

I think this is a better approach.

> In
> https://www.openwall.com/lists/musl/2023/05/26/8
> Joakim Sindholt has proposed THE ABSOLUTE SAME PATCH already!
> Even with d for destination, as it is done in explicit_bzero().
> c23 uses s as destination, memset() in musl dest.

Yes.

> But Jens has a point.
> In the long run I also hope for compilers recognizing
> memset_explicit() and also erase other copies it has made (in
> registers or on stack) without being explicitly requested to do so
> in the source.
> And it should NOT be by an intrinsic someone has to call, but due to
> the knowledge of the semantic of memset_explicit().
> If this is inline or by a function call is an implementation detail.

Yes, it's better for the compiler to use its own intrinsic so that it
can know to clear out anywhere else it may have spilled the same data.
But that's separate from the external function we need to implement in
libc.

> If I had not tried to adapt to the musl coding style I would have
> accepted a few 100 cycles delay for the writes to take effect:
> 
> static void *(*volatile lib_memset_fp)(void *restrict,int,size_t)=memset;
> void *memset_explicit(void *restrict dest,int val,size_t len){
>     return (*lib_memset_fp)(dest,val,len);
> }

This is a worse implementation that does not convey the semantics that
the result of the zeroing is used (and can't be optimized out), using
volatile hacks rather than an actual dependency. This approach was
rejected when explicit_bzero was proposed, IIRC.

Rich

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [musl] c23 memset_explicit()
  2024-03-19 14:02   ` Rich Felker
  2024-03-19 14:24     ` Jₑₙₛ Gustedt
  2024-03-19 15:51     ` Aaron Peter Bachmann
@ 2024-03-19 16:37     ` NRK
  2 siblings, 0 replies; 8+ messages in thread
From: NRK @ 2024-03-19 16:37 UTC (permalink / raw)
  To: musl; +Cc: Jₑₙₛ Gustedt, Aaron Peter Bachmann

> I'm not sure what the builtin thing was.

IIRC, the idea was that if memset_explicit() is recognized by compilers
then they can make better decision. E.g if the data is held in a
register without any further copy in memory, then compiler can just
clear the register instead of spilling it onto the stack and calling
memset_explicit() on it.

But that's not really relevant for libcs trying to add support for it.
Both of them can co-exist.

- NRK

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [musl] Re: Potential bug in __res_msend_rc() wrt to union, initialization.
  2024-03-19 16:19       ` Rich Felker
@ 2024-03-20  7:42         ` Aaron Peter Bachmann
  0 siblings, 0 replies; 8+ messages in thread
From: Aaron Peter Bachmann @ 2024-03-20  7:42 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

On

19 Mar 2024 17:36:41 -0400, Rich Felker wrote:
> > > actually the introduction of `{}` versus `{0}` in C23 was not meant to
> > > provide any difference in semantics, just to make the syntax nicer and
> > > consistent with C++.
> > 
> > Regardless of what the intention was, the reality is that it *does* have
> > semantic difference. Specifically, empty initialization `{}` benefits
> > from the default initialization rules, which specify zero-ing out the
> > padding bits whereas `{0}` doesn't guarantee that.
>
> That's simply not true. There is no difference in the rules as
> specified by the standard.
There is:
Before C23:
{ } was not mentioned in the std so it was implicitly undefined.
{0} was defined.
=> This is a difference.

C23:
We can look at n3096: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3096.pdf
Later versions not generally available have - to the best of my knowledge -
not changed in regard of whet we are discussing.

6.7.10 p10, p11, ...:
10 Except where explicitly stated otherwise, for the purposes of this subclause unnamed members
of objects of structure and union type do not participate in initialization. Unnamed members of
structure objects have indeterminate representation even after initialization.
11 If an object that has automatic storage duration is not initialized explicitly, its representation is
indeterminate. If an object that has static or thread storage duration is not initialized explicitly, or
any object is initialized with an empty initializer, then it is subject to default initialization, which
initializes an object as follows: ...

For me it reads that when the empty initializer {} is not used,
we have default initialization only for memory with static and thread local storage duration,
but not for automatic storage duration. This affects the padding.
=> This is a difference.

The empty initializer can be used for VLAs.
=> This is a difference. But certainly not one we are discussing.

But it does not matter for musl, as it does not use C23 and it could not reasonably do so.
The fact that compilers implementing {} before C23 could not in advance see what C23 would say
does not help either.
It will take some time before we can benefit from the stronger guarantees C23 provides via {}.

And I would also like to add: If no difference was intended, as we know from Jens,
there had better be no difference.

Regards Aaron Peter Bachmann





^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-03-20  7:43 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-19 11:18 [musl] c23 memset_explicit() Aaron Peter Bachmann
2024-03-19 13:50 ` Jₑₙₛ Gustedt
2024-03-19 14:02   ` Rich Felker
2024-03-19 14:24     ` Jₑₙₛ Gustedt
2024-03-19 15:51     ` Aaron Peter Bachmann
2024-03-19 16:19       ` Rich Felker
2024-03-20  7:42         ` [musl] Re: Potential bug in __res_msend_rc() wrt to union, initialization Aaron Peter Bachmann
2024-03-19 16:37     ` [musl] c23 memset_explicit() NRK

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).