mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] patches for C23
@ 2023-05-01 18:50 Jₑₙₛ Gustedt
  2023-05-01 19:24 ` Khem Raj
  2023-05-01 19:41 ` Rich Felker
  0 siblings, 2 replies; 29+ messages in thread
From: Jₑₙₛ Gustedt @ 2023-05-01 18:50 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 911 bytes --]

Hello,
I now have a series of patches (~40) that implement most the missing
bits for C23 support. Nothing is very deep, here, I think, but
probably that is still quite a chunk to review. They are compile
tested for gcc version 7 to 12 and clang versions 9 to 17, but only on
x86_64.

Shall I sent them all here (I would have to refresh my git skills for
that) or does somebody want to browse on some sort of gitlab host
first?

When implementing this I also marginally updated my page describing
C23 library changes

     https://gustedt.gitlabpages.inria.fr/c23-library/

Jₑₙₛ

-- 
:: ICube :::::::::::::::::::::::::::::: deputy director ::
:: Université de Strasbourg :::::::::::::::::::::: ICPS ::
:: INRIA Nancy Grand Est :::::::::::::::::::::::: Camus ::
:: :::::::::::::::::::::::::::::::::::: ☎ +33 368854536 ::
:: https://icube-icps.unistra.fr/index.php/Jens_Gustedt ::

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [musl] patches for C23
  2023-05-01 18:50 [musl] patches for C23 Jₑₙₛ Gustedt
@ 2023-05-01 19:24 ` Khem Raj
  2023-05-01 19:41 ` Rich Felker
  1 sibling, 0 replies; 29+ messages in thread
From: Khem Raj @ 2023-05-01 19:24 UTC (permalink / raw)
  To: musl

Hi Jens

It would be good to share it via a branch on some git repository ( I
prefer that :) ) or perhaps on ml also.
I and may be others too can give it a shot on some of non x86_64 arches.

Thanks
-Khem


On Mon, May 1, 2023 at 11:51 AM Jₑₙₛ Gustedt <jens.gustedt@inria.fr> wrote:
>
> Hello,
> I now have a series of patches (~40) that implement most the missing
> bits for C23 support. Nothing is very deep, here, I think, but
> probably that is still quite a chunk to review. They are compile
> tested for gcc version 7 to 12 and clang versions 9 to 17, but only on
> x86_64.
>
> Shall I sent them all here (I would have to refresh my git skills for
> that) or does somebody want to browse on some sort of gitlab host
> first?
>
> When implementing this I also marginally updated my page describing
> C23 library changes
>
>      https://gustedt.gitlabpages.inria.fr/c23-library/
>
> Jₑₙₛ
>
> --
> :: ICube :::::::::::::::::::::::::::::: deputy director ::
> :: Université de Strasbourg :::::::::::::::::::::: ICPS ::
> :: INRIA Nancy Grand Est :::::::::::::::::::::::: Camus ::
> :: :::::::::::::::::::::::::::::::::::: ☎ +33 368854536 ::
> :: https://icube-icps.unistra.fr/index.php/Jens_Gustedt ::

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [musl] patches for C23
  2023-05-01 18:50 [musl] patches for C23 Jₑₙₛ Gustedt
  2023-05-01 19:24 ` Khem Raj
@ 2023-05-01 19:41 ` Rich Felker
  2023-05-02  6:57   ` Jₑₙₛ Gustedt
  1 sibling, 1 reply; 29+ messages in thread
From: Rich Felker @ 2023-05-01 19:41 UTC (permalink / raw)
  To: Jₑₙₛ Gustedt; +Cc: musl

On Mon, May 01, 2023 at 08:50:37PM +0200, Jₑₙₛ Gustedt wrote:
> Hello,
> I now have a series of patches (~40) that implement most the missing
> bits for C23 support. Nothing is very deep, here, I think, but
> probably that is still quite a chunk to review. They are compile
> tested for gcc version 7 to 12 and clang versions 9 to 17, but only on
> x86_64.
> 
> Shall I sent them all here (I would have to refresh my git skills for
> that) or does somebody want to browse on some sort of gitlab host
> first?
> 
> When implementing this I also marginally updated my page describing
> C23 library changes
> 
>      https://gustedt.gitlabpages.inria.fr/c23-library/

If you want discussion of them, it would be most helpful to submit as
attachments to the list where the context of discussion will exist in
immutable form, rather than as a git branch that might disappear or be
replaced. ~40 is a lot for attachments to a single message, but ~40
individual posts would also be overwhelming, so it probably makes
sense to group them by areas of functionality or layering or
something. Likely there are going to be a bunch that are fairly
uncontroversial and mechanical, and some that are more invasive and
that will attract more meaningful discussion.

Rich

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [musl] patches for C23
  2023-05-01 19:41 ` Rich Felker
@ 2023-05-02  6:57   ` Jₑₙₛ Gustedt
  2023-05-02 13:59     ` Jₑₙₛ Gustedt
  0 siblings, 1 reply; 29+ messages in thread
From: Jₑₙₛ Gustedt @ 2023-05-02  6:57 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

[-- Attachment #1: Type: text/plain, Size: 1358 bytes --]

Hi

on Mon, 1 May 2023 15:41:21 -0400 you (Rich Felker <dalias@libc.org>)
wrote:

> If you want discussion of them, it would be most helpful to submit as
> attachments to the list where the context of discussion will exist in
> immutable form, rather than as a git branch that might disappear or be
> replaced. ~40 is a lot for attachments to a single message, but ~40
> individual posts would also be overwhelming, so it probably makes
> sense to group them by areas of functionality or layering or
> something.

Ok, I'll see that I rebase such that we have meaningful groups of
patches.

> Likely there are going to be a bunch that are fairly
> uncontroversial and mechanical, and some that are more invasive and
> that will attract more meaningful discussion.

Yes, definitively.

I'll also setup a git repository for those who would be willing to
test the whole. Just be aware that is really testing and review, not
yet ready for direct inclusion. So probably this will be rebased
several times.

Thanks
Jₑₙₛ

-- 
:: ICube :::::::::::::::::::::::::::::: deputy director ::
:: Université de Strasbourg :::::::::::::::::::::: ICPS ::
:: INRIA Nancy Grand Est :::::::::::::::::::::::: Camus ::
:: :::::::::::::::::::::::::::::::::::: ☎ +33 368854536 ::
:: https://icube-icps.unistra.fr/index.php/Jens_Gustedt ::

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [musl] patches for C23
  2023-05-02  6:57   ` Jₑₙₛ Gustedt
@ 2023-05-02 13:59     ` Jₑₙₛ Gustedt
  2023-05-02 23:20       ` Rich Felker
  0 siblings, 1 reply; 29+ messages in thread
From: Jₑₙₛ Gustedt @ 2023-05-02 13:59 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

[-- Attachment #1: Type: text/plain, Size: 1136 bytes --]

on Tue, 2 May 2023 08:57:40 +0200 you (Jₑₙₛ Gustedt
<jens.gustedt@inria.fr>) wrote:

> I'll also setup a git repository for those who would be willing to
> test the whole. Just be aware that is really testing and review, not
> yet ready for direct inclusion. So probably this will be rebased
> several times.

So this can now be found here

   https://icube-forge.unistra.fr/icps/musl/

with my additional branch called "c23". I also put on tags for what I
think might be good groups to treat together. An overview should be
accessible here

           https://icube-forge.unistra.fr/icps/musl/-/network/c23?ref_type=heads

Let me know if you have any problems in accessing this.

I will then post the patches on the ML later, probably need some time
for that to do it right.

Thanks
Jₑₙₛ

-- 
:: ICube :::::::::::::::::::::::::::::: deputy director ::
:: Université de Strasbourg :::::::::::::::::::::: ICPS ::
:: INRIA Nancy Grand Est :::::::::::::::::::::::: Camus ::
:: :::::::::::::::::::::::::::::::::::: ☎ +33 368854536 ::
:: https://icube-icps.unistra.fr/index.php/Jens_Gustedt ::

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [musl] patches for C23
  2023-05-02 13:59     ` Jₑₙₛ Gustedt
@ 2023-05-02 23:20       ` Rich Felker
  2023-05-03  0:00         ` Rich Felker
  2023-05-03  7:13         ` Jₑₙₛ Gustedt
  0 siblings, 2 replies; 29+ messages in thread
From: Rich Felker @ 2023-05-02 23:20 UTC (permalink / raw)
  To: Jₑₙₛ Gustedt; +Cc: musl

On Tue, May 02, 2023 at 03:59:03PM +0200, Jₑₙₛ Gustedt wrote:
> on Tue, 2 May 2023 08:57:40 +0200 you (Jₑₙₛ Gustedt
> <jens.gustedt@inria.fr>) wrote:
> 
> > I'll also setup a git repository for those who would be willing to
> > test the whole. Just be aware that is really testing and review, not
> > yet ready for direct inclusion. So probably this will be rebased
> > several times.
> 
> So this can now be found here
> 
>    https://icube-forge.unistra.fr/icps/musl/
> 
> with my additional branch called "c23". I also put on tags for what I
> think might be good groups to treat together. An overview should be
> accessible here
> 
>            https://icube-forge.unistra.fr/icps/musl/-/network/c23?ref_type=heads
> 
> Let me know if you have any problems in accessing this.
> 
> I will then post the patches on the ML later, probably need some time
> for that to do it right.

One quick find, in
https://icube-forge.unistra.fr/icps/musl/-/commit/3a2b83bf32d7c94f1bf0b2b2fd6ba8b6bf980d99

-				np = strtoul(r+9, &z, 10);
+				np = strtoul(r+9, (char**)&z, 10);

is UB. Accessing a const char * as char *. I would prefer in general
we just #undef any of the const-stuff-tg macros in files that use
them, or just have src/include/string.h always do that. (Not really
needed since musl source is written in c99 not c23, but it would be
nice to have it also compile with c11 and c23 compilers, so I think
the #undef is useful.)

Rich

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [musl] patches for C23
  2023-05-02 23:20       ` Rich Felker
@ 2023-05-03  0:00         ` Rich Felker
  2023-05-03  9:12           ` Jₑₙₛ Gustedt
  2023-05-03  7:13         ` Jₑₙₛ Gustedt
  1 sibling, 1 reply; 29+ messages in thread
From: Rich Felker @ 2023-05-03  0:00 UTC (permalink / raw)
  To: Jₑₙₛ Gustedt; +Cc: musl

On Tue, May 02, 2023 at 07:20:09PM -0400, Rich Felker wrote:
> On Tue, May 02, 2023 at 03:59:03PM +0200, Jₑₙₛ Gustedt wrote:
> > on Tue, 2 May 2023 08:57:40 +0200 you (Jₑₙₛ Gustedt
> > <jens.gustedt@inria.fr>) wrote:
> > 
> > > I'll also setup a git repository for those who would be willing to
> > > test the whole. Just be aware that is really testing and review, not
> > > yet ready for direct inclusion. So probably this will be rebased
> > > several times.
> > 
> > So this can now be found here
> > 
> >    https://icube-forge.unistra.fr/icps/musl/
> > 
> > with my additional branch called "c23". I also put on tags for what I
> > think might be good groups to treat together. An overview should be
> > accessible here
> > 
> >            https://icube-forge.unistra.fr/icps/musl/-/network/c23?ref_type=heads
> > 
> > Let me know if you have any problems in accessing this.
> > 
> > I will then post the patches on the ML later, probably need some time
> > for that to do it right.
> 
> One quick find, in
> https://icube-forge.unistra.fr/icps/musl/-/commit/3a2b83bf32d7c94f1bf0b2b2fd6ba8b6bf980d99
> 
> -				np = strtoul(r+9, &z, 10);
> +				np = strtoul(r+9, (char**)&z, 10);
> 
> is UB. Accessing a const char * as char *. I would prefer in general
> we just #undef any of the const-stuff-tg macros in files that use
> them, or just have src/include/string.h always do that. (Not really
> needed since musl source is written in c99 not c23, but it would be
> nice to have it also compile with c11 and c23 compilers, so I think
> the #undef is useful.)

Some more things I noticed, some of them general:

For compiler feature stuff like __has_c_attribute, this should not be
littered all over headers. We have features.h that already abstracts
getting the ones we need. If there are others strictly needed, those
should be abstracted there too.

It's probably not needed to avoid exposing new functions under older
__STDC_VERSION. We generally do not aim for strict namespace
conformance to older versions of the standards.

On the flip side, it's not needed to jump through compiler-specific
hoops to get new features that can't be obtained in standard ways
without c23 mode. For example, nullptr_t does not need clang special
cases. If it's not c23, it just doesn't need to be defined (and in
fact strictly speaking it's a namespace violation to define it, but as
above we don't really care about that.)

It's not needed to make namespace-safe internal __-prefixed versions
of functions unless they're used to implement functions in a more
restrictive namespace profile. For example, POSIX functions needed to
implement STDC functions need this treatment, but STDC functions
basically never do.

Language/compiler baseline for building musl is not going to go up, so
this complicates some things, especially implementing the int128
stuff. This will need pop_arg to call out to an arch-provided asm
function that bypasses the C type system to get the nonexistent-type
argument off the va_list and store it in a pair of uint64_t.

There are not going to be different versions of scanf/strto* because
there's just no way to do that in a conforming way (the standard
allows declaring these functions yourself without including the
header, which would not get the remapping). As above, strict
conformance to outdated versions of the standard is just not a
priority. musl's claim/target is conformance to current versions only
and sometimes, on a case-by-case basis, partial support for older
ones. (Looking at it again, I don't understand how the code in your
repo was actually intended to provide different versions. __intscan,
etc. are not public interface boundaries and references to them never
appear in applications, only internally within libc. Your code seems
to be conditioning which gets used based on the STDC version in use
for building musl, which is completely decoupled from the version of
the standard that a given application is being built/linked for.)


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [musl] patches for C23
  2023-05-02 23:20       ` Rich Felker
  2023-05-03  0:00         ` Rich Felker
@ 2023-05-03  7:13         ` Jₑₙₛ Gustedt
  2023-05-03 14:06           ` Rich Felker
  1 sibling, 1 reply; 29+ messages in thread
From: Jₑₙₛ Gustedt @ 2023-05-03  7:13 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

[-- Attachment #1: Type: text/plain, Size: 3590 bytes --]

Rich,

on Tue, 2 May 2023 19:20:09 -0400 you (Rich Felker <dalias@libc.org>)
wrote:

> On Tue, May 02, 2023 at 03:59:03PM +0200, Jₑₙₛ Gustedt wrote:
> > on Tue, 2 May 2023 08:57:40 +0200 you (Jₑₙₛ Gustedt
> > <jens.gustedt@inria.fr>) wrote:
> >   
> > > I'll also setup a git repository for those who would be willing to
> > > test the whole. Just be aware that is really testing and review,
> > > not yet ready for direct inclusion. So probably this will be
> > > rebased several times.  
> > 
> > So this can now be found here
> > 
> >    https://icube-forge.unistra.fr/icps/musl/
> > 
> > with my additional branch called "c23". I also put on tags for what
> > I think might be good groups to treat together. An overview should
> > be accessible here
> > 
> >            https://icube-forge.unistra.fr/icps/musl/-/network/c23?ref_type=heads
> > 
> > Let me know if you have any problems in accessing this.
> > 
> > I will then post the patches on the ML later, probably need some
> > time for that to do it right.  
> 
> One quick find, in
> https://icube-forge.unistra.fr/icps/musl/-/commit/3a2b83bf32d7c94f1bf0b2b2fd6ba8b6bf980d99
> 
> -				np = strtoul(r+9, &z, 10);
> +				np = strtoul(r+9, (char**)&z, 10);
> 
> is UB.

I think it the situation is more subtle than that. If this were
application C code the implementation of `strtoul` would provoque UB
under certain circumstances. And this UB would be happening in line 16
of strtol.c, not at the calling side. Here at the calling side, we
only have a pointer cast, which as such is well-defined because the
two pointer types have same representation and alignment.

Spinning that further, the code would then be UB as written before
(with an unqualified `z`) because the call to `strtoul` then stores a
`char const*` value into a `char*` object. By providing a declaration
of `z` as `char const*`, the store in fact is now valid. So with that
application side notion of UB, the proposed patch changes the code
from UB to well-defined.

But since we are the C library implementation nothing of that is UB,
because the C standard enforces that the `char const*` value is stored
in a `char*` object.

So that cast as written above only calls out a special situation.

> Accessing a const char * as char *. I would prefer in general
> we just #undef any of the const-stuff-tg macros in files that use
> them, or just have src/include/string.h always do that. (Not really
> needed since musl source is written in c99 not c23, but it would be
> nice to have it also compile with c11 and c23 compilers, so I think
> the #undef is useful.)

I am not sure that I understand how you think that should work, we
have to provide these tg macros to our users, don't we?

In any case, I prefer to mark such positions explicitly with something
like `(strchr)(r, '\n')` as in line 222 of the code that you are
refering to. All of this is marked as obsolescent in 7.26.5.1

         If a macro definition of any of these generic functions is
         suppressed to access an actual function, the external
         declaration with the corresponding concrete type is
         visible.381)

         381) This is an obsolescent feature.

Jₑₙₛ

-- 
:: ICube :::::::::::::::::::::::::::::: deputy director ::
:: Université de Strasbourg :::::::::::::::::::::: ICPS ::
:: INRIA Nancy Grand Est :::::::::::::::::::::::: Camus ::
:: :::::::::::::::::::::::::::::::::::: ☎ +33 368854536 ::
:: https://icube-icps.unistra.fr/index.php/Jens_Gustedt ::

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [musl] patches for C23
  2023-05-03  0:00         ` Rich Felker
@ 2023-05-03  9:12           ` Jₑₙₛ Gustedt
  2023-05-03 14:16             ` Rich Felker
  2023-05-04 15:50             ` [musl] patches for C23 Jeffrey Walton
  0 siblings, 2 replies; 29+ messages in thread
From: Jₑₙₛ Gustedt @ 2023-05-03  9:12 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

[-- Attachment #1: Type: text/plain, Size: 5413 bytes --]

Rich,
thanks for looking into this so quickly.

on Tue, 2 May 2023 20:00:45 -0400 you (Rich Felker <dalias@libc.org>)
wrote:

> Some more things I noticed, some of them general:
> 
> For compiler feature stuff like __has_c_attribute, this should not be
> littered all over headers. We have features.h that already abstracts
> getting the ones we need. If there are others strictly needed, those
> should be abstracted there too.

ok, I'll see that I group things that are needed together, there.

> It's probably not needed to avoid exposing new functions under older
> __STDC_VERSION. We generally do not aim for strict namespace
> conformance to older versions of the standards.

ok, that's politics, if you think that this is fine I'll go for it

(Such a policy will be difficult to maintain, when there will be third
party implemenations of the decimal floating point stuff. This will be
some hundreds of new interfaces that had not been reserved. Also there
will be a bunch of such functions in `math.h` even without decimal
floating point which are not covered by these patches.)

> On the flip side, it's not needed to jump through compiler-specific
> hoops to get new features that can't be obtained in standard ways
> without c23 mode. For example, nullptr_t does not need clang special
> cases. If it's not c23, it just doesn't need to be defined (and in
> fact strictly speaking it's a namespace violation to define it, but as
> above we don't really care about that.)

I see your point. `nullptr` is special in that it might also be
provided as an extension in other standard modes by compilers, and
tendency in clang, for example, could be to base `NULL` on
that. Providing the `typedef` is then some nicety for our users.

> It's not needed to make namespace-safe internal __-prefixed versions
> of functions unless they're used to implement functions in a more
> restrictive namespace profile. For example, POSIX functions needed to
> implement STDC functions need this treatment, but STDC functions
> basically never do.

Ok, that simplifies things. And effectively this will give hard errors
for applications that already use such names, even if they are
dynamically linked.

This probably is a good thing.

> Language/compiler baseline for building musl is not going to go up, so
> this complicates some things, especially implementing the int128
> stuff. This will need pop_arg to call out to an arch-provided asm
> function that bypasses the C type system to get the nonexistent-type
> argument off the va_list and store it in a pair of uint64_t.

I don't see that. `pop_arg` just uses `va_arg` and that in turn is
fixed to `__builtin_va_arg`. The proposed patches assume that if
`__SIZEOF_INT128__` is defined by the compiler that then the compiler
provides the `__int128` types and knows how to deal with them in
`__builtin_va_arg`. Is there anything wrong with that assumtion?

> There are not going to be different versions of scanf/strto* because
> there's just no way to do that in a conforming way (the standard
> allows declaring these functions yourself without including the
> header, which would not get the remapping).

Yes, sure. Applications that do that would receive the default
treatment which is delivered with a given C library, much as for other
legacy objects that had been compiled with previous versions.

> As above, strict conformance to outdated versions of the standard is
> just not a priority. musl's claim/target is conformance to current
> versions only and sometimes, on a case-by-case basis, partial
> support for older ones.

Yes. But this here is really something to consider. Legacy executables
that are linked dynamically may behave semantically different with
this patch. This might even have security implications. E.g within
musl itself in inet_aton.c there is a use with a base of `0` that
could perhaps be abused to do spoofy things.

> (Looking at it again, I don't understand how the code in your
> repo was actually intended to provide different versions. __intscan,
> etc. are not public interface boundaries and references to them never
> appear in applications, only internally within libc. Your code seems
> to be conditioning which gets used based on the STDC version in use
> for building musl, which is completely decoupled from the version of
> the standard that a given application is being built/linked for.)

Yes, this is an attempt to cope with the situation, which is probably
not yet final. I was hestitating to do the work of duplicate all the
externally visible interfaces before we agree upon such an approach.

The idea is that a C library provider (i.e Linux distribution) would
have to decide which version to provide as a default for legacy
executables. (Here this is done by using the C version with which the
library is compiled, but other mechanisms would be possible.)

Newly compiled code (but for the extremists that you mentionned above)
would get the mode C17 or C23 under which they compile.

Jₑₙₛ

-- 
:: ICube :::::::::::::::::::::::::::::: deputy director ::
:: Université de Strasbourg :::::::::::::::::::::: ICPS ::
:: INRIA Nancy Grand Est :::::::::::::::::::::::: Camus ::
:: :::::::::::::::::::::::::::::::::::: ☎ +33 368854536 ::
:: https://icube-icps.unistra.fr/index.php/Jens_Gustedt ::

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [musl] patches for C23
  2023-05-03  7:13         ` Jₑₙₛ Gustedt
@ 2023-05-03 14:06           ` Rich Felker
  2023-05-03 14:26             ` Jₑₙₛ Gustedt
  0 siblings, 1 reply; 29+ messages in thread
From: Rich Felker @ 2023-05-03 14:06 UTC (permalink / raw)
  To: Jₑₙₛ Gustedt; +Cc: musl

On Wed, May 03, 2023 at 09:13:40AM +0200, Jₑₙₛ Gustedt wrote:
> Rich,
> 
> on Tue, 2 May 2023 19:20:09 -0400 you (Rich Felker <dalias@libc.org>)
> wrote:
> 
> > On Tue, May 02, 2023 at 03:59:03PM +0200, Jₑₙₛ Gustedt wrote:
> > > on Tue, 2 May 2023 08:57:40 +0200 you (Jₑₙₛ Gustedt
> > > <jens.gustedt@inria.fr>) wrote:
> > >   
> > > > I'll also setup a git repository for those who would be willing to
> > > > test the whole. Just be aware that is really testing and review,
> > > > not yet ready for direct inclusion. So probably this will be
> > > > rebased several times.  
> > > 
> > > So this can now be found here
> > > 
> > >    https://icube-forge.unistra.fr/icps/musl/
> > > 
> > > with my additional branch called "c23". I also put on tags for what
> > > I think might be good groups to treat together. An overview should
> > > be accessible here
> > > 
> > >            https://icube-forge.unistra.fr/icps/musl/-/network/c23?ref_type=heads
> > > 
> > > Let me know if you have any problems in accessing this.
> > > 
> > > I will then post the patches on the ML later, probably need some
> > > time for that to do it right.  
> > 
> > One quick find, in
> > https://icube-forge.unistra.fr/icps/musl/-/commit/3a2b83bf32d7c94f1bf0b2b2fd6ba8b6bf980d99
> > 
> > -				np = strtoul(r+9, &z, 10);
> > +				np = strtoul(r+9, (char**)&z, 10);
> > 
> > is UB.
> 
> I think it the situation is more subtle than that. If this were
> application C code the implementation of `strtoul` would provoque UB
> under certain circumstances. And this UB would be happening in line 16
> of strtol.c, not at the calling side. Here at the calling side, we
> only have a pointer cast, which as such is well-defined because the
> two pointer types have same representation and alignment.

The UB on the application side is passing a pointer to an object of
the wrong type to a standard function. But internally within the
implementation, the actual UB happens inside the implementation of
strtol. In any case it's wrong/UB.

> Spinning that further, the code would then be UB as written before
> (with an unqualified `z`) because the call to `strtoul` then stores a
> `char const*` value into a `char*` object. By providing a declaration

No, it does not. It stores a value that originally had type const
char *, converted into a value of type char *, into an object of type
char *. You're confusing accessing an object with wrong lvalue type
with conversions.

> > Accessing a const char * as char *. I would prefer in general
> > we just #undef any of the const-stuff-tg macros in files that use
> > them, or just have src/include/string.h always do that. (Not really
> > needed since musl source is written in c99 not c23, but it would be
> > nice to have it also compile with c11 and c23 compilers, so I think
> > the #undef is useful.)
> 
> I am not sure that I understand how you think that should work, we
> have to provide these tg macros to our users, don't we?

Yes but we don't have to use them inside the implementation.
src/include/* are a layer on top of the public headers for
implementation-internal use.

> In any case, I prefer to mark such positions explicitly with something
> like `(strchr)(r, '\n')` as in line 222 of the code that you are
> refering to. All of this is marked as obsolescent in 7.26.5.1

I'd rather just fix it in one place (the implementation-internal
header) so we don't have to worry about it.

Rich

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [musl] patches for C23
  2023-05-03  9:12           ` Jₑₙₛ Gustedt
@ 2023-05-03 14:16             ` Rich Felker
  2023-05-03 15:11               ` Jₑₙₛ Gustedt
  2023-05-04 15:50             ` [musl] patches for C23 Jeffrey Walton
  1 sibling, 1 reply; 29+ messages in thread
From: Rich Felker @ 2023-05-03 14:16 UTC (permalink / raw)
  To: Jₑₙₛ Gustedt; +Cc: musl

On Wed, May 03, 2023 at 11:12:46AM +0200, Jₑₙₛ Gustedt wrote:
> > Language/compiler baseline for building musl is not going to go up, so
> > this complicates some things, especially implementing the int128
> > stuff. This will need pop_arg to call out to an arch-provided asm
> > function that bypasses the C type system to get the nonexistent-type
> > argument off the va_list and store it in a pair of uint64_t.
> 
> I don't see that. `pop_arg` just uses `va_arg` and that in turn is
> fixed to `__builtin_va_arg`. The proposed patches assume that if
> `__SIZEOF_INT128__` is defined by the compiler that then the compiler
> provides the `__int128` types and knows how to deal with them in
> `__builtin_va_arg`. Is there anything wrong with that assumtion?

Yes. We don't require a compiler that has an __int128. The feature set
of the library is not allowed to vary depending on the compiler
version it was built with. The only non-UB way to get an __int128 out
of a va_list if the compiler has no idea there's such a thing as
__int128 is to write asm that bypasses the C type system. There can be
a "generic" version of this TU, I guess, for archs where __int128 has
always been part of the arch ABI definition, that just uses a C
function calling va_arg; this would also be suitable for folks reusing
the code in places like wasm where an asm implementation isn't
suitable and where they have more control over the tooling.

> > As above, strict conformance to outdated versions of the standard is
> > just not a priority. musl's claim/target is conformance to current
> > versions only and sometimes, on a case-by-case basis, partial
> > support for older ones.
> 
> Yes. But this here is really something to consider. Legacy executables
> that are linked dynamically may behave semantically different with
> this patch. This might even have security implications. E.g within
> musl itself in inet_aton.c there is a use with a base of `0` that
> could perhaps be abused to do spoofy things.

One thing that could be done here, but I'm not sure if it's useful or
appropriate, is linking an object file defining a symbol named
something like __c23_profile with value 1 into the application or
shared library built in c23 mode. This would override (via
interposition) a definition with the value zero internal to libc, and
could be used to switch on incompatible features like this. I'm
skeptical whether this kind of thing is something we should do or want
to do, but it's at least something we could consider...

It seems unfortunate that the committee did not consider this
adequately. It would have made a lot more sense to leave the behavior
of base==0 alone and add new behaviors with base=-1 or something. But
FWIW the same kind of incompatible change already happened with
floating point in the past (strtod/scanf %e/f/g consuming hex floats
rather than reading "0x..." as 0) and the world didn't explode.

Rich

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [musl] patches for C23
  2023-05-03 14:06           ` Rich Felker
@ 2023-05-03 14:26             ` Jₑₙₛ Gustedt
  2023-05-03 14:43               ` Rich Felker
  0 siblings, 1 reply; 29+ messages in thread
From: Jₑₙₛ Gustedt @ 2023-05-03 14:26 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

[-- Attachment #1: Type: text/plain, Size: 594 bytes --]

Rich,

on Wed, 3 May 2023 10:06:20 -0400 you (Rich Felker <dalias@libc.org>)
wrote:

> I'd rather just fix it in one place (the implementation-internal
> header) so we don't have to worry about it.

I still don't understand which header that is supposed to be.

Jₑₙₛ

-- 
:: ICube :::::::::::::::::::::::::::::: deputy director ::
:: Université de Strasbourg :::::::::::::::::::::: ICPS ::
:: INRIA Nancy Grand Est :::::::::::::::::::::::: Camus ::
:: :::::::::::::::::::::::::::::::::::: ☎ +33 368854536 ::
:: https://icube-icps.unistra.fr/index.php/Jens_Gustedt ::

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [musl] patches for C23
  2023-05-03 14:26             ` Jₑₙₛ Gustedt
@ 2023-05-03 14:43               ` Rich Felker
  2023-05-03 15:26                 ` Jₑₙₛ Gustedt
  0 siblings, 1 reply; 29+ messages in thread
From: Rich Felker @ 2023-05-03 14:43 UTC (permalink / raw)
  To: Jₑₙₛ Gustedt; +Cc: musl

On Wed, May 03, 2023 at 04:26:49PM +0200, Jₑₙₛ Gustedt wrote:
> Rich,
> 
> on Wed, 3 May 2023 10:06:20 -0400 you (Rich Felker <dalias@libc.org>)
> wrote:
> 
> > I'd rather just fix it in one place (the implementation-internal
> > header) so we don't have to worry about it.
> 
> I still don't understand which header that is supposed to be.

See src/include/*.h. These headers so far mostly just declare
additional __-prefixed versions of interfaces where needed, but some
of them do additional things. For example, src/include/stdio.h does
some additional things:

- suppresses the complete definition of FILE, which conflicts with
  libc-internal use where we have a real structure not a gratuitous
  fake type for pre-c11 (and POSIX) conformance reasons.

- replaces the stdin/out/err macros with ones that resolve directly to
  address-of the internal objects rather than pointer objects subject
  to copy relocations -- this makes internal codegen a lot more
  efficient for functions which implicitly use stdin/out.

Some other things I eventually intend to do in src/include/*:

- making memcpy expand to __builtin_memcpy if available, and similar
  for other string functions with builtins; the few places where that
  could be problem would need to #undef them.

- making calls to some functions where the interposable call overhead
  is likely significant expand to direct calls to hidden aliases.

- etc.

Undefining macro definitions that are unsuitable for some reason to
the implementation-internal code in libc is another perfectly good use
for these.

Rich

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [musl] patches for C23
  2023-05-03 14:16             ` Rich Felker
@ 2023-05-03 15:11               ` Jₑₙₛ Gustedt
  2023-05-03 17:28                 ` Rich Felker
  0 siblings, 1 reply; 29+ messages in thread
From: Jₑₙₛ Gustedt @ 2023-05-03 15:11 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

[-- Attachment #1: Type: text/plain, Size: 6217 bytes --]

Rich,

on Wed, 3 May 2023 10:16:19 -0400 you (Rich Felker <dalias@libc.org>)
wrote:

> On Wed, May 03, 2023 at 11:12:46AM +0200, Jₑₙₛ Gustedt wrote:
> > > Language/compiler baseline for building musl is not going to go
> > > up, so this complicates some things, especially implementing the
> > > int128 stuff. This will need pop_arg to call out to an
> > > arch-provided asm function that bypasses the C type system to get
> > > the nonexistent-type argument off the va_list and store it in a
> > > pair of uint64_t.  
> > 
> > I don't see that. `pop_arg` just uses `va_arg` and that in turn is
> > fixed to `__builtin_va_arg`. The proposed patches assume that if
> > `__SIZEOF_INT128__` is defined by the compiler that then the
> > compiler provides the `__int128` types and knows how to deal with
> > them in `__builtin_va_arg`. Is there anything wrong with that
> > assumtion?  
> 
> Yes. We don't require a compiler that has an __int128.

sure, but all the uses are protected by `__SIZEOF_INT128__`. So if the
compiler don't has this, they will not see that code when compiling
musl.

Also application side compilation with a different compiler that has
no `__int128` would not see these interfaces, so such application code
can never call into the library with `__int128` types.

> The feature set of the library is not allowed to vary depending on
> the compiler version it was built with.

I am not sure that I can follow. The feature set will vary, modelled
with feature macros as provided by the compiler. That is what these
things are made for, no? Basically that would mean that we can never
extend ABIs.

This will for sure occur in the future for `_BitInt` types where both
gcc and clang have the intention to augment `__BITINT_MAXWIDTH__` from
currently 128 to much bigger numbers. So since we have to provide
`BITINT_MAXWIDTH` we would not be able to adapt because that would
change the feature set.

> The only non-UB way to get an __int128 out of a va_list if the
> compiler has no idea there's such a thing as __int128 is to write
> asm that bypasses the C type system.

But nobody will want to do that because that would be completely
useless. If the compiler does not implement these types, there is no
point or even possibilty of feeding them in or out of the library. (If
the library was compiled with int128 support and the actual compiler
doesn't provide it, that is just a bit of dead code in the library.)

(And relying on `__builtin_va_arg` basically means that, yes, the
feature set changes with the compiler that is used to compile the
library.)

In the other direction, when user code has `__int128` and the library
hasn't, for the use of functions that have 128 bit types in their
interfaces would fail on linkage. The use of `va_arg` functions in the
library (which are `printf` and `scanf` functions) would just have the
calls fail, because the library wouldn't know how to handle `w128`
format specifiers. If we would like to have the latter fail at
compilation or link time, there would certainly be a way to do that
with some artificial symbol `__needs_128_bit_types` or so.

(Aplication `va_arg` functions are safe because they would use the
updated `__builtin_va_arg`.)

> There can be a "generic" version of this TU, I guess, for archs
> where __int128 has always been part of the arch ABI definition, that
> just uses a C function calling va_arg; this would also be suitable
> for folks reusing the code in places like wasm where an asm
> implementation isn't suitable and where they have more control over
> the tooling.

> > > As above, strict conformance to outdated versions of the standard
> > > is just not a priority. musl's claim/target is conformance to
> > > current versions only and sometimes, on a case-by-case basis,
> > > partial support for older ones.  
> > 
> > Yes. But this here is really something to consider. Legacy
> > executables that are linked dynamically may behave semantically
> > different with this patch. This might even have security
> > implications. E.g within musl itself in inet_aton.c there is a use
> > with a base of `0` that could perhaps be abused to do spoofy
> > things.  
> 
> One thing that could be done here, but I'm not sure if it's useful or
> appropriate, is linking an object file defining a symbol named
> something like __c23_profile with value 1 into the application or
> shared library built in c23 mode. This would override (via
> interposition) a definition with the value zero internal to libc, and
> could be used to switch on incompatible features like this. I'm
> skeptical whether this kind of thing is something we should do or want
> to do, but it's at least something we could consider...
> 
> It seems unfortunate that the committee did not consider this
> adequately.

I tried my best, but this was basically brushed over without much
arguments, and with a quite pittyful attitude.

> It would have made a lot more sense to leave the behavior
> of base==0 alone and add new behaviors with base=-1 or something. But
> FWIW the same kind of incompatible change already happened with
> floating point in the past (strtod/scanf %e/f/g consuming hex floats
> rather than reading "0x..." as 0) and the world didn't explode.

The integer functions are much more widely used, in particular in
security critical situations such as translating IP or memory
addresses. If someone happens to place 0b numbers in a textual IP
address for example and the application parses them with base 0,
c23-compiled libraries would get the right address and c17-compiled
forgotten legacy platforms would see a default IP address with all
0. And as musl itself shows, it is tempting to have base 0, here,
because currently some people use decimal and some may use hexadecimal
numbers.

Jₑₙₛ

-- 
:: ICube :::::::::::::::::::::::::::::: deputy director ::
:: Université de Strasbourg :::::::::::::::::::::: ICPS ::
:: INRIA Nancy Grand Est :::::::::::::::::::::::: Camus ::
:: :::::::::::::::::::::::::::::::::::: ☎ +33 368854536 ::
:: https://icube-icps.unistra.fr/index.php/Jens_Gustedt ::

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [musl] patches for C23
  2023-05-03 14:43               ` Rich Felker
@ 2023-05-03 15:26                 ` Jₑₙₛ Gustedt
  0 siblings, 0 replies; 29+ messages in thread
From: Jₑₙₛ Gustedt @ 2023-05-03 15:26 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

[-- Attachment #1: Type: text/plain, Size: 816 bytes --]

Rich,

on Wed, 3 May 2023 10:43:46 -0400 you (Rich Felker <dalias@libc.org>)
wrote:

> Undefining macro definitions that are unsuitable for some reason to
> the implementation-internal code in libc is another perfectly good use
> for these.

Ah, right, now I see. I'll do that then, and drop the two related
patches with const issues.

This means also that I have to be more careful when changing to the
new include guards such that I do the changes in both files in
parallel.

Jₑₙₛ

-- 
:: ICube :::::::::::::::::::::::::::::: deputy director ::
:: Université de Strasbourg :::::::::::::::::::::: ICPS ::
:: INRIA Nancy Grand Est :::::::::::::::::::::::: Camus ::
:: :::::::::::::::::::::::::::::::::::: ☎ +33 368854536 ::
:: https://icube-icps.unistra.fr/index.php/Jens_Gustedt ::

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [musl] patches for C23
  2023-05-03 15:11               ` Jₑₙₛ Gustedt
@ 2023-05-03 17:28                 ` Rich Felker
  2023-05-03 18:46                   ` Jₑₙₛ Gustedt
  0 siblings, 1 reply; 29+ messages in thread
From: Rich Felker @ 2023-05-03 17:28 UTC (permalink / raw)
  To: Jₑₙₛ Gustedt; +Cc: musl

On Wed, May 03, 2023 at 05:11:11PM +0200, Jₑₙₛ Gustedt wrote:
> Rich,
> 
> on Wed, 3 May 2023 10:16:19 -0400 you (Rich Felker <dalias@libc.org>)
> wrote:
> 
> > On Wed, May 03, 2023 at 11:12:46AM +0200, Jₑₙₛ Gustedt wrote:
> > > > Language/compiler baseline for building musl is not going to go
> > > > up, so this complicates some things, especially implementing the
> > > > int128 stuff. This will need pop_arg to call out to an
> > > > arch-provided asm function that bypasses the C type system to get
> > > > the nonexistent-type argument off the va_list and store it in a
> > > > pair of uint64_t.  
> > > 
> > > I don't see that. `pop_arg` just uses `va_arg` and that in turn is
> > > fixed to `__builtin_va_arg`. The proposed patches assume that if
> > > `__SIZEOF_INT128__` is defined by the compiler that then the
> > > compiler provides the `__int128` types and knows how to deal with
> > > them in `__builtin_va_arg`. Is there anything wrong with that
> > > assumtion?  
> > 
> > Yes. We don't require a compiler that has an __int128.
> 
> sure, but all the uses are protected by `__SIZEOF_INT128__`. So if the
> compiler don't has this, they will not see that code when compiling
> musl.

Again, there are not multiple versions of musl with different features
depending on which compiler was used to compile them. There is one
unified feature set. There are not configure-time or compile-time
decisions about which features to support.

> Also application side compilation with a different compiler that has
> no `__int128` would not see these interfaces, so such application code
> can never call into the library with `__int128` types.

The compiler used to compile musl and the compiler used to compile the
application using musl have nothing to do with each other except
sharing a baseline ABI target.

Rich

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [musl] patches for C23
  2023-05-03 17:28                 ` Rich Felker
@ 2023-05-03 18:46                   ` Jₑₙₛ Gustedt
  2023-05-03 19:33                     ` Rich Felker
  0 siblings, 1 reply; 29+ messages in thread
From: Jₑₙₛ Gustedt @ 2023-05-03 18:46 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

[-- Attachment #1: Type: text/plain, Size: 2659 bytes --]

Rich,

on Wed, 3 May 2023 13:28:02 -0400 you (Rich Felker <dalias@libc.org>)
wrote:

> On Wed, May 03, 2023 at 05:11:11PM +0200, Jₑₙₛ Gustedt wrote:
> > Rich,
> > 
> > on Wed, 3 May 2023 10:16:19 -0400 you (Rich Felker
> > <dalias@libc.org>) wrote:
> >   
> > > On Wed, May 03, 2023 at 11:12:46AM +0200, Jₑₙₛ Gustedt wrote:  
>  [...]  
>  [...]  
> > > 
> > > Yes. We don't require a compiler that has an __int128.  
> > 
> > sure, but all the uses are protected by `__SIZEOF_INT128__`. So if
> > the compiler don't has this, they will not see that code when
> > compiling musl.  
> 
> Again, there are not multiple versions of musl with different features
> depending on which compiler was used to compile them. There is one
> unified feature set. There are not configure-time or compile-time
> decisions about which features to support.

This sounds a bit dogmatic and also unrealistic. As said the dependency
on compiler builtins undermines that approach. Future versions of gcc
and clang will soon support `va_start` with only one parameter for
example. Musl will just be dependent on that compiler feature.

How will you do with optional features, then? For example decimal
floating point? This will never be added to musl? (Nobody will
probably backport support for them to very old gcc versions, for
example, or even to more recent versions of clang)

> > Also application side compilation with a different compiler that has
> > no `__int128` would not see these interfaces, so such application
> > code can never call into the library with `__int128` types.  
> 
> The compiler used to compile musl and the compiler used to compile the
> application using musl have nothing to do with each other except
> sharing a baseline ABI target.

Yes, exactly. And one supporting `__int128` and the other that doesn't
basically wouldn't interfere.

For the support of `__int128`: gcc has this since ages on 64 bit
archs, is there any such arch out there where this support is changing
according to versions of gcc that are still in use? So if we make the
availability of `__int128` dependent on `UINTPTR_WIDTH` being 64,
would that be acceptable for you? Or an even more dependent approach
with special casing architectures where this is available since
always?

Thanks
Jₑₙₛ

-- 
:: ICube :::::::::::::::::::::::::::::: deputy director ::
:: Université de Strasbourg :::::::::::::::::::::: ICPS ::
:: INRIA Nancy Grand Est :::::::::::::::::::::::: Camus ::
:: :::::::::::::::::::::::::::::::::::: ☎ +33 368854536 ::
:: https://icube-icps.unistra.fr/index.php/Jens_Gustedt ::

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [musl] patches for C23
  2023-05-03 18:46                   ` Jₑₙₛ Gustedt
@ 2023-05-03 19:33                     ` Rich Felker
  2023-05-04  1:09                       ` Gabriel Ravier
  2023-05-04  6:48                       ` Jₑₙₛ Gustedt
  0 siblings, 2 replies; 29+ messages in thread
From: Rich Felker @ 2023-05-03 19:33 UTC (permalink / raw)
  To: Jₑₙₛ Gustedt; +Cc: musl

On Wed, May 03, 2023 at 08:46:56PM +0200, Jₑₙₛ Gustedt wrote:
> Rich,
> 
> on Wed, 3 May 2023 13:28:02 -0400 you (Rich Felker <dalias@libc.org>)
> wrote:
> 
> > On Wed, May 03, 2023 at 05:11:11PM +0200, Jₑₙₛ Gustedt wrote:
> > > Rich,
> > > 
> > > on Wed, 3 May 2023 10:16:19 -0400 you (Rich Felker
> > > <dalias@libc.org>) wrote:
> > >   
> > > > On Wed, May 03, 2023 at 11:12:46AM +0200, Jₑₙₛ Gustedt wrote:  
> >  [...]  
> >  [...]  
> > > > 
> > > > Yes. We don't require a compiler that has an __int128.  
> > > 
> > > sure, but all the uses are protected by `__SIZEOF_INT128__`. So if
> > > the compiler don't has this, they will not see that code when
> > > compiling musl.  
> > 
> > Again, there are not multiple versions of musl with different features
> > depending on which compiler was used to compile them. There is one
> > unified feature set. There are not configure-time or compile-time
> > decisions about which features to support.
> 
> This sounds a bit dogmatic

Yes, it's one of the core principles of musl: that we don't have
build-time-selectable feature-set like uclibc did.

> and also unrealistic. As said the dependency
> on compiler builtins undermines that approach. Future versions of gcc
> and clang will soon support `va_start` with only one parameter for
> example. Musl will just be dependent on that compiler feature.

No it won't. None of the code in musl calls or needs to call va_start
with one parameter. You're confusing header-level stuff that a c23
application might depend on, with build dependencies of libc.

> How will you do with optional features, then? For example decimal
> floating point? This will never be added to musl? (Nobody will
> probably backport support for them to very old gcc versions, for
> example, or even to more recent versions of clang)

Decimal float math library will likely be left to a third-party
library implementation.

Decimal float in printf, if that becomes a thing, will be done the
same way as int128: stub to pop the arguments, and 100% integer code
to actually work with the data.

> > > Also application side compilation with a different compiler that has
> > > no `__int128` would not see these interfaces, so such application
> > > code can never call into the library with `__int128` types.  
> > 
> > The compiler used to compile musl and the compiler used to compile the
> > application using musl have nothing to do with each other except
> > sharing a baseline ABI target.
> 
> Yes, exactly. And one supporting `__int128` and the other that doesn't
> basically wouldn't interfere.

The premise here is that applications and libc are being built by
possibly different people with different tools. If I have a system
built with gcc 5.3, I can't build C23 applications, but I might get a
dynamically-linked C23 binary from someone who can. That binary needs
to run with my musl-1.2.7 (made-up number) libc.so because the C
language version the binary was generated from (or whether it was even
C at all) is irrelevant. The interface surface is just the musl ABI
surface.

> For the support of `__int128`: gcc has this since ages on 64 bit
> archs, is there any such arch out there where this support is changing
> according to versions of gcc that are still in use? So if we make the

We also support pcc, cparser+libfirm, etc. on archs they support. Not
just gcc. And gcc back to 3.x.

> availability of `__int128` dependent on `UINTPTR_WIDTH` being 64,
> would that be acceptable for you? Or an even more dependent approach
> with special casing architectures where this is available since
> always?

It's not really "special casing archs where this is available since
always". It's more like the other way around, "not special casing
archs where __int128 is a guaranteed part of the baseline psABI". For
those we can just let the default C implementation be used. For the
rest we need a (completely trivial) asm stub that pops the arg
according to the variadic argument ABI for the arch. This really isn't
that big a deal. It's a few instructions at most.

Rich

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [musl] patches for C23
  2023-05-03 19:33                     ` Rich Felker
@ 2023-05-04  1:09                       ` Gabriel Ravier
  2023-05-04 14:07                         ` Rich Felker
  2023-05-04  6:48                       ` Jₑₙₛ Gustedt
  1 sibling, 1 reply; 29+ messages in thread
From: Gabriel Ravier @ 2023-05-04  1:09 UTC (permalink / raw)
  To: musl, Rich Felker, Jₑₙₛ Gustedt

On 5/3/23 21:33, Rich Felker wrote:
> On Wed, May 03, 2023 at 08:46:56PM +0200, Jₑₙₛ Gustedt wrote:
>> Rich,
>>
>> on Wed, 3 May 2023 13:28:02 -0400 you (Rich Felker <dalias@libc.org>)
>> wrote:
>>
>>> On Wed, May 03, 2023 at 05:11:11PM +0200, Jₑₙₛ Gustedt wrote:
>>>> Rich,
>>>>
>>>> on Wed, 3 May 2023 10:16:19 -0400 you (Rich Felker
>>>> <dalias@libc.org>) wrote:
>>>>    
>>>>> On Wed, May 03, 2023 at 11:12:46AM +0200, Jₑₙₛ Gustedt wrote:
>>>   [...]
>>>   [...]
>>>>> Yes. We don't require a compiler that has an __int128.
>>>> sure, but all the uses are protected by `__SIZEOF_INT128__`. So if
>>>> the compiler don't has this, they will not see that code when
>>>> compiling musl.
>>> Again, there are not multiple versions of musl with different features
>>> depending on which compiler was used to compile them. There is one
>>> unified feature set. There are not configure-time or compile-time
>>> decisions about which features to support.
>> This sounds a bit dogmatic
> Yes, it's one of the core principles of musl: that we don't have
> build-time-selectable feature-set like uclibc did.
>
>> and also unrealistic. As said the dependency
>> on compiler builtins undermines that approach. Future versions of gcc
>> and clang will soon support `va_start` with only one parameter for
>> example. Musl will just be dependent on that compiler feature.
> No it won't. None of the code in musl calls or needs to call va_start
> with one parameter. You're confusing header-level stuff that a c23
> application might depend on, with build dependencies of libc.
>
>> How will you do with optional features, then? For example decimal
>> floating point? This will never be added to musl? (Nobody will
>> probably backport support for them to very old gcc versions, for
>> example, or even to more recent versions of clang)
> Decimal float math library will likely be left to a third-party
> library implementation.
>
> Decimal float in printf, if that becomes a thing, will be done the
> same way as int128: stub to pop the arguments, and 100% integer code
> to actually work with the data.
>
>>>> Also application side compilation with a different compiler that has
>>>> no `__int128` would not see these interfaces, so such application
>>>> code can never call into the library with `__int128` types.
>>> The compiler used to compile musl and the compiler used to compile the
>>> application using musl have nothing to do with each other except
>>> sharing a baseline ABI target.
>> Yes, exactly. And one supporting `__int128` and the other that doesn't
>> basically wouldn't interfere.
> The premise here is that applications and libc are being built by
> possibly different people with different tools. If I have a system
> built with gcc 5.3, I can't build C23 applications, but I might get a
> dynamically-linked C23 binary from someone who can. That binary needs
> to run with my musl-1.2.7 (made-up number) libc.so because the C
> language version the binary was generated from (or whether it was even
> C at all) is irrelevant. The interface surface is just the musl ABI
> surface.
>
>> For the support of `__int128`: gcc has this since ages on 64 bit
>> archs, is there any such arch out there where this support is changing
>> according to versions of gcc that are still in use? So if we make the
> We also support pcc, cparser+libfirm, etc. on archs they support. Not
> just gcc. And gcc back to 3.x.
GCC 3.x ??? I understand wanting backwards compatibility but compiler 
versions from barely after the turn of the century seem like a bit far, 
I guess it's admirable that musl works with versions of software that 
are older than I am, but at some point I have to wonder if even a single 
person in the world actually finds it useful to be able to build musl 
with GCC 3 in 2023...
>
>> availability of `__int128` dependent on `UINTPTR_WIDTH` being 64,
>> would that be acceptable for you? Or an even more dependent approach
>> with special casing architectures where this is available since
>> always?
> It's not really "special casing archs where this is available since
> always". It's more like the other way around, "not special casing
> archs where __int128 is a guaranteed part of the baseline psABI". For
> those we can just let the default C implementation be used. For the
> rest we need a (completely trivial) asm stub that pops the arg
> according to the variadic argument ABI for the arch. This really isn't
> that big a deal. It's a few instructions at most.
>
> Rich



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [musl] patches for C23
  2023-05-03 19:33                     ` Rich Felker
  2023-05-04  1:09                       ` Gabriel Ravier
@ 2023-05-04  6:48                       ` Jₑₙₛ Gustedt
  2023-05-04 14:30                         ` Rich Felker
  1 sibling, 1 reply; 29+ messages in thread
From: Jₑₙₛ Gustedt @ 2023-05-04  6:48 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

[-- Attachment #1: Type: text/plain, Size: 4915 bytes --]

Rich,

on Wed, 3 May 2023 15:33:26 -0400 you (Rich Felker <dalias@libc.org>)
wrote:

> On Wed, May 03, 2023 at 08:46:56PM +0200, Jₑₙₛ Gustedt wrote:
> > Rich,
> > 
> > on Wed, 3 May 2023 13:28:02 -0400 you (Rich Felker
> > <dalias@libc.org>) wrote:
> >   
> > > On Wed, May 03, 2023 at 05:11:11PM +0200, Jₑₙₛ Gustedt wrote:  
>  [...]  
>  [...]  
> > >  [...]  
> > >  [...]    
>  [...]  
>  [...]  
> > > 
> > > Again, there are not multiple versions of musl with different
> > > features depending on which compiler was used to compile them.
> > > There is one unified feature set. There are not configure-time or
> > > compile-time decisions about which features to support.  
> > 
> > This sounds a bit dogmatic  
> 
> Yes, it's one of the core principles of musl: that we don't have
> build-time-selectable feature-set like uclibc did.
> 
> > and also unrealistic. As said the dependency
> > on compiler builtins undermines that approach. Future versions of
> > gcc and clang will soon support `va_start` with only one parameter
> > for example. Musl will just be dependent on that compiler feature.  
> 
> No it won't. None of the code in musl calls or needs to call va_start
> with one parameter. You're confusing

??

> header-level stuff that a c23
> application might depend on, with build dependencies of libc.
> 
> > How will you do with optional features, then? For example decimal
> > floating point? This will never be added to musl? (Nobody will
> > probably backport support for them to very old gcc versions, for
> > example, or even to more recent versions of clang)  
> 
> Decimal float math library will likely be left to a third-party
> library implementation.
> 
> Decimal float in printf, if that becomes a thing, will be done the
> same way as int128: stub to pop the arguments, and 100% integer code
> to actually work with the data.

>  [...]  
> > > 
> > > The compiler used to compile musl and the compiler used to
> > > compile the application using musl have nothing to do with each
> > > other except sharing a baseline ABI target.  
> > 
> > Yes, exactly. And one supporting `__int128` and the other that
> > doesn't basically wouldn't interfere.  
> 
> The premise here is that applications and libc are being built by
> possibly different people with different tools. If I have a system
> built with gcc 5.3, I can't build C23 applications, but I might get a
> dynamically-linked C23 binary from someone who can. That binary needs
> to run with my musl-1.2.7 (made-up number) libc.so because the C
> language version the binary was generated from (or whether it was even
> C at all) is irrelevant. The interface surface is just the musl ABI
> surface.
> 
> > For the support of `__int128`: gcc has this since ages on 64 bit
> > archs, is there any such arch out there where this support is
> > changing according to versions of gcc that are still in use? So if
> > we make the  
> 
> We also support pcc, cparser+libfirm, etc. on archs they support. Not
> just gcc. And gcc back to 3.x.
> 
> > availability of `__int128` dependent on `UINTPTR_WIDTH` being 64,
> > would that be acceptable for you? Or an even more dependent approach
> > with special casing architectures where this is available since
> > always?  
> 
> It's not really "special casing archs where this is available since
> always". It's more like the other way around, "not special casing
> archs where __int128 is a guaranteed part of the baseline psABI". For
> those we can just let the default C implementation be used. For the
> rest we need a (completely trivial) asm stub that pops the arg
> according to the variadic argument ABI for the arch. This really isn't
> that big a deal. It's a few instructions at most.

I would still prefer that on those archs where there is `__int128` or
`_BitInt(128)` (for the latter basically all C23 compilers, I think)
that the default is done with that compiler support. We should leave
to the compiler people what they do best ;-)

This leaves us with fallback code to write that will probably rarely
be used. Also, I have difficulties to asses the effort that is
needed. There are the `printf`, `scanf` and the new bit-fiddeling
interfaces. For the latter the current proposal is to have them
implemented as shallow static inline functions. That would a bit
complicated without compiler support.

In all to me this sounds like a substantial effort in implementation
and coordination. What is the way forward, here?

Thanks
Jₑₙₛ

-- 
:: ICube :::::::::::::::::::::::::::::: deputy director ::
:: Université de Strasbourg :::::::::::::::::::::: ICPS ::
:: INRIA Nancy Grand Est :::::::::::::::::::::::: Camus ::
:: :::::::::::::::::::::::::::::::::::: ☎ +33 368854536 ::
:: https://icube-icps.unistra.fr/index.php/Jens_Gustedt ::

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [musl] patches for C23
  2023-05-04  1:09                       ` Gabriel Ravier
@ 2023-05-04 14:07                         ` Rich Felker
  0 siblings, 0 replies; 29+ messages in thread
From: Rich Felker @ 2023-05-04 14:07 UTC (permalink / raw)
  To: Gabriel Ravier; +Cc: musl, Jₑₙₛ Gustedt

On Thu, May 04, 2023 at 03:09:53AM +0200, Gabriel Ravier wrote:
> On 5/3/23 21:33, Rich Felker wrote:
> >On Wed, May 03, 2023 at 08:46:56PM +0200, Jₑₙₛ Gustedt wrote:
> >>Rich,
> >>
> >>on Wed, 3 May 2023 13:28:02 -0400 you (Rich Felker <dalias@libc.org>)
> >>wrote:
> >>
> >>>On Wed, May 03, 2023 at 05:11:11PM +0200, Jₑₙₛ Gustedt wrote:
> >>>>Rich,
> >>>>
> >>>>on Wed, 3 May 2023 10:16:19 -0400 you (Rich Felker
> >>>><dalias@libc.org>) wrote:
> >>>>>On Wed, May 03, 2023 at 11:12:46AM +0200, Jₑₙₛ Gustedt wrote:
> >>>  [...]
> >>>  [...]
> >>>>>Yes. We don't require a compiler that has an __int128.
> >>>>sure, but all the uses are protected by `__SIZEOF_INT128__`. So if
> >>>>the compiler don't has this, they will not see that code when
> >>>>compiling musl.
> >>>Again, there are not multiple versions of musl with different features
> >>>depending on which compiler was used to compile them. There is one
> >>>unified feature set. There are not configure-time or compile-time
> >>>decisions about which features to support.
> >>This sounds a bit dogmatic
> >Yes, it's one of the core principles of musl: that we don't have
> >build-time-selectable feature-set like uclibc did.
> >
> >>and also unrealistic. As said the dependency
> >>on compiler builtins undermines that approach. Future versions of gcc
> >>and clang will soon support `va_start` with only one parameter for
> >>example. Musl will just be dependent on that compiler feature.
> >No it won't. None of the code in musl calls or needs to call va_start
> >with one parameter. You're confusing header-level stuff that a c23
> >application might depend on, with build dependencies of libc.
> >
> >>How will you do with optional features, then? For example decimal
> >>floating point? This will never be added to musl? (Nobody will
> >>probably backport support for them to very old gcc versions, for
> >>example, or even to more recent versions of clang)
> >Decimal float math library will likely be left to a third-party
> >library implementation.
> >
> >Decimal float in printf, if that becomes a thing, will be done the
> >same way as int128: stub to pop the arguments, and 100% integer code
> >to actually work with the data.
> >
> >>>>Also application side compilation with a different compiler that has
> >>>>no `__int128` would not see these interfaces, so such application
> >>>>code can never call into the library with `__int128` types.
> >>>The compiler used to compile musl and the compiler used to compile the
> >>>application using musl have nothing to do with each other except
> >>>sharing a baseline ABI target.
> >>Yes, exactly. And one supporting `__int128` and the other that doesn't
> >>basically wouldn't interfere.
> >The premise here is that applications and libc are being built by
> >possibly different people with different tools. If I have a system
> >built with gcc 5.3, I can't build C23 applications, but I might get a
> >dynamically-linked C23 binary from someone who can. That binary needs
> >to run with my musl-1.2.7 (made-up number) libc.so because the C
> >language version the binary was generated from (or whether it was even
> >C at all) is irrelevant. The interface surface is just the musl ABI
> >surface.
> >
> >>For the support of `__int128`: gcc has this since ages on 64 bit
> >>archs, is there any such arch out there where this support is changing
> >>according to versions of gcc that are still in use? So if we make the
> >We also support pcc, cparser+libfirm, etc. on archs they support. Not
> >just gcc. And gcc back to 3.x.
> GCC 3.x ??? I understand wanting backwards compatibility but
> compiler versions from barely after the turn of the century seem
> like a bit far, I guess it's admirable that musl works with versions
> of software that are older than I am, but at some point I have to
> wonder if even a single person in the world actually finds it useful
> to be able to build musl with GCC 3 in 2023...

Yes, at least one and possibly several musl-based distros bootstrapped
with gcc 3 last time I checked.

In any case, you don't put "vastly narrowing the body of compilers you
can use to build musl" on the table as part of implementing new
functionality. This is how you get existing users to hate new
functionality.

More abstractly, we don't have "we support gcc 3 because it's gcc 3".
We have an extremely minimal set of requirements for compiler that
amount to "c99 plus an extremely small subset of 'gnu c' extensions"
and support any compiler that provides these, and gcc 3 happens to be
one of those. (So does pcc, etc.) The only times we've added
*anything* to the required set of extensions, to my recollection, has
been when the thing to be added was already known to be supported by
all existing compilers that already fit the requirements.

Now if there's a critical bug in a particular old compiler/version
that has it miscompiling musl, that doesn't mean we're going to go out
of our way to work around it. Likely it means anyone who wants to keep
using it needs to patch the bug. Again this is because our
requirements are based on standards/a baseline profile of what's
needed, not particular products and versions we've selected.

Rich

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [musl] patches for C23
  2023-05-04  6:48                       ` Jₑₙₛ Gustedt
@ 2023-05-04 14:30                         ` Rich Felker
  2023-05-04 15:31                           ` enh
  2023-05-04 15:53                           ` Jₑₙₛ Gustedt
  0 siblings, 2 replies; 29+ messages in thread
From: Rich Felker @ 2023-05-04 14:30 UTC (permalink / raw)
  To: Jₑₙₛ Gustedt; +Cc: musl

On Thu, May 04, 2023 at 08:48:46AM +0200, Jₑₙₛ Gustedt wrote:
> Rich,
> 
> on Wed, 3 May 2023 15:33:26 -0400 you (Rich Felker <dalias@libc.org>)
> wrote:
> 
> > On Wed, May 03, 2023 at 08:46:56PM +0200, Jₑₙₛ Gustedt wrote:
> > > Rich,
> > > 
> > > on Wed, 3 May 2023 13:28:02 -0400 you (Rich Felker
> > > <dalias@libc.org>) wrote:
> > >   
> > > > On Wed, May 03, 2023 at 05:11:11PM +0200, Jₑₙₛ Gustedt wrote:  
> >  [...]  
> >  [...]  
> > > >  [...]  
> > > >  [...]    
> >  [...]  
> >  [...]  
> > > > 
> > > > Again, there are not multiple versions of musl with different
> > > > features depending on which compiler was used to compile them.
> > > > There is one unified feature set. There are not configure-time or
> > > > compile-time decisions about which features to support.  
> > > 
> > > This sounds a bit dogmatic  
> > 
> > Yes, it's one of the core principles of musl: that we don't have
> > build-time-selectable feature-set like uclibc did.
> > 
> > > and also unrealistic. As said the dependency
> > > on compiler builtins undermines that approach. Future versions of
> > > gcc and clang will soon support `va_start` with only one parameter
> > > for example. Musl will just be dependent on that compiler feature.  
> > 
> > No it won't. None of the code in musl calls or needs to call va_start
> > with one parameter. You're confusing
> 
> ??

Either your statement that "musl will be dependent on that compiler
feature" is inaccutate or I'm misunderstanding what you mean. The code
in musl does not call va_start wth only one parameter.

If you mean "in order to provide a conforming C23 compilation
environment for applications, the compiler must support a
single-parameter va_start built-in", this is true, but it's obvious
that to compile C23 applications you need a C23 compiler (or compiler
with at least the subset of C23 that you need). This is the
application depending on it, not musl depending on it.

> > > availability of `__int128` dependent on `UINTPTR_WIDTH` being 64,
> > > would that be acceptable for you? Or an even more dependent approach
> > > with special casing architectures where this is available since
> > > always?  
> > 
> > It's not really "special casing archs where this is available since
> > always". It's more like the other way around, "not special casing
> > archs where __int128 is a guaranteed part of the baseline psABI". For
> > those we can just let the default C implementation be used. For the
> > rest we need a (completely trivial) asm stub that pops the arg
> > according to the variadic argument ABI for the arch. This really isn't
> > that big a deal. It's a few instructions at most.
> 
> I would still prefer that on those archs where there is `__int128` or
> `_BitInt(128)` (for the latter basically all C23 compilers, I think)
> that the default is done with that compiler support. We should leave
> to the compiler people what they do best ;-)

Note that you can use gcc -S to generate the asm, clean up any cruft
in it, and commit the output to git, using a function like this:

struct int128_s { uint64_t a, b; };
union u { __int128 x; struct int128_s s; };

struct int128_s __pop_arg_int128(va_list *ap)
{
	return (union u){ .x = va_arg(*ap, __int128) }.s;
}

> This leaves us with fallback code to write that will probably rarely
> be used. Also, I have difficulties to asses the effort that is
> needed.

See above.

> There are the `printf`, `scanf` and the new bit-fiddeling
> interfaces.

For scanf, no special va_list support is needed. It makes use of the
POSIX allowance to read pointer arguments as void *, and just stores
via them. All it needs to do is format the int128 in memory and memcpy
to the void *.

> For the latter the current proposal is to have them
> implemented as shallow static inline functions. That would a bit
> complicated without compiler support.

Do the bit-fiddling interfaces require external function definitions,
or are macro-only implementations allowed? In case of the latter, yes,
you absolutely can assume a compiler that supports whatever type is
being used, since they're compiled by the compiler that is building
the application, not the compiler that is building musl.

> In all to me this sounds like a substantial effort in implementation
> and coordination. What is the way forward, here?

I don't think it's actually all that much.

The popping thunks can be generated from the above mechanically for
all archs.

The main remaining code is writing explicit long mul/div for operating
on a struct representing int128 in two int64s which can be used in
printf and scanf/strto*. The div is only /10, so I think it can be
quite compact (vs arbitrary 128-bit division which would be nasty).

Rich

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [musl] patches for C23
  2023-05-04 14:30                         ` Rich Felker
@ 2023-05-04 15:31                           ` enh
  2023-05-04 15:53                           ` Jₑₙₛ Gustedt
  1 sibling, 0 replies; 29+ messages in thread
From: enh @ 2023-05-04 15:31 UTC (permalink / raw)
  To: musl; +Cc: Jₑₙₛ Gustedt

[-- Attachment #1: Type: text/plain, Size: 5851 bytes --]

On Thu, May 4, 2023 at 7:31 AM Rich Felker <dalias@libc.org> wrote:

> On Thu, May 04, 2023 at 08:48:46AM +0200, Jₑₙₛ Gustedt wrote:
> > Rich,
> >
> > on Wed, 3 May 2023 15:33:26 -0400 you (Rich Felker <dalias@libc.org>)
> > wrote:
> >
> > > On Wed, May 03, 2023 at 08:46:56PM +0200, Jₑₙₛ Gustedt wrote:
> > > > Rich,
> > > >
> > > > on Wed, 3 May 2023 13:28:02 -0400 you (Rich Felker
> > > > <dalias@libc.org>) wrote:
> > > >
> > > > > On Wed, May 03, 2023 at 05:11:11PM +0200, Jₑₙₛ Gustedt wrote:
> > >  [...]
> > >  [...]
> > > > >  [...]
> > > > >  [...]
> > >  [...]
> > >  [...]
> > > > >
> > > > > Again, there are not multiple versions of musl with different
> > > > > features depending on which compiler was used to compile them.
> > > > > There is one unified feature set. There are not configure-time or
> > > > > compile-time decisions about which features to support.
> > > >
> > > > This sounds a bit dogmatic
> > >
> > > Yes, it's one of the core principles of musl: that we don't have
> > > build-time-selectable feature-set like uclibc did.
> > >
> > > > and also unrealistic. As said the dependency
> > > > on compiler builtins undermines that approach. Future versions of
> > > > gcc and clang will soon support `va_start` with only one parameter
> > > > for example. Musl will just be dependent on that compiler feature.
> > >
> > > No it won't. None of the code in musl calls or needs to call va_start
> > > with one parameter. You're confusing
> >
> > ??
>
> Either your statement that "musl will be dependent on that compiler
> feature" is inaccutate or I'm misunderstanding what you mean. The code
> in musl does not call va_start wth only one parameter.
>
> If you mean "in order to provide a conforming C23 compilation
> environment for applications, the compiler must support a
> single-parameter va_start built-in", this is true, but it's obvious
> that to compile C23 applications you need a C23 compiler (or compiler
> with at least the subset of C23 that you need). This is the
> application depending on it, not musl depending on it.
>
> > > > availability of `__int128` dependent on `UINTPTR_WIDTH` being 64,
> > > > would that be acceptable for you? Or an even more dependent approach
> > > > with special casing architectures where this is available since
> > > > always?
> > >
> > > It's not really "special casing archs where this is available since
> > > always". It's more like the other way around, "not special casing
> > > archs where __int128 is a guaranteed part of the baseline psABI". For
> > > those we can just let the default C implementation be used. For the
> > > rest we need a (completely trivial) asm stub that pops the arg
> > > according to the variadic argument ABI for the arch. This really isn't
> > > that big a deal. It's a few instructions at most.
> >
> > I would still prefer that on those archs where there is `__int128` or
> > `_BitInt(128)` (for the latter basically all C23 compilers, I think)
> > that the default is done with that compiler support. We should leave
> > to the compiler people what they do best ;-)
>
> Note that you can use gcc -S to generate the asm, clean up any cruft
> in it, and commit the output to git, using a function like this:
>
> struct int128_s { uint64_t a, b; };
> union u { __int128 x; struct int128_s s; };
>
> struct int128_s __pop_arg_int128(va_list *ap)
> {
>         return (union u){ .x = va_arg(*ap, __int128) }.s;
> }
>
> > This leaves us with fallback code to write that will probably rarely
> > be used. Also, I have difficulties to asses the effort that is
> > needed.
>
> See above.
>
> > There are the `printf`, `scanf` and the new bit-fiddeling
> > interfaces.
>
> For scanf, no special va_list support is needed. It makes use of the
> POSIX allowance to read pointer arguments as void *, and just stores
> via them. All it needs to do is format the int128 in memory and memcpy
> to the void *.
>
> > For the latter the current proposal is to have them
> > implemented as shallow static inline functions. That would a bit
> > complicated without compiler support.
>
> Do the bit-fiddling interfaces require external function definitions,
> or are macro-only implementations allowed?


the outcome of https://github.com/llvm/llvm-project/issues/62248 was the
former, sadly. i was hoping so just send llvm a macro-only <stdckdint.h>
and <stdbit.h> and stay out of it. that should still work for the former,
but i'm not sure what to do about the latter.

(_apart_ from the fact that it's in ISO C, <stdbit.h> seems strictly worse
than just using __builtin_foo(). even the names are less readable, and
having them as external functions seems to defeat the purpose of them
existing in the first place. i'm very tempted to just implement them as
macros and leave the actual functions as a "you're holding it wrong" case.
but probably i'll just wait to see if these ever actually get used first...)


> In case of the latter, yes,
> you absolutely can assume a compiler that supports whatever type is
> being used, since they're compiled by the compiler that is building
> the application, not the compiler that is building musl.
>
> > In all to me this sounds like a substantial effort in implementation
> > and coordination. What is the way forward, here?
>
> I don't think it's actually all that much.
>
> The popping thunks can be generated from the above mechanically for
> all archs.
>
> The main remaining code is writing explicit long mul/div for operating
> on a struct representing int128 in two int64s which can be used in
> printf and scanf/strto*. The div is only /10, so I think it can be
> quite compact (vs arbitrary 128-bit division which would be nasty).
>
> Rich
>

[-- Attachment #2: Type: text/html, Size: 7380 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [musl] patches for C23
  2023-05-03  9:12           ` Jₑₙₛ Gustedt
  2023-05-03 14:16             ` Rich Felker
@ 2023-05-04 15:50             ` Jeffrey Walton
  2023-05-04 16:05               ` Rich Felker
  1 sibling, 1 reply; 29+ messages in thread
From: Jeffrey Walton @ 2023-05-04 15:50 UTC (permalink / raw)
  To: musl; +Cc: Rich Felker

On Wed, May 3, 2023 at 5:13 AM Jₑₙₛ Gustedt <jens.gustedt@inria.fr> wrote:
>
> [...]
> > Language/compiler baseline for building musl is not going to go up, so
> > this complicates some things, especially implementing the int128
> > stuff. This will need pop_arg to call out to an arch-provided asm
> > function that bypasses the C type system to get the nonexistent-type
> > argument off the va_list and store it in a pair of uint64_t.
>
> I don't see that. `pop_arg` just uses `va_arg` and that in turn is
> fixed to `__builtin_va_arg`. The proposed patches assume that if
> `__SIZEOF_INT128__` is defined by the compiler that then the compiler
> provides the `__int128` types and knows how to deal with them in
> `__builtin_va_arg`. Is there anything wrong with that assumtion?

It may be worth mentioning the GCC folks say the test is
__SIZEOF_INT128__ >= 16, and not merely defining __SIZEOF_INT128__.[1]

And __SIZEOF_INT128__ will only show up on 64-bit platforms at the
moment. 32-bit platforms will lack the define.

Jeff

[1] 128-bit integer - nonsensical documentation?,
https://gcc.gnu.org/legacy-ml/gcc-help/2015-08/msg00176.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [musl] patches for C23
  2023-05-04 14:30                         ` Rich Felker
  2023-05-04 15:31                           ` enh
@ 2023-05-04 15:53                           ` Jₑₙₛ Gustedt
  2023-05-04 16:14                             ` Rich Felker
  2023-05-10 14:28                             ` [musl] stdbit.h Jₑₙₛ Gustedt
  1 sibling, 2 replies; 29+ messages in thread
From: Jₑₙₛ Gustedt @ 2023-05-04 15:53 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

[-- Attachment #1: Type: text/plain, Size: 4075 bytes --]

Rich,

on Thu, 4 May 2023 10:30:53 -0400 you (Rich Felker <dalias@libc.org>)
wrote:

> > > You're confusing  
> > 
> > ??  

> Note that you can use gcc -S to generate the asm,

sure

> clean up any cruft
> in it, and commit the output to git, using a function like this:
> 
> struct int128_s { uint64_t a, b; };
> union u { __int128 x; struct int128_s s; };
> 
> struct int128_s __pop_arg_int128(va_list *ap)
> {
> 	return (union u){ .x = va_arg(*ap, __int128) }.s;
> }
> 
> > This leaves us with fallback code to write that will probably rarely
> > be used. Also, I have difficulties to asses the effort that is
> > needed.  
> 
> See above.

Yes, sure, what is worrying me is not to do that for one architecture
that I have and know some about (well certainly have to learn things,
but that's ok) but to have that for all architectures, to which I
don't have access and that may have asm fiddling that I don't know
about.

> > There are the `printf`, `scanf` and the new bit-fiddeling
> > interfaces.  
> 
> For scanf, no special va_list support is needed. It makes use of the
> POSIX allowance to read pointer arguments as void *, and just stores
> via them. All it needs to do is format the int128 in memory and memcpy
> to the void *.
> 
> > For the latter the current proposal is to have them
> > implemented as shallow static inline functions. That would a bit
> > complicated without compiler support.  
> 
> Do the bit-fiddling interfaces require external function definitions,
> or are macro-only implementations allowed?


They are required for the three usual wide unsigned integer types. The
type-generic interface is supposed to work for all wide standard and
extended integer types (not including `_BitInt(N)` for weird `N`). So
the most natural here would be to add functions for the 128 bit
types. Also the generic code that just dispatches inline function
pointers is much easier and clearer. `_Generic` for function or macro
calls (in contrast to just function pointers) is much nastier, because
all branches must be valid C and should not drown us in
false-positives.

> In case of the latter, yes, you absolutely can assume a compiler
> that supports whatever type is being used, since they're compiled by
> the compiler that is building the application, not the compiler that
> is building musl.

A macro version is certainly doable, there is such a version in the
patches already, but it is much nastier.

But for all of this, the separation in two chunks of 64 bit and
assembling the result is already done.

> > In all to me this sounds like a substantial effort in implementation
> > and coordination. What is the way forward, here?  
> 
> I don't think it's actually all that much.
> 
> The popping thunks can be generated from the above mechanically for
> all archs.

Yes, but only for people that have access to these archs. So this is
much more effort than me writing some code and having it reviewed by
you guys.

> The main remaining code is writing explicit long mul/div for operating
> on a struct representing int128 in two int64s which can be used in
> printf and scanf/strto*. The div is only /10, so I think it can be
> quite compact (vs arbitrary 128-bit division which would be nasty).

Yes, I figured that. Some of that for bases 16 and 10 should already
be there in the floating point code, I imagine. But still this is not
so easy to read from the start, and would need good review and
testing. And our internal dispatch `__intscan` accept bases from 2 to
36, so there is either a bit more than 10 and 16 to cover, or a
special instantiation of the function as used by `scanf` for 128 types
has to be created.

Jₑₙₛ

-- 
:: ICube :::::::::::::::::::::::::::::: deputy director ::
:: Université de Strasbourg :::::::::::::::::::::: ICPS ::
:: INRIA Nancy Grand Est :::::::::::::::::::::::: Camus ::
:: :::::::::::::::::::::::::::::::::::: ☎ +33 368854536 ::
:: https://icube-icps.unistra.fr/index.php/Jens_Gustedt ::

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [musl] patches for C23
  2023-05-04 15:50             ` [musl] patches for C23 Jeffrey Walton
@ 2023-05-04 16:05               ` Rich Felker
  0 siblings, 0 replies; 29+ messages in thread
From: Rich Felker @ 2023-05-04 16:05 UTC (permalink / raw)
  To: Jeffrey Walton; +Cc: musl

On Thu, May 04, 2023 at 11:50:34AM -0400, Jeffrey Walton wrote:
> On Wed, May 3, 2023 at 5:13 AM Jₑₙₛ Gustedt <jens.gustedt@inria.fr> wrote:
> >
> > [...]
> > > Language/compiler baseline for building musl is not going to go up, so
> > > this complicates some things, especially implementing the int128
> > > stuff. This will need pop_arg to call out to an arch-provided asm
> > > function that bypasses the C type system to get the nonexistent-type
> > > argument off the va_list and store it in a pair of uint64_t.
> >
> > I don't see that. `pop_arg` just uses `va_arg` and that in turn is
> > fixed to `__builtin_va_arg`. The proposed patches assume that if
> > `__SIZEOF_INT128__` is defined by the compiler that then the compiler
> > provides the `__int128` types and knows how to deal with them in
> > `__builtin_va_arg`. Is there anything wrong with that assumtion?
> 
> It may be worth mentioning the GCC folks say the test is
> __SIZEOF_INT128__ >= 16, and not merely defining __SIZEOF_INT128__.[1]
> And __SIZEOF_INT128__ will only show up on 64-bit platforms at the
> moment. 32-bit platforms will lack the define.

__SIZEOF_INT128__ is necessarily 16 if the type exists. This is a
consequence of CHAR_BIT==8 and exact-size integer types not being
allowed to have padding.

Rich

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [musl] patches for C23
  2023-05-04 15:53                           ` Jₑₙₛ Gustedt
@ 2023-05-04 16:14                             ` Rich Felker
  2023-05-10 14:17                               ` Jₑₙₛ Gustedt
  2023-05-10 14:28                             ` [musl] stdbit.h Jₑₙₛ Gustedt
  1 sibling, 1 reply; 29+ messages in thread
From: Rich Felker @ 2023-05-04 16:14 UTC (permalink / raw)
  To: Jₑₙₛ Gustedt; +Cc: musl

On Thu, May 04, 2023 at 05:53:57PM +0200, Jₑₙₛ Gustedt wrote:
> Rich,
> 
> on Thu, 4 May 2023 10:30:53 -0400 you (Rich Felker <dalias@libc.org>)
> wrote:
> 
> > > > You're confusing  
> > > 
> > > ??  
> 
> > Note that you can use gcc -S to generate the asm,
> 
> sure
> 
> > clean up any cruft
> > in it, and commit the output to git, using a function like this:
> > 
> > struct int128_s { uint64_t a, b; };
> > union u { __int128 x; struct int128_s s; };
> > 
> > struct int128_s __pop_arg_int128(va_list *ap)
> > {
> > 	return (union u){ .x = va_arg(*ap, __int128) }.s;
> > }
> > 
> > > This leaves us with fallback code to write that will probably rarely
> > > be used. Also, I have difficulties to asses the effort that is
> > > needed.  
> > 
> > See above.
> 
> Yes, sure, what is worrying me is not to do that for one architecture
> that I have and know some about (well certainly have to learn things,
> but that's ok) but to have that for all architectures, to which I
> don't have access and that may have asm fiddling that I don't know
> about.

I don't expect you do to this work for us. It's something myself or
anyone else working on musl stuff can do and that your patches can
just assume is already present in musl by a name like __pop_arg_int128
or something.

> > The main remaining code is writing explicit long mul/div for operating
> > on a struct representing int128 in two int64s which can be used in
> > printf and scanf/strto*. The div is only /10, so I think it can be
> > quite compact (vs arbitrary 128-bit division which would be nasty).
> 
> Yes, I figured that. Some of that for bases 16 and 10 should already
> be there in the floating point code, I imagine. But still this is not
> so easy to read from the start, and would need good review and
> testing. And our internal dispatch `__intscan` accept bases from 2 to
> 36, so there is either a bit more than 10 and 16 to cover, or a
> special instantiation of the function as used by `scanf` for 128 types
> has to be created.

__intscan only needs mul, not div, and mul is the easy side. It's
printf that needs div, and 10 is the only non-power-of-two base there.

In the case of __intscan, I'd just change the signature to return an
int128 tuple struct, and switch to using it when the value no longer
fits in smaller type. The "lim" argument mechanism needs some change
too I think. No need for a different variant of the function for
int128; the whole point of the way it's implemented is not to have
multiple versions of it for different types.

Rich

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [musl] patches for C23
  2023-05-04 16:14                             ` Rich Felker
@ 2023-05-10 14:17                               ` Jₑₙₛ Gustedt
  0 siblings, 0 replies; 29+ messages in thread
From: Jₑₙₛ Gustedt @ 2023-05-10 14:17 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

[-- Attachment #1: Type: text/plain, Size: 2548 bytes --]

Hi,
there is a new version for these patches at

        https://icube-forge.unistra.fr/icps/musl/-/network/master?extended_sha1=c23-v4

I tried to integrate the feedback that I had so far (Thanks!) in
particular concerning the support for 128 bit integer types. It should
now be that this support, as far as needed and interfaced by musl, is
unconditional.

on Thu, 4 May 2023 12:14:52 -0400 you (Rich Felker <dalias@libc.org>)
wrote:

> I don't expect you do to this work for us. It's something myself or
> anyone else working on musl stuff can do and that your patches can
> just assume is already present in musl by a name like __pop_arg_int128
> or something.

Ok, this should work now. The emulation code is found in uwide128.h
and uwide128.c and this function, that has to be provided in asm is
called `__uwide128_pop`.

If the compiler used to compile musl implements the `__int128` types,
these types are used, there is no reason to waste the knowledge that
was put over the years into this compiler support. Under this
condition, `__uwide128_pop` is also produced and just has to be
generated with -S and extracted from the asm file. It is then easy to
clean that up a bit, make the symbol weak and to provide the .s file
for the architecture, much as you indicated, Rich.

> __intscan only needs mul, not div, and mul is the easy side. It's
> printf that needs div, and 10 is the only non-power-of-two base there.

Well actually both only need mul and div with small numbers, so the
code complexity is about the same for both operations, here. But for
the whole we need also comparison, addition, subtraction, negation,
zero-test and conversion back and forth. So in all it was a bit more
complex than I thought.

> In the case of __intscan, I'd just change the signature to return an
> int128 tuple struct, and switch to using it when the value no longer
> fits in smaller type. The "lim" argument mechanism needs some change
> too I think.

Actually not much, only that one has to watch that the min values for
signed types get sign extended when converted to the structure.

I'll comment on the bit operations in stdbit.h as a reply to a separate
mail.

Thanks
Jₑₙₛ

-- 
:: ICube :::::::::::::::::::::::::::::: deputy director ::
:: Université de Strasbourg :::::::::::::::::::::: ICPS ::
:: INRIA Nancy Grand Est :::::::::::::::::::::::: Camus ::
:: :::::::::::::::::::::::::::::::::::: ☎ +33 368854536 ::
:: https://icube-icps.unistra.fr/index.php/Jens_Gustedt ::

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [musl] stdbit.h
  2023-05-04 15:53                           ` Jₑₙₛ Gustedt
  2023-05-04 16:14                             ` Rich Felker
@ 2023-05-10 14:28                             ` Jₑₙₛ Gustedt
  1 sibling, 0 replies; 29+ messages in thread
From: Jₑₙₛ Gustedt @ 2023-05-10 14:28 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

[-- Attachment #1: Type: text/plain, Size: 1841 bytes --]

now to the bit interfaces in the new header stdbit.h

on Thu, 4 May 2023 17:53:57 +0200 you (Jₑₙₛ Gustedt
<jens.gustedt@inria.fr>) wrote:

> They are required for the three usual wide unsigned integer types. The
> type-generic interface is supposed to work for all wide standard and
> extended integer types (not including `_BitInt(N)` for weird `N`). So
> the most natural here would be to add functions for the 128 bit
> types. Also the generic code that just dispatches inline function
> pointers is much easier and clearer. `_Generic` for function or macro
> calls (in contrast to just function pointers) is much nastier, because
> all branches must be valid C and should not drown us in
> false-positives.

So for the moment I kept it like that with inline function
interfaces. To support the 128 bit types I did the following:

 - added internal interfaces that work with two 64 bit integers
 - added application side interfaces for `__int128` types
 - added application side interfaces for `_BitInt(128)` types

The latter two are never part of musl itself, but only produced in the
application as `static inline` interfaces that refer to the ones that
work with two 64 bit numbers (and may tail call into these). So if the
application compiler knows how to deal with `__int128` (very likely on
64 bit archs) or `_BitInt(128)` (very likely with C23 compilers) they
can rely on the necessary infrastructor within musl, regardles with
which compiler musl was compiled.

Thanks
Jₑₙₛ

-- 
:: ICube :::::::::::::::::::::::::::::: deputy director ::
:: Université de Strasbourg :::::::::::::::::::::: ICPS ::
:: INRIA Nancy Grand Est :::::::::::::::::::::::: Camus ::
:: :::::::::::::::::::::::::::::::::::: ☎ +33 368854536 ::
:: https://icube-icps.unistra.fr/index.php/Jens_Gustedt ::

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2023-05-10 14:28 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-01 18:50 [musl] patches for C23 Jₑₙₛ Gustedt
2023-05-01 19:24 ` Khem Raj
2023-05-01 19:41 ` Rich Felker
2023-05-02  6:57   ` Jₑₙₛ Gustedt
2023-05-02 13:59     ` Jₑₙₛ Gustedt
2023-05-02 23:20       ` Rich Felker
2023-05-03  0:00         ` Rich Felker
2023-05-03  9:12           ` Jₑₙₛ Gustedt
2023-05-03 14:16             ` Rich Felker
2023-05-03 15:11               ` Jₑₙₛ Gustedt
2023-05-03 17:28                 ` Rich Felker
2023-05-03 18:46                   ` Jₑₙₛ Gustedt
2023-05-03 19:33                     ` Rich Felker
2023-05-04  1:09                       ` Gabriel Ravier
2023-05-04 14:07                         ` Rich Felker
2023-05-04  6:48                       ` Jₑₙₛ Gustedt
2023-05-04 14:30                         ` Rich Felker
2023-05-04 15:31                           ` enh
2023-05-04 15:53                           ` Jₑₙₛ Gustedt
2023-05-04 16:14                             ` Rich Felker
2023-05-10 14:17                               ` Jₑₙₛ Gustedt
2023-05-10 14:28                             ` [musl] stdbit.h Jₑₙₛ Gustedt
2023-05-04 15:50             ` [musl] patches for C23 Jeffrey Walton
2023-05-04 16:05               ` Rich Felker
2023-05-03  7:13         ` Jₑₙₛ Gustedt
2023-05-03 14:06           ` Rich Felker
2023-05-03 14:26             ` Jₑₙₛ Gustedt
2023-05-03 14:43               ` Rich Felker
2023-05-03 15:26                 ` Jₑₙₛ Gustedt

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).