From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/14371 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: James Y Knight Newsgroups: gmane.linux.lib.musl.general Subject: Re: [PATCH] Define NULL as __null in C++ mode when using GCC or Clang. Date: Tue, 9 Jul 2019 19:04:13 -0400 Message-ID: References: <20190709193826.GR1506@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="000000000000df6ce7058d4795db" Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="41761"; mail-complaints-to="usenet@blaine.gmane.org" To: musl@lists.openwall.com Original-X-From: musl-return-14387-gllmg-musl=m.gmane.org@lists.openwall.com Wed Jul 10 01:04:58 2019 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1hkzAP-000Air-C9 for gllmg-musl@m.gmane.org; Wed, 10 Jul 2019 01:04:57 +0200 Original-Received: (qmail 23582 invoked by uid 550); 9 Jul 2019 23:04:53 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 23561 invoked from network); 9 Jul 2019 23:04:52 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=zPZc8dzTMbAUITAFYQSM1JNnoOxnv5blUwtu2opxkAg=; b=lCAIpZOCNNN8Y4JbssuznsRPjzlk7dMJ87kmmthv3CZq+hJYKCF/1tEO+JuNsaomIs e6vtTN2ay31m3zLiuQbuZPBwSzOyiRT9Mi5sx7RSifXehyiJar/UMVckCAAY4ful/FBC 3gGPdeUAfKp7kW9kIrLKkT/iB7wmoEb9aSXHT4vq5R3H4lCoW8dmnLuf1eLVPjwJ+Iow 5Jtr20vX0c9ZbH8hGNPqVSRTEDt/7F/jgCRyzKtTvC7Az4/w2z5kQwbUPX/8TFjML334 hkM9GBq4gAn7xnm6CXuNOElSMicWBo01ZnTMuuODmULThV2rGU//lcahm1GFYq7gtXtY r0jg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=zPZc8dzTMbAUITAFYQSM1JNnoOxnv5blUwtu2opxkAg=; b=m6/I3ujjGF0mzJzEiJsDOe83WNjacVzc/FZQq4WIWiRuiyumPtjormUXgoQK2oFNDC BCk2dfO0dyzerGCD7+RtusnIqWBAvL6VqpGGYG7ueozZp7TpGi3WYijTPeyoqcR9aQ2G e63WRVN+v9AZUa76xjwmqt6dqvW+eL7fApDGPcn6q5enX8NRNsKAJSy86H2vAYEYZ8St IOOfLcqg9h6KagRWibk0N8bb6u6Y0jXTvZpRS0n8m31U26ONWPvTm62qHLIMPIqFrfo5 +nLRFpcJ6a/U0PBtJxBK6o9sJWWUO25p8i4Z7Mkv6f03NxRWMIOxakdmUgkJBaUv0Ii8 cP2g== X-Gm-Message-State: APjAAAXX6a5TXSrq7G11JYqE0odr4On6riHMoN6cLpXyovtMZWqmJPdi RyOpy3G44yFgPsU1RFnsCs1Myb4prCr3ZqnNoCul0USguDg= X-Google-Smtp-Source: APXvYqy6+cveCOHTZqUFQVK6GsouIv3d4gGktUeGD2zkJbLWs0oOl0DQVe6ooInj5qei6fWTzvaWeUrHNArReYLc1Fs= X-Received: by 2002:a67:6282:: with SMTP id w124mr12605952vsb.4.1562713480171; Tue, 09 Jul 2019 16:04:40 -0700 (PDT) In-Reply-To: <20190709193826.GR1506@brightrain.aerifal.cx> Xref: news.gmane.org gmane.linux.lib.musl.general:14371 Archived-At: --000000000000df6ce7058d4795db Content-Type: text/plain; charset="UTF-8" On Tue, Jul 9, 2019 at 3:38 PM Rich Felker wrote: > On Tue, Jul 09, 2019 at 03:19:00PM -0400, James Y Knight wrote: > > Both GCC and Clang ship their own stddef.h which does this (musl's > > stddef.h is simply ignored). But, musl also defines the macro in a > > number of other headers. > > This was intentional at the time it was done. At least in the present > version of C++ at the time, __null was not a conforming definition. > For example, conforming applications could parse the stringification > of NULL and expect to get something that's a valid null pointer > constant. This could possibly be revisited, if it's changed, and if it > can be done in a way that doesn't involve compiler-specific stuff. > The C++11 standard simply says "The macro NULL is an implementation-defined C++ null pointer constant in this International Standard." (and links to the definition of null pointer constant -- "A null pointer constant is an integral constant expression (5.19) prvalue of integer type that evaluates to zero or a prvalue of type std::nullptr_t"). If the implementation defines __null to be a null pointer constant, which is an integral constant expression of integer type that evaluates to zero (which it does), that seems entirely valid. I see nothing that restricts what it may stringify to. > Thus, depending on which header you include > > last, you'll end up with a different definition of NULL. > > musl doesn't support use of the compiler versions of these headers, > for several reasons that have nothing to do with NULL That's rather non-obvious. Usually, the compiler's builtin headers are intended to be before libc in the search path. Invoking the compiler with --sysroot= pointed to a musl install directory results in that ordering. I would've expected that the behavior you get from `clang --sysroot=X -target x86_64-linux-musl` should actually work -- and, well, it does indeed appear to work -- but uses the compiler's stddef.h. I'm also not really sure what the value is for musl to provide its own stddef.h -- can you be more specific as to why it's bad to use the compiler's copy? The compiler already has to know the correct types for ptrdiff_t, size_t, and wchar_t internally, whether or not you use its stddef.h. Having the compiler also be responsible for exposing those names seems like a good way to ensure that they're defined in a consistent manner. Of course, as long as musl does define them to the same types, it's also fine for musl to provide the header...but it seems less than ideal for the compiler and the libc to be fighting about who should be providing the header. > Mostly, getting musl's definition simply degrades warning diagnostics > in C++ slightly -- e.g. GCC can no longer emit this warning: > > warning: passing NULL to non-pointer argument 1 of 'int foo(long int)' > > [-Wconversion-null] > > The current definition however catches bugs where NULL is wrongly used > to terminate a variadic argument list where (void*)0 or (char*)0 is > actually needed. See commits 41d7c77d6a2e74294807d35062e4cd1d48ab72d3 > and c8a9c22173f485c8c053709e1dfa0a617cb6be1a. So both choices are a > tradeoff in diagnostic capability. Ehh. Actually musl's definition actually breaks the intended semantics of that warning as well. The following program: #include int main() { execl("foo", "bar", NULL); } does indeed not warn with "g++ -Wall foo.cc" when NULL is __null, and it does warn if NULL is 0L. However, it was an intentional choice, not an accident, to suppress the warning in the former case. This kind of usage is widespread,, and in practice it does not trigger buggy behavior (because __null is guaranteed to be the same size as a pointer). GCC does have a separate "-Wstrict-null-sentinel" warning flag, if you want to be more pedantic about this not-standards-compliant code. > If you're using Clang's modules support, it can also break > > compilation. In that case, the conflicting definitions may be detected > > as a module incompatibility. > > I'm not sure what this means. The conflicting definition should never > appear in code that's consistent with compiling and linking with musl The issue occurs when the compiler's stddef.h and musl's locale.h (or other such headers) are both used -- then you can observe conflicting definitions within one translation unit, which is what can trigger the issue. (All the details of what "modules' is, and when and why such conflicting definitions can trigger an error probably isn't really important here). > > A different (potentially better) fix would be to always retrieve the > > definition of NULL from the compiler's stddef.h (via #define > > __need_NULL #include ). It may also be best to delete the > > musl stddef.h entirely for clarity, since it's currently ignored, > > anyhow. > > We intentionally don't do things that way. --000000000000df6ce7058d4795db Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Tue, Jul 9, 2019 at 3:38 PM Rich Felke= r <dalias@libc.org<= /a>> wrote:
On Tue, Jul 09, 2019 at 03:19:00PM -0400, James Y= Knight wrote:
> Both GCC and Clang ship their own stddef.h which does this (musl's=
> stddef.h is simply ignored). But, musl also defines the macro in a
> number of other headers.

This was intentional at the time it was done. At least in the present
version of C++ at the time, __null was not a conforming definition.
For example, conforming applications could parse the stringification
of NULL and expect to get something that's a valid null pointer
constant. This could possibly be revisited, if it's changed, and if it<= br> can be done in a way that doesn't involve compiler-specific stuff.
<= /blockquote>

The C++11 standard simply says "The m= acro NULL is an implementation-defined C++ null pointer constant in this In= ternational Standard." (and links to the definition of null pointer co= nstant -- "A null pointer constant is an integral constant expression = (5.19) prvalue of integer type that evaluates to zero or a prvalue of type std::nullptr_t").

I= f the implementation defines __null to be a null pointer constant, which is= an integral constant expression of integer type that evaluates to zero (wh= ich it does), that seems entirely valid. I see nothing that restricts what = it may stringify to.

> Thus, depending on which header you include
> last, you'll end up with a different definition of NULL.

musl doesn't support use of the compiler versions of these headers,
for several reasons that have nothing to do with NULL

=
That's rather non-obvious. Usually, the compiler's built= in headers are intended to be before libc in the search path. Invoking the = compiler with --sysroot=3D pointed to a musl install directory results in t= hat ordering. I would've expected that the behavior you get from `clang= --sysroot=3DX -target x86_64-linux-musl` should actually work -- and, well= , it does indeed appear to work -- but uses the compiler's stddef.h.

I'm also not really sure what the valu= e is for musl to provide its own stddef.h -- can you be more specific as to= why it's bad to use the compiler's copy? The compiler already has = to know the correct types for ptrdiff_t, size_t, and wchar_t internally, wh= ether or not you use its stddef.h. Having the compiler also be responsible = for exposing those names seems like a good way to ensure that they're d= efined in a consistent manner.

Of cour= se, as long as musl does define them to the same types, it's also fine = for musl to provide the header...but it seems less than ideal for the compi= ler and the libc to be fighting about who should be providing the header.

> M= ostly, getting musl's definition simply degrades warning diagnostics
> in C++ slightly -- e.g. GCC can no longer emit this warning:
>=C2=A0 =C2=A0warning: passing NULL to non-pointer argument 1 of 'in= t foo(long int)'
> [-Wconversion-null]

The current definition however catches bugs where NULL is wrongly used
to terminate a variadic argument list where (void*)0 or (char*)0 is
actually needed. See commits 41d7c77d6a2e74294807d35062e4cd1d48ab72d3
and c8a9c22173f485c8c053709e1dfa0a617cb6be1a. So both choices are a
tradeoff in diagnostic capability.

Ehh. Act= ually musl's definition actually breaks the intended semantics of that = warning as well. The following program:
#include <unistd.h>=
int main() {
=C2=A0 execl("foo", "bar", NULL);}
does indeed not warn with "g++ -Wall foo.cc" wh= en NULL is __null, and it does warn if NULL is 0L.

However, it was an intentional choice, not an accident, to suppress the wa= rning in the former case. This kind of usage is widespread,, and in practic= e it does not trigger buggy behavior (because __null is guaranteed to be th= e same size as a pointer). GCC does have a separate "-Wstrict-null-sen= tinel" warning flag, if you want to be more pedantic about this not-st= andards-compliant code.

> If you're using Clang's modules support, it= can also break
> compilation. In that case, the conflicting definitions may be detected=
> as a module incompatibility.

I'm not sure what this means. The conflicting definition should never appear in code that's consistent with compiling and linking with musl

The issue occurs when the compile= r's stddef.h and musl's locale.h (or other such headers) are both u= sed -- then you can observe conflicting definitions within one translation = unit, which is what can trigger the issue. (All the details of what "m= odules' is, and when and why such conflicting definitions can trigger a= n error probably isn't really important here).
=C2= =A0
> A different (potentially better) fix would be to always retrieve the > definition of NULL from the compiler's stddef.h (via #define
> __need_NULL #include <stddef.h>). It may also be best to delete = the
> musl stddef.h entirely for clarity, since it's currently ignored,<= br> > anyhow.

We intentionally don't do things that way.

<= div>=C2=A0
--000000000000df6ce7058d4795db--