On Tue, Jul 9, 2019 at 3:38 PM Rich Felker <dalias@libc.org> wrote:
On Tue, Jul 09, 2019 at 03:19:00PM -0400, James Y Knight wrote:
> Both GCC and Clang ship their own stddef.h which does this (musl's
> stddef.h is simply ignored). But, musl also defines the macro in a
> number of other headers.

This was intentional at the time it was done. At least in the present
version of C++ at the time, __null was not a conforming definition.
For example, conforming applications could parse the stringification
of NULL and expect to get something that's a valid null pointer
constant. This could possibly be revisited, if it's changed, and if it
can be done in a way that doesn't involve compiler-specific stuff.

The C++11 standard simply says "The macro NULL is an implementation-defined C++ null pointer constant in this International Standard." (and links to the definition of null pointer constant -- "A null pointer constant is an integral constant expression (5.19) prvalue of integer type that evaluates to zero or a prvalue of type std::nullptr_t").

If the implementation defines __null to be a null pointer constant, which is an integral constant expression of integer type that evaluates to zero (which it does), that seems entirely valid. I see nothing that restricts what it may stringify to.

> Thus, depending on which header you include
> last, you'll end up with a different definition of NULL.

musl doesn't support use of the compiler versions of these headers,
for several reasons that have nothing to do with NULL

That's rather non-obvious. Usually, the compiler's builtin headers are intended to be before libc in the search path. Invoking the compiler with --sysroot= pointed to a musl install directory results in that ordering. I would've expected that the behavior you get from `clang --sysroot=X -target x86_64-linux-musl` should actually work -- and, well, it does indeed appear to work -- but uses the compiler's stddef.h.

I'm also not really sure what the value is for musl to provide its own stddef.h -- can you be more specific as to why it's bad to use the compiler's copy? The compiler already has to know the correct types for ptrdiff_t, size_t, and wchar_t internally, whether or not you use its stddef.h. Having the compiler also be responsible for exposing those names seems like a good way to ensure that they're defined in a consistent manner.

Of course, as long as musl does define them to the same types, it's also fine for musl to provide the header...but it seems less than ideal for the compiler and the libc to be fighting about who should be providing the header.

> Mostly, getting musl's definition simply degrades warning diagnostics
> in C++ slightly -- e.g. GCC can no longer emit this warning:
>   warning: passing NULL to non-pointer argument 1 of 'int foo(long int)'
> [-Wconversion-null]

The current definition however catches bugs where NULL is wrongly used
to terminate a variadic argument list where (void*)0 or (char*)0 is
actually needed. See commits 41d7c77d6a2e74294807d35062e4cd1d48ab72d3
and c8a9c22173f485c8c053709e1dfa0a617cb6be1a. So both choices are a
tradeoff in diagnostic capability.

Ehh. Actually musl's definition actually breaks the intended semantics of that warning as well. The following program:
#include <unistd.h>
int main() {
  execl("foo", "bar", NULL);
}
does indeed not warn with "g++ -Wall foo.cc" when NULL is __null, and it does warn if NULL is 0L.

However, it was an intentional choice, not an accident, to suppress the warning in the former case. This kind of usage is widespread,, and in practice it does not trigger buggy behavior (because __null is guaranteed to be the same size as a pointer). GCC does have a separate "-Wstrict-null-sentinel" warning flag, if you want to be more pedantic about this not-standards-compliant code.

> If you're using Clang's modules support, it can also break
> compilation. In that case, the conflicting definitions may be detected
> as a module incompatibility.

I'm not sure what this means. The conflicting definition should never
appear in code that's consistent with compiling and linking with musl

The issue occurs when the compiler's stddef.h and musl's locale.h (or other such headers) are both used -- then you can observe conflicting definitions within one translation unit, which is what can trigger the issue. (All the details of what "modules' is, and when and why such conflicting definitions can trigger an error probably isn't really important here).
 
> A different (potentially better) fix would be to always retrieve the
> definition of NULL from the compiler's stddef.h (via #define
> __need_NULL #include <stddef.h>). It may also be best to delete the
> musl stddef.h entirely for clarity, since it's currently ignored,
> anyhow.

We intentionally don't do things that way.