C++ inline functions are requred to have exact same sequence of tokens in every translation unit, but __FILE__ and __LINE__ may expand to different tokens. The ODR violatioin is usually benign, but it can lead to errors when C++20 modules are used. echo 'import B; import C; int main() { foo(); }' > A.cc cat > B.ccm <<'eof' module; #include <assert.h> export module B; export inline void foo() { assert(1); } eof cat > C.ccm <<'eof' module; #include <assert.h> export module C; export inline void foo() { assert(1); } eof clang -std=c++20 --precompile B.ccm -o B.pcm clang -std=c++20 --precompile C.ccm -o C.pcm clang -std=c++20 -fprebuilt-module-path=. A.cc B.pcm C.pcm -o A /tmp/d/C.ccm:3:37: error: 'foo' has different definitions in different modules; definition in module 'C' first difference is function body export module C; export inline void foo() { assert(1); } ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~ /tmp/d/B.ccm:3:37: note: but in 'B' found a different body export module B; export inline void foo() { assert(1); } ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~ Fix this by preferring __builtin_FILE/__builtin_LINE which do not need preprocessing. --- include/assert.h | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/include/assert.h b/include/assert.h index d14ec94e..b209c2ae 100644 --- a/include/assert.h +++ b/include/assert.h @@ -4,6 +4,12 @@ #ifdef NDEBUG #define assert(x) (void)0 +#elif defined(__has_builtin) +#if __has_builtin(__builtin_FILE) +#define assert(x) ((void)((x) || (__assert_fail(#x, __builtin_FILE(), __builtin_LINE(), __func__),0))) +#else +#define assert(x) ((void)((x) || (__assert_fail(#x, __FILE__, __LINE__, __func__),0))) +#endif #else #define assert(x) ((void)((x) || (__assert_fail(#x, __FILE__, __LINE__, __func__),0))) #endif -- 2.39.GIT
On Fri, Feb 17, 2023 at 05:33:33PM -0800, Fangrui Song wrote:
> C++ inline functions are requred to have exact same sequence of tokens
> in every translation unit, but __FILE__ and __LINE__ may expand to
> different tokens. The ODR violatioin is usually benign, but it can lead
> to errors when C++20 modules are used.
>
> echo 'import B; import C; int main() { foo(); }' > A.cc
> cat > B.ccm <<'eof'
> module;
> #include <assert.h>
> export module B; export inline void foo() { assert(1); }
> eof
> cat > C.ccm <<'eof'
> module;
> #include <assert.h>
> export module C; export inline void foo() { assert(1); }
> eof
> clang -std=c++20 --precompile B.ccm -o B.pcm
> clang -std=c++20 --precompile C.ccm -o C.pcm
> clang -std=c++20 -fprebuilt-module-path=. A.cc B.pcm C.pcm -o A
>
> /tmp/d/C.ccm:3:37: error: 'foo' has different definitions in different modules; definition in module 'C' first difference is function body
> export module C; export inline void foo() { assert(1); }
> ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
> /tmp/d/B.ccm:3:37: note: but in 'B' found a different body
> export module B; export inline void foo() { assert(1); }
> ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
>
> Fix this by preferring __builtin_FILE/__builtin_LINE which do not need
> preprocessing.
> ---
> include/assert.h | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/include/assert.h b/include/assert.h
> index d14ec94e..b209c2ae 100644
> --- a/include/assert.h
> +++ b/include/assert.h
> @@ -4,6 +4,12 @@
>
> #ifdef NDEBUG
> #define assert(x) (void)0
> +#elif defined(__has_builtin)
> +#if __has_builtin(__builtin_FILE)
> +#define assert(x) ((void)((x) || (__assert_fail(#x, __builtin_FILE(), __builtin_LINE(), __func__),0)))
> +#else
> +#define assert(x) ((void)((x) || (__assert_fail(#x, __FILE__, __LINE__, __func__),0)))
> +#endif
> #else
> #define assert(x) ((void)((x) || (__assert_fail(#x, __FILE__, __LINE__, __func__),0)))
> #endif
> --
> 2.39.GIT
It seems like use of assert here violates the ODR and is thus an
application error, no? In particular, it produces multiple definitions
that have differing behaviors, leaving which one actually gets used up
to the linker. Without the above change, LTO is able to diagnose the
error; with the change; it's silently deferred until runtime (where
the assertion violation message, of produced, will likely indicate the
wrong location).
Rich
On Fri, Feb 17, 2023 at 6:03 PM Rich Felker <dalias@libc.org> wrote:
>
> On Fri, Feb 17, 2023 at 05:33:33PM -0800, Fangrui Song wrote:
> > C++ inline functions are requred to have exact same sequence of tokens
> > in every translation unit, but __FILE__ and __LINE__ may expand to
> > different tokens. The ODR violatioin is usually benign, but it can lead
> > to errors when C++20 modules are used.
> >
> > echo 'import B; import C; int main() { foo(); }' > A.cc
> > cat > B.ccm <<'eof'
> > module;
> > #include <assert.h>
> > export module B; export inline void foo() { assert(1); }
> > eof
> > cat > C.ccm <<'eof'
> > module;
> > #include <assert.h>
> > export module C; export inline void foo() { assert(1); }
> > eof
> > clang -std=c++20 --precompile B.ccm -o B.pcm
> > clang -std=c++20 --precompile C.ccm -o C.pcm
> > clang -std=c++20 -fprebuilt-module-path=. A.cc B.pcm C.pcm -o A
> >
> > /tmp/d/C.ccm:3:37: error: 'foo' has different definitions in different modules; definition in module 'C' first difference is function body
> > export module C; export inline void foo() { assert(1); }
> > ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
> > /tmp/d/B.ccm:3:37: note: but in 'B' found a different body
> > export module B; export inline void foo() { assert(1); }
> > ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
> >
> > Fix this by preferring __builtin_FILE/__builtin_LINE which do not need
> > preprocessing.
> > ---
> > include/assert.h | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/include/assert.h b/include/assert.h
> > index d14ec94e..b209c2ae 100644
> > --- a/include/assert.h
> > +++ b/include/assert.h
> > @@ -4,6 +4,12 @@
> >
> > #ifdef NDEBUG
> > #define assert(x) (void)0
> > +#elif defined(__has_builtin)
> > +#if __has_builtin(__builtin_FILE)
> > +#define assert(x) ((void)((x) || (__assert_fail(#x, __builtin_FILE(), __builtin_LINE(), __func__),0)))
> > +#else
> > +#define assert(x) ((void)((x) || (__assert_fail(#x, __FILE__, __LINE__, __func__),0)))
> > +#endif
> > #else
> > #define assert(x) ((void)((x) || (__assert_fail(#x, __FILE__, __LINE__, __func__),0)))
> > #endif
> > --
> > 2.39.GIT
>
> It seems like use of assert here violates the ODR and is thus an
> application error, no? In particular, it produces multiple definitions
> that have differing behaviors, leaving which one actually gets used up
> to the linker. Without the above change, LTO is able to diagnose the
> error; with the change; it's silently deferred until runtime (where
> the assertion violation message, of produced, will likely indicate the
> wrong location).
>
> Rich
I am unsure whether it is an application error in the above example.
If it is not a good example, here is another one
where the inline function using assert is in a header:
echo 'import B; import C; int main() { foo(); }' > A.cc
cat > a.h <<'eof'
#include <assert.h>
inline void fn() { assert(1); }
eof
cat > B.ccm <<'eof'
module;
#include "a.h"
export module B; export inline void foo() { fn(); }
eof
mkdir -p ./d
cat > d/C.ccm <<'eof'
module;
#include "../a.h"
export module C; export inline void foo() { fn(); }
eof
sed 's/^ /\t/' > Makefile <<'eof'
C := clang
all:
$C -std=c++20 --precompile B.ccm -o B.pcm
$C -std=c++20 --precompile d/C.ccm -o d/C.pcm
$C -std=c++20 -fprebuilt-module-path=. -fprebuilt-module-path=d A.cc
B.pcm d/C.pcm -o A
eof
[-- Attachment #1: Type: text/plain, Size: 751 bytes --] On Sat, 18 Feb 2023, 02:54 Fangrui Song, <i@maskray.me> wrote: On Fri, Feb 17, 2023 at 6:03 PM Rich Felker <dalias@libc.org> wrote: > > On Fri, Feb 17, 2023 at 05:33:33PM -0800, Fangrui Song wrote: > > C++ inline functions are requred to have exact same sequence of tokens > > in every translation unit, but __FILE__ and __LINE__ may expand to > > different tokens. The ODR violatioin is usually benign, but it can lead > > to errors when C++20 modules are used. It is sad that C++ modules broke 'assert' but not surprising. Modules were largely created out of aversion to macros. This isn't something libc can fix though, I suggest a defect report against C++ instead. Changing the semantics of assert in C seems like a bad thing to do. Thanks [-- Attachment #2: Type: text/html, Size: 1236 bytes --]
On Sat, Feb 18, 2023 at 4:17 AM Jon Chesterfield <jonathanchesterfield@gmail.com> wrote: > > On Sat, 18 Feb 2023, 02:54 Fangrui Song, <i@maskray.me> wrote: > > On Fri, Feb 17, 2023 at 6:03 PM Rich Felker <dalias@libc.org> wrote: > > > > On Fri, Feb 17, 2023 at 05:33:33PM -0800, Fangrui Song wrote: > > > C++ inline functions are requred to have exact same sequence of tokens > > > in every translation unit, but __FILE__ and __LINE__ may expand to > > > different tokens. The ODR violatioin is usually benign, but it can lead > > > to errors when C++20 modules are used. > > > > It is sad that C++ modules broke 'assert' but not surprising. Modules were largely created out of aversion to macros. This isn't something libc can fix though, I suggest a defect report against C++ instead. > > Changing the semantics of assert in C seems like a bad thing to do. > > Thanks I disagree. This is a footgun where the right fix (or workaround, if you prefer) is on the libc side. It is fairly reasonable for a header to use assert and not expect two includes using different paths to not cause C++ module problems. The current module behavior regarding macros is a reasonable compromise. It can be evolved (e.g. https://gracicot.github.io/modules/2018/05/14/modules-macro.html).
On Sat, Feb 18, 2023 at 7:17 AM Jon Chesterfield <jonathanchesterfield@gmail.com> wrote: > > On Sat, 18 Feb 2023, 02:54 Fangrui Song, <i@maskray.me> wrote: > > On Fri, Feb 17, 2023 at 6:03 PM Rich Felker <dalias@libc.org> wrote: > > > > On Fri, Feb 17, 2023 at 05:33:33PM -0800, Fangrui Song wrote: > > > C++ inline functions are requred to have exact same sequence of tokens > > > in every translation unit, but __FILE__ and __LINE__ may expand to > > > different tokens. The ODR violatioin is usually benign, but it can lead > > > to errors when C++20 modules are used. > > It is sad that C++ modules broke 'assert' but not surprising. Modules were largely created out of aversion to macros. This isn't something libc can fix though, I suggest a defect report against C++ instead. Somewhat related, I tried contacting the C/C++ committee to have concerns addressed several years ago. I was not able to reach the committee. There's no mailing list or discussion group available to the public at large. It looks like one of those groups where the committee only talks with the companies that build compilers. Outsiders (like compiler users) are not welcomed. > Changing the semantics of assert in C seems like a bad thing to do. ++ Jeff
* Fangrui Song <i@maskray.me> [2023-02-21 11:09:14 -0800]:
> On Sat, Feb 18, 2023 at 4:17 AM Jon Chesterfield
> <jonathanchesterfield@gmail.com> wrote:
> >
> > On Sat, 18 Feb 2023, 02:54 Fangrui Song, <i@maskray.me> wrote:
> >
> > On Fri, Feb 17, 2023 at 6:03 PM Rich Felker <dalias@libc.org> wrote:
> > >
> > > On Fri, Feb 17, 2023 at 05:33:33PM -0800, Fangrui Song wrote:
> > > > C++ inline functions are requred to have exact same sequence of tokens
> > > > in every translation unit, but __FILE__ and __LINE__ may expand to
> > > > different tokens. The ODR violatioin is usually benign, but it can lead
> > > > to errors when C++20 modules are used.
> >
> >
> >
> > It is sad that C++ modules broke 'assert' but not surprising. Modules were largely created out of aversion to macros. This isn't something libc can fix though, I suggest a defect report against C++ instead.
> >
> > Changing the semantics of assert in C seems like a bad thing to do.
> >
> > Thanks
>
> I disagree. This is a footgun where the right fix (or workaround, if
> you prefer) is on the libc side. It is fairly reasonable for a header
> to use assert and not expect two includes using different paths to not
> cause C++ module problems.
>
> The current module behavior regarding macros is a reasonable
> compromise. It can be evolved (e.g.
> https://gracicot.github.io/modules/2018/05/14/modules-macro.html).
i dont see how that solves the fundamental problem:
the *behavior* of assert changes depending on which include path is
used and thus inline functions that are supposed to be equivalent
aren't. (__builtin_FILE makes the pp-token sequence the same across
the instances, but the actual code will have different paths, which
while not an odr violation per the literal words of the spec, it
clearly violates the reason the rule is there in the first place.)
libc can avoid printing the file path in the assert fail message for
c++. this makes assert less useful but it solves the conformance issue.
if c++ does not specify which path assert should print (or allow it to
be unpredictable) then it is difficult to do better than this.
it would have been more useful to have a __builtin_canonical_FILE()
or similar that gives a path that is somehow independent of include
path, but we don't have that now.
On Mon, Feb 27, 2023 at 2:27 PM Szabolcs Nagy <nsz@port70.net> wrote: > > * Fangrui Song <i@maskray.me> [2023-02-21 11:09:14 -0800]: > > > On Sat, Feb 18, 2023 at 4:17 AM Jon Chesterfield > > <jonathanchesterfield@gmail.com> wrote: > > > > > > On Sat, 18 Feb 2023, 02:54 Fangrui Song, <i@maskray.me> wrote: > > > > > > On Fri, Feb 17, 2023 at 6:03 PM Rich Felker <dalias@libc.org> wrote: > > > > > > > > On Fri, Feb 17, 2023 at 05:33:33PM -0800, Fangrui Song wrote: > > > > > C++ inline functions are requred to have exact same sequence of tokens > > > > > in every translation unit, but __FILE__ and __LINE__ may expand to > > > > > different tokens. The ODR violatioin is usually benign, but it can lead > > > > > to errors when C++20 modules are used. > > > > > > > > > > > > It is sad that C++ modules broke 'assert' but not surprising. Modules were largely created out of aversion to macros. This isn't something libc can fix though, I suggest a defect report against C++ instead. To lichray: ^^ > > > Changing the semantics of assert in C seems like a bad thing to do. > > > > > > Thanks > > > > I disagree. This is a footgun where the right fix (or workaround, if > > you prefer) is on the libc side. It is fairly reasonable for a header > > to use assert and not expect two includes using different paths to not > > cause C++ module problems. > > > > The current module behavior regarding macros is a reasonable > > compromise. It can be evolved (e.g. > > https://gracicot.github.io/modules/2018/05/14/modules-macro.html). > > i dont see how that solves the fundamental problem: > > the *behavior* of assert changes depending on which include path is > used and thus inline functions that are supposed to be equivalent > aren't. (__builtin_FILE makes the pp-token sequence the same across > the instances, but the actual code will have different paths, which > while not an odr violation per the literal words of the spec, it > clearly violates the reason the rule is there in the first place.) > libc can avoid printing the file path in the assert fail message for > c++. this makes assert less useful but it solves the conformance issue. > if c++ does not specify which path assert should print (or allow it to > be unpredictable) then it is difficult to do better than this. > > it would have been more useful to have a __builtin_canonical_FILE() > or similar that gives a path that is somehow independent of include > path, but we don't have that now. __FILE_NAME__ / __builtin_FILE_NAME just expands to the basename (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108978), which may help. I created this musl patch as I saw glibc made a similar change on 2023-02-10. Rejecting this patch is fine. It probably needs some time for standard C++ modules to become mainstream to expose this deployment problem.
[-- Attachment #1: Type: text/plain, Size: 2508 bytes --] On Thu, Mar 2, 2023 at 12:19 AM Fangrui Song <i@maskray.me> wrote: > On Mon, Feb 27, 2023 at 2:27 PM Szabolcs Nagy <nsz@port70.net> wrote: > > > > It is sad that C++ modules broke 'assert' but not surprising. > Modules were largely created out of aversion to macros. This isn't > something libc can fix though, I suggest a defect report against C++ > instead. > > To lichray: ^^ > The definition of 'assert' is mandatory only when NDEBUG is defined as a macro name. 7.2.1.1 reads: "[...], if expression (which shall have a scalar type) is false (that is, compares equal to 0), the assert macro writes information about the particular call that failed (including the text of the argument, the name of the source file, the source line number, and the name of the enclosing function — the latter are respectively the values of the preprocessing macros __FILE__ and __LINE__ and of the identifier __func__) on the standard error stream in an implementation-defined format." The text in parentheses did not mandate the use of __FILE__ and __LINE__, and C++ did not permit 'assert' not to work in modules. Therefore an implementation that fails to compile the code snippet in the original email is not conforming. > > > > > Changing the semantics of assert in C seems like a bad thing to do. > > > > > The semantics is unchanged, and people are doing it: Add custom ODR-safe assert. (!1166) · Merge requests · libeigen / eigen · GitLab <https://gitlab.com/libeigen/eigen/-/merge_requests/1166> > > > > i dont see how that solves the fundamental problem: > > > > the *behavior* of assert changes depending on which include path is > > used and thus inline functions that are supposed to be equivalent > > aren't. (__builtin_FILE makes the pp-token sequence the same across > > the instances, but the actual code will have different paths, which > > while not an odr violation per the literal words of the spec, it > > clearly violates the reason the rule is there in the first place.) > This is a different topic. In related news, CWG Issue 2678 (cplusplus.github.io) <https://cplusplus.github.io/CWG/issues/2678.html> will likely need to be revisited (i.e., what does 'odr' mean is subject to change). Regardless, __builtin_FILE is a vendor extension and serves customers' needs in implementing 'assert.' -- Zhihao Yuan, ID lichray The best way to predict the future is to invent it. _______________________________________________ [-- Attachment #2: Type: text/html, Size: 3559 bytes --]
On Sat, Mar 4, 2023 at 7:23 PM Zhihao Yuan <lichray@gmail.com> wrote:
>
> On Thu, Mar 2, 2023 at 12:19 AM Fangrui Song <i@maskray.me> wrote:
>>
>> On Mon, Feb 27, 2023 at 2:27 PM Szabolcs Nagy <nsz@port70.net> wrote:
>> > > > It is sad that C++ modules broke 'assert' but not surprising. Modules were largely created out of aversion to macros. This isn't something libc can fix though, I suggest a defect report against C++ instead.
>>
>> To lichray: ^^
>
>
> The definition of 'assert' is mandatory only when NDEBUG
> is defined as a macro name. 7.2.1.1 reads:
>
> "[...], if expression (which shall have a scalar type) is false (that is, compares equal to 0),
> the assert macro writes information about the particular call that failed (including the text of the
> argument, the name of the source file, the source line number, and the name of the enclosing function
> — the latter are respectively the values of the preprocessing macros __FILE__ and __LINE__ and of
> the identifier __func__) on the standard error stream in an implementation-defined format."
>
> The text in parentheses did not mandate the use of __FILE__
> and __LINE__, and C++ did not permit 'assert' not to work in
> modules. Therefore an implementation that fails to compile
> the code snippet in the original email is not conforming.
>>
>>
>> > > > Changing the semantics of assert in C seems like a bad thing to do.
>> > > >
>
>
> The semantics is unchanged, and people are doing it:
>
> Add custom ODR-safe assert. (!1166) · Merge requests · libeigen / eigen · GitLab
>
>>
>> >
>> > i dont see how that solves the fundamental problem:
>> >
>> > the *behavior* of assert changes depending on which include path is
>> > used and thus inline functions that are supposed to be equivalent
>> > aren't. (__builtin_FILE makes the pp-token sequence the same across
>> > the instances, but the actual code will have different paths, which
>> > while not an odr violation per the literal words of the spec, it
>> > clearly violates the reason the rule is there in the first place.)
>
>
> This is a different topic. In related news, CWG Issue 2678 (cplusplus.github.io)
> will likely need to be revisited (i.e., what does 'odr' mean
> is subject to change). Regardless, __builtin_FILE is
> a vendor extension and serves customers' needs in
> implementing 'assert.'
>
> --
> Zhihao Yuan, ID lichray
> The best way to predict the future is to invent it.
> _______________________________________________
Bump:)