The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
From: Luther Johnson <luther.johnson@makerlisp.com>
To: Nevin Liber <nevin@eviloverlord.com>, tuhs@tuhs.org
Subject: [TUHS] Re: C history question: why is signed integer overflow UB?
Date: Fri, 15 Aug 2025 11:25:32 -0700	[thread overview]
Message-ID: <e1a46242-a9e6-a580-9f05-994e76262f50@makerlisp.com> (raw)
In-Reply-To: <CAGg_6+OZOV0+yNm-T0pR3zqkhxOuhZAbY_B5hWg5JVcnXaV5hA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4310 bytes --]

I hear and understand what you're saying. I think what I'm trying to 
point out, is that in C, as it was originally implemented, in 
expressions "a + b", "a >> 1", "++a", C "does what the machine does". 
That's a very different thing from having rational, safe, predictable 
language semantics for operations on types - but it was also a strength, 
and a simple way to describe what C would do, deferring to machine 
semantics. I believe one place in C89/C90 where this is stated 
explicitly, as "do what the machine does", is "-1 >> 1", as opposed to 
"-1 / 2".  On most machines, this program:

#include <stdio.h>

int main()
{
     printf("%d\n", -1 >> 1);
     printf("%d\n", -1 / 2);

     return 0;
}

returns:

-1
0

directly reflecting the underlying machine shift and divide instructions 
- but if you made an appeal to rational integer type semantics, you 
might decide for it to do something else.

Old C was one way. Modern C has gone another way, good tools and 
rational semantics for safer and/or higher performance code, or some 
balance between those and other goals. Old C just did what the machine 
did, and was a high leverage tool - but you had to understand your machine.

On 08/15/2025 11:02 AM, Nevin Liber wrote:
> On Fri, Aug 15, 2025 at 12:32 PM Luther Johnson 
> <luther.johnson@makerlisp.com <mailto:luther.johnson@makerlisp.com>> 
> wrote:
>
>     My belief is that this was done so compilers could employ
>     optimizations
>     that did not have to consider or maintain implementation-specific
>     behavior when integers would wrap. I don't agree with this, I
>     think 2's
>     complement behavior on integers as an implementation-specific
>     behavior
>     can be well-specified, and well-understood, machine by machine, but I
>     think this is one of the places where compilers and benchmarks
>     conspire
>     to subvert the obvious and change the language to "language-legally"
>     allow optimizations that can break the used-to-be-expected 2's
>     complement implementation-specific behavior.
>
>
> It isn't just about optimizations.
>
> Unsigned math in C is well defined here.  The problem is that its 
> wrapping behavior is almost (but not) always a bug.  Because of that, 
> for instance, one cannot write a no-false-positive sanitizer to catch 
> this because it cannot tell the difference between an accidental bug 
> and a deliberate use.  This is a well-defined case with a very 
> reasonable definition which most of the time leads to bugs.
>
> There are times folks want the wrapping behavior.  There are times 
> folks want saturating behavior.  There are times folks want such code 
> to error out.  There are times folks want the optimizing behavior 
> because their code doesn't go anywhere near wrapping.
>
> Ultimately, one needs different functions for the different 
> behaviors, but if you only have one spelling for that operation, you 
> can only get one behavior.  A given type has to pick one of the above 
> behaviors for a given spelling of an operation.
>
> You can, of course, disagree with what C picked here (many do), but it 
> is unlikely to change in the future.
>
> Not that it hasn't been tried.  In 2018 there was a proposal for C++ 
> P0907R0 Signed Integers are Two's Complement 
> <https://wg21.link/P0907R0>, and if you look at the next revision of 
> that paper P0907R1 <https://wg21.link/P0907R1>, there was no consensus 
> for the wrapping behavior.  Quoting the paper:
>
>   * Performance concerns, whereby defining the behavior prevents
>     optimizers from assuming that overflow never occurs;
>   * Implementation leeway for tools such as sanitizers;
>   * Data from Google suggesting that over 90% of all overflow is a
>     bug, and defining wrapping behavior would not have solved the bug.
>
> Fun fact:  in C++ std::atomic<int> does wrap, so you can actually get 
> the behavior you want.  I haven't looked to see if that is also true 
> using C's _Atomic type qualifier.
>
> Full disclosure:  I am on the WG21 (C++) Committee and am starting to 
> participate on the WG14 (C) Committee.
> -- 
>  Nevin ":-)" Liber  <mailto:nevin@eviloverlord.com 
> <mailto:nevin@eviloverlord.com>>  +1-847-691-1404


[-- Attachment #2: Type: text/html, Size: 7107 bytes --]

  reply	other threads:[~2025-08-15 18:25 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-15 17:17 [TUHS] C history question: why is signed integer overflow UB? Dan Cross
2025-08-15 17:31 ` [TUHS] " Luther Johnson
2025-08-15 17:36   ` Luther Johnson
2025-08-15 18:03     ` Warner Losh
2025-08-16  6:01       ` Lars Brinkhoff
2025-08-15 18:02   ` Nevin Liber
2025-08-15 18:25     ` Luther Johnson [this message]
2025-08-15 18:44       ` John Levine
2025-08-15 21:04         ` Douglas McIlroy
2025-08-15 21:59           ` Dave Horsfall
2025-08-15 23:58           ` Luther Johnson
2025-08-17  2:25 ` Clem Cole

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e1a46242-a9e6-a580-9f05-994e76262f50@makerlisp.com \
    --to=luther.johnson@makerlisp.com \
    --cc=nevin@eviloverlord.com \
    --cc=tuhs@tuhs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).