From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-0.7 required=5.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,MAILING_LIST_MULTI, RCVD_IN_ZEN_BLOCKED_OPENDNS,URIBL_DBL_BLOCKED_OPENDNS, URIBL_ZEN_BLOCKED_OPENDNS autolearn=ham autolearn_force=no version=3.4.4 Received: from minnie.tuhs.org (minnie.tuhs.org [IPv6:2600:3c01:e000:146::1]) by inbox.vuxu.org (Postfix) with ESMTP id 6D27A2A72F for ; Sun, 17 Aug 2025 04:26:24 +0200 (CEST) Received: from minnie.tuhs.org (localhost [IPv6:::1]) by minnie.tuhs.org (Postfix) with ESMTP id C705843C85; Sun, 17 Aug 2025 12:26:17 +1000 (AEST) Received: from mail-ej1-x630.google.com (mail-ej1-x630.google.com [IPv6:2a00:1450:4864:20::630]) by minnie.tuhs.org (Postfix) with ESMTPS id 10AA043C5A for ; Sun, 17 Aug 2025 12:26:12 +1000 (AEST) Received: by mail-ej1-x630.google.com with SMTP id a640c23a62f3a-afcb7ace3baso532384566b.3 for ; Sat, 16 Aug 2025 19:26:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ccc.com; s=google; t=1755397570; x=1756002370; darn=tuhs.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=/2JHpOUceJsRsL9RoEGgIrZWYiGhohyILzFieZfOVYI=; b=N2rNPrB3eu8n8oEtmrLlibG/oBiUs3WNxHgzN19/W3byGP0bkqJD/jUgWYr4T/Ibdl yEe4vqlfBrSfbUfcZosQyjxH67F7o7JcAhcQzA9bnUaJck9zh0r//KZrssR6ewa7mL1O wBQIdiTxVi3jgrBuVFoPdP+tMYtmAgDVDxOSQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1755397570; x=1756002370; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=/2JHpOUceJsRsL9RoEGgIrZWYiGhohyILzFieZfOVYI=; b=nDDB7gxXed5Lmbiz0u4vePoyvAkRb8eZUuxw88xhKwKRZjX1MvftE/RqzI5C24kr8d t8D3gs0f0ao7zmtMVxzadM4P+ZHp9t93opCWsntfP3NdUiA6EJuZjuiLSiBqc9BX91Ky Q6TNDsqA3Gfaaptk8KFvSiY8ZG0cOL5u5d38kbrWk4R52scuUn1KlzND3CCEGhAkKa7T Fxn57iBV9rLXIJmERoJbNkFPB4FPorVxUShBLkpf1yyGrecF3WqSSo5zucXsnDwu07Bb SsDTL+Ni/6mPtNAdisIdVKpb/gzwt2bJN2sNX3gksTicXLKVvTyagdQC7vLiVO1g/TIS /Dgg== X-Gm-Message-State: AOJu0YyBBaS6cPsvdS9iTsHj10RxnNjPIoVdQmg82nIdUHIayvFhVzw5 F9h/zYXRqp1vaM5pxnbhHzc4FNVDsmyDSxTy9TBFiN+4moUxyE9V4dWVirjFYanrDH4KDFP8wNA CycXquogJtTvU/u14neeQ5WQE6b8nS++ER4LfekrvZHuOQ91EXBQSMZ3i6U8= X-Gm-Gg: ASbGncscYHeUF0BPg1NW2JXH0Z3rOHxAd6Lp+c1lVBtonEJAteOAqKt/M4ebG3/s+lA hrhLMUiWu9KJc3NFAnj8KOjE0jBS4Y4tpQhzo1vP3KNeExdczGw25I0wPqWww6L3Ev49BD/DDNb B9dZGIxdbA5TJhc6flnIul0pnTMGg9GWQbpHy+05ZWsbPoxqcqvPA4uoo63MlkT2bZWGeTk+MQ6 447aC5rh6QrGCuIZhY= X-Google-Smtp-Source: AGHT+IEqAkbuUpnQewSsa3Gq/efH+y39sjiZVydmMtFPFDLwKyjFLOFdaMWzWgKSI8m0WCMuODYfrqAfXB8JEFK+eE4= X-Received: by 2002:a17:906:f5a7:b0:ae3:b654:165b with SMTP id a640c23a62f3a-afcdc27e5d6mr711434666b.24.1755397570064; Sat, 16 Aug 2025 19:26:10 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Clem Cole Date: Sat, 16 Aug 2025 19:25:33 -0700 X-Gm-Features: Ac12FXzuYgQG9vvp0KIqWeWAmyoXg5xqlY6QVNPO0me1qdBJn1rwQqQQmSgH_aY Message-ID: To: Dan Cross Content-Type: multipart/alternative; boundary="0000000000009a8918063c865885" Message-ID-Hash: CMONYNYPEYCYQU46QFC5IVKXF3ZXIRKU X-Message-ID-Hash: CMONYNYPEYCYQU46QFC5IVKXF3ZXIRKU X-MailFrom: clemc@ccc.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-tuhs.tuhs.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: TUHS , Douglas McIlroy X-Mailman-Version: 3.3.6b1 Precedence: list Subject: [TUHS] Re: C history question: why is signed integer overflow UB? List-Id: The Unix Heritage Society mailing list Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --0000000000009a8918063c865885 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable below... On Fri, Aug 15, 2025 at 10:18=E2=80=AFAM Dan Cross wrote= : > [Note: A few folks Cc'ed directly] > > > > Starting with the 1990 ANSI/ISO C standard, and continuing on to the > present day, C has specified that signed integer overflow is > "undefined behavior"; unsigned integer arithmetic is defined to be > modular, and unsigned integer operations thus cannot meaningfully > overflow, since they're always taken mod 2^b, where b is the number of > bits in the datum (assuming unsigned int or larger, since type > promotion of smaller things gets weird). > > But why is signed overflow UB? My belief has always been that signed > integer overflow across various machines has non-deterministic > behavior, in part because some machines would trap on overflow while > others used non-2's-complement > representations for signed integers and so the results could not be > precisely > defined: even if it did not trap, overflowing a 1's complement machine > yielded a different _value_ than on 2's complement. And around the > time of initial standardization, targeting those machines was still an > important use case. So while 2's complement with silent wrap-around > was common, it could not be assumed, and once machines that generated > traps on overflow were brought into the mix, it was safer to simply > declare behavior on overflow undefined. > We need someone like Peter Darnell, Plauger, and some of the original ANSI C weenies that had to argue this all through in those days, but I think you caught the core issues. This was just one of the many troublesome things that had to be worked through. Until C89, the only ``official'' C was what Dennis shipped at any given time, and that was a bit ephemeral =E2=80=94 particularly as the PCs = and microprocessors C compiler implemented started to play fast and lose with the C syntax. The core task of the C89 was to do as little harm as possible. My memory is that Dennis was pretty cool and rarely played his trump card (far pointers is one case I am aware that he told the 8086 people to pound sand) , but the compromise that the committee tended to use to kick the can down the road and get the standard out the door was to make things UB with the hopes that later versions could find a way to tighten things up. Truth is, other language specs had used that (like Fortran) , so it was not a bad idea. So, back to your question, I can not say what the actual cause if why there was a conflict WRT to signed integer overflow, but I bet it was that, since so many compilers handled it in different ways, the committee did not have a way to make a formal standard that would work, and they never found one later. FWIW: Remember, C89 tossed a lot of systems like the PDP-11 away with floating point. It says, we are going to use IEEE 754. So just because an old system used a format, did not guarantee it would be accepted. --0000000000009a8918063c865885 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
below...

On Fri, Aug 15, 2025 at 10:18=E2=80=AFAM Dan = Cross <crossd@gmail.com> wrot= e:
[Note: A few = folks Cc'ed directly]



Starting with the 1990 ANSI/ISO C standard, and continuing on to the
present day, C has specified that signed integer overflow is
"undefined behavior"; unsigned integer arithmetic is defined to b= e
modular, and unsigned integer operations thus cannot meaningfully
overflow, since they're always taken mod 2^b, where b is the number of<= br> bits in the datum (assuming unsigned int or larger, since type
promotion of smaller things gets weird).

But why is signed overflow UB? My belief has always been that signed
integer overflow across various machines has non-deterministic
behavior, in part because some machines would trap on overflow while others= used non-2's-complement
representations for signed integers=C2=A0and so the results could not be pr= ecisely
defined: even if it did not trap, overflowing a 1's complement machine<= br> yielded a different _value_ than on 2's complement. And around the
time of initial standardization, targeting those machines was still an
important use case.=C2=A0 So while 2's complement with silent wrap-arou= nd
was common, it could not be assumed, and once machines that generated
traps on overflow were brought into the mix, it was safer to simply
declare behavior on overflow undefined.
We need someone like Peter Darnell, Plauger, and some of the = original ANSI C weenies that had to argue this all through in those days, b= ut I think you caught the core issues.=C2=A0

This wa= s just one of the many troublesome things that had to be worked through.=C2= =A0 Until C89, the only ``official'' C was what Dennis shipped at a= ny given time, and that was a bit ephemeral =E2=80=94 particularly as the P= Cs and microprocessors C compiler implemented started to play fast and lose= with the C syntax.=C2=A0

The core task of the C89 was to do as lit= tle harm as possible.=C2=A0 My memory is that Dennis was pretty cool and ra= rely played his trump card (far pointers is one case I am aware that he tol= d the 8086 people to pound sand) , but the compromise that the committee te= nded to use to kick the can down the road and get the standard out the door= was to make things UB with the hopes that later versions could find a way = to tighten things up.

Truth is, other language specs had used that = (like Fortran) , so it was not a bad idea.=C2=A0

So, back to your q= uestion, I can not say what the actual cause if why there was a conflict WR= T to=C2=A0=C2= =A0signed integer overflow, but I bet it was that, = since so many compilers handled it in different ways, the committee did not= have a way to make a formal standard that would work, and they never found= one later.

FWIW:= =C2=A0 =C2=A0Remember, C89 tossed a lot of systems like the PDP-11 away wit= h floating point.=C2=A0 It says, we are going to use IEEE 754.=C2=A0 So jus= t because an old system used a format, did not guarantee it would be accept= ed.=C2=A0=C2=A0
--0000000000009a8918063c865885--