From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,MAILING_LIST_MULTI, RCVD_IN_ZEN_BLOCKED_OPENDNS autolearn=ham autolearn_force=no version=3.4.4 Received: from minnie.tuhs.org (minnie.tuhs.org [50.116.15.146]) by inbox.vuxu.org (Postfix) with ESMTP id 8F7AC2395D for ; Fri, 15 Aug 2025 20:04:10 +0200 (CEST) Received: from minnie.tuhs.org (localhost [IPv6:::1]) by minnie.tuhs.org (Postfix) with ESMTP id 4239143BEE; Sat, 16 Aug 2025 04:04:06 +1000 (AEST) Received: from mail-pg1-x52e.google.com (mail-pg1-x52e.google.com [IPv6:2607:f8b0:4864:20::52e]) by minnie.tuhs.org (Postfix) with ESMTPS id 4DEAF43BF0 for ; Sat, 16 Aug 2025 04:03:59 +1000 (AEST) Received: by mail-pg1-x52e.google.com with SMTP id 41be03b00d2f7-b47174aec0eso1462502a12.2 for ; Fri, 15 Aug 2025 11:03:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20230601.gappssmtp.com; s=20230601; t=1755281038; x=1755885838; darn=tuhs.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=jXqR2B8dbMyeno96XI5x+mUTmOpiMkX6RWcNf9yVEj0=; b=EiB2ah+aU15l2ySpQPqpN3pK6ZRIw0mgwCnGg9zRMOPGSlzEifmj78xQ/Q+LyJa/+2 gNODwn96ph38OBQDHavYIzBQfGr59Xx1w5WvG+kYghq+Ru9a1b4irwWr6c1OdpIkMCe2 VjkS6kGtXVTZrvMlVlPRFE+GKHpOARKghMl6vnxTrJv4+ATI+kAbTbmifP+xLw+4RpsT AmjmgcIbD1iHIpjlT6pt9e9GDkHcKofkkJl835bOdBwMi4iKXRSyVyiC9V/sf7cAm/qn n7WACDOimVqBQFH59TSlwPVJ5csDukoGh6AUwu0xmgDbO0yRFnUG5mGesmU0KW8K1y5q SNxw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1755281038; x=1755885838; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=jXqR2B8dbMyeno96XI5x+mUTmOpiMkX6RWcNf9yVEj0=; b=OSyccepZBFUgZhIHKrRkzI+tEPnD5Z2jdh3Iblg3YdUePQtS/c0mwJHqWTdxxpYihk qivFQ0jGNKAIJ/lr+duhNBbm+nZvj9LhTCppXfDkfH9XTnXIYpYZeyr00eRJz17axn/8 B1axI8Wg92/LJbJVl1PfyarQI4iCOg6bVNZQSUxdmF83qbM9XszzmwruBZ8dkTW9S+Y4 O1TrPybYscB0LI0x5zCLdXH9SSd1VWbKpMgwAQYcS5o0l5nTbJFIhgD7/gbOIXmKwUaJ xOB4qnwHAwMTz8kE+e+yPvPqfX2PmXVGiDPGiOq8L6Sxd4R+RRCEPK8OLvJNF/BWNRha 669A== X-Gm-Message-State: AOJu0YyFBBjkuSnELutjopVcEZznfdX31wFIrtigTqxxk4jB4dBD9qIV E8o+6bTdITV6/eKzErsRsxTmhAZiArVM5eVTDeq5P1ZkWBXh1Ricv4NUJtD2fTsJrNtZVv4s16u OBTDZIFU9WaPrFR4GKxizbfzxUjVDfixLslAZBn7wBZ8z1Wl0/HjR X-Gm-Gg: ASbGnctlhw7ubQm1LeeTV0c3W1YDx7uPE9+t8zorIQTGx8Gph28b6ACuk9Gp3qSf8xT Tv6xgL53nRez27m2zobKKCeizYf6tw5oZQDPKqwvp1obQefWae7W8v5HWw9Mk3i/jdzQ6mVEn9s SognKxSqeoAY3Z5x69NJ0n7dVNsLfDzztjBoNeWOAdvyvhumoG3hYqrBDMkJutP03tLyZXHhXdR S/mEj4= X-Google-Smtp-Source: AGHT+IF3DB8qov7XFHYGzpszOOknq/f4qRW/loBgsS8dhp9ulv4Kk/xWyFIfCowuLJfuum7pI0eJkQxLwEY2gNxjqtU= X-Received: by 2002:a17:90b:2d03:b0:321:2160:bf72 with SMTP id 98e67ed59e1d1-323421644aamr4037487a91.7.1755281038436; Fri, 15 Aug 2025 11:03:58 -0700 (PDT) MIME-Version: 1.0 References: <664f1cf9-ae56-11a5-1e94-f58e0ca23565@makerlisp.com> <2e3d71c9-167e-b5d7-0d68-516248d91cf3@makerlisp.com> In-Reply-To: <2e3d71c9-167e-b5d7-0d68-516248d91cf3@makerlisp.com> From: Warner Losh Date: Fri, 15 Aug 2025 12:03:46 -0600 X-Gm-Features: Ac12FXymJtxUJwge806l7a0ZiQQmT9jiIBhjpg7-4y7KLVB9p-4pGE_Ff8bVN4o Message-ID: To: Luther Johnson Content-Type: multipart/alternative; boundary="000000000000c6ff3b063c6b3651" Message-ID-Hash: RTN6NYVB442Y2KVKBRCJ5BI3VEWVPGWO X-Message-ID-Hash: RTN6NYVB442Y2KVKBRCJ5BI3VEWVPGWO X-MailFrom: wlosh@bsdimp.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-tuhs.tuhs.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: tuhs@tuhs.org X-Mailman-Version: 3.3.6b1 Precedence: list Subject: [TUHS] Re: C history question: why is signed integer overflow UB? List-Id: The Unix Heritage Society mailing list Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --000000000000c6ff3b063c6b3651 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable I suspect that it was the absence of a signed right shift. In my Decsystem-20 OS class, one of the differences between the compiler on the VAX and the compiler on the '20 was that -1 >> 1 was -1 on the VAX and 2^35-1 on the '20. This was in 1985 or 1986 for a compiler that was written in 1982 or 83 (that no longer exists today, I'm told, other '20 compilers took over). Some signed overflows / underflow / traps were different as well, which only mattered in the '20 simulator we were running since all traps and interrupts reset the trap frame (whether you wanted to or not), so if you did it in the kernel interrupt, you'd double trap the machine (I think this was an intentional difference to teach about being careful in an interrupt context, but still...). It's the lack of uniformity for signed operations for machines generally available in the late 70s and early 80s (often based on designs dating back to the 60s) that I always assumed drove it... These details took up bits of three different lectures in the OS class, and was a big source of problems by everybody... So while my specific case was super weird / edge. But if it came up in an undergraduate OS class at an obscure technical school in the middle of the desert in New Mexico, I can't imagine that the design committee didn't know about it. Since the standardization started a few years before c89, I'm guessing that we'd see that if we had early drafts of the standard (or maybe it was inherited from K&R-era). Warner On Fri, Aug 15, 2025 at 11:37=E2=80=AFAM Luther Johnson < luther.johnson@makerlisp.com> wrote: > Or one's complement on those machines, but the idea was that this case > is out of bounds, so you don't have to worry if munging some computation > by substituting or rearranging expressions would change it, whatever the > machine-specific behavior was. > > On 08/15/2025 10:31 AM, Luther Johnson wrote: > > My belief is that this was done so compilers could employ > > optimizations that did not have to consider or maintain > > implementation-specific behavior when integers would wrap. I don't > > agree with this, I think 2's complement behavior on integers as an > > implementation-specific behavior can be well-specified, and > > well-understood, machine by machine, but I think this is one of the > > places where compilers and benchmarks conspire to subvert the obvious > > and change the language to "language-legally" allow optimizations that > > can break the used-to-be-expected 2's complement > > implementation-specific behavior. > > > > I'm sure many people will disagree, but I think this is part of the > > slippery slope of modern C, and part of how it stopped being more > > usefully, directly, tied to the machine underneath. > > > > On 08/15/2025 10:17 AM, Dan Cross wrote: > >> [Note: A few folks Cc'ed directly] > >> > >> This is not exactly a Unix history question, but given the close > >> relationship between C's development and that of Unix, perhaps it is > >> both topical and someone may chime in with a definitive answer. > >> > >> Starting with the 1990 ANSI/ISO C standard, and continuing on to the > >> present day, C has specified that signed integer overflow is > >> "undefined behavior"; unsigned integer arithmetic is defined to be > >> modular, and unsigned integer operations thus cannot meaningfully > >> overflow, since they're always taken mod 2^b, where b is the number of > >> bits in the datum (assuming unsigned int or larger, since type > >> promotion of smaller things gets weird). > >> > >> But why is signed overflow UB? My belief has always been that signed > >> integer overflow across various machines has non-deterministic > >> behavior, in part because some machines would trap on overflow (e.g., > >> Unisys 1100 series mainframes) while others used non-2's-complement > >> representations for signed integers (again, the Unisys 1100 series, > >> which used 1's complement), and so the results could not be precisely > >> defined: even if it did not trap, overflowing a 1's complement machine > >> yielded a different _value_ than on 2's complement. And around the > >> time of initial standardization, targeting those machines was still an > >> important use case. So while 2's complement with silent wrap-around > >> was common, it could not be assumed, and once machines that generated > >> traps on overflow were brought into the mix, it was safer to simply > >> declare behavior on overflow undefined. > >> > >> But is that actually the case? > >> > >> Thanks in advance. > >> > >> - Dan C. > >> > > > > --000000000000c6ff3b063c6b3651 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I suspect that it was the absence of a signed right s= hift. In my Decsystem-20 OS class, one of the differences between the compi= ler on the VAX and the compiler on the '20 was that -1 >> 1 was -= 1 on the VAX and 2^35-1 on the '20. This was in 1985 or 1986 for a comp= iler that was written in 1982 or 83 (that no longer exists today, I'm t= old, other '20 compilers took over). Some signed overflows / underflow = / traps were different as well, which only mattered in the '20 simulato= r we were running since all traps and interrupts reset the trap frame (whet= her you wanted to or not), so if you did it in the kernel interrupt, you= 9;d double trap the machine (I think this was an intentional difference to = teach about being careful in an interrupt context, but still...). It's = the lack of uniformity for signed operations for machines generally availab= le in the late 70s and early 80s (often based on designs dating back to the= 60s) that I always assumed drove it...=C2=A0 These details took up bits of= three different lectures in the OS class, and was a big source of problems= by everybody...

So while my specific case was sup= er weird / edge. But if it came up in an undergraduate OS class at an obscu= re technical school in the middle of the desert in New Mexico, I can't = imagine that the design committee didn't know about it. Since the stand= ardization started a few years before c89, I'm guessing that we'd s= ee that if we had early drafts of the standard (or maybe it was inherited f= rom K&R-era).

Warner

On Fri, Aug 15, 2025 at 11:37=E2=80=AFAM Luther Johnson <luther.johnson@makerlisp.com> wro= te:
Or one's= complement on those machines, but the idea was that this case
is out of bounds, so you don't have to worry if munging some computatio= n
by substituting or rearranging expressions would change it, whatever the machine-specific behavior was.

On 08/15/2025 10:31 AM, Luther Johnson wrote:
> My belief is that this was done so compilers could employ
> optimizations that did not have to consider or maintain
> implementation-specific behavior when integers would wrap. I don't=
> agree with this, I think 2's complement behavior on integers as an=
> implementation-specific behavior can be well-specified, and
> well-understood, machine by machine, but I think this is one of the > places where compilers and benchmarks conspire to subvert the obvious =
> and change the language to "language-legally" allow optimiza= tions that
> can break the used-to-be-expected 2's complement
> implementation-specific behavior.
>
> I'm sure many people will disagree, but I think this is part of th= e
> slippery slope of modern C, and part of how it stopped being more
> usefully, directly, tied to the machine underneath.
>
> On 08/15/2025 10:17 AM, Dan Cross wrote:
>> [Note: A few folks Cc'ed directly]
>>
>> This is not exactly a Unix history question, but given the close >> relationship between C's development and that of Unix, perhaps= it is
>> both topical and someone may chime in with a definitive answer. >>
>> Starting with the 1990 ANSI/ISO C standard, and continuing on to t= he
>> present day, C has specified that signed integer overflow is
>> "undefined behavior"; unsigned integer arithmetic is def= ined to be
>> modular, and unsigned integer operations thus cannot meaningfully<= br> >> overflow, since they're always taken mod 2^b, where b is the n= umber of
>> bits in the datum (assuming unsigned int or larger, since type
>> promotion of smaller things gets weird).
>>
>> But why is signed overflow UB? My belief has always been that sign= ed
>> integer overflow across various machines has non-deterministic
>> behavior, in part because some machines would trap on overflow (e.= g.,
>> Unisys 1100 series mainframes) while others used non-2's-compl= ement
>> representations for signed integers (again, the Unisys 1100 series= ,
>> which used 1's complement), and so the results could not be pr= ecisely
>> defined: even if it did not trap, overflowing a 1's complement= machine
>> yielded a different _value_ than on 2's complement. And around= the
>> time of initial standardization, targeting those machines was stil= l an
>> important use case.=C2=A0 So while 2's complement with silent = wrap-around
>> was common, it could not be assumed, and once machines that genera= ted
>> traps on overflow were brought into the mix, it was safer to simpl= y
>> declare behavior on overflow undefined.
>>
>> But is that actually the case?
>>
>> Thanks in advance.
>>
>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - Dan C.
>>
>

--000000000000c6ff3b063c6b3651--