From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 32649 invoked from network); 27 May 2020 23:17:47 -0000 Received: from minnie.tuhs.org (45.79.103.53) by inbox.vuxu.org with ESMTPUTF8; 27 May 2020 23:17:47 -0000 Received: by minnie.tuhs.org (Postfix, from userid 112) id 3E96A9C5EB; Thu, 28 May 2020 09:17:44 +1000 (AEST) Received: from minnie.tuhs.org (localhost [127.0.0.1]) by minnie.tuhs.org (Postfix) with ESMTP id CF64D9C5E5; Thu, 28 May 2020 09:17:15 +1000 (AEST) Received: by minnie.tuhs.org (Postfix, from userid 112) id 0491E9C5E5; Thu, 28 May 2020 09:17:12 +1000 (AEST) Received: from central.weird.com (unknown [198.96.117.51]) by minnie.tuhs.org (Postfix) with ESMTP id ADBDC9C194 for ; Thu, 28 May 2020 09:17:09 +1000 (AEST) Received: from (invalid client hostname: bind: DNS error: DNS lookup for A for 'more.local': Unknown host)more.local ((no PTR matching greeting name)S01060026bb6c284e.ok.shawcable.net[24.71.254.93] port=44393) by central.weird.com([198.96.117.51] port=587) via TCP with esmtp (7281 bytes) (sender: ) (ident using UNIX) id for ; Wed, 27 May 2020 19:17:06 -0400 (EDT) (Smail-3.2.0.122-Pre 2005-Nov-17 #78 built 2020-Mar-25) Received: from (invalid client hostname: the DNS A record (with the targegt address [10.0.1.129]) for the hostname 'more.local' does not match the expected address [10.0.1.129])more.local ((no PTR matching greeting name)future.local[10.0.1.133] port=64885) by more.local([10.0.1.129] port=25) via TCP with esmtp (6771 bytes) (sender: ) id for ; Wed, 27 May 2020 16:17:06 -0700 (PDT) (Smail-3.2.0.122-Pre 2005-Nov-17 #1 built 2015-Feb-17) Message-Id: Date: Wed, 27 May 2020 16:17:06 -0700 From: "Greg A. Woods" To: The Unix Heritage Society mailing list In-Reply-To: References: <8a2e9b1b-8890-a783-5b53-c8480c070f2e@telegraphics.com.au> <95e6e8de901c837a28b84e62556ba326@firemail.de> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL/10.8 EasyPG/1.0.0 Emacs/25.3 (x86_64--netbsd) MULE/6.0 (HANACHIRUSATO) X-Face: ; j3Eth2XV8h1Yfu*uL{<:dQ$#E[DB0gemGZJ"J#4fH*][ lz; @-iwMv_u\6uIEKR0KY"=MzoQH#CrqBN`nG_5B@rrM8,f~Gr&h5a\= List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: The Unix Heritage Society mailing list Errors-To: tuhs-bounces@minnie.tuhs.org Sender: "TUHS" --pgp-sign-Multipart_Wed_May_27_16:17:00_2020-1 Content-Type: text/plain; charset=US-ASCII At Wed, 27 May 2020 16:00:57 -0500, Nevin Liber wrote: Subject: Re: [TUHS] History of popularity of C > > On Wed, May 27, 2020 at 2:50 PM Greg A. Woods wrote: > > > > A big part of the problem is that the C Standard mandates compilation > > will and must succeed (and allows this success to be totally silent too) > > even if the code contains instances of undefined behaviour. > > No it does not. > > To quote C11: > > undefined behavior > behavior, upon use of a nonportable or erroneous program construct or of > erroneous data, for which this International Standard imposes no > requirements Sorry, I concede. Yes, "no requirements". In C99 at least. Sadly most compilers, including GCC and Clang/LLVM will, at best, warn (and warnings are only treated as errors by the most macho|wise); and compilers only do that now because they've been getting flack from developers whenever the optimizer does something unexpected. > Much UB cannot be detected at compile time. Much UB is too expensive to > detect at run time. Indeed. At best you can get a warning, or optional runtime code to abort the program. Now this isn't a problem when "undefined behaviour" becomes "implementation defined behaviour" for a given implementation. However that's not portable obviously, except for the trivial cases where the common compilers for a given type of platform all do the same things. The real problems though arise when the optimizer takes advantage of these rules regardless of what the un-optimized code will do on any given platform and architecture. The Linux kernel example I've referred to involved dereferencing a pointer to do an assignment in a local variable definition, then a few lines later testing if the pointer was NULL before using the local variable. Unoptimised the code will dereference a NULL pointer and load junk from location zero into the variable (because it's kernel code), then the NULL test will trigger and all will be good. The optimizer rips out the NULL check because "obviously" the programmer has assumed the pointer is always a valid non-NULL pointer since they've explicitly dereferenced it before checking it and they wouldn't want to waste even a single jump-on-zero instruction checking it again. (It's also quite possible the code was written "correctly" at first, then someone mushed all the variable initialisations up onto their definitions.) In any case there's now a GCC option: -fno-delete-null-pointer-checks (to go along with -fno-strict-aliasing and -fno-strict-overflow, and -fno-strict-enums, all of which MUST be used, and sometimes -fno-strict-volatile-bitfields too, on all legacy code that you don't want to break) It's even worse when you have to write bare-metal code that must explictly dereference a NULL pointer (a not-so-real example: you want to use location zero in the CPU zero-page (e.g. on a 6502 or 6800, or PDP-8, etc.) as a pointer) -- it is now impossible to do that in strict Standard C even though trivially it "should just work" despite the silly rules. As far as I can tell it always did just work in "plain old" C. The crazy thing about modern optimizers is that they're way more persistent and often somewhat more clever than your average programmer. They follow all the paths. They apply all the rules at every turn. > Take strlen(const char* s) for example. s must be a valid pointer that > points to a '\0'-terminated string. How would you detect that at compile > time? How would you set up your run time to detect that and error out? My premise is that you shouldn't try to detect this problem, AND in any case where the optimizer might be able to prove the pointed at object isn't a valid string it should not, and must not, abuse that knowledge to rip out code or cause other even worse mis-behaviour. I.e. this should not be "undefined", but rather "implementation defined and without any recourse to allowing optimizer abuses". -- Greg A. Woods Kelowna, BC +1 250 762-7675 RoboHack Planix, Inc. Avoncote Farms --pgp-sign-Multipart_Wed_May_27_16:17:00_2020-1 Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit Content-Description: OpenPGP Digital Signature -----BEGIN PGP SIGNATURE----- iF0EABECAB0WIQTWEnAIIlcZX4oAawJie18UwlnHhQUCXs707wAKCRBie18UwlnH hZtuAKDErh696qgpuTi9dTG4lRzA3/KwFACfR1+TD6frbp4iUEPhQN3joG0vQvM= =hk8K -----END PGP SIGNATURE----- --pgp-sign-Multipart_Wed_May_27_16:17:00_2020-1--