From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FROM,HTML_MESSAGE,MAILING_LIST_MULTI autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 8444 invoked from network); 31 Jan 2023 04:01:36 -0000 Received: from minnie.tuhs.org (50.116.15.146) by inbox.vuxu.org with ESMTPUTF8; 31 Jan 2023 04:01:36 -0000 Received: from minnie.tuhs.org (localhost [IPv6:::1]) by minnie.tuhs.org (Postfix) with ESMTP id 7E8274261E; Tue, 31 Jan 2023 14:00:59 +1000 (AEST) Received: from mail-qk1-f182.google.com (mail-qk1-f182.google.com [209.85.222.182]) by minnie.tuhs.org (Postfix) with ESMTPS id 835CF42612 for ; Tue, 31 Jan 2023 14:00:55 +1000 (AEST) Received: by mail-qk1-f182.google.com with SMTP id r187so3838258qkf.10 for ; Mon, 30 Jan 2023 20:00:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=cXx2F94E9Bd1lOYAy09HPenrrIkYRObRTkQbFmTFmo0=; b=QN/DAnfeOgl/2potpNZTjoXQaWUSC/Ms5DGdHtpMw9Ue0VrChDzu2U8k8zdiH3dYZQ 35LkEqgUvC1/oww7DLRy4ADvi03OwpBxj2twII4OHQhGbCO3rESgoReiphuSJ0WUnGlq 4MT2s385hjJapC0HC36CFBh7oHHtYnBaCrOXKGf9cxW9wPHIlpOLQdv1W/ke6lq27bl3 W0vdYFf0uLxcQG0sovEkaq61aHcki9uscNGH6raQQ6HLdHur3zLe+G579m3XRxHVceOe GOIbQPOz53hJaq1EQPjpsK27Y8Kva6yAZf1y+mTvdSxS3FDFdjqhNUfONtyCoQLyD96l Q5Cg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=cXx2F94E9Bd1lOYAy09HPenrrIkYRObRTkQbFmTFmo0=; b=mX2vs/wHPZI062gPwAzhr4JJ/eWLDsOqk+eN4DuvQrMCaKKsX0DZeY9/ll6J+brIPs L/BPIxDWnS9HJ0PViNGuUwvvFDueRWSFs019gA+D5H/SV8HS9RARVNTsfXMVGi1P0v6d s7Uh+kVBJhMJEvYcrYv6aDiNEwnkSbkPJrqdOU6nAo2JTddToPLtALghAN+f6y6hjBnR AXbKAl2scndDuWQGPh83wya6vvsWldQS4gRNUU0dnLgXCmaj7ek4iUjQCzyLUMwWlf2S Mkri6fEjwrxJ99IAie3NILz1aQSZBNW893kFsUIoQXDL80+EIAQYtRhbfkUoRH5XfJXM iNHQ== X-Gm-Message-State: AFqh2kr0L+UwGe+dkaphRxBSP4hKN1cRqq2W6kK1EKKWf4I4Dqw4dLzE bFLs1qexEurUTuvSUvDTqDL6mr8X4K2j8Yvu4YN07RfFRJs= X-Google-Smtp-Source: AMrXdXt776kq9AR06ElkJS5DUAHm54Po2CNNqHtcckTvI9GbiZ1g094YXMKd0Zf0lsKKZSnZNqqw7PMp+7Nl54jMUfw= X-Received: by 2002:a05:620a:9cf:b0:6fe:d495:390a with SMTP id y15-20020a05620a09cf00b006fed495390amr3636944qky.149.1675137594473; Mon, 30 Jan 2023 19:59:54 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: ron minnich Date: Mon, 30 Jan 2023 19:59:43 -0800 Message-ID: To: Dan Cross Content-Type: multipart/alternative; boundary="00000000000044c2a005f3875df0" Message-ID-Hash: SQJDHWPZP7SWNBIWZ75IQWAWVIRY6NHZ X-Message-ID-Hash: SQJDHWPZP7SWNBIWZ75IQWAWVIRY6NHZ X-MailFrom: rminnich@gmail.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-tuhs.tuhs.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: Alejandro Colomar , segaloco , The Eunuchs Hysterical Society X-Mailman-Version: 3.3.6b1 Precedence: list Subject: [TUHS] Re: yet another C discussion (YACD) and: Rust is not C++ List-Id: The Unix Heritage Society mailing list Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --00000000000044c2a005f3875df0 Content-Type: text/plain; charset="UTF-8" That example was a simplified bit of code from a widely used code base. All I need to do is change the function g go a pointer to function, or have it be provided by a .so, and all bets are off. In any event, the important thing here is not that y should be initialized, or should not; it's that it is not possible to get a consistent answer on the question, from people who have been writing in C for decades. ron On Mon, Jan 30, 2023 at 6:56 PM Dan Cross wrote: > On Mon, Jan 30, 2023 at 8:49 PM Alejandro Colomar > wrote: > > Hello Ron, > > > > On 1/30/23 20:35, ron minnich wrote: > > > I don't know how many ways there are to say this, but Rust and C/C++ > are > > > fundamentally different at the lowest level. > > > > > > If you are just looking at Rust syntax in a superficial way, you might > be > > > excused for thinking it's "C with features / C++ with differences." > > > > > > But that's not how it is. It's like saying C is "just like assembly" > because > > > labels have a ':' in them; or that Unix is "just like RSX" because > they have > > > vaguely similar commands. > > > > > > Here's a real question that came up where I work: should the code > shown below be > > > accepted (this is abstracted from a real example that is in use ... > everywhere)? > > > We had one code analyzer that said, emphatically, NO; one person said > YES, > > > another MAYBE. One piece of code, 3 answers :-) > > > > > > char f() { > > > char *y; > > > g(&y); > > > return *y; > > > } > > > > > > > > > A specific question: should y be initialized to NULL? > > > > No. At least not if you don't want to use the value NULL in your > program. > > Using NULL as something to avoid Undefined Behavior is wrong, and it will > > contribute to hide programmer errors. > > Sorry, I think this misses the point: how do you meaningfully tell > that `g` did something to `y` so that it's safe to indirect in the > `return`? > > On the other hand, one could write, > > char f() { > char *y = NULL; > g(&y); > if (y == NULL) > panic("g failed"); > return *y; > } > > C, of course, can't tell in the original. And while you can now tell > that `g` did _something_ to `y`, you still really don't know that `y` > points to something valid. > > > These days, compilers and static analyzers are smart enough to detect > > uninitialized variables, even across Translation Units, and throw an > error, > > letting the programmer fix such bugs, when they occur. > > In many cases, yes, but not in all. That would be equivalent to > solving the halting problem. > > > The practice of initializing always to NULL and 0 provides no value, and > > silences all of those warnings, thus creating silent bugs, that will > bite some > > cold winter night. > > > > I know some static analyzers (e.g., clang-tidy(1)) do warn when you don't > > initialize variables and especially pointers (well, you need to enable > the > > warning that does that, but it can warn). That warning is there due to > some > > coding style or certifications that require it. I recommend disabling > those > > bogus warnings, and forgetting about the bogus coding style or > certification > > that requires you to write bogus code. > > Oh my. > > > > The case to set y to NULL: otherwise it has an unknown value and it's > unsafe. > > > > Is an undefined value less safe than an unexpected one? I don't think > so. At > > least compilers can detect the former, but not the latter. > > > > > The case against setting y to NULL: it is pointless, as it slows the > code down > > > slightly and g is going to change it anyway. > > > > Performance is a very minor thing. But it's a nice side-effect that > doing the > > right thing has performance advantages. Readability is a good reason > (and in > > fact, the compiler suffers that readability too, which is the cause of > the > > silencing of the wanted warnings. > > > > > The case maybe: Why do you trust g() to always set it? Why don't you > trust g()? > > > convince me. > > > > Well, it depends on the contract of g(). If the contract is that it may > not > > initialize the variable, then sure, initialize it yourself, or even > better, > > check for g()'s errors, and react when it fails and doesn't initialize > it. > > > > If the contract is that it should always initialize it, then trust it > blindly. > > The compiler will tell you when it doesn't happen (that is, when g() has > a bug). > > The number of situations where the compiler can't tell whether `g` has > a bug is unbounded. > > > > You can't write this in Rust with this ambiguity. It won't compile. In > fact, & > > > doesn't mean in Rust what it does in C. > > > > I don't know Rust. Does it force NULL initialization? If so, I guess > it's a > > bad design choice. Unless Rust is so different that it can detect such > > programmer errors even having defined default initialization, but I can't > > imagine how that is. > > Rust enforces that all variables must be initialized prior to use. > Whether they're initialized with a zero value or something else is up > to the programmer; but not initializing is a compile-time error. > > For example: > > | fn main() { > | let x; > | if thing_is_true() { > | x = 5; > | } else { > | x = 3; > | } > | println!("x={x}"); > | } > > In fact, this is good; this allows us to employ a technique called, > "Type-Driven Development", whereby we can create some type that > encodes an invariant about the object. An object of that type is > written in such a way that once it has been initialized, the mere > existence of the object is sufficient to prove that the invariant > holds, and need not be retested whenever the object is used. For > example: > > | #[repr(transparent)] > | struct PageFrameAddr(u64); > | impl PageFrameAddr { > | fn new_round_down(addr: u64) -> PageFrameAddr { > | PageFrameAddr(addr & !0xFFF) > | } > | } > > Here, "PageFrameAddr" contains a 4KiB-aligned page address. Since the > only way to create one of these is by the, `new_round_down` associated > method that masks off the low bits, we can be sure that if we get one > of these, the contained address is properly aligned. In C, we'd > pretty much have to test at the site of use. > > This is an extremely powerful technique; cf Alexis King's blog post, > "Parse Don't Validate" > (https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/) > and Cliff Biffle's talk on the Hubris embedded RTOS > (https://talks.osfc.io/osfc2021/talk/JTWYEH/). > > > > Sorry to be such a pedant, but I was concerned that we not fall into > the "Rust > > > is C++ all over again" trap. > > > > > > As for replacing C, the velocity of Rust is just astonishing. I think > folks have > > > been waiting for something to replace C for a long time, and Rust, > with all its > > > headaches and issues, is likely to be that thing. > > > > Modern C is receiving a lot of improvements from C++ and other > languages. It's > > getting really good in fixing the small issues it had in the past (and > GNU C > > provides even more good things). GNU C2x is quite safe and readable, > compared > > to say ISO C99 versions. > > C23 looks like it will be a better language that C11, but I don't know > that even JeanHeyd would suggest it's "quite safe". :-/ > > - Dan C. > > > > I don't think C will ever be replaced. And I hope it doesn't. > > > > Possibly, something like with Plan9 and Unix/Linux will happen. The > good things > > from other languages will come back in one form or another to C. The > > not-so-good ones will be discarded. > > > > > > > > Personally, I still prefer Go, but I can also see which way the wind > is blowing, > > > especially when I see Rust use exploding in firmware and user mode, > and now even > > > in the Linux kernel. > > > > Cheers, > > > > Alex > --00000000000044c2a005f3875df0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
That example was a simplified bit of code from a widely us= ed code base. All I need to do is change the function g go a pointer to fun= ction, or have it be provided by a .so, and all bets are off.

In any event, the important thing here is not that y should=C2=A0be i= nitialized, or should not; it's that it is not possible to get a consis= tent answer on the question, from people who have been writing in C for dec= ades.=C2=A0

ron

On Mon, Jan 30, 2023 at 6:56 = PM Dan Cross <crossd@gmail.com&g= t; wrote:
On Mon= , Jan 30, 2023 at 8:49 PM Alejandro Colomar
<alx.manpage= s@gmail.com> wrote:
> Hello Ron,
>
> On 1/30/23 20:35, ron minnich wrote:
> > I don't know how many ways there are to say this, but Rust an= d C/C++ are
> > fundamentally different at the lowest level.
> >
> > If you are just looking at Rust syntax in a superficial way, you = might be
> > excused for thinking it's "C with features / C++ with di= fferences."
> >
> > But that's not how it is. It's like saying C is "jus= t like assembly" because
> > labels have a ':' in them; or that Unix is "just lik= e RSX" because they have
> > vaguely similar commands.
> >
> > Here's a real question that came up where I work: should the = code shown below be
> > accepted (this is abstracted from a real example that is in use .= .. everywhere)?
> > We had one code analyzer that said, emphatically, NO; one person = said YES,
> > another MAYBE. One piece of code, 3 answers :-)
> >
> > char f() {
> >=C2=A0 =C2=A0 =C2=A0char *y;
> >=C2=A0 =C2=A0 =C2=A0g(&y);
> >=C2=A0 =C2=A0 =C2=A0return *y;
> > }
> >
> >
> > A specific question: should y be initialized to NULL?
>
> No.=C2=A0 At least not if you don't want to use the value NULL in = your program.
> Using NULL as something to avoid Undefined Behavior is wrong, and it w= ill
> contribute to hide programmer errors.

Sorry, I think this misses the point: how do you meaningfully tell
that `g` did something to `y` so that it's safe to indirect in the
`return`?

On the other hand, one could write,

char f() {
=C2=A0 =C2=A0 char *y =3D NULL;
=C2=A0 =C2=A0 g(&y);
=C2=A0 =C2=A0 if (y =3D=3D NULL)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 panic("g failed");
=C2=A0 =C2=A0 return *y;
}

C, of course, can't tell in the original. And while you can now tell that `g` did _something_ to `y`, you still really don't know that `y` points to something valid.

> These days, compilers and static analyzers are smart enough to detect<= br> > uninitialized variables, even across Translation Units, and throw an e= rror,
> letting the programmer fix such bugs, when they occur.

In many cases, yes, but not in all. That would be equivalent to
solving the halting problem.

> The practice of initializing always to NULL and 0 provides no value, a= nd
> silences all of those warnings, thus creating silent bugs, that will b= ite some
> cold winter night.
>
> I know some static analyzers (e.g., clang-tidy(1)) do warn when you do= n't
> initialize variables and especially pointers (well, you need to enable= the
> warning that does that, but it can warn).=C2=A0 That warning is there = due to some
> coding style or certifications that require it.=C2=A0 I recommend disa= bling those
> bogus warnings, and forgetting about the bogus coding style or certifi= cation
> that requires you to write bogus code.

Oh my.

> > The case to set y to NULL: otherwise it has an unknown value and = it's unsafe.
>
> Is an undefined value less safe than an unexpected one?=C2=A0 I don= 9;t think so.=C2=A0 At
> least compilers can detect the former, but not the latter.
>
> > The case against setting y to NULL: it is pointless, as it slows = the code down
> > slightly and g is going to change it anyway.
>
> Performance is a very minor thing.=C2=A0 But it's a nice side-effe= ct that doing the
> right thing has performance advantages.=C2=A0 Readability is a good re= ason (and in
> fact, the compiler suffers that readability too, which is the cause of= the
> silencing of the wanted warnings.
>
> > The case maybe: Why do you trust g() to always set it? Why don= 9;t you trust g()?
> > convince me.
>
> Well, it depends on the contract of g().=C2=A0 If the contract is that= it may not
> initialize the variable, then sure, initialize it yourself, or even be= tter,
> check for g()'s errors, and react when it fails and doesn't in= itialize it.
>
> If the contract is that it should always initialize it, then trust it = blindly.
> The compiler will tell you when it doesn't happen (that is, when g= () has a bug).

The number of situations where the compiler can't tell whether `g` has<= br> a bug is unbounded.

> > You can't write this in Rust with this ambiguity. It won'= t compile. In fact, &
> > doesn't mean in Rust what it does in C.
>
> I don't know Rust.=C2=A0 Does it force NULL initialization?=C2=A0 = If so, I guess it's a
> bad design choice.=C2=A0 Unless Rust is so different that it can detec= t such
> programmer errors even having defined default initialization, but I ca= n't
> imagine how that is.

Rust enforces that all variables must be initialized prior to use.
Whether they're initialized with a zero value or something else is up to the programmer; but not initializing is a compile-time error.

For example:

| fn main() {
|=C2=A0 =C2=A0 =C2=A0let x;
|=C2=A0 =C2=A0 =C2=A0if thing_is_true() {
|=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0x =3D 5;
|=C2=A0 =C2=A0 =C2=A0} else {
|=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0x =3D 3;
|=C2=A0 =C2=A0 =C2=A0}
|=C2=A0 =C2=A0 =C2=A0println!("x=3D{x}");
| }

In fact, this is good; this allows us to employ a technique called,
"Type-Driven Development", whereby we can create some type that encodes an invariant about the object. An object of that type is
written in such a way that once it has been initialized, the mere
existence of the object is sufficient to prove that the invariant
holds, and need not be retested whenever the object is used. For
example:

| #[repr(transparent)]
| struct PageFrameAddr(u64);
| impl PageFrameAddr {
|=C2=A0 =C2=A0 =C2=A0fn new_round_down(addr: u64) -> PageFrameAddr {
|=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0PageFrameAddr(addr & !0xFFF)
|=C2=A0 =C2=A0 =C2=A0}
| }

Here, "PageFrameAddr" contains a 4KiB-aligned page address.=C2=A0= Since the
only way to create one of these is by the, `new_round_down` associated
method that masks off the low bits, we can be sure that if we get one
of these, the contained address is properly aligned.=C2=A0 In C, we'd pretty much have to test at the site of use.

This is an extremely powerful technique; cf Alexis King's blog post, "Parse Don't Validate"
(https://lexi-lambda.github.io/bl= og/2019/11/05/parse-don-t-validate/)
and Cliff Biffle's talk on the Hubris embedded RTOS
(https://talks.osfc.io/osfc2021/talk/JTWYEH/).

> > Sorry to be such a pedant, but I was concerned that we not fall i= nto the "Rust
> > is C++ all over again" trap.
> >
> > As for replacing C, the velocity of Rust is just astonishing. I t= hink folks have
> > been waiting for something to replace C for a long time, and Rust= , with all its
> > headaches and issues, is likely to be that thing.
>
> Modern C is receiving a lot of improvements from C++ and other languag= es.=C2=A0 It's
> getting really good in fixing the small issues it had in the past (and= GNU C
> provides even more good things).=C2=A0 GNU C2x is quite safe and reada= ble, compared
> to say ISO C99 versions.

C23 looks like it will be a better language that C11, but I don't know<= br> that even JeanHeyd would suggest it's "quite safe". :-/

=C2=A0 =C2=A0 =C2=A0 =C2=A0 - Dan C.


> I don't think C will ever be replaced.=C2=A0 And I hope it doesn&#= 39;t.
>
> Possibly, something like with Plan9 and Unix/Linux will happen.=C2=A0 = The good things
> from other languages will come back in one form or another to C.=C2=A0= The
> not-so-good ones will be discarded.
>
> >
> > Personally, I still prefer Go, but I can also see which way the w= ind is blowing,
> > especially when I see Rust use exploding in firmware and user mod= e, and now even
> > in the Linux kernel.
>
> Cheers,
>
> Alex
--00000000000044c2a005f3875df0--