The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
From: ron minnich <rminnich@gmail.com>
To: Dan Cross <crossd@gmail.com>
Cc: Alejandro Colomar <alx.manpages@gmail.com>,
	segaloco <segaloco@protonmail.com>,
	The Eunuchs Hysterical Society <tuhs@tuhs.org>
Subject: [TUHS] Re: yet another C discussion (YACD) and: Rust is not C++
Date: Mon, 30 Jan 2023 19:59:43 -0800	[thread overview]
Message-ID: <CAP6exY+Qz2Oe4gC4D1Fqy22JKKDaanTOYpc0gxugBv485JUknQ@mail.gmail.com> (raw)
In-Reply-To: <CAEoi9W4nT-cwe2GJreO_xTt=YGfrgTDa5bHfiGf1Fo4bYocziw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 7922 bytes --]

That example was a simplified bit of code from a widely used code base. All
I need to do is change the function g go a pointer to function, or have it
be provided by a .so, and all bets are off.

In any event, the important thing here is not that y should be initialized,
or should not; it's that it is not possible to get a consistent answer on
the question, from people who have been writing in C for decades.

ron

On Mon, Jan 30, 2023 at 6:56 PM Dan Cross <crossd@gmail.com> wrote:

> On Mon, Jan 30, 2023 at 8:49 PM Alejandro Colomar
> <alx.manpages@gmail.com> wrote:
> > Hello Ron,
> >
> > On 1/30/23 20:35, ron minnich wrote:
> > > I don't know how many ways there are to say this, but Rust and C/C++
> are
> > > fundamentally different at the lowest level.
> > >
> > > If you are just looking at Rust syntax in a superficial way, you might
> be
> > > excused for thinking it's "C with features / C++ with differences."
> > >
> > > But that's not how it is. It's like saying C is "just like assembly"
> because
> > > labels have a ':' in them; or that Unix is "just like RSX" because
> they have
> > > vaguely similar commands.
> > >
> > > Here's a real question that came up where I work: should the code
> shown below be
> > > accepted (this is abstracted from a real example that is in use ...
> everywhere)?
> > > We had one code analyzer that said, emphatically, NO; one person said
> YES,
> > > another MAYBE. One piece of code, 3 answers :-)
> > >
> > > char f() {
> > >     char *y;
> > >     g(&y);
> > >     return *y;
> > > }
> > >
> > >
> > > A specific question: should y be initialized to NULL?
> >
> > No.  At least not if you don't want to use the value NULL in your
> program.
> > Using NULL as something to avoid Undefined Behavior is wrong, and it will
> > contribute to hide programmer errors.
>
> Sorry, I think this misses the point: how do you meaningfully tell
> that `g` did something to `y` so that it's safe to indirect in the
> `return`?
>
> On the other hand, one could write,
>
> char f() {
>     char *y = NULL;
>     g(&y);
>     if (y == NULL)
>         panic("g failed");
>     return *y;
> }
>
> C, of course, can't tell in the original. And while you can now tell
> that `g` did _something_ to `y`, you still really don't know that `y`
> points to something valid.
>
> > These days, compilers and static analyzers are smart enough to detect
> > uninitialized variables, even across Translation Units, and throw an
> error,
> > letting the programmer fix such bugs, when they occur.
>
> In many cases, yes, but not in all. That would be equivalent to
> solving the halting problem.
>
> > The practice of initializing always to NULL and 0 provides no value, and
> > silences all of those warnings, thus creating silent bugs, that will
> bite some
> > cold winter night.
> >
> > I know some static analyzers (e.g., clang-tidy(1)) do warn when you don't
> > initialize variables and especially pointers (well, you need to enable
> the
> > warning that does that, but it can warn).  That warning is there due to
> some
> > coding style or certifications that require it.  I recommend disabling
> those
> > bogus warnings, and forgetting about the bogus coding style or
> certification
> > that requires you to write bogus code.
>
> Oh my.
>
> > > The case to set y to NULL: otherwise it has an unknown value and it's
> unsafe.
> >
> > Is an undefined value less safe than an unexpected one?  I don't think
> so.  At
> > least compilers can detect the former, but not the latter.
> >
> > > The case against setting y to NULL: it is pointless, as it slows the
> code down
> > > slightly and g is going to change it anyway.
> >
> > Performance is a very minor thing.  But it's a nice side-effect that
> doing the
> > right thing has performance advantages.  Readability is a good reason
> (and in
> > fact, the compiler suffers that readability too, which is the cause of
> the
> > silencing of the wanted warnings.
> >
> > > The case maybe: Why do you trust g() to always set it? Why don't you
> trust g()?
> > > convince me.
> >
> > Well, it depends on the contract of g().  If the contract is that it may
> not
> > initialize the variable, then sure, initialize it yourself, or even
> better,
> > check for g()'s errors, and react when it fails and doesn't initialize
> it.
> >
> > If the contract is that it should always initialize it, then trust it
> blindly.
> > The compiler will tell you when it doesn't happen (that is, when g() has
> a bug).
>
> The number of situations where the compiler can't tell whether `g` has
> a bug is unbounded.
>
> > > You can't write this in Rust with this ambiguity. It won't compile. In
> fact, &
> > > doesn't mean in Rust what it does in C.
> >
> > I don't know Rust.  Does it force NULL initialization?  If so, I guess
> it's a
> > bad design choice.  Unless Rust is so different that it can detect such
> > programmer errors even having defined default initialization, but I can't
> > imagine how that is.
>
> Rust enforces that all variables must be initialized prior to use.
> Whether they're initialized with a zero value or something else is up
> to the programmer; but not initializing is a compile-time error.
>
> For example:
>
> | fn main() {
> |     let x;
> |     if thing_is_true() {
> |         x = 5;
> |     } else {
> |         x = 3;
> |     }
> |     println!("x={x}");
> | }
>
> In fact, this is good; this allows us to employ a technique called,
> "Type-Driven Development", whereby we can create some type that
> encodes an invariant about the object. An object of that type is
> written in such a way that once it has been initialized, the mere
> existence of the object is sufficient to prove that the invariant
> holds, and need not be retested whenever the object is used. For
> example:
>
> | #[repr(transparent)]
> | struct PageFrameAddr(u64);
> | impl PageFrameAddr {
> |     fn new_round_down(addr: u64) -> PageFrameAddr {
> |         PageFrameAddr(addr & !0xFFF)
> |     }
> | }
>
> Here, "PageFrameAddr" contains a 4KiB-aligned page address.  Since the
> only way to create one of these is by the, `new_round_down` associated
> method that masks off the low bits, we can be sure that if we get one
> of these, the contained address is properly aligned.  In C, we'd
> pretty much have to test at the site of use.
>
> This is an extremely powerful technique; cf Alexis King's blog post,
> "Parse Don't Validate"
> (https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/)
> and Cliff Biffle's talk on the Hubris embedded RTOS
> (https://talks.osfc.io/osfc2021/talk/JTWYEH/).
>
> > > Sorry to be such a pedant, but I was concerned that we not fall into
> the "Rust
> > > is C++ all over again" trap.
> > >
> > > As for replacing C, the velocity of Rust is just astonishing. I think
> folks have
> > > been waiting for something to replace C for a long time, and Rust,
> with all its
> > > headaches and issues, is likely to be that thing.
> >
> > Modern C is receiving a lot of improvements from C++ and other
> languages.  It's
> > getting really good in fixing the small issues it had in the past (and
> GNU C
> > provides even more good things).  GNU C2x is quite safe and readable,
> compared
> > to say ISO C99 versions.
>
> C23 looks like it will be a better language that C11, but I don't know
> that even JeanHeyd would suggest it's "quite safe". :-/
>
>         - Dan C.
>
>
> > I don't think C will ever be replaced.  And I hope it doesn't.
> >
> > Possibly, something like with Plan9 and Unix/Linux will happen.  The
> good things
> > from other languages will come back in one form or another to C.  The
> > not-so-good ones will be discarded.
> >
> > >
> > > Personally, I still prefer Go, but I can also see which way the wind
> is blowing,
> > > especially when I see Rust use exploding in firmware and user mode,
> and now even
> > > in the Linux kernel.
> >
> > Cheers,
> >
> > Alex
>

[-- Attachment #2: Type: text/html, Size: 9768 bytes --]

  reply	other threads:[~2023-01-31  4:01 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-30 17:24 [TUHS] " ron minnich
2023-01-30 18:47 ` [TUHS] " Andy Kosela
2023-01-30 18:59   ` segaloco via TUHS
2023-01-30 19:35     ` ron minnich
2023-01-30 21:37       ` Stuff Received
2023-01-31  1:47       ` Alejandro Colomar
2023-01-31  2:55         ` Dan Cross
2023-01-31  3:59           ` ron minnich [this message]
2023-01-31 12:26             ` Dan Cross
2023-02-01  0:40             ` Steffen Nurpmeso
2023-02-01  0:49               ` Larry McVoy
2023-02-01  1:03                 ` Luther Johnson
     [not found]                   ` <EjVurdhKm6O8BZ9n0FesPIsZH6sArmVbEN_ikfso8aNlawWIe8ElCGwgaT4ydoWHMNvRty9jDPxFxhLXVXGMjwsTfmfgLgKXcfjpLmqENT8=@protonmail.com>
2023-02-01  2:22                     ` [TUHS] Re: yet another C discussion (YACD) Warren Toomey via TUHS
2023-02-01  0:50               ` [TUHS] Algol rules: was something about Rust is not C++ Will Senn
2023-02-01  1:18                 ` [TUHS] " Chet Ramey
2023-02-01  1:24                   ` Luther Johnson
2023-02-01  0:53               ` [TUHS] Re: yet another C discussion (YACD) and: " Steffen Nurpmeso
2023-02-01 16:34                 ` Luther Johnson
2023-01-31 14:15 ` Blake McBride

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAP6exY+Qz2Oe4gC4D1Fqy22JKKDaanTOYpc0gxugBv485JUknQ@mail.gmail.com \
    --to=rminnich@gmail.com \
    --cc=alx.manpages@gmail.com \
    --cc=crossd@gmail.com \
    --cc=segaloco@protonmail.com \
    --cc=tuhs@tuhs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).