From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID,
	DKIM_VALID_AU,FREEMAIL_FROM,HTML_MESSAGE,MAILING_LIST_MULTI,
	PDS_OTHER_BAD_TLD,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no
	version=3.4.4
Received: (qmail 23419 invoked from network); 4 Feb 2022 23:19:51 -0000
Received: from minnie.tuhs.org (45.79.103.53)
  by inbox.vuxu.org with ESMTPUTF8; 4 Feb 2022 23:19:51 -0000
Received: by minnie.tuhs.org (Postfix, from userid 112)
	id 5FA899BFC9; Sat,  5 Feb 2022 09:19:50 +1000 (AEST)
Received: from minnie.tuhs.org (localhost [127.0.0.1])
	by minnie.tuhs.org (Postfix) with ESMTP id 16DFC9B92F;
	Sat,  5 Feb 2022 09:18:52 +1000 (AEST)
Authentication-Results: minnie.tuhs.org;
	dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="jXB16iVO";
	dkim-atps=neutral
Received: by minnie.tuhs.org (Postfix, from userid 112)
 id 048AE9B7AF; Sat,  5 Feb 2022 09:18:49 +1000 (AEST)
Received: from mail-oi1-f175.google.com (mail-oi1-f175.google.com
 [209.85.167.175])
 by minnie.tuhs.org (Postfix) with ESMTPS id BAA9B95192
 for <tuhs@minnie.tuhs.org>; Sat,  5 Feb 2022 09:18:46 +1000 (AEST)
Received: by mail-oi1-f175.google.com with SMTP id i5so10345895oih.1
 for <tuhs@minnie.tuhs.org>; Fri, 04 Feb 2022 15:18:46 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112;
 h=mime-version:references:in-reply-to:from:date:message-id:subject:to
 :cc; bh=ZlHyhpk+5xzIgC1UEgSfuLHLyFhqdzUOq5NEOYzxfJY=;
 b=jXB16iVOfgCe3u8IBL+hgsktFbGHKpylwSINK6AbHXsRVliHjYipTuwMkW6AkK8BiV
 qVdFzo/sIN/JvXqiXggZ7piCUbca+8e7HbjJf84PMgp1M0AP0Nz9KmdeQj5dwYX2nIGZ
 TmlM8ctoSao/aKj3S5JLgjwoLU7x77WCVTT/Vd3/AaT/w5zNOHavKmAiL5m19kSnLXOt
 kH9yCXTxamTQKcoP6nK7XDWzK27LKLAjotyZIk9MHqjsqaeq02MSqr7ha3KnMiK6UCkK
 BXkn2gsSYnuGqkNh5BAdLLu6X8GmtWf+lRHlF4OF7FxtwLjIw11jnn/9quZtTqaqaXgw
 SM5Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc;
 bh=ZlHyhpk+5xzIgC1UEgSfuLHLyFhqdzUOq5NEOYzxfJY=;
 b=IOMLbPbTCmsqLGTVM+7p2MedQPh5hGQEVjyvBbdqXfxGTqs6JFGldJCgCL1WFgHMeT
 3wVYdKbOREU9u0xUPgHTsq1cH2A+n3zLhb3SnjvZSPomylZQ3XkvXeOxoEja5SE48Vg6
 NJvjfT6hDs/zpFyf1qxUKOxDxfHdkNUbQiiFzVVZJ9VLICXTQipSfCc6eU9R6cCyOncO
 LC7kuSSSh+MrXXwrlsAG1ck2m0xC/yQnvFlIt+WaNL0OZ0J7w4CwvfQ3NpIhqKQ3vZjk
 gdhcl9GTePcM0/CEjBR8k2LuLEV0zSu+rJU0yyQSokvswKRE0+QAA1gOIbrm1yTp1KBH
 dExQ==
X-Gm-Message-State: AOAM533NG9T1hfnqwNPH212uScSUeCyFVK6BqtNy5hkBllyiwRir2XK2
 J2JpoqJphgaa2P1dZ2BG8qiMi//QEhYu2hTlzH6p5wiXHQk=
X-Google-Smtp-Source: ABdhPJysXzaPanOSMgEv3hDOrwSyqPRn4OSx4mfGZdYHPAzkPJgJaNwzQdQ6ZskKZKAhwFmkJJzrkeSWCBR3fAbo5zM=
X-Received: by 2002:a05:6808:230d:: with SMTP id
 bn13mr662843oib.74.1644016725773; 
 Fri, 04 Feb 2022 15:18:45 -0800 (PST)
MIME-Version: 1.0
References: <c0fd1aec-55a8-54a4-94dc-748068d9e15d@gmail.com>
 <202202011537.211FbYSe017204@freefriends.org>
 <20220201155225.5A9541FB21@orac.inputplus.co.uk>
 <202202020747.2127lTTh005669@freefriends.org>
 <7C19F93B-4F21-4BB1-A064-0307D3568DB7@cfcl.com>
 <1nFWmo-1Gn-00@marmaro.de>
 <CAP2nic37m2aSCffgr8o_J+BkAbnrzFDUaHX9V0brd+L8+PqdPA@mail.gmail.com>
 <202202040234.2142YeKN3307556@darkstar.fourwinds.com>
In-Reply-To: <202202040234.2142YeKN3307556@darkstar.fourwinds.com>
From: Dan Cross <crossd@gmail.com>
Date: Fri, 4 Feb 2022 18:18:09 -0500
Message-ID: <CAEoi9W4U7PpBGPT5U3k_XLoxBrceUDAHuB7ykZYQb9TxAD0x+Q@mail.gmail.com>
To: Jon Steinhart <jon@fourwinds.com>
Content-Type: multipart/alternative; boundary="000000000000f1d02a05d7397823"
Subject: Re: [TUHS] more about Brian... [really Rust]
X-BeenThere: tuhs@minnie.tuhs.org
X-Mailman-Version: 2.1.26
Precedence: list
List-Id: The Unix Heritage Society mailing list <tuhs.minnie.tuhs.org>
List-Unsubscribe: <https://minnie.tuhs.org/cgi-bin/mailman/options/tuhs>,
 <mailto:tuhs-request@minnie.tuhs.org?subject=unsubscribe>
List-Archive: <http://minnie.tuhs.org/pipermail/tuhs/>
List-Post: <mailto:tuhs@minnie.tuhs.org>
List-Help: <mailto:tuhs-request@minnie.tuhs.org?subject=help>
List-Subscribe: <https://minnie.tuhs.org/cgi-bin/mailman/listinfo/tuhs>,
 <mailto:tuhs-request@minnie.tuhs.org?subject=subscribe>
Cc: COFF <coff@minnie.tuhs.org>
Errors-To: tuhs-bounces@minnie.tuhs.org
Sender: "TUHS" <tuhs-bounces@minnie.tuhs.org>

--000000000000f1d02a05d7397823
Content-Type: text/plain; charset="UTF-8"

[TUHS to Bcc, +COFF <coff@minnie.tuhs.org> ]

This isn't exactly COFF material, but I don't know what list is more
appropriate.

On Thu, Feb 3, 2022 at 9:41 PM Jon Steinhart <jon@fourwinds.com> wrote:

> Adam Thornton writes:
> > Do the august personages on this list have opinions about Rust?
> > People who generally have tastes consonant with mine tell me I'd like
> Rust.
>
> Well, I'm not an august personage and am not a Rust programmer.  I did
> spend a while trying to learn rust a while ago and wasn't impressed.
>
> Now, I'm heavily biased in that I think that it doesn't add value to keep
> inventing new languages to do the same old things, and I didn't see
> anything
> in Rust that I couldn't do in a myriad of other languages.
>

I'm a Rust programmer, mostly using it for bare-metal kernel programming
(though in my current gig, I find myself mostly in Rust
userspace...ironically, it's back to C for the kernel). That said, I'm not
a fan-boy for the language: it's not perfect.

I've written basically four kernels in Rust now, to varying degrees of
complexity from, "turn the computer on, spit hello-world out of the UART,
and halt" to most of a v6 clone (which I really need to get around to
finishing) to two rather more complex ones. I've done one ersatz kernel in
C, and worked on a bunch more in C over the years. Between the two
languages, I'd pick Rust over C for similar projects.

Why? Because it really doesn't just do the same old things: it adds new
stuff. Honest!

Further, the sad reality (and the tie-in with TUHS/COFF) is that modern C
has strayed far from its roots as a vehicle for systems programming, in
particular, for implementing operating system kernels (
https://arxiv.org/pdf/2201.07845.pdf). C _implementations_ target the
abstract machine defined in the C standard, not hardware, and they use
"undefined behavior" as an excuse to make aggressive optimizations that
change the semantics of one's program in such a way that some of the tricks
you really do have to do when implementing an OS are just not easily done.
For example, consider this code:

uint16_t mul(uint16_t a, uint16_t b) { return a * b; }

Does that code ever exhibit undefined behavior? The answer is that "it
depends, but on most platforms, yes." Why? Because most often uint16_t is a
typedef for `unsigned short int`, and because `short int` is of lesser
"rank" than `int` and usually not as wide, the "usual arithmetic
conversions" will apply before the multiplication. This means that the
unsigned shorts will be converted to (signed) int. But on many
platforms `int` will be a 32-bit integer (even 64-bit platforms!). However,
the range of an unsigned 16-bit integer is such that the product of two
uint16_t's can include values whose product is larger than whatever is
representable in a signed 32-bit int, leading to overflow, and signed
integer overflow is undefined overflow is undefined behavior. But does that
_matter_ in practice? Potentially: since signed int overflow is UB, the
compiler can decide it would never happen. And so if the compiler decides,
for whatever reason, that (say) a saturating multiplication is the best way
to implement that multiplication, then that simple single-expression
function will yield results that (I'm pretty sure...) the programmer did
not anticipate for some subset of inputs. How do you fix this?

uint16_t mul(uint16_t a, uint16_t b) { unsigned int aa = a, bb = b; return
aa * bb; }

That may sound very hypothetical, but similar things have shown up in the
wild: https://people.csail.mit.edu/nickolai/papers/wang-undef-2012-08-21.pdf

In practice, this one is unlikely. But it's not impossible: the compiler
would be right, the programmer would be wrong. One thing I've realized
about C is that successive generations of compilers have tightened the
noose on UB so that code that has worked for *years* all of a sudden breaks
one day. There be dragons in our code.

After being bit one too many times by such issues in C I decided to
investigate alternatives. The choices at the time were either Rust or Go:
for the latter, one gets a nice, relatively simple language, but a big
complex runtime. For the former, you get a big, ugly language, but a
minimal runtime akin to C: to get it going, you really don't have to do
much more than set up a stack and join to a function. While people have
built systems running Go at the kernel level (
https://pdos.csail.mit.edu/papers/biscuit.pdf), that seemed like a pretty
heavy lift. On the other hand, if Rust could deliver on a quarter of the
promises it made, I'd be ahead of the game. That was sometime in the latter
half of 2018 and since then I've generally been pleasantly surprised at how
much it really does deliver.

For the above example, integer overflow is defined to trap. If you want
wrapping (or saturating!) semantics, you request those explicitly:

fn mul(a: u16, b: u16) -> u16 { a.wrapping_mul(b) }

This is perfectly well-defined, and guaranteed to work pretty much forever.

But, my real issue came from some of the tutorials that I perused.  Rust is
> being sold as "safer".  As near as I can tell from the tutorials, the model
> is that nothing works unless you enable it.  Want to be able to write a
> variable?  Turn that on.  So it seemed like the general style was to write
> code and then turn various things on until it ran.
>

That's one way to look at it, but I don't think that's the intent: the
model is rather, "immutable by default."

Rust forces you to think about mutability, ownership, and the semantics of
taking references, because the compiler enforces invariants on all of those
things in a way that pretty much no other language does. It is opinionated,
and not shy about sharing those opinions.

To me, this implies a mindset that programming errors are more important
> than thinking errors, and that one should hack on things until they work
> instead of thinking about what one is doing.  I know that that's the
> modern definition of programming, but will never be for me.


It's funny, I've had the exact opposite experience.

I have found that it actually forces you to invest a _lot_ more in-up front
thought about what you're doing. Writing code first, and then sprinkling in
`mut` and `unsafe` until it compiles is a symptom of writing what we called
"crust" on my last project at Google: that is, "C in Rust syntax." When I
convinced our team to switch from C(++) to Rust, but none of us were really
particularly adept at the language, and all hit similar walls of
frustration; at one point, an engineer quipped, "this language has a
near-vertical learning curve." And it's true that we took a multi-week
productivity hit, but once we reached a certain level of familiarity,
something equally curious happened: our debugging load went way, _way_ down
and we started moving much faster.

It turned out it was harder to get a Rust program to build at first,
particularly with the bad habits we'd built up over decades of whatever
languages we came from, but once it did those programs very often ran
correctly the first time. You had to think _really hard_ about what data
structures to use, their ownership semantics, their visibility, locking,
etc. A lot of us had to absorb an emotional gut punch when the compiler
showed us things that we _knew_ were correct were, in fact, not correct.
But once code compiled, it tended not to have the kinds of errors that were
insta-panics or triple faults (or worse, silent corruption you only noticed
a million instructions later): no dangling pointers, no use-after-free
bugs, no data races, no integer overflow, no out-of-bounds array
references, etc. Simply put, the language _forced_ a level of discipline on
us that even veteran C programmers didn't have.

It also let us program at a moderately higher level of abstraction;
off-by-one errors were gone because we had things like iterators. ADTs and
a "Maybe" monad (the `Result<T,E>` type) greatly improved our error
handling. `match` statements have to be exhaustive so you can't add a
variant to an enum and forget to update code to account in just that one
place (the compiler squawks at you). It's a small point, but the `?`
operator removed a lot of tedious boilerplate from our code, making things
clearer without sacrificing robust failure handling. Tuples for multiple
return values instead of using pointers for output arguments (that have to
be manually checked for validity!) are really useful. Pattern matching and
destructuring in a fast systems language? Good to go.

In contrast, I ran into a "bug" of sorts with KVM due to code I wrote that
manifested itself as an "x86 emulation error" when it was anything but: I
was turning on paging very early in boot, and I had manually set up an
identity mapping for the low 4GiB of address space for the jump from 32-bit
to 64-bit mode. I used gigabyte pages since it was easy, and I figured it
would be supported, but I foolishly didn't check the CPU features when
running this under virtualization for testing and got that weird KVM error.
What was going on? It turned out KVM in this case didn't support gig pages,
but the hardware did; the software worked just fine until the first time
the kernel went to do IO. Then, when the hypervisor went to fetch the
instruction bytes to emulate the IO instruction, it saw the gig-sized pages
and errored. Since the incompatibility was manifest deep in the bowels of
the instruction emulation code, that was the error that returned, even
though it had nothing to do with instruction emulation. It would have been
nice to plumb through some kind of meaningful error message, but in C
that's annoying at best. In Rust, it's trivial.
https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/

70% of CVEs out of Microsoft over the last 15 years have been memory safety
issues, and while we may poo-poo MSFT, they've got some really great
engineers and let's be honest: Unix and Linux aren't that much better in
this department. Our best and brightest C programmers continue to turn out
highly buggy programs despite 50 years of experience.

But it's not perfect. The allocator interface was a pain (it's defined to
panic on allocation failure; I'm cool with a NULL return), though work is
ongoing in this area. There's no ergonomic way to initialize an object
'in-place' (https://mcyoung.xyz/2021/04/26/move-ctors/), and there's no
great way to say, essentially, "this points at RAM; even though I haven't
initialized it, just trust me don't poison it" (
https://users.rust-lang.org/t/is-it-possible-to-read-uninitialized-memory-without-invoking-ub/63092
-- we really need a `freeze` operation). However, right now? I think it
sits at a local maxima for systems languages targeting bare-metal.

        - Dan C.

--000000000000f1d02a05d7397823
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div>[TUHS to Bcc, <a class=3D"gmail_plusreply" id=3D"plus=
ReplyChip-1" href=3D"mailto:coff@minnie.tuhs.org" tabindex=3D"-1">+COFF</a>=
=C2=A0]</div><div><br></div><div>This isn&#39;t exactly COFF material, but =
I don&#39;t know what list is more appropriate.</div><div><br></div><div di=
r=3D"ltr">On Thu, Feb 3, 2022 at 9:41 PM Jon Steinhart &lt;<a href=3D"mailt=
o:jon@fourwinds.com" target=3D"_blank">jon@fourwinds.com</a>&gt; wrote:<br>=
</div><div class=3D"gmail_quote"><blockquote class=3D"gmail_quote" style=3D=
"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-le=
ft:1ex">Adam Thornton writes:<br>
&gt; Do the august personages on this list have opinions about Rust?<br>
&gt; People who generally have tastes consonant with mine tell me I&#39;d l=
ike Rust.<br>
<br>
Well, I&#39;m not an august personage and am not a Rust programmer.=C2=A0 I=
 did<br>
spend a while trying to learn rust a while ago and wasn&#39;t impressed.<br=
>
<br>
Now, I&#39;m heavily biased in that I think that it doesn&#39;t add value t=
o keep<br>
inventing new languages to do the same old things, and I didn&#39;t see any=
thing<br>
in Rust that I couldn&#39;t do in a myriad of other languages.=C2=A0<br></b=
lockquote><div><br></div><div>I&#39;m a Rust programmer, mostly using it fo=
r bare-metal kernel programming (though in my current gig, I find myself mo=
stly in Rust userspace...ironically, it&#39;s back to C for the kernel). Th=
at said, I&#39;m not a fan-boy for the language: it&#39;s not perfect.</div=
><div><br></div><div>I&#39;ve written basically four kernels in Rust now, t=
o varying degrees of complexity from, &quot;turn the computer on, spit hell=
o-world out of the UART, and halt&quot; to most of a v6 clone (which I real=
ly need to get around to finishing) to two rather more complex ones. I&#39;=
ve done one ersatz kernel in C, and worked on a bunch more in C over the ye=
ars. Between the two languages, I&#39;d pick Rust over C for similar projec=
ts.</div><div><br></div><div>Why? Because it really doesn&#39;t just do the=
 same old things: it adds new stuff. Honest!</div><div><br></div><div>Furth=
er, the sad reality (and the tie-in with TUHS/COFF) is that modern C has st=
rayed far from its roots as a vehicle for systems programming, in particula=
r, for implementing operating system kernels (<a href=3D"https://arxiv.org/=
pdf/2201.07845.pdf">https://arxiv.org/pdf/2201.07845.pdf</a>). C _implement=
ations_ target the abstract machine defined in the C standard, not hardware=
, and they use &quot;undefined behavior&quot; as an excuse to make aggressi=
ve optimizations that change the semantics of one&#39;s program in such a w=
ay that some of the tricks you really do have to do when implementing an OS=
 are just not easily done. For example, consider this code:</div><div><br><=
/div><div>uint16_t mul(uint16_t a, uint16_t b) { return a * b; }</div><div>=
<br></div><div>Does that code ever exhibit undefined behavior? The answer i=
s that &quot;it depends, but on most platforms, yes.&quot; Why? Because mos=
t often uint16_t is a typedef for `unsigned short int`, and because `short =
int` is of lesser &quot;rank&quot; than `int` and usually not as wide, the =
&quot;usual arithmetic conversions&quot; will apply before the multiplicati=
on. This means that the unsigned shorts will be converted to (signed) int. =
But on many platforms=C2=A0`int` will be a 32-bit integer (even 64-bit plat=
forms!). However, the range of an unsigned 16-bit integer is such that the =
product of two uint16_t&#39;s can include values whose product is larger th=
an whatever is representable in a signed 32-bit int, leading to overflow, a=
nd signed integer overflow is undefined overflow is undefined behavior. But=
 does that _matter_ in practice? Potentially: since signed int overflow is =
UB, the compiler can decide it would never happen. And so if the compiler d=
ecides, for whatever reason, that (say) a saturating multiplication is the =
best way to implement that multiplication, then that simple single-expressi=
on function will yield results that (I&#39;m pretty sure...) the programmer=
 did not anticipate for some subset of inputs. How do you fix this?</div><d=
iv><br></div><div>uint16_t mul(uint16_t a, uint16_t b) { unsigned int aa =
=3D a, bb =3D b; return aa * bb; }</div><div><br></div><div>That may sound =
very hypothetical, but similar things have shown up in the wild: <a href=3D=
"https://people.csail.mit.edu/nickolai/papers/wang-undef-2012-08-21.pdf">ht=
tps://people.csail.mit.edu/nickolai/papers/wang-undef-2012-08-21.pdf</a></d=
iv><div><br></div><div>In practice, this one is unlikely. But it&#39;s not =
impossible: the compiler would be right, the programmer would be wrong. One=
 thing I&#39;ve realized about C is that successive generations of compiler=
s have tightened the noose on UB so that code that has worked for *years* a=
ll of a sudden breaks one day. There be dragons in our code.</div><div><br>=
</div><div><div>After being bit one too many times by such issues in C I de=
cided to investigate alternatives. The choices at the time were either Rust=
 or Go: for the latter, one gets a nice, relatively simple language, but a =
big complex runtime. For the former, you get a big, ugly language, but a mi=
nimal runtime akin to C: to get it going, you really don&#39;t have to do m=
uch more than set up a stack and join to a function. While people have buil=
t systems running Go at the kernel level (<a href=3D"https://pdos.csail.mit=
.edu/papers/biscuit.pdf">https://pdos.csail.mit.edu/papers/biscuit.pdf</a>)=
, that seemed like a pretty heavy lift. On the other hand, if Rust could de=
liver on a quarter of the promises it made, I&#39;d be ahead of the game. T=
hat was sometime in the latter half of 2018 and since then I&#39;ve general=
ly been pleasantly surprised at how much it really does deliver.</div></div=
><div><br></div><div>For the above example, integer overflow is defined to =
trap. If you want wrapping (or saturating!) semantics, you request those ex=
plicitly:</div><div><br></div><div>fn mul(a: u16, b: u16) -&gt; u16 { a.wra=
pping_mul(b) }</div><div><br></div><div>This is perfectly well-defined, and=
 guaranteed to work pretty much forever.</div><div><br></div><blockquote cl=
ass=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid=
 rgb(204,204,204);padding-left:1ex">
But, my real issue came from some of the tutorials that I perused.=C2=A0 Ru=
st is<br>
being sold as &quot;safer&quot;.=C2=A0 As near as I can tell from the tutor=
ials, the model<br>
is that nothing works unless you enable it.=C2=A0 Want to be able to write =
a<br>
variable?=C2=A0 Turn that on.=C2=A0 So it seemed like the general style was=
 to write<br>
code and then turn various things on until it ran.<br></blockquote><div><br=
></div><div>That&#39;s one way to look at it, but I don&#39;t think that=
9;s the intent: the model is rather, &quot;immutable by default.&quot;</div=
><div><br></div><div>Rust forces you to think about mutability, ownership, =
and the semantics of taking references, because the compiler enforces invar=
iants on all of those things in a way that pretty much no other language do=
es. It is opinionated, and not shy about sharing those opinions.</div><div>=
<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8=
ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
To me, this implies a mindset that programming errors are more important<br=
>
than thinking errors, and that one should hack on things until they work<br=
>
instead of thinking about what one is doing.=C2=A0 I know that that&#39;s t=
he<br>
modern definition of programming, but will never be for me.</blockquote><di=
v><br></div><div>It&#39;s funny, I&#39;ve had the exact opposite experience=
.</div><div><br></div><div><div>I have found that it actually forces you to=
 invest a _lot_ more in-up front thought about what you&#39;re doing. Writi=
ng code first, and then sprinkling in `mut` and `unsafe` until it compiles =
is a symptom of writing what we called &quot;crust&quot; on my last project=
 at Google: that is, &quot;C in Rust syntax.&quot; When I convinced our tea=
m to switch from C(++) to Rust, but none of us were really particularly ade=
pt at the language, and all hit similar walls of frustration; at one point,=
 an engineer quipped, &quot;this language has a near-vertical learning curv=
e.&quot; And it&#39;s true that we took a multi-week productivity hit, but =
once we reached a certain level of familiarity, something equally curious h=
appened: our debugging load went way, _way_ down and we started moving much=
 faster.</div><div><br></div><div>It turned out it was harder to get a Rust=
 program to build at first, particularly with the bad habits we&#39;d built=
 up over decades of whatever languages we came from, but once it did those =
programs very often ran correctly the first time. You had to think _really =
hard_ about what data structures to use, their ownership=C2=A0semantics, th=
eir visibility, locking, etc. A lot of us had to absorb an emotional=C2=A0g=
ut punch when the compiler showed us things that we _knew_ were correct wer=
e, in fact, not correct. But once code compiled, it tended not to have the =
kinds of errors that were insta-panics or triple faults (or worse, silent c=
orruption you only noticed a million instructions later): no dangling point=
ers, no use-after-free bugs, no data races, no integer overflow, no out-of-=
bounds array references, etc. Simply put, the language _forced_ a level of =
discipline on us that even veteran C programmers didn&#39;t have.</div></di=
v><div><br></div><div>It also let us program at a moderately higher level o=
f abstraction; off-by-one errors were gone because we had things like itera=
tors. ADTs and a &quot;Maybe&quot; monad (the `Result&lt;T,E&gt;` type) gre=
atly improved our error handling. `match` statements have to be exhaustive =
so you can&#39;t add a variant to an enum and forget to update code to acco=
unt in just that one place (the compiler squawks at you). It&#39;s a small =
point, but the `?` operator removed a lot of tedious boilerplate from our c=
ode, making things clearer without sacrificing robust failure handling. Tup=
les for multiple return values instead of using pointers for output argumen=
ts (that have to be manually checked for validity!) are really useful. Patt=
ern matching and destructuring in a fast systems language? Good to go.</div=
><div><br></div><div>In contrast, I ran into a &quot;bug&quot; of sorts wit=
h KVM due to code I wrote that manifested itself as an &quot;x86 emulation =
error&quot; when it was anything but: I was turning on paging very early in=
 boot, and I had manually set up an identity mapping for the low 4GiB of ad=
dress space for the jump from 32-bit to 64-bit mode. I used gigabyte pages =
since it was easy, and I figured it would be supported, but I foolishly did=
n&#39;t check the CPU features when running this under virtualization for t=
esting and got that weird KVM error. What was going on? It turned out KVM i=
n this case didn&#39;t support gig pages, but the hardware did; the softwar=
e worked just fine until the first time the kernel went to do IO. Then, whe=
n the hypervisor went to fetch the instruction bytes to emulate the IO inst=
ruction, it saw the gig-sized pages and errored. Since the incompatibility =
was manifest deep in the bowels of the instruction emulation code, that was=
 the error that returned, even though it had nothing to do with instruction=
 emulation. It would have been nice to plumb through some kind of meaningfu=
l error message, but in C that&#39;s annoying at best. In Rust, it&#39;s tr=
ivial.=C2=A0<a href=3D"https://lexi-lambda.github.io/blog/2019/11/05/parse-=
don-t-validate/">https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-=
validate/</a></div><div><br></div><div>70% of CVEs out of Microsoft over th=
e last 15 years have been memory safety issues, and while we may poo-poo MS=
FT, they&#39;ve got some really great engineers and let&#39;s be honest: Un=
ix and Linux aren&#39;t that much better in this department. Our best and b=
rightest C programmers continue to turn out highly buggy programs despite 5=
0 years of experience.</div><div><br></div><div>But it&#39;s not perfect. T=
he allocator interface was a pain (it&#39;s defined to panic on allocation =
failure; I&#39;m cool with a NULL return), though work is ongoing in this a=
rea. There&#39;s no ergonomic way to initialize an object &#39;in-place&#39=
; (<a href=3D"https://mcyoung.xyz/2021/04/26/move-ctors/">https://mcyoung.x=
yz/2021/04/26/move-ctors/</a>), and there&#39;s no great way to say, essent=
ially, &quot;this points at RAM; even though I haven&#39;t initialized it, =
just trust me don&#39;t poison it&quot; (<a href=3D"https://users.rust-lang=
.org/t/is-it-possible-to-read-uninitialized-memory-without-invoking-ub/6309=
2">https://users.rust-lang.org/t/is-it-possible-to-read-uninitialized-memor=
y-without-invoking-ub/63092</a> -- we really need a `freeze` operation). Ho=
wever, right now? I think it sits at a local maxima for systems languages t=
argeting bare-metal.</div><div><br></div><div>=C2=A0 =C2=A0 =C2=A0 =C2=A0 -=
 Dan C.</div><div><br></div></div></div>

--000000000000f1d02a05d7397823--