From mboxrd@z Thu Jan  1 00:00:00 1970
From: crossd at gmail.com (Dan Cross)
Date: Thu, 4 Feb 2021 10:26:31 -0500
Subject: [COFF] [TUHS] 68k prototypes & microcode
In-Reply-To: <20210204013356.GA16541@mcvoy.com>
References: <E1l5RL3-0002iv-Qv@tanda>
 <abf50209-5730-f5a0-0fd6-aec13ee68440@e-bbes.com>
 <202102030759.1137x7C2013543@freefriends.org>
 <CAHTagfHdykiYmqPCkhQkUQTU8fLqJBukPOyj-k1ef=Ur9rqH+Q@mail.gmail.com>
 <202102030858.1138wuqd011051@freefriends.org>
 <CAHTagfGOC7vgE2Os+kuP4oGzvot2kG3MERpQdLb2EoEhUoFpyg@mail.gmail.com>
 <CAC20D2M_33YdQyuHdb7EM-UVNcdM0TXz9eJXpTftHekWxK0=Dg@mail.gmail.com>
 <27567.1612399305@hop.toad.com>
 <bce2c77e-dd8d-a0f2-5b27-0f9239c76738@kilonet.net>
 <20210204013356.GA16541@mcvoy.com>
Message-ID: <CAEoi9W5qLQSY25UT5szf1s4JPD6uf7_=JYny-8jMym+p5aMuAA@mail.gmail.com>

On Wed, Feb 3, 2021 at 8:34 PM Larry McVoy <lm at mcvoy.com> wrote:

> I have to admit that I haven't looked at ARM assembler, the M1 is making
> me rethink that.  Anyone have an opinion on where ARM lies in the pleasant
> to unpleasant scale?
>

Redirecting to "COFF" as this is drifting away from Unix.

I have a soft spot for ARM, but I wonder if I should. At first blush, it's
a pleasant RISC-ish design: loads and stores for dealing with memory,
arithmetic and logic instructions work on registers and/or immediate
operands, etc. As others have mentioned, there's an inline barrel shifter
in the ALU that a lot of instructions can take advantage of in their second
operand; you can rotate, shift, etc, an immediate or register operand while
executing an instruction: here's code for setting up page table entries for
an identity mapping for the low part of the physical address space (the
root page table pointer is at phys 0x40000000):

        MOV     r1, #0x0000
        MOVT    r1, #0x4000
        MOV     r0, #0
.Lpti:  MOV     r2, r0, LSL #20
        ORR     r2, r2, r3
        STR     r2, [r1], #4
        ADD     r0, r0, #1
        CMP     r0, #2048
        BNE     .Lpti

(Note the `LSL #20` in the `MOV` instruction.)

32-bit ARM also has some niceness for conditionally executing instructions
based on currently set condition codes in the PSW, so you might see
something like:

1:      CMP     r0, #0
        ADDNE   r1, r1, #1
        SUBNE   r0, r0, #1
        BNE     1b

The architecture tends to map nicely to C and similar languages (e.g.
Rust). There is a rich set of instructions for various kinds of arithmetic;
for instance, they support saturating instructions for DSP-style code. You
can push multiple registers onto the stack at once, which is a little odd
for a RISC ISA, but works ok in practice.

The supervisor instruction set is pretty nice. IO is memory-mapped, etc.
There's a co-processor interface for working with MMUs and things like it.
Memory mapping is a little weird, in that the first-level page table isn't
the same second-level tables: the first-level page table maps the 32-bit
address space into 1MiB "sections", each of which is described by a 32-bit
section descriptor; thus, to map the entire 4GiB space, you need 4096 of
those in 16KiB of physically contiguous RAM. At the second-level, 4KiB page
frames map page into the 1MiB section at different granularities; I think
the smallest is 1KIB (thus, you need 1024 32-bit entries). To map a 4KiB
virtual page to a 4KiB PFN, you repeat the relevant entry 4 times in the
second-level page. It ends up being kind of annoying. I did a little toy
kernel for ARM32 and ended up deciding to use 16KiB pages (basically, I map
4x4KiB contiguous pages) so I could allocate a single sized structure for
the page tables themselves.

Starting with the ARMv8 architecture, it's been split into 32-bit aarch32
(basically the above) and 64-bit aarch64; the latter has expanded the
number and width of general purpose registers, one is a zero register in
some contexts (and I think a stack pointer in others? I forget the
details). I haven't played around with it too much, but looked at it when
it came out and thought "this is reasonable, with some concessions for
backwards compatibility." They cleaned up the paging weirdness mentioned
above. The multiple push instruction has been retired and replaced with a
"push a pair of adjacent registers" instruction; I viewed that as a
concession between code size and instruction set orthogonality.

So...Overall quite pleasant, and far better than x86_64, but with some
oddities.

        - Dan C.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://minnie.tuhs.org/pipermail/coff/attachments/20210204/56116de1/attachment.htm>