The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
* [TUHS] Re: Clever code
@ 2022-12-13  3:30 Rudi Blom
  2022-12-13  3:41 ` Warner Losh
                   ` (2 more replies)
  0 siblings, 3 replies; 31+ messages in thread
From: Rudi Blom @ 2022-12-13  3:30 UTC (permalink / raw)
  To: tuhs

[-- Attachment #1: Type: text/plain, Size: 572 bytes --]

I vaguely remember having read here about 'clever code' which took into
account the time a magnetic drum needed to rotate in order to optimise
access.

Similarly I can imagine that with resource restraints you sometimes need to
be clever in order to get your program to fit. Of course, any such
cleverness needs extra documentation.

I only ever programmed in user space but even then without lots of comment
in my code I may already start wondering what I did after only a few months
past.

Cheers,
uncle rubl
-- 
The more I learn the better I understand I know nothing.

[-- Attachment #2: Type: text/html, Size: 800 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Re: Clever code
  2022-12-13  3:30 [TUHS] Re: Clever code Rudi Blom
@ 2022-12-13  3:41 ` Warner Losh
  2022-12-13  3:53 ` Dave Horsfall
  2022-12-13 15:52 ` [TUHS] Re: Clever code Bakul Shah
  2 siblings, 0 replies; 31+ messages in thread
From: Warner Losh @ 2022-12-13  3:41 UTC (permalink / raw)
  To: Rudi Blom; +Cc: The Eunuchs Hysterical Society

[-- Attachment #1: Type: text/plain, Size: 1033 bytes --]

On Mon, Dec 12, 2022, 8:32 PM Rudi Blom <rudi.j.blom@gmail.com> wrote:

>
> I vaguely remember having read here about 'clever code' which took into
> account the time a magnetic drum needed to rotate in order to optimise
> access.
>

Yes. Many ways this was done. Biggest ones were interleaving and striding.
Interleaving allowed one a little processing time for each sector while the
disk fpu. So the next logical sector isn't the next physical... and the
sectors are numbered in adjacent tracks to take into account rotation and
seek times.... there is a lot of research here...

Warner

Similarly I can imagine that with resource restraints you sometimes need to
> be clever in order to get your program to fit. Of course, any such
> cleverness needs extra documentation.
>
> I only ever programmed in user space but even then without lots of comment
> in my code I may already start wondering what I did after only a few months
> past.
>
> Cheers,
> uncle rubl
> --
> The more I learn the better I understand I know nothing.
>
>

[-- Attachment #2: Type: text/html, Size: 1780 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Re: Clever code
  2022-12-13  3:30 [TUHS] Re: Clever code Rudi Blom
  2022-12-13  3:41 ` Warner Losh
@ 2022-12-13  3:53 ` Dave Horsfall
  2022-12-13  4:03   ` George Michaelson
                     ` (2 more replies)
  2022-12-13 15:52 ` [TUHS] Re: Clever code Bakul Shah
  2 siblings, 3 replies; 31+ messages in thread
From: Dave Horsfall @ 2022-12-13  3:53 UTC (permalink / raw)
  To: The Eunuchs Hysterical Society

[-- Attachment #1: Type: text/plain, Size: 1026 bytes --]

On Tue, 13 Dec 2022, Rudi Blom wrote:

> I vaguely remember having read here about 'clever code' which took into
> account the time a magnetic drum needed to rotate in order to optimise
> access.

Sounds like you're referring to SOAP (Symbolic Optimal Assembly Program) 
on the IBM 650; the programmer wrote the code "straight down" and SOAP 
reordered it for rotational latency.

> Similarly I can imagine that with resource restraints you sometimes need to
> be clever in order to get your program to fit. Of course, any such
> cleverness needs extra documentation.

Try writing a bootstrap program in 512 bytes :-)  Self-modifying code was
the order of the day...

> I only ever programmed in user space but even then without lots of comment
> in my code I may already start wondering what I did after only a few months
> past.

You could be clever in kernel space too, such as taking advantage of
the DATIP/DATO cycles on DEC's Unibus when updating a memory word i.e. 
read/modify/write.

-- Dave

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Re: Clever code
  2022-12-13  3:53 ` Dave Horsfall
@ 2022-12-13  4:03   ` George Michaelson
  2022-12-13  8:05     ` Ralph Corderoy
  2022-12-13  7:47   ` Ralph Corderoy
  2022-12-13 11:46   ` John P. Linderman
  2 siblings, 1 reply; 31+ messages in thread
From: George Michaelson @ 2022-12-13  4:03 UTC (permalink / raw)
  To: Dave Horsfall; +Cc: The Eunuchs Hysterical Society

The "sticky" bit was quite clever. tell the OS to keep something
memory resident, so you can binary patch the back end without
worrying.

Copy on Write was enormously clever.  Keeping FD open across fork/exec
was fantastically clever.

making a null pointer be a valid address was probably not clever
(PDP11) but I expect somebody will explain to me why it was clever.

On Tue, Dec 13, 2022 at 1:53 PM Dave Horsfall <dave@horsfall.org> wrote:
>
> On Tue, 13 Dec 2022, Rudi Blom wrote:
>
> > I vaguely remember having read here about 'clever code' which took into
> > account the time a magnetic drum needed to rotate in order to optimise
> > access.
>
> Sounds like you're referring to SOAP (Symbolic Optimal Assembly Program)
> on the IBM 650; the programmer wrote the code "straight down" and SOAP
> reordered it for rotational latency.
>
> > Similarly I can imagine that with resource restraints you sometimes need to
> > be clever in order to get your program to fit. Of course, any such
> > cleverness needs extra documentation.
>
> Try writing a bootstrap program in 512 bytes :-)  Self-modifying code was
> the order of the day...
>
> > I only ever programmed in user space but even then without lots of comment
> > in my code I may already start wondering what I did after only a few months
> > past.
>
> You could be clever in kernel space too, such as taking advantage of
> the DATIP/DATO cycles on DEC's Unibus when updating a memory word i.e.
> read/modify/write.
>
> -- Dave

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Re: Clever code
  2022-12-13  3:53 ` Dave Horsfall
  2022-12-13  4:03   ` George Michaelson
@ 2022-12-13  7:47   ` Ralph Corderoy
  2022-12-13 19:56     ` Dave Horsfall
  2022-12-13 11:46   ` John P. Linderman
  2 siblings, 1 reply; 31+ messages in thread
From: Ralph Corderoy @ 2022-12-13  7:47 UTC (permalink / raw)
  To: tuhs

Hi Dave,

> Rudi Blom wrote:
> > I vaguely remember having read here about 'clever code' which took
> > into account the time a magnetic drum needed to rotate in order to
> > optimise access.
>
> Sounds like you're referring to SOAP (Symbolic Optimal Assembly
> Program) 

I'd have thought the most widespread tale of drum-rotation time is the
wonderful prose version of ‘The Story of Mel’.

-- 
Cheers, Ralph.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Re: Clever code
  2022-12-13  4:03   ` George Michaelson
@ 2022-12-13  8:05     ` Ralph Corderoy
  2022-12-13  9:45       ` Dagobert Michelsen
  0 siblings, 1 reply; 31+ messages in thread
From: Ralph Corderoy @ 2022-12-13  8:05 UTC (permalink / raw)
  To: tuhs

Hi George,

> The "sticky" bit was quite clever. tell the OS to keep something
> memory resident

Was it ever more than a hint?  I've not heard of it locking or pinning
an executable to memory.

> so you can binary patch the back end without worrying.

I think it altered the worrying.  :-)

If not sticky, ‘chmod -x’ the file, ensure no process is running it from
before the chmod, patch, ‘chmod +x’.

If sticky then ‘chmod -tx’, ensure no process is running it from before
the chmod, ‘chmod +x’, run it and exit to remove any stuck copy,
‘chmod -x’, patch, chmod ‘+tx’.

I may have got the detail wrong but avoiding the unpatched copy
lingering in memory is the aim.

-- 
Cheers, Ralph.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Re: Clever code
  2022-12-13  8:05     ` Ralph Corderoy
@ 2022-12-13  9:45       ` Dagobert Michelsen
  0 siblings, 0 replies; 31+ messages in thread
From: Dagobert Michelsen @ 2022-12-13  9:45 UTC (permalink / raw)
  To: Ralph Corderoy; +Cc: segaloco via TUHS

Hi Ralph,

> Am 13.12.2022 um 09:05 schrieb Ralph Corderoy <ralph@inputplus.co.uk>:
> 
> Hi George,
> 
>> The "sticky" bit was quite clever. tell the OS to keep something
>> memory resident
> 
> Was it ever more than a hint?  I've not heard of it locking or pinning
> an executable to memory.

Funny thing is that it meant the opposite on Solaris: files with the
sticky bit set avoided the file to go through the page cache and therefore
would never be held in memory. This was e.g. set for swapfiles.
Obviously it is counterproductive to cache stuff in memory which is going
to be swapped out...


Best regards

  — Dago

-- 
"You don't become great by trying to be great, you become great by wanting to do something,
and then doing it so hard that you become great in the process." - xkcd #896


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Re: Clever code
  2022-12-13  3:53 ` Dave Horsfall
  2022-12-13  4:03   ` George Michaelson
  2022-12-13  7:47   ` Ralph Corderoy
@ 2022-12-13 11:46   ` John P. Linderman
  2022-12-13 14:07     ` Douglas McIlroy
  2 siblings, 1 reply; 31+ messages in thread
From: John P. Linderman @ 2022-12-13 11:46 UTC (permalink / raw)
  To: Dave Horsfall; +Cc: The Eunuchs Hysterical Society

[-- Attachment #1: Type: text/plain, Size: 1606 bytes --]

There was a story that old hands would torment newcomers to the IBM 650
by tinkering with the optimizer to make it as slow as possible (and, with
rotating drums, that could be VERY slow). Then they'd look at the
newcomer's code, make a trivial change, run it with the real optimizer,
and get dazzling improvements.

I also recall punched card bootstrap programs for the IBM 7094 that
would load column binary when run column binary, and load row binary
when run row binary. -- jpl

On Mon, Dec 12, 2022 at 10:53 PM Dave Horsfall <dave@horsfall.org> wrote:

> On Tue, 13 Dec 2022, Rudi Blom wrote:
>
> > I vaguely remember having read here about 'clever code' which took into
> > account the time a magnetic drum needed to rotate in order to optimise
> > access.
>
> Sounds like you're referring to SOAP (Symbolic Optimal Assembly Program)
> on the IBM 650; the programmer wrote the code "straight down" and SOAP
> reordered it for rotational latency.
>
> > Similarly I can imagine that with resource restraints you sometimes need
> to
> > be clever in order to get your program to fit. Of course, any such
> > cleverness needs extra documentation.
>
> Try writing a bootstrap program in 512 bytes :-)  Self-modifying code was
> the order of the day...
>
> > I only ever programmed in user space but even then without lots of
> comment
> > in my code I may already start wondering what I did after only a few
> months
> > past.
>
> You could be clever in kernel space too, such as taking advantage of
> the DATIP/DATO cycles on DEC's Unibus when updating a memory word i.e.
> read/modify/write.
>
> -- Dave

[-- Attachment #2: Type: text/html, Size: 2531 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Re: Clever code
  2022-12-13 11:46   ` John P. Linderman
@ 2022-12-13 14:07     ` Douglas McIlroy
  2022-12-13 14:31       ` arnold
  0 siblings, 1 reply; 31+ messages in thread
From: Douglas McIlroy @ 2022-12-13 14:07 UTC (permalink / raw)
  To: John P. Linderman; +Cc: The Eunuchs Hysterical Society

Apropos of accessing rotating storage, John Kelly used to describe the
Packard-Bell 250, which had a delay-line memory, as a machine where
addresses refer to time rather than space.

The PB 250 had two instruction-sequencing modes. In one mode, each
instruction included the address of its successor. In the other mode,
whatever popped out the delay line when the current instruction
completed would be executed next.

Doug

On Tue, Dec 13, 2022 at 6:48 AM John P. Linderman <jpl.jpl@gmail.com> wrote:
>
> There was a story that old hands would torment newcomers to the IBM 650
> by tinkering with the optimizer to make it as slow as possible (and, with
> rotating drums, that could be VERY slow). Then they'd look at the
> newcomer's code, make a trivial change, run it with the real optimizer,
> and get dazzling improvements.
>
> I also recall punched card bootstrap programs for the IBM 7094 that
> would load column binary when run column binary, and load row binary
> when run row binary. -- jpl
>
> On Mon, Dec 12, 2022 at 10:53 PM Dave Horsfall <dave@horsfall.org> wrote:
>>
>> On Tue, 13 Dec 2022, Rudi Blom wrote:
>>
>> > I vaguely remember having read here about 'clever code' which took into
>> > account the time a magnetic drum needed to rotate in order to optimise
>> > access.
>>
>> Sounds like you're referring to SOAP (Symbolic Optimal Assembly Program)
>> on the IBM 650; the programmer wrote the code "straight down" and SOAP
>> reordered it for rotational latency.
>>
>> > Similarly I can imagine that with resource restraints you sometimes need to
>> > be clever in order to get your program to fit. Of course, any such
>> > cleverness needs extra documentation.
>>
>> Try writing a bootstrap program in 512 bytes :-)  Self-modifying code was
>> the order of the day...
>>
>> > I only ever programmed in user space but even then without lots of comment
>> > in my code I may already start wondering what I did after only a few months
>> > past.
>>
>> You could be clever in kernel space too, such as taking advantage of
>> the DATIP/DATO cycles on DEC's Unibus when updating a memory word i.e.
>> read/modify/write.
>>
>> -- Dave

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Re: Clever code
  2022-12-13 14:07     ` Douglas McIlroy
@ 2022-12-13 14:31       ` arnold
  2022-12-13 14:48         ` Ralph Corderoy
                           ` (2 more replies)
  0 siblings, 3 replies; 31+ messages in thread
From: arnold @ 2022-12-13 14:31 UTC (permalink / raw)
  To: jpl.jpl, douglas.mcilroy; +Cc: tuhs

Douglas McIlroy <douglas.mcilroy@dartmouth.edu> wrote:

> Apropos of accessing rotating storage, John Kelly used to describe the
> Packard-Bell 250, which had a delay-line memory, as a machine where
> addresses refer to time rather than space.
>
> The PB 250 had two instruction-sequencing modes. In one mode, each
> instruction included the address of its successor. In the other mode,
> whatever popped out the delay line when the current instruction
> completed would be executed next.
>
> Doug

For us (relative) youngsters, can you explain some more how delay
line memory worked? The second mode you describe sounds like it
would be impossible to use if you wanted repeatable, reproducible
runs of your program.

Thanks,

Arnold

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Re: Clever code
  2022-12-13 14:31       ` arnold
@ 2022-12-13 14:48         ` Ralph Corderoy
  2022-12-13 15:10         ` Douglas McIlroy
  2022-12-15  6:30         ` [TUHS] Delay line memory (was: Clever code) Greg 'groggy' Lehey
  2 siblings, 0 replies; 31+ messages in thread
From: Ralph Corderoy @ 2022-12-13 14:48 UTC (permalink / raw)
  To: tuhs

Hi Arnold,

> > The PB 250 had two instruction-sequencing modes.  In one mode, each
> > instruction included the address of its successor.  In the other
> > mode, whatever popped out the delay line when the current
> > instruction completed would be executed next.
...
> The second mode you describe sounds like it would be impossible to use
> if you wanted repeatable, reproducible runs of your program.

How so?  As long as the time taken by the current instruction was
constant then it would be known what's popping out of the delay line
when it's done.  And if the time varied, say because of an operand or
the need to carry, then that could be used to choose between addresses.
Either way, it's repeatable.

It is as if the PC register on today's CPU was steadily trundling
through program memory in parallel to the execution of the current
instruction and when the fetch cycle started, it got whatever PC was
pointing at.

BTW, there's https://en.wikipedia.org/wiki/Delay-line_memory if you
don't know much about them, though I don't think it covers how the
line's content was modified other than a simple block diagram showing
taps for input and output.
https://en.wikipedia.org/wiki/Delay-line_memory#/media/File:SEACComputer_010.png

-- 
Cheers, Ralph.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Re: Clever code
  2022-12-13 14:31       ` arnold
  2022-12-13 14:48         ` Ralph Corderoy
@ 2022-12-13 15:10         ` Douglas McIlroy
  2022-12-13 15:34           ` Stuff Received
                             ` (3 more replies)
  2022-12-15  6:30         ` [TUHS] Delay line memory (was: Clever code) Greg 'groggy' Lehey
  2 siblings, 4 replies; 31+ messages in thread
From: Douglas McIlroy @ 2022-12-13 15:10 UTC (permalink / raw)
  To: arnold; +Cc: tuhs

A delay line is logically like a drum, with circulating data that is
accessible only at one point on the circle. A delay line was
effectively a linear channel along which a train of data pulses was
sent. Pulses received at the far end were reshaped electronically. and
reinjected at the sending end. One kind of delay line was a mercury
column that carried acoustic pulses.. The PB 250 delay line was
magnetostrictive (a technology I know nothing about).

If instruction timing is known, then the next instruction to appear is
predictable. The only caveat is that instruction times should not be
data-dependent. You can lay out sequential code along the circle as
long as no instruction steps on one already placed. When that happens
you must switch modes to jump to an open spot, or perhaps insert nops
to jiggle the layout.

Doug

On Tue, Dec 13, 2022 at 9:31 AM <arnold@skeeve.com> wrote:
>
> Douglas McIlroy <douglas.mcilroy@dartmouth.edu> wrote:
>
> > Apropos of accessing rotating storage, John Kelly used to describe the
> > Packard-Bell 250, which had a delay-line memory, as a machine where
> > addresses refer to time rather than space.
> >
> > The PB 250 had two instruction-sequencing modes. In one mode, each
> > instruction included the address of its successor. In the other mode,
> > whatever popped out the delay line when the current instruction
> > completed would be executed next.
> >
> > Doug
>
> For us (relative) youngsters, can you explain some more how delay
> line memory worked? The second mode you describe sounds like it
> would be impossible to use if you wanted repeatable, reproducible
> runs of your program.
>
> Thanks,
>
> Arnold

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Re: Clever code
  2022-12-13 15:10         ` Douglas McIlroy
@ 2022-12-13 15:34           ` Stuff Received
  2022-12-13 15:56             ` Ralph Corderoy
  2022-12-13 23:02           ` Harald Arnesen
                             ` (2 subsequent siblings)
  3 siblings, 1 reply; 31+ messages in thread
From: Stuff Received @ 2022-12-13 15:34 UTC (permalink / raw)
  To: tuhs

On 2022-12-13 10:10, Douglas McIlroy wrote:
> A delay line is logically like a drum, with circulating data that is
> accessible only at one point on the circle. A delay line was
> effectively a linear channel along which a train of data pulses was
> sent. Pulses received at the far end were reshaped electronically. and
> reinjected at the sending end.

I had always thought of a delay line as a precursor to a register (or 
stack) for storing intermediate results.  Is this not an accurate way of 
thinking about it?

N.


> One kind of delay line was a mercury
> column that carried acoustic pulses.. The PB 250 delay line was
> magnetostrictive (a technology I know nothing about).
> 
> If instruction timing is known, then the next instruction to appear is
> predictable. The only caveat is that instruction times should not be
> data-dependent. You can lay out sequential code along the circle as
> long as no instruction steps on one already placed. When that happens
> you must switch modes to jump to an open spot, or perhaps insert nops
> to jiggle the layout.
> 
> Doug
> 
> On Tue, Dec 13, 2022 at 9:31 AM <arnold@skeeve.com> wrote:
>>
>> Douglas McIlroy <douglas.mcilroy@dartmouth.edu> wrote:
>>
>>> Apropos of accessing rotating storage, John Kelly used to describe the
>>> Packard-Bell 250, which had a delay-line memory, as a machine where
>>> addresses refer to time rather than space.
>>>
>>> The PB 250 had two instruction-sequencing modes. In one mode, each
>>> instruction included the address of its successor. In the other mode,
>>> whatever popped out the delay line when the current instruction
>>> completed would be executed next.
>>>
>>> Doug
>>
>> For us (relative) youngsters, can you explain some more how delay
>> line memory worked? The second mode you describe sounds like it
>> would be impossible to use if you wanted repeatable, reproducible
>> runs of your program.
>>
>> Thanks,
>>
>> Arnold


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Re: Clever code
  2022-12-13  3:30 [TUHS] Re: Clever code Rudi Blom
  2022-12-13  3:41 ` Warner Losh
  2022-12-13  3:53 ` Dave Horsfall
@ 2022-12-13 15:52 ` Bakul Shah
  2022-12-13 16:14   ` Ralph Corderoy
  2022-12-15  6:39   ` [TUHS] Sector interleaving (was: Clever code) Greg 'groggy' Lehey
  2 siblings, 2 replies; 31+ messages in thread
From: Bakul Shah @ 2022-12-13 15:52 UTC (permalink / raw)
  To: Rudi Blom; +Cc: tuhs

On Dec 12, 2022, at 7:30 PM, Rudi Blom <rudi.j.blom@gmail.com> wrote:
> 
> I vaguely remember having read here about 'clever code' which took into account the time a magnetic drum needed to rotate in order to optimise access.

Similar consideration applied in the early days of unix workstations.
Fortune 32:16 was a 5.6Mhz machine and couldn't process 1020KB/sec
(17 sectors/track of early ST412/ST506 disks) fast enough. As Warner
said, one dealt with it by formatting the disk so that the logical
blocks N & N+1 (from the OS PoV) were physically more than 1 sector
apart. No clever coding needed!

The "clever" coding we did was to use all 17 sectors rather than 16
+ 1 spare as was intended. Since our first disks were only 5MB in
size, we wanted to use all the space and typical error rate is much
less than 6%. This complicated handling bad blocks and slowed things
down as blocks with soft read errors were copied to blocks at the
very end of the disk. I don't recall whose idea it was but I was the
one who implemented it. I had an especially bad disk for testing on
which I used to build V7 kernels....

> Similarly I can imagine that with resource restraints you sometimes need to be clever in order to get your program to fit. Of course, any such cleverness needs extra documentation.

One has to be careful here as resource constraints change over time.
An optimization for rev N h/w can be a *pessimization* for later revs.
Even if you document code, people tend to leave "working code" alone.

> I only ever programmed in user space but even then without lots of comment in my code I may already start wondering what I did after only a few months past.

Over time comments tend to turn into fake news! Always go to the
primary sources!

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Re: Clever code
  2022-12-13 15:34           ` Stuff Received
@ 2022-12-13 15:56             ` Ralph Corderoy
  0 siblings, 0 replies; 31+ messages in thread
From: Ralph Corderoy @ 2022-12-13 15:56 UTC (permalink / raw)
  To: tuhs

Hi N.,

> I had always thought of a delay line as a precursor to a register (or
> stack) for storing intermediate results.  Is this not an accurate way
> of thinking about it?

As an example, https://en.wikipedia.org/wiki/EDVAC#Technical_description
says

    Physically, the computer comprised the following components:

        - a magnetic tape reader-recorder (Wilkes 1956:36 describes this
          as a wire recorder.)
        ...
        - a dual memory unit consisting of two sets of 64 mercury
          acoustic delay lines of eight words capacity on each line
          [1 Ki words]
        - three temporary delay-line tanks each holding a single word

It looks like the three temporaries were more akin to a stack or
registers with the main delay lines providing working memory distinct
from tape storage.

Another analogy for a delay line might be a steadily turning Rolodex
where the card on display can be read and then written, perhaps with a
different value, before it disappears.

-- 
Cheers, Ralph.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Re: Clever code
  2022-12-13 15:52 ` [TUHS] Re: Clever code Bakul Shah
@ 2022-12-13 16:14   ` Ralph Corderoy
  2022-12-13 16:30     ` Bakul Shah
  2022-12-15  6:39   ` [TUHS] Sector interleaving (was: Clever code) Greg 'groggy' Lehey
  1 sibling, 1 reply; 31+ messages in thread
From: Ralph Corderoy @ 2022-12-13 16:14 UTC (permalink / raw)
  To: tuhs

Hi Bakul,

> Fortune 32:16 was a 5.6Mhz machine and couldn't process 1020KB/sec
> (17 sectors/track of early ST412/ST506 disks) fast enough.  As Warner
> said, one dealt with it by formatting the disk so that the logical
> blocks N & N+1 (from the OS PoV) were physically more than 1 sector
> apart.

Sticking with ST506 hard drives, by the time the 8 MHz ARM2 from Acorn
was reading a 56 MB Rodime, it was the drive which couldn't keep up so
executables were stored compressed on disk so the CPU had something to
do, uncompressing the sector's content, while it waited for the next
sector to arrive.  :-)

-- 
Cheers, Ralph.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Re: Clever code
  2022-12-13 16:14   ` Ralph Corderoy
@ 2022-12-13 16:30     ` Bakul Shah
  0 siblings, 0 replies; 31+ messages in thread
From: Bakul Shah @ 2022-12-13 16:30 UTC (permalink / raw)
  To: Ralph Corderoy; +Cc: tuhs

On Dec 13, 2022, at 8:14 AM, Ralph Corderoy <ralph@inputplus.co.uk> wrote:
> 
> Hi Bakul,
> 
>> Fortune 32:16 was a 5.6Mhz machine and couldn't process 1020KB/sec
>> (17 sectors/track of early ST412/ST506 disks) fast enough.  As Warner
>> said, one dealt with it by formatting the disk so that the logical
>> blocks N & N+1 (from the OS PoV) were physically more than 1 sector
>> apart.
> 
> Sticking with ST506 hard drives, by the time the 8 MHz ARM2 from Acorn
> was reading a 56 MB Rodime, it was the drive which couldn't keep up so
> executables were stored compressed on disk so the CPU had something to
> do, uncompressing the sector's content, while it waited for the next
> sector to arrive.  :-)

IIRC the slow part was due to running some common apps! Not much buffering
allowed on a 256KB machine so by the time the app asks for the next block,
on a 1:1 interleave the block would be past the read head and you had to
spend an extra revolution to grab it!

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Re: Clever code
  2022-12-13  7:47   ` Ralph Corderoy
@ 2022-12-13 19:56     ` Dave Horsfall
  0 siblings, 0 replies; 31+ messages in thread
From: Dave Horsfall @ 2022-12-13 19:56 UTC (permalink / raw)
  To: The Eunuchs Hysterical Society

[-- Attachment #1: Type: text/plain, Size: 206 bytes --]

On Tue, 13 Dec 2022, Ralph Corderoy wrote:

> I'd have thought the most widespread tale of drum-rotation time is the 
> wonderful prose version of ‘The Story of Mel’.

Ah yes, a classic!

-- Dave

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Re: Clever code
  2022-12-13 15:10         ` Douglas McIlroy
  2022-12-13 15:34           ` Stuff Received
@ 2022-12-13 23:02           ` Harald Arnesen
  2022-12-14  7:31           ` arnold
  2022-12-15 18:06           ` Marc Donner
  3 siblings, 0 replies; 31+ messages in thread
From: Harald Arnesen @ 2022-12-13 23:02 UTC (permalink / raw)
  To: tuhs

Douglas McIlroy [13/12/2022 16.10]:

> A delay line is logically like a drum, with circulating data that is
> accessible only at one point on the circle. A delay line was
> effectively a linear channel along which a train of data pulses was
> sent. Pulses received at the far end were reshaped electronically. and
> reinjected at the sending end. One kind of delay line was a mercury
> column that carried acoustic pulses.

And according to Alan Turing, it might have been implemented with Gin 
and Tonic.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Re: Clever code
  2022-12-13 15:10         ` Douglas McIlroy
  2022-12-13 15:34           ` Stuff Received
  2022-12-13 23:02           ` Harald Arnesen
@ 2022-12-14  7:31           ` arnold
  2022-12-15 18:06           ` Marc Donner
  3 siblings, 0 replies; 31+ messages in thread
From: arnold @ 2022-12-14  7:31 UTC (permalink / raw)
  To: douglas.mcilroy, arnold; +Cc: tuhs

Thank you for the explanation.  The skill of the programmers who had to
write code for such machines amazes me. I might could have held all
that kind of info in my head when I was younger, but certainly not today...

Thanks,

Arnold

Douglas McIlroy <douglas.mcilroy@dartmouth.edu> wrote:

> A delay line is logically like a drum, with circulating data that is
> accessible only at one point on the circle. A delay line was
> effectively a linear channel along which a train of data pulses was
> sent. Pulses received at the far end were reshaped electronically. and
> reinjected at the sending end. One kind of delay line was a mercury
> column that carried acoustic pulses.. The PB 250 delay line was
> magnetostrictive (a technology I know nothing about).
>
> If instruction timing is known, then the next instruction to appear is
> predictable. The only caveat is that instruction times should not be
> data-dependent. You can lay out sequential code along the circle as
> long as no instruction steps on one already placed. When that happens
> you must switch modes to jump to an open spot, or perhaps insert nops
> to jiggle the layout.
>
> Doug
>
> On Tue, Dec 13, 2022 at 9:31 AM <arnold@skeeve.com> wrote:
> >
> > Douglas McIlroy <douglas.mcilroy@dartmouth.edu> wrote:
> >
> > > Apropos of accessing rotating storage, John Kelly used to describe the
> > > Packard-Bell 250, which had a delay-line memory, as a machine where
> > > addresses refer to time rather than space.
> > >
> > > The PB 250 had two instruction-sequencing modes. In one mode, each
> > > instruction included the address of its successor. In the other mode,
> > > whatever popped out the delay line when the current instruction
> > > completed would be executed next.
> > >
> > > Doug
> >
> > For us (relative) youngsters, can you explain some more how delay
> > line memory worked? The second mode you describe sounds like it
> > would be impossible to use if you wanted repeatable, reproducible
> > runs of your program.
> >
> > Thanks,
> >
> > Arnold

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Delay line memory (was: Clever code)
  2022-12-13 14:31       ` arnold
  2022-12-13 14:48         ` Ralph Corderoy
  2022-12-13 15:10         ` Douglas McIlroy
@ 2022-12-15  6:30         ` Greg 'groggy' Lehey
  2 siblings, 0 replies; 31+ messages in thread
From: Greg 'groggy' Lehey @ 2022-12-15  6:30 UTC (permalink / raw)
  To: arnold; +Cc: douglas.mcilroy, tuhs

[-- Attachment #1: Type: text/plain, Size: 715 bytes --]

On Tuesday, 13 December 2022 at  7:31:12 -0700, arnold@skeeve.com wrote:
> For us (relative) youngsters, can you explain some more how delay
> line memory worked?

At a slight tangent, if you're ever in Melbourne (Australia, of
course), you should take a look at CSIRAC, allegedly the oldest intact
computer still in existence.  It had delay line memory.  I took some
photos of it years ago:
http://www.lemis.com/grog/photos/Photos.php?dirdate=20040904

Greg
--
Sent from my desktop computer.
Finger grog@lemis.com for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA.php

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 163 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Sector interleaving (was: Clever code)
  2022-12-13 15:52 ` [TUHS] Re: Clever code Bakul Shah
  2022-12-13 16:14   ` Ralph Corderoy
@ 2022-12-15  6:39   ` Greg 'groggy' Lehey
  1 sibling, 0 replies; 31+ messages in thread
From: Greg 'groggy' Lehey @ 2022-12-15  6:39 UTC (permalink / raw)
  To: Bakul Shah; +Cc: Rudi Blom, tuhs

[-- Attachment #1: Type: text/plain, Size: 1529 bytes --]

On Tuesday, 13 December 2022 at  7:52:49 -0800, Bakul Shah wrote:
> On Dec 12, 2022, at 7:30 PM, Rudi Blom <rudi.j.blom@gmail.com> wrote:
>>
>> I vaguely remember having read here about 'clever code' which took
>> into account the time a magnetic drum needed to rotate in order to
>> optimise access.
>
> Similar consideration applied in the early days of unix workstations.
> Fortune 32:16 was a 5.6Mhz machine and couldn't process 1020KB/sec
> (17 sectors/track of early ST412/ST506 disks) fast enough. As Warner
> said, one dealt with it by formatting the disk so that the logical
> blocks N & N+1 (from the OS PoV) were physically more than 1 sector
> apart. No clever coding needed!

CP/M did something similar with floppy disks.  It imposed a 6 fold
software interleave between sectors (logical sectors 1, 2, 3.. were
"physical" sectors 1, 7, 13...)

On soft-sectored floppies, the "physical" sectors were really just the
numbers in the sector header.  By the time I got involved, computers
were far fast enough that they spent a lot of time just waiting for
the next sector.  I wrote a format program that positioned the
"physical" sectors so that there was only one sector between
"physical" 1, 7, 13 and so.  It made an amazing difference to the disk
speed.

Greg
--
Sent from my desktop computer.
Finger grog@lemis.com for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA.php

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 163 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Re: Clever code
  2022-12-13 15:10         ` Douglas McIlroy
                             ` (2 preceding siblings ...)
  2022-12-14  7:31           ` arnold
@ 2022-12-15 18:06           ` Marc Donner
  2022-12-15 18:08             ` Marc Donner
  3 siblings, 1 reply; 31+ messages in thread
From: Marc Donner @ 2022-12-15 18:06 UTC (permalink / raw)
  To: Douglas McIlroy; +Cc: tuhs

[-- Attachment #1: Type: text/plain, Size: 2329 bytes --]

Further on delay line storage:

Physically one of the most common ones was a cylinder of liquid mercury.
There was a device at one end for introducing pressure waves into the
mercury (think loudspeaker) and a device at the other end for measuring the
pressure waves arriving (think microphone).  The pulses that came off the
microphone end were then fed back to the loudspeaker end, after being
cleaned up.
=====
nygeek.net
mindthegapdialogs.com/home <https://www.mindthegapdialogs.com/home>


On Tue, Dec 13, 2022 at 10:12 AM Douglas McIlroy <
douglas.mcilroy@dartmouth.edu> wrote:

> A delay line is logically like a drum, with circulating data that is
> accessible only at one point on the circle. A delay line was
> effectively a linear channel along which a train of data pulses was
> sent. Pulses received at the far end were reshaped electronically. and
> reinjected at the sending end. One kind of delay line was a mercury
> column that carried acoustic pulses.. The PB 250 delay line was
> magnetostrictive (a technology I know nothing about).
>
> If instruction timing is known, then the next instruction to appear is
> predictable. The only caveat is that instruction times should not be
> data-dependent. You can lay out sequential code along the circle as
> long as no instruction steps on one already placed. When that happens
> you must switch modes to jump to an open spot, or perhaps insert nops
> to jiggle the layout.
>
> Doug
>
> On Tue, Dec 13, 2022 at 9:31 AM <arnold@skeeve.com> wrote:
> >
> > Douglas McIlroy <douglas.mcilroy@dartmouth.edu> wrote:
> >
> > > Apropos of accessing rotating storage, John Kelly used to describe the
> > > Packard-Bell 250, which had a delay-line memory, as a machine where
> > > addresses refer to time rather than space.
> > >
> > > The PB 250 had two instruction-sequencing modes. In one mode, each
> > > instruction included the address of its successor. In the other mode,
> > > whatever popped out the delay line when the current instruction
> > > completed would be executed next.
> > >
> > > Doug
> >
> > For us (relative) youngsters, can you explain some more how delay
> > line memory worked? The second mode you describe sounds like it
> > would be impossible to use if you wanted repeatable, reproducible
> > runs of your program.
> >
> > Thanks,
> >
> > Arnold
>

[-- Attachment #2: Type: text/html, Size: 3453 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Re: Clever code
  2022-12-15 18:06           ` Marc Donner
@ 2022-12-15 18:08             ` Marc Donner
  0 siblings, 0 replies; 31+ messages in thread
From: Marc Donner @ 2022-12-15 18:08 UTC (permalink / raw)
  To: Douglas McIlroy; +Cc: tuhs

[-- Attachment #1: Type: text/plain, Size: 2739 bytes --]

Here is a page from the Computer History Museum on the topic:
https://www.computerhistory.org/revolution/memory-storage/8/309

about halfway down the page is a nice schematic.
=====
nygeek.net
mindthegapdialogs.com/home <https://www.mindthegapdialogs.com/home>


On Thu, Dec 15, 2022 at 1:06 PM Marc Donner <marc.donner@gmail.com> wrote:

> Further on delay line storage:
>
> Physically one of the most common ones was a cylinder of liquid mercury.
> There was a device at one end for introducing pressure waves into the
> mercury (think loudspeaker) and a device at the other end for measuring the
> pressure waves arriving (think microphone).  The pulses that came off the
> microphone end were then fed back to the loudspeaker end, after being
> cleaned up.
> =====
> nygeek.net
> mindthegapdialogs.com/home <https://www.mindthegapdialogs.com/home>
>
>
> On Tue, Dec 13, 2022 at 10:12 AM Douglas McIlroy <
> douglas.mcilroy@dartmouth.edu> wrote:
>
>> A delay line is logically like a drum, with circulating data that is
>> accessible only at one point on the circle. A delay line was
>> effectively a linear channel along which a train of data pulses was
>> sent. Pulses received at the far end were reshaped electronically. and
>> reinjected at the sending end. One kind of delay line was a mercury
>> column that carried acoustic pulses.. The PB 250 delay line was
>> magnetostrictive (a technology I know nothing about).
>>
>> If instruction timing is known, then the next instruction to appear is
>> predictable. The only caveat is that instruction times should not be
>> data-dependent. You can lay out sequential code along the circle as
>> long as no instruction steps on one already placed. When that happens
>> you must switch modes to jump to an open spot, or perhaps insert nops
>> to jiggle the layout.
>>
>> Doug
>>
>> On Tue, Dec 13, 2022 at 9:31 AM <arnold@skeeve.com> wrote:
>> >
>> > Douglas McIlroy <douglas.mcilroy@dartmouth.edu> wrote:
>> >
>> > > Apropos of accessing rotating storage, John Kelly used to describe the
>> > > Packard-Bell 250, which had a delay-line memory, as a machine where
>> > > addresses refer to time rather than space.
>> > >
>> > > The PB 250 had two instruction-sequencing modes. In one mode, each
>> > > instruction included the address of its successor. In the other mode,
>> > > whatever popped out the delay line when the current instruction
>> > > completed would be executed next.
>> > >
>> > > Doug
>> >
>> > For us (relative) youngsters, can you explain some more how delay
>> > line memory worked? The second mode you describe sounds like it
>> > would be impossible to use if you wanted repeatable, reproducible
>> > runs of your program.
>> >
>> > Thanks,
>> >
>> > Arnold
>>
>

[-- Attachment #2: Type: text/html, Size: 4593 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Re: Clever code
  2022-12-13 20:14   ` segaloco via TUHS
  2022-12-13 20:58     ` Warren Toomey via TUHS
@ 2022-12-14  2:28     ` Luther Johnson
  1 sibling, 0 replies; 31+ messages in thread
From: Luther Johnson @ 2022-12-14  2:28 UTC (permalink / raw)
  To: tuhs

[-- Attachment #1: Type: text/plain, Size: 3913 bytes --]

The CPU I have designed and (implemented, it's running in a Lattice FPGA 
right now)  has three general purpose registers, a frame pointer, and a 
stack pointer. But the encoding problem you mention is real. So instead 
of designing a scheme where the instruction word is split up into 
fields, I have the first byte as the instruction type, and then however 
many immediate data bytes (in this instruction set, 1, or 3) are 
necessary following. The first byte, after it is fetched, is simply fed 
to a lookup table, which then results in a 12 bit value, 6 bits for 
operation, and two 3 bit fields for the a and b registers - these 12 
bits go to the execution stage. This is a two register address, 24 bit 
machine - I designed it as a replacement for the Zilog eZ80, which has 
become hard to get. Anyhow, I get good code density and I've got lots of 
spare codes left. I've attached the ISA description.

I did go through lots of design alternatives to reach the parameters of 
this ISA - they key one was if I wanted to have the operations I needed, 
available across all the general purpose registers, that limited how 
many general purpose registers I could have and keep all the 
enumerations in less than 256 codes, with some to spare. Another set of 
of choices relates to how I wanted to implement C on this machine, and 
that I did not intend for it to support all possible styles of assembly 
language programming - it is meant to support code generated by the C 
compiler, with a minimum of assembly required.

This machine is called "COR24". I can describe the machine in further 
detail, or show you some sample code,  if you're interested.

Luther

On 12/13/2022 01:14 PM, segaloco via TUHS wrote:
> Where RISC-V is very intentional on this, my reading has lead me to understand that many previous CPU architectures simply passed pieces of the opcode to further hardware in the microarchitecture, so it wasn't so much of a design a register system to fit in a specific bit width but rather a matter of bits 3-5 and 7-9 are connected directly to the two inputs of the ALU internally or something to that effect.  Hearsay of course, I wasn't there, but that's the explanation I've heard in the past.
>
> Now how much settling on a bit width for the register field of opcodes influences the number of registers or vice versa, hard to say.  Did Motorola want a 3 bit register field in opcodes or a resolution of 8 registers per addressing mode in the 68k first for instance, and which decision then followed?  I don't know, maybe someone does?  In fact, that makes me now wonder if there are CPUs with non-power-of-two register counts outside of the early days.  Anything else would waste values in a bitfield.
>
> - Matt G.
>
> ------- Original Message -------
> On Tuesday, December 13th, 2022 at 10:51 AM, G. Branden Robinson <g.branden.robinson@gmail.com> wrote:
>
>
>> At 2022-12-13T12:58:11-0500, Noel Chiappa wrote:
>>
>>> ... registers used to have two aspects - one now gone (and maybe
>>> the second too). The first was that the technology used to implement
>>> them (latches built out of tubes, then transistors) was faster than
>>> main memory - a distinction now mostly gone, especially since caches
>>> blur the speed distinction between today's main memory and registers.
>>> The second was that registers, being smaller in numbers, could be
>>> named with a few bits, allowing them to be named with a small share of
>>> the bits in an instruction. (This one still remains, although
>>> instructions are now so long it's probably less important.)
>>
>> Maybe less important on x86, but the amount of space in the instruction
>> for encoding registers seems to me to have played a major role in the
>> design of the RV32I/E and C (compressed) extension instruction formats
>> of RISC-V.
>>
>> https://riscv.org/wp-content/uploads/2017/05/riscv-spec-v2.2.pdf
>>
>> Regards,
>> Branden


[-- Attachment #2: isa.txt --]
[-- Type: text/plain, Size: 14196 bytes --]

Inst    Description             Op      a,b     6+3+3 binary      Hex
----    -----------             --      ---     ---------------   ---
00      add   r0,r0             00      0,0     00 0000 000 000 | 000
01      add   r0,r1             00      0,1     00 0000 000 001 | 001
02      add   r0,r2             00      0,2     00 0000 000 010 | 002
03      add   r1,r0             00      1,0     00 0000 001 000 | 008
04      add   r1,r1             00      1,1     00 0000 001 001 | 009
05      add   r1,r2             00      1,2     00 0000 001 010 | 00a
06      add   r2,r0             00      2,0     00 0000 010 000 | 010
07      add   r2,r1             00      2,1     00 0000 010 001 | 011
08      add   r2,r2             00      2,2     00 0000 010 010 | 012

09      add   r0,dd             01      0,7     00 0001 000 111 | 047
0a      add   r1,dd             01      1,7     00 0001 001 111 | 04f
0b      add   r2,dd             01      2,7     00 0001 010 111 | 057
0c      add   sp,dd             01      4,7     00 0001 100 111 | 067

0d      and   r0,r1             02      0,1     00 0010 000 001 | 081
0e      and   r0,r2             02      0,2     00 0010 000 010 | 082
0f      and   r1,r0             02      1,0     00 0010 001 000 | 088
10      and   r1,r2             02      1,2     00 0010 001 010 | 08a
11      and   r2,r0             02      2,0     00 0010 010 000 | 090
12      and   r2,r1             02      2,1     00 0010 010 001 | 091

13      bra   dd                03      7,7     00 0011 111 111 | 0ff

14      brf   dd                04      7,7     00 0100 111 111 | 13f

15      brt   dd                05      7,7     00 0101 111 111 | 17f

16      ceq   r0,r1             06      0,1     00 0110 000 001 | 181
17      ceq   r0,r2             06      0,2     00 0110 000 010 | 182
18      ceq   r1,r2             06      1,2     00 0110 001 010 | 18a

19      cls   r0,r1             07      0,1     00 0111 000 001 | 1c1
1a      cls   r0,r2             07      0,2     00 0111 000 010 | 1c2
1b      cls   r1,r0             07      1,0     00 0111 001 000 | 1c8
1c      cls   r1,r2             07      1,2     00 0111 001 010 | 1ca
1d      cls   r2,r0             07      2,0     00 0111 010 000 | 1d0
1e      cls   r2,r1             07      2,1     00 0111 010 001 | 1d1

1f      clu   r0,r1             08      0,1     00 1000 000 001 | 201
20      clu   r0,r2             08      0,2     00 1000 000 010 | 202
21      clu   r1,r0             08      1,0     00 1000 001 000 | 208
22      clu   r1,r2             08      1,2     00 1000 001 010 | 20a
23      clu   r2,r0             08      2,0     00 1000 010 000 | 210
24      clu   r2,r1             08      2,1     00 1000 010 001 | 211

25      jal   r1,(r0)           09      1,0     00 1001 001 000 | 248

26      jmp   (r0)              0a      0,7     00 1010 000 111 | 287
27      jmp   (r1)              0a      1,7     00 1010 001 111 | 28f
28      jmp   (r2)              0a      2,7     00 1010 010 111 | 297

29      la    r0,dddddd         0b      0,7     00 1011 000 111 | 2c7
2a      la    r1,dddddd         0b      1,7     00 1011 001 111 | 2cf
2b      la    r2,dddddd         0b      2,7     00 1011 010 111 | 2d7

2c      lb    r0,dd(r0)         0c      0,0     00 1100 000 000 | 300
2d      lb    r0,dd(r1)         0c      0,1     00 1100 000 001 | 301
2e      lb    r0,dd(r2)         0c      0,2     00 1100 000 010 | 302
2f      lb    r0,dd(fp)         0c      0,3     00 1100 000 011 | 303
30      lb    r1,dd(r0)         0c      1,0     00 1100 001 000 | 308
31      lb    r1,dd(r1)         0c      1,1     00 1100 001 001 | 309
32      lb    r1,dd(r2)         0c      1,2     00 1100 001 010 | 30a
33      lb    r1,dd(fp)         0c      1,3     00 1100 001 011 | 30b
34      lb    r2,dd(r0)         0c      2,0     00 1100 010 000 | 310
35      lb    r2,dd(r1)         0c      2,1     00 1100 010 001 | 311
36      lb    r2,dd(r2)         0c      2,2     00 1100 010 010 | 312
37      lb    r2,dd(fp)         0c      2,3     00 1100 010 011 | 313

38      lbu   r0,dd(r0)         0d      0,0     00 1101 000 000 | 340
39      lbu   r0,dd(r1)         0d      0,1     00 1101 000 001 | 341
3a      lbu   r0,dd(r2)         0d      0,2     00 1101 000 010 | 342
3b      lbu   r0,dd(fp)         0d      0,3     00 1101 000 011 | 343
3c      lbu   r1,dd(r0)         0d      1,0     00 1101 001 000 | 348
3d      lbu   r1,dd(r1)         0d      1,1     00 1101 001 001 | 349
3e      lbu   r1,dd(r2)         0d      1,2     00 1101 001 010 | 34a
3f      lbu   r1,dd(fp)         0d      1,3     00 1101 001 011 | 34b
40      lbu   r2,dd(r0)         0d      2,0     00 1101 010 000 | 350
41      lbu   r2,dd(r1)         0d      2,1     00 1101 010 001 | 351
42      lbu   r2,dd(r2)         0d      2,2     00 1101 010 010 | 352
43      lbu   r2,dd(fp)         0d      2,3     00 1101 010 011 | 353

44      lc    r0,dd             0e      0,7     00 1110 000 111 | 387
45      lc    r1,dd             0e      1,7     00 1110 001 111 | 38f
46      lc    r2,dd             0e      2,7     00 1110 010 111 | 397

47      lcu   r0,dd             0f      0,7     00 1111 000 111 | 3c7
48      lcu   r1,dd             0f      1,7     00 1111 001 111 | 3cf
49      lcu   r2,dd             0f      2,7     00 1111 010 111 | 3d7

4a      lw    r0,dd(r0)         10      0,0     01 0000 000 000 | 400
4b      lw    r0,dd(r1)         10      0,1     01 0000 000 001 | 401
4c      lw    r0,dd(r2)         10      0,2     01 0000 000 010 | 402
4d      lw    r0,dd(fp)         10      0,3     01 0000 000 011 | 403
4e      lw    r1,dd(r0)         10      1,0     01 0000 001 000 | 408
4f      lw    r1,dd(r1)         10      1,1     01 0000 001 001 | 409
50      lw    r1,dd(r2)         10      1,2     01 0000 001 010 | 40a
51      lw    r1,dd(fp)         10      1,3     01 0000 001 011 | 40b
52      lw    r2,dd(r0)         10      2,0     01 0000 010 000 | 410
53      lw    r2,dd(r1)         10      2,1     01 0000 010 001 | 411
54      lw    r2,dd(r2)         10      2,2     01 0000 010 010 | 412
55      lw    r2,dd(fp)         10      2,3     01 0000 010 011 | 413

56      mov   r0,r1             11      0,1     01 0001 000 001 | 441
57      mov   r0,r2             11      0,2     01 0001 000 010 | 442
58      mov   r0,fp             11      0,3     01 0001 000 011 | 443
59      mov   r0,sp             11      0,4     01 0001 000 100 | 444
5a      mov   r1,r0             11      1,0     01 0001 001 000 | 448
5b      mov   r1,r2             11      1,2     01 0001 001 010 | 44a
5c      mov   r1,fp             11      1,3     01 0001 001 011 | 44b
5d      mov   r1,sp             11      1,4     01 0001 001 100 | 44c
5e      mov   r2,r0             11      2,0     01 0001 010 000 | 450
5f      mov   r2,r1             11      2,1     01 0001 010 001 | 451
60      mov   r2,fp             11      2,3     01 0001 010 011 | 453
61      mov   r2,sp             11      2,4     01 0001 010 100 | 454
62      mov   r0,c              11      0,5     01 0001 000 101 | 445
63      mov   r1,c              11      1,5     01 0001 001 101 | 44d
64      mov   r2,c              11      2,5     01 0001 010 101 | 455
65      mov   fp,sp             11      3,4     01 0001 011 100 | 45c
66      mov   sp,r0             11      4,0     01 0001 100 000 | 460
67      mov   sp,r1             11      4,1     01 0001 100 001 | 461
68      mov   sp,r2             11      4,2     01 0001 100 010 | 462
69      mov   sp,fp             11      4,3     01 0001 100 011 | 463

6a      mul   r0,r0             12      0,0     01 0010 000 000 | 480
6b      mul   r0,r1             12      0,1     01 0010 000 001 | 481
6c      mul   r0,r2             12      0,2     01 0010 000 010 | 482
6d      mul   r1,r0             12      1,0     01 0010 001 000 | 488
6e      mul   r1,r1             12      1,1     01 0010 001 001 | 489
6f      mul   r1,r2             12      1,2     01 0010 001 010 | 48a
70      mul   r2,r0             12      2,0     01 0010 010 000 | 490
71      mul   r2,r1             12      2,1     01 0010 010 001 | 491
72      mul   r2,r2             12      2,2     01 0010 010 010 | 492

73      or    r0,r1             13      0,1     01 0011 000 001 | 4c1
74      or    r0,r2             13      0,2     01 0011 000 010 | 4c2
75      or    r1,r0             13      1,0     01 0011 001 000 | 4c8
76      or    r1,r2             13      1,2     01 0011 001 010 | 4ca
77      or    r2,r0             13      2,0     01 0011 010 000 | 4d0
78      or    r2,r1             13      2,1     01 0011 010 001 | 4d1

79      pop     r0              14      0,4     01 0100 000 100 | 504
7a      pop     r1              14      1,4     01 0100 001 100 | 50c
7b      pop     r2              14      2,4     01 0100 010 100 | 514
7c      pop     fp              14      3,4     01 0100 011 100 | 51c

7d      push    r0              15      0,4     01 0101 000 100 | 544
7e      push    r1              15      1,4     01 0101 001 100 | 54c
7f      push    r2              15      2,4     01 0101 010 100 | 554
80      push    fp              15      3,4     01 0101 011 100 | 55c

81      sb    r0,dd(r1)         16      0,1     01 0110 000 001 | 581
82      sb    r0,dd(r2)         16      0,2     01 0110 000 010 | 582
83      sb    r0,dd(fp)         16      0,3     01 0110 000 011 | 583
84      sb    r1,dd(r0)         16      1,0     01 0110 001 000 | 588
85      sb    r1,dd(r2)         16      1,2     01 0110 001 010 | 58a
86      sb    r1,dd(fp)         16      1,3     01 0110 001 011 | 58b
87      sb    r2,dd(r0)         16      2,0     01 0110 010 000 | 590
88      sb    r2,dd(r1)         16      2,1     01 0110 010 001 | 591
89      sb    r2,dd(fp)         16      2,3     01 0110 010 011 | 593

8a      shl   r0,r1             17      0,1     01 0111 000 001 | 5c1
8b      shl   r0,r2             17      0,2     01 0111 000 010 | 5c2
8c      shl   r1,r0             17      1,0     01 0111 001 000 | 5c8
8d      shl   r1,r2             17      1,2     01 0111 001 010 | 5ca
8e      shl   r2,r0             17      2,0     01 0111 010 000 | 5d0
8f      shl   r2,r1             17      2,1     01 0111 010 001 | 5d1

90      sra   r0,r1             18      0,1     01 1000 000 001 | 601
91      sra   r0,r2             18      0,2     01 1000 000 010 | 602
92      sra   r1,r0             18      1,0     01 1000 001 000 | 608
93      sra   r1,r2             18      1,2     01 1000 001 010 | 60a
94      sra   r2,r0             18      2,0     01 1000 010 000 | 610
95      sra   r2,r1             18      2,1     01 1000 010 001 | 611

96      srl   r0,r1             19      0,1     01 1001 000 001 | 641
97      srl   r0,r2             19      0,2     01 1001 000 010 | 642
98      srl   r1,r0             19      1,0     01 1001 001 000 | 648
99      srl   r1,r2             19      1,2     01 1001 001 010 | 64a
9a      srl   r2,r0             19      2,0     01 1001 010 000 | 650
9b      srl   r2,r1             19      2,1     01 1001 010 001 | 651

9c      sub   r0,r1             1a      0,1     01 1010 000 001 | 681
9d      sub   r0,r2             1a      0,2     01 1010 000 010 | 682
9e      sub   r1,r0             1a      1,0     01 1010 001 000 | 688
9f      sub   r1,r2             1a      1,2     01 1010 001 010 | 68a
a0      sub   r2,r0             1a      2,0     01 1010 010 000 | 690
a1      sub   r2,r1             1a      2,1     01 1010 010 001 | 691

a2      sub   sp,dddddd         1b      4,7     01 1011 100 111 | 6e7

a3      sw    r0,dd(r0)         1c      0,0     01 1100 000 000 | 700
a4      sw    r0,dd(r1)         1c      0,1     01 1100 000 001 | 701
a5      sw    r0,dd(r2)         1c      0,2     01 1100 000 010 | 702
a6      sw    r0,dd(fp)         1c      0,3     01 1100 000 011 | 703
a7      sw    r1,dd(r0)         1c      1,0     01 1100 001 000 | 708
a8      sw    r1,dd(r1)         1c      1,1     01 1100 001 001 | 709
a9      sw    r1,dd(r2)         1c      1,2     01 1100 001 010 | 70a
aa      sw    r1,dd(fp)         1c      1,3     01 1100 001 011 | 70b
ab      sw    r2,dd(r0)         1c      2,0     01 1100 010 000 | 710
ac      sw    r2,dd(r1)         1c      2,1     01 1100 010 001 | 711
ad      sw    r2,dd(r2)         1c      2,2     01 1100 010 010 | 712
ae      sw    r2,dd(fp)         1c      2,3     01 1100 010 011 | 713

af      sxt   r0,r0             1d      0,0     01 1101 000 000 | 740
b0      sxt   r0,r1             1d      0,1     01 1101 000 001 | 741
b1      sxt   r0,r2             1d      0,2     01 1101 000 010 | 742
b2      sxt   r1,r0             1d      1,0     01 1101 001 000 | 748
b3      sxt   r1,r1             1d      1,1     01 1101 001 001 | 749
b4      sxt   r1,r2             1d      1,2     01 1101 001 010 | 74a
b5      sxt   r2,r0             1d      2,0     01 1101 010 000 | 750
b6      sxt   r2,r1             1d      2,1     01 1101 010 001 | 751
b7      sxt   r2,r2             1d      2,1     01 1101 010 010 | 752

b8      xor   r0,r1             1e      0,1     01 1110 000 001 | 781
b9      xor   r0,r2             1e      0,2     01 1110 000 010 | 782
ba      xor   r1,r0             1e      1,0     01 1110 001 000 | 788
bb      xor   r1,r2             1e      1,2     01 1110 001 010 | 78a
bc      xor   r2,r0             1e      2,0     01 1110 010 000 | 790
bd      xor   r2,r1             1e      2,1     01 1110 010 001 | 791

be      zxt   r0,r0             1f      0,0     01 1111 000 000 | 7c0
bf      zxt   r0,r1             1f      0,1     01 1111 000 001 | 7c1
c0      zxt   r0,r2             1f      0,2     01 1111 000 010 | 7c2
c1      zxt   r1,r0             1f      1,0     01 1111 001 000 | 7c8
c2      zxt   r1,r1             1f      1,1     01 1111 001 001 | 7c9
c3      zxt   r1,r2             1f      1,2     01 1111 001 010 | 7ca
c4      zxt   r2,r0             1f      2,0     01 1111 010 000 | 7d0
c5      zxt   r2,r1             1f      2,1     01 1111 010 001 | 7d1
c6      zxt   r2,r2             1f      2,2     01 1111 010 010 | 7d2

=== Late additions ===

c7      jmp   dddddd            0b      7,7     00 1011 111 111 | 2ff

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Re: Clever code
  2022-12-13 20:14   ` segaloco via TUHS
@ 2022-12-13 20:58     ` Warren Toomey via TUHS
  2022-12-14  2:28     ` Luther Johnson
  1 sibling, 0 replies; 31+ messages in thread
From: Warren Toomey via TUHS @ 2022-12-13 20:58 UTC (permalink / raw)
  To: tuhs; +Cc: coff

I think we might move the discussion on memory technologies, CPU 
architectures etc. to COFF 😁

Thanks, Warren



^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Re: Clever code
  2022-12-13 18:51 ` G. Branden Robinson
@ 2022-12-13 20:14   ` segaloco via TUHS
  2022-12-13 20:58     ` Warren Toomey via TUHS
  2022-12-14  2:28     ` Luther Johnson
  0 siblings, 2 replies; 31+ messages in thread
From: segaloco via TUHS @ 2022-12-13 20:14 UTC (permalink / raw)
  To: G. Branden Robinson; +Cc: tuhs

Where RISC-V is very intentional on this, my reading has lead me to understand that many previous CPU architectures simply passed pieces of the opcode to further hardware in the microarchitecture, so it wasn't so much of a design a register system to fit in a specific bit width but rather a matter of bits 3-5 and 7-9 are connected directly to the two inputs of the ALU internally or something to that effect.  Hearsay of course, I wasn't there, but that's the explanation I've heard in the past.

Now how much settling on a bit width for the register field of opcodes influences the number of registers or vice versa, hard to say.  Did Motorola want a 3 bit register field in opcodes or a resolution of 8 registers per addressing mode in the 68k first for instance, and which decision then followed?  I don't know, maybe someone does?  In fact, that makes me now wonder if there are CPUs with non-power-of-two register counts outside of the early days.  Anything else would waste values in a bitfield.

- Matt G.

------- Original Message -------
On Tuesday, December 13th, 2022 at 10:51 AM, G. Branden Robinson <g.branden.robinson@gmail.com> wrote:


> At 2022-12-13T12:58:11-0500, Noel Chiappa wrote:
> 
> > ... registers used to have two aspects - one now gone (and maybe
> > the second too). The first was that the technology used to implement
> > them (latches built out of tubes, then transistors) was faster than
> > main memory - a distinction now mostly gone, especially since caches
> > blur the speed distinction between today's main memory and registers.
> > The second was that registers, being smaller in numbers, could be
> > named with a few bits, allowing them to be named with a small share of
> > the bits in an instruction. (This one still remains, although
> > instructions are now so long it's probably less important.)
> 
> 
> Maybe less important on x86, but the amount of space in the instruction
> for encoding registers seems to me to have played a major role in the
> design of the RV32I/E and C (compressed) extension instruction formats
> of RISC-V.
> 
> https://riscv.org/wp-content/uploads/2017/05/riscv-spec-v2.2.pdf
> 
> Regards,
> Branden

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Re: Clever code
  2022-12-13 17:58 Noel Chiappa
@ 2022-12-13 18:51 ` G. Branden Robinson
  2022-12-13 20:14   ` segaloco via TUHS
  0 siblings, 1 reply; 31+ messages in thread
From: G. Branden Robinson @ 2022-12-13 18:51 UTC (permalink / raw)
  To: tuhs

[-- Attachment #1: Type: text/plain, Size: 978 bytes --]

At 2022-12-13T12:58:11-0500, Noel Chiappa wrote:
> ... registers used to have two aspects - one now gone (and maybe
> the second too). The first was that the _technology_ used to implement
> them (latches built out of tubes, then transistors) was faster than
> main memory - a distinction now mostly gone, especially since caches
> blur the speed distinction between today's main memory and registers.
> The second was that registers, being smaller in numbers, could be
> named with a few bits, allowing them to be named with a small share of
> the bits in an instruction. (This one still remains, although
> instructions are now so long it's probably less important.)

Maybe less important on x86, but the amount of space in the instruction
for encoding registers seems to me to have played a major role in the
design of the RV32I/E and C (compressed) extension instruction formats
of RISC-V.

https://riscv.org/wp-content/uploads/2017/05/riscv-spec-v2.2.pdf

Regards,
Branden

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Re: Clever code
@ 2022-12-13 18:02 Noel Chiappa
  0 siblings, 0 replies; 31+ messages in thread
From: Noel Chiappa @ 2022-12-13 18:02 UTC (permalink / raw)
  To: tuhs; +Cc: jnc

    > From: Bakul Shah

    > one dealt with it by formatting the disk so that the logical blocks N &
    > N+1 (from the OS PoV) were physically more than 1 sector apart. No
    > clever coding needed!

An old hack. ('Nothing new', and all that.) DEC Rx01/02 floppies used the
same thing, circa 1976.

	Noel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Re: Clever code
@ 2022-12-13 17:58 Noel Chiappa
  2022-12-13 18:51 ` G. Branden Robinson
  0 siblings, 1 reply; 31+ messages in thread
From: Noel Chiappa @ 2022-12-13 17:58 UTC (permalink / raw)
  To: tuhs; +Cc: jnc

    > From: Stuff Received

    > I had always thought of a delay line as a precursor to a register (or
    > stack) for storing intermediate results. Is this not an accurate way of
    > thinking about it?

No, not at all.

First: delay lines were a memory _technology_ (one that was inherently
serial, not random-access). They preceded all others.

Second: registers used to have two aspects - one now gone (and maybe the
second too). The first was that the _technology_ used to implement them
(latches built out of tubes, then transistors) was faster than main memory -
a distinction now mostly gone, especially since caches blur the speed
distinction between today's main memory and registers. The second was that
registers, being smaller in numbers, could be named with a few bits, allowing
them to be named with a small share of the bits in an instruction. (This one
still remains, although instructions are now so long it's probably less
important.)

Some delay-line machines had two different delay line sizes (since size is
equivalent to average access time) - what one might consider 'registers' were
kept in the small ones, for fast access at all times, whereas main memory
used the longer ones.

	Noel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [TUHS] Re: Clever code
  2022-12-12  2:15 ` [TUHS] Clever code (was " Bakul Shah
@ 2022-12-12  9:48   ` Michael Kjörling
  0 siblings, 0 replies; 31+ messages in thread
From: Michael Kjörling @ 2022-12-12  9:48 UTC (permalink / raw)
  To: tuhs

On 11 Dec 2022 18:15 -0800, from bakul@iitbombay.org (Bakul Shah):
> Agree that clear code is preferable to complicated code. But in practice
> people sacrifice clarity for performance improvement all the time. Look
> at the kernel code of any modern os. Everybody pays lip service to this
> but most anything other than toy programs ends up getting needlessly
> complicated over time.

Performant code does not need to stand in opposition to clear code.
And if you are writing performance-critical or size-critical code
(such as the QNX microkernel that someone brought up), then of course
you might end up needing to do things in non-obvious ways; that's not
the point here. Clever is fine IMO _where the cleverness provides an
actual benefit in a real-world scenario_, and an operating system
kernel (and a language standard library) is a rather special type of
program where sometimes you have to do things in somewhat non-obvious
ways because of the environment the code is meant to execute in. But a
significant portion of the time, where I see "clever" code there is
_no_ significant benefit to the cleverness; it's often more about
"showing off", or saving a few source code characters at the expense
of at-a-glance readability, than it is about actual usefulness and
necessity.

Necessary complexity in order to solve the problem, combined with
things like performance requirements, may sometimes require clever
code. At that point, there is a benefit to the cleverness. But one can
still aim to write the clever code _clearly_, with everything from
comments to good variable and function names to formatting that
enhances readability by for example grouping related operations
together in the source code to keeping related parts grouped together
and separate from other code. Just such a simple thing as that I often
add extra parenthesis over and beyond what's actually required based
on operator precedence, because doing so makes it clearer what goes
together and unless the compiler is severely braindead by modern
standards, doing so costs _nothing_ past the compilation stage.

-- 
✍  Michael Kjörling                  🏡 https://michael.kjorling.se
“Remember when, on the Internet, nobody cared that you were a dog?”


^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2022-12-15 18:09 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-13  3:30 [TUHS] Re: Clever code Rudi Blom
2022-12-13  3:41 ` Warner Losh
2022-12-13  3:53 ` Dave Horsfall
2022-12-13  4:03   ` George Michaelson
2022-12-13  8:05     ` Ralph Corderoy
2022-12-13  9:45       ` Dagobert Michelsen
2022-12-13  7:47   ` Ralph Corderoy
2022-12-13 19:56     ` Dave Horsfall
2022-12-13 11:46   ` John P. Linderman
2022-12-13 14:07     ` Douglas McIlroy
2022-12-13 14:31       ` arnold
2022-12-13 14:48         ` Ralph Corderoy
2022-12-13 15:10         ` Douglas McIlroy
2022-12-13 15:34           ` Stuff Received
2022-12-13 15:56             ` Ralph Corderoy
2022-12-13 23:02           ` Harald Arnesen
2022-12-14  7:31           ` arnold
2022-12-15 18:06           ` Marc Donner
2022-12-15 18:08             ` Marc Donner
2022-12-15  6:30         ` [TUHS] Delay line memory (was: Clever code) Greg 'groggy' Lehey
2022-12-13 15:52 ` [TUHS] Re: Clever code Bakul Shah
2022-12-13 16:14   ` Ralph Corderoy
2022-12-13 16:30     ` Bakul Shah
2022-12-15  6:39   ` [TUHS] Sector interleaving (was: Clever code) Greg 'groggy' Lehey
  -- strict thread matches above, loose matches on Subject: below --
2022-12-13 18:02 [TUHS] Re: Clever code Noel Chiappa
2022-12-13 17:58 Noel Chiappa
2022-12-13 18:51 ` G. Branden Robinson
2022-12-13 20:14   ` segaloco via TUHS
2022-12-13 20:58     ` Warren Toomey via TUHS
2022-12-14  2:28     ` Luther Johnson
2022-12-11 20:03 [TUHS] Re: Stdin Redirect in Cu History/Alternatives? Larry McVoy
2022-12-12  2:15 ` [TUHS] Clever code (was " Bakul Shah
2022-12-12  9:48   ` [TUHS] Re: Clever code Michael Kjörling

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).