[TUHS] Discuss of style and design of computer programs from a

The Unix Heritage Society mailing list
 help / color / mirror / Atom feed

* [TUHS] Discuss of style and design of computer programs from a
@ 2017-05-07  0:51 Nemo
  2017-05-08 13:39 ` Tony Finch
  0 siblings, 1 reply; 14+ messages in thread
From: Nemo @ 2017-05-07  0:51 UTC (permalink / raw)


On 6 May 2017 at 11:23, ron minnich <rminnich at gmail.com> wrote (in part):
[...]
> Lest you think things are better now, Linux uses self modifying code to
> optimize certain critical operations, and at one talk I heard the speaker
> say that he'd like to put more self modifying code into Linux, "because it's
> fun". Oh boy.

Fun, indeed!  Even self-modifying chips are being touted -- Yikes!

N.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [TUHS] Discuss of style and design of computer programs from a
  2017-05-07  0:51 [TUHS] Discuss of style and design of computer programs from a Nemo
@ 2017-05-08 13:39 ` Tony Finch
  2017-05-08 16:21   ` Steve Johnson
  0 siblings, 1 reply; 14+ messages in thread
From: Tony Finch @ 2017-05-08 13:39 UTC (permalink / raw)


Nemo <cym224 at gmail.com> wrote:
> On 6 May 2017 at 11:23, ron minnich <rminnich at gmail.com> wrote (in part):
> [...]
> > Lest you think things are better now, Linux uses self modifying code to
> > optimize certain critical operations, and at one talk I heard the speaker
> > say that he'd like to put more self modifying code into Linux, "because it's
> > fun". Oh boy.
>
> Fun, indeed!  Even self-modifying chips are being touted -- Yikes!

You reminded me of these comments on a bug in NVidia's Tegra
"Project Denver" dynamic JIT firmware:

https://twitter.com/FioraAeterna/status/855445075341398017
>
> small brain: bug in your code
> big brain:  bug in the compiler
> cosmic brain: bug in the cpu's on-chip recompiler
> https://github.com/golang/go/issues/19809#issuecomment-290804472

https://twitter.com/eqe/status/855533948931252224
>
> This happened with TransMeta back in the day, and now with Tegra. I
> wonder if NVidia has a update deployment strategy...

(Marginally topical relevance is that Linus Torvalds worked for Transmeta)

Tony.
-- 
f.anthony.n.finch  <dot at dotat.at>  http://dotat.at/  -  I xn--zr8h punycode
Lundy, Fastnet, Irish Sea, Shannon, Rockall, Malin, South Hebrides: Easterly
or northeasterly 4 or 5, occasionally 6 at first, becoming variable 3 at times
later. Slight or moderate. Fair. Good.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [TUHS] Discuss of style and design of computer programs from a
  2017-05-08 13:39 ` Tony Finch
@ 2017-05-08 16:21   ` Steve Johnson
  2017-05-08 17:01     ` Dan Cross
  0 siblings, 1 reply; 14+ messages in thread
From: Steve Johnson @ 2017-05-08 16:21 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 3460 bytes --]

That's a very interesting bug, and Fiora is a genius...

I too worked at Transmeta, although our recompiler was not on chip...

We designed the chip originally based on Spec benchmarks.  This
turned out to be a mistake because Windows doesn't look a bit like
Spec.

For one thing, when booting Windows, 50% of all the instructions were
executed once only.  When we found all of these instructions and
JITted their
basic blocks, it ran like a turtle.  The solution was to put in an
X86 interpreter, and only JIT when the instruction was executed 5
times.  The resulting performance
was more than acceptable.  But there was a problem demonstrating this
with a number of benchmarks, namely those that

	* Ran a small problem and timed it.
	* Based on that, made a bigger problem that was scaled by #1 -- the
slower the small problem, the smaller the big problem.
	* Added 1 and 2 to get the score.

	The small problem often ran mostly in the interpreter.  So the "big"
problem was really pretty small.  That ran fast, but overall, #1
dominated.

	If we actually ran the big problem from the getgo, we looked pretty
decent.

	Among the interesting facts we learned:  Windows executed a billion
instructions between detecting a reason to crash and putting up the
blue screen...

	We also ran into problems with self modifying code.  To their
credit, Microsoft quickly fixed the things we pointed out (most in
3rd-party drivers).

	Transmeta actually had some of the most interesting technology I've
ever worked with as a compiler writer, including a checkpoint/restart
in hardware that let the JIT do out-of-order execution and then do it
over serially if a fault appeared.
Steve

----- Original Message -----
From: "Tony Finch" <dot@dotat.at>
To:"Nemo" <cym224 at gmail.com>
Cc:"TUHS main list" <tuhs at minnie.tuhs.org>
Sent:Mon, 8 May 2017 14:39:36 +0100
Subject:Re: [TUHS] Discuss of style and design of computer programs
from a

 Nemo <cym224 at gmail.com> wrote:
 > On 6 May 2017 at 11:23, ron minnich <rminnich at gmail.com> wrote (in
part):
 > [...]
 > > Lest you think things are better now, Linux uses self modifying
code to
 > > optimize certain critical operations, and at one talk I heard the
speaker
 > > say that he'd like to put more self modifying code into Linux,
"because it's
 > > fun". Oh boy.
 >
 > Fun, indeed! Even self-modifying chips are being touted -- Yikes!

 You reminded me of these comments on a bug in NVidia's Tegra
 "Project Denver" dynamic JIT firmware:

 https://twitter.com/FioraAeterna/status/855445075341398017
 >
 > small brain: bug in your code
 > big brain: bug in the compiler
 > cosmic brain: bug in the cpu's on-chip recompiler
 > https://github.com/golang/go/issues/19809#issuecomment-290804472

 https://twitter.com/eqe/status/855533948931252224
 >
 > This happened with TransMeta back in the day, and now with Tegra. I
 > wonder if NVidia has a update deployment strategy...

 (Marginally topical relevance is that Linus Torvalds worked for
Transmeta)

 Tony.
 -- 
 f.anthony.n.finch <dot at dotat.at> http://dotat.at/ - I xn--zr8h
punycode
 Lundy, Fastnet, Irish Sea, Shannon, Rockall, Malin, South Hebrides:
Easterly
 or northeasterly 4 or 5, occasionally 6 at first, becoming variable 3
at times
 later. Slight or moderate. Fair. Good.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20170508/7d5fd7ae/attachment.html>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [TUHS] Discuss of style and design of computer programs from a
  2017-05-08 16:21   ` Steve Johnson
@ 2017-05-08 17:01     ` Dan Cross
  0 siblings, 0 replies; 14+ messages in thread
From: Dan Cross @ 2017-05-08 17:01 UTC (permalink / raw)

On Mon, May 8, 2017 at 12:21 PM, Steve Johnson <scj at yaccman.com> wrote:

> That's a very interesting bug, and Fiora is a genius...
>

That bug actually came from Dave Chase at Google. He mentioned it to me in
passing when he found it, "Hey, want to hear about this crazy hardware bug
we found?" I remember just sort of looking at him with this, "you've got to
be kidding me..." expression.

        - Dan C.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20170508/99f6fd7d/attachment.html>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [TUHS] Discuss of style and design of computer programs from a
  2017-05-06 21:45       ` Michael Kjörling
@ 2017-05-07  7:42         ` Stephen Kitt
  0 siblings, 0 replies; 14+ messages in thread
From: Stephen Kitt @ 2017-05-07  7:42 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 4000 bytes --]

On Sat, 6 May 2017 21:45:19 +0000, Michael Kjörling <michael at kjorling.se>
wrote:
> On 6 May 2017 16:00 -0400, from usotsuki at buric.co (Steve Nickolas):
> > In 6502 code, it's not uncommon to do something like
> > 
> > foo1:     lda      #$00
> >           .byte    $2C       ; 3-byte BIT
> > foo2:     lda      #$01
> >            .
> >            .
> >            .
> > 
> > to save a byte (and probably still done for the few who write in
> > ASM). The "2C" operand would cause it to disassemble as something
> > like...
> > 
> > 1000-     LDA      #$00
> > 1002-     BIT      $01A9
> > 
> > which is the route you'd go down if you called "foo1".  Apart
> > diddling a few CPU flags, and an unneeded read on $01A9, harmless.  
> 
> You could do something quite similar on the 8086, which I am somewhat
> more familiar with.
[...]

One possible equivalent 16-bit x86 is

	foo1:	mov al, 0
		db 035h		; XOR AX, imm16
	foo2:	mov al, 1

which implements a branchless fall-through. (But you’d probably just use
SALC...)

> Or, something slightly more "useful"
> 
>         jmp short $+3
>         int 0 ; hex: cd 00
>         db 0
>         jz somewhere
> 
> which, again IIRC, would clear the accumulator (AX) register because
> 0000 hex is XOR AX,AX,

According to my disassembler 0000 is "add byte ptr [bx+si], al"; "xor ax, ax"
is either 31c0 or 33c0.

> which as a side effect sets the zero flag
> because the result is zero, and then executes a conditional jump if
> the zero flag is set (which it will be). In other words, it _executes_
> as
> 
>         jmp short l1
>         db 0cdh ; dummy byte
>     l1: xor ax,ax ; hex: 00 00
>         jz somewhere
> 
> but a naiive decompiler will see the CDh byte after the jump and take
> that as the first byte of the two-byte interrupt instruction. I don't
> know what 00 followed by the first byte of the JZ instruction would
> be, but probably a register/register XOR with a register store. Make
> it a far jump and you have a few extra bytes to play with, to befuddle
> anyone trying to figure out just what on Earth your code is really
> doing.

My memories of disassembling x86 code in the early 90s at least is that,
because of all this, disassemblers were pretty good at restarting streams
from jump destinations, so it never really was an issue. You need to get
creative with single-stepping mode and self-decrypting code to really annoy
people trying to understand your code ;-).

> The above, of course, is really just scratching the surface. With how
> many variable-length instructions the x86 has, with careful selection
> of opcodes and possibly use of undocumented but functionally valid
> combinations, I wouldn't be the least bit surprised if it's
> technically possible to write a 8086 program that does one thing when
> started normally, and something utterly different but still useful
> when started at an offset that results in execution beginning in the
> middle of an instruction. Bonus points if the first thing that program
> does is jump into itself _at that offset_. Now, the _work involved in
> doing so_, never mind maintaining it...

I have seen some short sections of code which do this. The x86 ISA is so rich
that you can make it adapt to many different constraints — one I rather liked
is assembling code which is valid ASCII (amaze your friends by typing in
a .COM using COPY CON!). There’s a C compiler which does that on GitHub
somewhere.

> Of course, I wouldn't do anything like the above in any real-world
> code base meant to run on modern systems unless obfuscation really
> _was_ a design goal. Honest. If I was tasked to write something that
> really needed to do something non-trivial in 16 KiB RAM on an original
> IBM 5150 PC or clone? I'd probably spend quite a bit of time in front
> of the whiteboard and with reference manuals before writing even one
> line of code...

Yes, and combine that with assemblers which output all possible encodings of
a given instruction!

Regards,

Stephen


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [TUHS] Discuss of style and design of computer programs from a
  2017-05-06 20:00     ` Steve Nickolas
@ 2017-05-06 21:45       ` Michael Kjörling
  2017-05-07  7:42         ` Stephen Kitt
  0 siblings, 1 reply; 14+ messages in thread
From: Michael Kjörling @ 2017-05-06 21:45 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 4134 bytes --]

On 6 May 2017 16:00 -0400, from usotsuki at buric.co (Steve Nickolas):
> In 6502 code, it's not uncommon to do something like
> 
> foo1:     lda      #$00
>           .byte    $2C       ; 3-byte BIT
> foo2:     lda      #$01
>            .
>            .
>            .
> 
> to save a byte (and probably still done for the few who write in
> ASM). The "2C" operand would cause it to disassemble as something
> like...
> 
> 1000-     LDA      #$00
> 1002-     BIT      $01A9
> 
> which is the route you'd go down if you called "foo1".  Apart
> diddling a few CPU flags, and an unneeded read on $01A9, harmless.

You could do something quite similar on the 8086, which I am somewhat
more familiar with. For example, in some kind of pseudo 8086
assembler, with $ denoting the value of the instruction pointer at the
beginning of the current instruction:

        jmp short $+3
        int 90h ; hex: cd 90

would almost certainly decompile to the above, if the decompiler
doesn't barf when it can't create a jump target label mid-instruction,
but it would certainly _execute_ as

        jmp short l1
        db 0cdh ; dummy byte
    l1: nop ; hex: 90

resulting in a jump over one byte followed by a no-op instruction. Of
course, invoking interrupt 90h would be a perfectly legal thing to do,
assuming that your interrupt tables are set up correctly. If you want
to mess with someone trying to figure out what the code is doing,
write a (nonsense or meaningful) value to interrupt vector 90h before
you do that. For example, assuming that it isn't being used, you could
point interrupt 90h at the reboot jump location, which IIRC on the IBM
PC would mean point it at FFFFh:FFF0h or absolute address FFFF0h.
Anyone trying to actually execute the INT 90h instruction would see
their computer reboot, but anyone actually executing the code would
see little more than a CPU cache flush due to the jump.

Or, something slightly more "useful"

        jmp short $+3
        int 0 ; hex: cd 00
        db 0
        jz somewhere

which, again IIRC, would clear the accumulator (AX) register because
0000 hex is XOR AX,AX, which as a side effect sets the zero flag
because the result is zero, and then executes a conditional jump if
the zero flag is set (which it will be). In other words, it _executes_
as

        jmp short l1
        db 0cdh ; dummy byte
    l1: xor ax,ax ; hex: 00 00
        jz somewhere

but a naiive decompiler will see the CDh byte after the jump and take
that as the first byte of the two-byte interrupt instruction. I don't
know what 00 followed by the first byte of the JZ instruction would
be, but probably a register/register XOR with a register store. Make
it a far jump and you have a few extra bytes to play with, to befuddle
anyone trying to figure out just what on Earth your code is really
doing.

The above, of course, is really just scratching the surface. With how
many variable-length instructions the x86 has, with careful selection
of opcodes and possibly use of undocumented but functionally valid
combinations, I wouldn't be the least bit surprised if it's
technically possible to write a 8086 program that does one thing when
started normally, and something utterly different but still useful
when started at an offset that results in execution beginning in the
middle of an instruction. Bonus points if the first thing that program
does is jump into itself _at that offset_. Now, the _work involved in
doing so_, never mind maintaining it...

Of course, I wouldn't do anything like the above in any real-world
code base meant to run on modern systems unless obfuscation really
_was_ a design goal. Honest. If I was tasked to write something that
really needed to do something non-trivial in 16 KiB RAM on an original
IBM 5150 PC or clone? I'd probably spend quite a bit of time in front
of the whiteboard and with reference manuals before writing even one
line of code...

-- 
Michael Kjörling • https://michael.kjorling.se • michael at kjorling.se
                 “People who think they know everything really annoy
                 those of us who know we don’t.” (Bjarne Stroustrup)


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [TUHS] Discuss of style and design of computer programs from a
  2017-05-06 15:20   ` Michael Kjörling
  2017-05-06 15:24     ` Larry McVoy
@ 2017-05-06 20:00     ` Steve Nickolas
  2017-05-06 21:45       ` Michael Kjörling
  1 sibling, 1 reply; 14+ messages in thread
From: Steve Nickolas @ 2017-05-06 20:00 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 2530 bytes --]

On Sat, 6 May 2017, Michael Kjörling wrote:

> On 6 May 2017 08:09 -0700, from corey at lod.com (Corey Lindsly):
>> Anyway, I reached one point in the assembly code that I simply could not
>> understand. It seemed like a mistake, and I went through it again and
>> again until I finally realized what it was doing. There was a branch/loop
>> that jumped to the middle of a multi-byte machine instruction, so that
>> branch had to be disassembled and stepped separately until it "synced" up
>> with the other branch again. Maybe this is standard practice in
>> programming (I don't know) but at the time I thought, what kind of evil
>> genius devised this to save a few bytes of memory?
>
> IIRC, that _was_ a common trick at least on machines of that class. It
> did have the potential to save a few bytes, yes (more if the
> instructions were such that you'd get some _other, desired_, behavior
> by jumping into the middle of one with some specific state), but it
> also foiled lots of disassemblers: Simply disassembling a binary from
> start to finish would yield nonsense in those locations, as you
> experienced. It thus basically forced you to single-step those
> instructions to figure out what was going on from the binary.
>
> I'm pretty sure it works on every architecture with variable-length
> instructions and arbitrary jump capability, as long as you have
> control over the specific machine instructions generated (such as if
> you are programming in assembler). Of course, it _is_ also a total
> nightmare to maintain such code.
>
> I would absolutely not say that doing something like that is standard
> practice in modern programming. Even in microcontrollers, where
> program and data memory can be scarce even today, I would argue that
> the costs would not outweigh the benefits by a long shot.

In 6502 code, it's not uncommon to do something like

foo1:     lda      #$00
           .byte    $2C       ; 3-byte BIT
foo2:     lda      #$01
            .
            .
            .

to save a byte (and probably still done for the few who write in ASM). 
The "2C" operand would cause it to disassemble as something like...

1000-     LDA      #$00
1002-     BIT      $01A9

which is the route you'd go down if you called "foo1".  Apart diddling a 
few CPU flags, and an unneeded read on $01A9, harmless.

(Most 6502 programmers would probably see a strange BIT instruction as an 
attempt to do this.)

It's probably not a good idea to still do this unless you're really REALLY 
crunched for space.

-uso.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [TUHS] Discuss of style and design of computer programs from a
  2017-05-06 15:51       ` Michael Kjörling
@ 2017-05-06 15:53         ` Larry McVoy
  0 siblings, 0 replies; 14+ messages in thread
From: Larry McVoy @ 2017-05-06 15:53 UTC (permalink / raw)


On Sat, May 06, 2017 at 03:51:24PM +0000, Michael Kj??rling wrote:
> On 6 May 2017 08:24 -0700, from lm at mcvoy.com (Larry McVoy):
> >> I would absolutely not say that doing something like that is standard
> >> practice in modern programming. Even in microcontrollers, where
> >> program and data memory can be scarce even today, I would argue that
> >> the costs would not outweigh the benefits by a long shot.
> > 
> > It strikes as being similar to Duff's device (1).  Which is a niche thing
> > but I still use that from time to time.  Not to save memory, just because
> > as a C programmer it seems pretty natural to do it.
> > 
> > --lm
> > 
> > (1) https://en.wikipedia.org/wiki/Duff's_device
> 
> I disagree; loop unrolling and jumping to the beginning of some
> instruction inside that unrolled loop is not at all the same thing as
> jumping _into the middle of a machine language instruction_.

That's fine, I feel no need to argue about it.  Seemed similar to me but
I'm not the sharpest tool in the shed :)


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [TUHS] Discuss of style and design of computer programs from a
  2017-05-06 15:24     ` Larry McVoy
@ 2017-05-06 15:51       ` Michael Kjörling
  2017-05-06 15:53         ` Larry McVoy
  0 siblings, 1 reply; 14+ messages in thread
From: Michael Kjörling @ 2017-05-06 15:51 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1031 bytes --]

On 6 May 2017 08:24 -0700, from lm at mcvoy.com (Larry McVoy):
>> I would absolutely not say that doing something like that is standard
>> practice in modern programming. Even in microcontrollers, where
>> program and data memory can be scarce even today, I would argue that
>> the costs would not outweigh the benefits by a long shot.
> 
> It strikes as being similar to Duff's device (1).  Which is a niche thing
> but I still use that from time to time.  Not to save memory, just because
> as a C programmer it seems pretty natural to do it.
> 
> --lm
> 
> (1) https://en.wikipedia.org/wiki/Duff's_device

I disagree; loop unrolling and jumping to the beginning of some
instruction inside that unrolled loop is not at all the same thing as
jumping _into the middle of a machine language instruction_.

-- 
Michael Kjörling • https://michael.kjorling.se • michael at kjorling.se
                 “People who think they know everything really annoy
                 those of us who know we don’t.” (Bjarne Stroustrup)


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [TUHS] Discuss of style and design of computer programs from a
  2017-05-06 15:23   ` ron minnich
@ 2017-05-06 15:44     ` Michael Kjörling
  0 siblings, 0 replies; 14+ messages in thread
From: Michael Kjörling @ 2017-05-06 15:44 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1009 bytes --]

On 6 May 2017 15:23 +0000, from rminnich at gmail.com (ron minnich):
> This is why the things like the Therac 25 happened ...
> https://en.wikipedia.org/wiki/Therac-25

Or the Ariane 5 flight 501, which according to Wikipedia cost more
than 370 million dollars (and resulted in the total loss of the
spacecraft). I would say a few million or even a few tens of millions
of dollars to double-check the software, or even write new software
specifically designed for the Ariane 5 rather than reuse software
designed for the Ariane 4 outside of that software's design limits,
might not have been a bad way to spend money there.

And that was in 1996. Hardly that long ago.

https://en.wikipedia.org/wiki/Ariane_5#Notable_launches

https://en.wikipedia.org/wiki/Cluster_%28spacecraft%29

-- 
Michael Kjörling • https://michael.kjorling.se • michael at kjorling.se
                 “People who think they know everything really annoy
                 those of us who know we don’t.” (Bjarne Stroustrup)


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [TUHS] Discuss of style and design of computer programs from a
  2017-05-06 15:20   ` Michael Kjörling
@ 2017-05-06 15:24     ` Larry McVoy
  2017-05-06 15:51       ` Michael Kjörling
  2017-05-06 20:00     ` Steve Nickolas
  1 sibling, 1 reply; 14+ messages in thread
From: Larry McVoy @ 2017-05-06 15:24 UTC (permalink / raw)


On Sat, May 06, 2017 at 03:20:53PM +0000, Michael Kj??rling wrote:
> I would absolutely not say that doing something like that is standard
> practice in modern programming. Even in microcontrollers, where
> program and data memory can be scarce even today, I would argue that
> the costs would not outweigh the benefits by a long shot.

It strikes as being similar to Duff's device (1).  Which is a niche thing
but I still use that from time to time.  Not to save memory, just because
as a C programmer it seems pretty natural to do it.

--lm

(1) https://en.wikipedia.org/wiki/Duff's_device


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [TUHS] Discuss of style and design of computer programs from a
  2017-05-06 15:09 ` [TUHS] Discuss of style and design of computer programs from a Corey Lindsly
  2017-05-06 15:20   ` Michael Kjörling
@ 2017-05-06 15:23   ` ron minnich
  2017-05-06 15:44     ` Michael Kjörling
  1 sibling, 1 reply; 14+ messages in thread
From: ron minnich @ 2017-05-06 15:23 UTC (permalink / raw)

On Sat, May 6, 2017 at 8:09 AM Corey Lindsly <corey at lod.com> wrote:

> There was a branch/loop
> that jumped to the middle of a multi-byte machine instruction, so that
> branch had to be disassembled and stepped separately until it "synced" up
> with the other branch again. Maybe this is standard practice in
> programming (I don't know) but at the time I thought, what kind of evil
> genius devised this to save a few bytes of memory?
>
>
This was extremely common back then. I had a friend who worked on a gas
chromatograph project, names redacted here. It had a very advanced idea, a
thermal printer. It would print a banner when it started.  Getting that
print to work, in the ROM they had, was a nightmare that involved all these
tricks.

When the "A" version came out, they asked my friend to have it print the A
after the product number. His response: "NO". There was no way he could
ever pick apart the crazy code that had printed out the startup banner so
he could add an "A". The startup banner remained the same.

Executing code as data in the early startup was also common in those days,
and modifying that data and then rerunning it happened all the time.

This is why the things like the Therac 25 happened ...
https://en.wikipedia.org/wiki/Therac-25

Note the reference to "Cargo coding", reusing code you don't understand. In
modern terms we call this software reuse and it's taught at all the best
uni's. Google a package, pull it down, compile it in, done.

Lest you think things are better now, Linux uses self modifying code to
optimize certain critical operations, and at one talk I heard the speaker
say that he'd like to put more self modifying code into Linux, "because
it's fun". Oh boy.

ron
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20170506/b0f8b7d4/attachment-0001.html>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [TUHS] Discuss of style and design of computer programs from a
  2017-05-06 15:09 ` [TUHS] Discuss of style and design of computer programs from a Corey Lindsly
@ 2017-05-06 15:20   ` Michael Kjörling
  2017-05-06 15:24     ` Larry McVoy
  2017-05-06 20:00     ` Steve Nickolas
  2017-05-06 15:23   ` ron minnich
  1 sibling, 2 replies; 14+ messages in thread
From: Michael Kjörling @ 2017-05-06 15:20 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1959 bytes --]

On 6 May 2017 08:09 -0700, from corey at lod.com (Corey Lindsly):
> Anyway, I reached one point in the assembly code that I simply could not 
> understand. It seemed like a mistake, and I went through it again and 
> again until I finally realized what it was doing. There was a branch/loop 
> that jumped to the middle of a multi-byte machine instruction, so that 
> branch had to be disassembled and stepped separately until it "synced" up 
> with the other branch again. Maybe this is standard practice in 
> programming (I don't know) but at the time I thought, what kind of evil 
> genius devised this to save a few bytes of memory?

IIRC, that _was_ a common trick at least on machines of that class. It
did have the potential to save a few bytes, yes (more if the
instructions were such that you'd get some _other, desired_, behavior
by jumping into the middle of one with some specific state), but it
also foiled lots of disassemblers: Simply disassembling a binary from
start to finish would yield nonsense in those locations, as you
experienced. It thus basically forced you to single-step those
instructions to figure out what was going on from the binary.

I'm pretty sure it works on every architecture with variable-length
instructions and arbitrary jump capability, as long as you have
control over the specific machine instructions generated (such as if
you are programming in assembler). Of course, it _is_ also a total
nightmare to maintain such code.

I would absolutely not say that doing something like that is standard
practice in modern programming. Even in microcontrollers, where
program and data memory can be scarce even today, I would argue that
the costs would not outweigh the benefits by a long shot.

-- 
Michael Kjörling • https://michael.kjorling.se • michael at kjorling.se
                 “People who think they know everything really annoy
                 those of us who know we don’t.” (Bjarne Stroustrup)


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [TUHS] Discuss of style and design of computer programs from a
  2017-05-06 14:40 [TUHS] Discuss of style and design of computer programs from a user stand point Larry McVoy
@ 2017-05-06 15:09 ` Corey Lindsly
  2017-05-06 15:20   ` Michael Kjörling
  2017-05-06 15:23   ` ron minnich
  0 siblings, 2 replies; 14+ messages in thread
From: Corey Lindsly @ 2017-05-06 15:09 UTC (permalink / raw)



> Personally, I find code that is clean, straightforward, obvious to be
> beautiful.  The clever stuff usually strikes an odd note, not a good
> one.
> 
> --lm

I am not a programmer. Almost four decades ago, my first computer was a 
TRS80 Model 1 with 16KB RAM. I spent one month disassembling and stepping 
through the Z80 code for the resident Microsoft BASIC interpreter. The 
entire thing fit in a 12KB PROM so it was originally written in assembly 
and tightly optimized. It was fascinating and extremely instructive. All 
these years later, I could probably still slap together a Z80 program if I 
needed to. 

Anyway, I reached one point in the assembly code that I simply could not 
understand. It seemed like a mistake, and I went through it again and 
again until I finally realized what it was doing. There was a branch/loop 
that jumped to the middle of a multi-byte machine instruction, so that 
branch had to be disassembled and stepped separately until it "synced" up 
with the other branch again. Maybe this is standard practice in 
programming (I don't know) but at the time I thought, what kind of evil 
genius devised this to save a few bytes of memory?

--corey


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-05-08 17:01 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-07  0:51 [TUHS] Discuss of style and design of computer programs from a Nemo
2017-05-08 13:39 ` Tony Finch
2017-05-08 16:21   ` Steve Johnson
2017-05-08 17:01     ` Dan Cross
  -- strict thread matches above, loose matches on Subject: below --
2017-05-06 14:40 [TUHS] Discuss of style and design of computer programs from a user stand point Larry McVoy
2017-05-06 15:09 ` [TUHS] Discuss of style and design of computer programs from a Corey Lindsly
2017-05-06 15:20   ` Michael Kjörling
2017-05-06 15:24     ` Larry McVoy
2017-05-06 15:51       ` Michael Kjörling
2017-05-06 15:53         ` Larry McVoy
2017-05-06 20:00     ` Steve Nickolas
2017-05-06 21:45       ` Michael Kjörling
2017-05-07  7:42         ` Stephen Kitt
2017-05-06 15:23   ` ron minnich
2017-05-06 15:44     ` Michael Kjörling

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).