The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
* [TUHS] Harvard and Von Neumann Architectures and Unix
@ 2017-11-27 16:11 Noel Chiappa
  2017-11-27 16:50 ` Larry McVoy
                   ` (2 more replies)
  0 siblings, 3 replies; 31+ messages in thread
From: Noel Chiappa @ 2017-11-27 16:11 UTC (permalink / raw)


    > From: Doug McIlroy

    > But if that had been in D space, it couldn't have been executed.

Along those lines, I was wondering about modern OS's, which I gather for
security reasons prevent execution of data, and prevent writing to code.

Programs which emit these little 'custom code fragments' (I prefer that term,
since they aren't really 'self-modifying code' - which I define as 'a program
which _changes_ _existing_ instructions) must have some way of having a chunk
of memory into which they can write, but which can also be executed.


    > Where is the boundary between changing one instruction and changing them
    > all? Or is this boundary a figment of imagination?

Well, the exec() call only overwrites existing instruction memory because of
the semantics of process address space in Unix - there's only one, so it has
to be over-written. An OS operating in a large segmented single-level memory
could implement an exec() as a jump....

BTW, note that although exec() in a single address-space OS is conventionally
something the OS does, this functionality _could_ be moved into the user
space, provided the right address space primitives were provided by the OS,
e.g. 'expand instruction space'. So the exec() code in user space would i)
find the executable, ii) see how much of each kind of memory it needs, iii)
get the OS to give it a block of memory/address space where the exec() code
can live while it's reading in the new code, iv) move itself there, v) use
standard read() calls to read the new image in, and then vi) jump to it.

Yes, it's probably simpler to implement it in the OS, but if one's goal is to
minimize the functionality in the kernel...

	 Noel


^ permalink raw reply	[flat|nested] 31+ messages in thread
* [TUHS] Harvard and Von Neumann Architectures and Unix
@ 2017-11-27 17:11 Noel Chiappa
  2017-11-28  0:23 ` Dave Horsfall
  0 siblings, 1 reply; 31+ messages in thread
From: Noel Chiappa @ 2017-11-27 17:11 UTC (permalink / raw)


    > From: Larry McVoy

    >> they aren't really 'self-modifying code' - which I define as 'a program
    >> which _changes_ _existing_ instructions

    > Isn't that how dtrace works?

I'm not familiar with dtrace(), but if it modifies some other routine's code,
then it would not be "self" modifying, right?


Oh, another category, sort of like biological viruses (which are in a grey
zone between 'alive' and not): the PDP-11 paper tape bootstrap:

  http://ana-3.lcs.mit.edu/~jnc/tech/pdp11/bootloader.mac

in which the program's own code _is_ modified - but not by program
instructions, but by data on the paper tape it is reading in. It's
entertainingly convoluted (the copy above should be well-enough commented to
make it pretty easy to understand what's going on).

     Noel


^ permalink raw reply	[flat|nested] 31+ messages in thread
* [TUHS] Harvard and Von Neumann Architectures and Unix
@ 2017-11-25 17:34 Doug McIlroy
  0 siblings, 0 replies; 31+ messages in thread
From: Doug McIlroy @ 2017-11-25 17:34 UTC (permalink / raw)


From the discussion of self-modifying code:

>> Optimal code for bitblt (raster block transfers) in the Blit
>
> Interesting case. I'm not familiar with BitBLT codes, do they actually modify
> the existing program, or rather do they build small custom ones? Only the
> > former is what I was thinking of.
> 
It built small custom fragments of code. But if that had been in D
space, it couldn't have been executed. 
 
>> Surely JIT compiling must count as self-modifying code.
>
> If it does, then my computer just runs one program from when I turn it
> on.  It switches memory formats and then is forever extending itself and
> throwing chunks away.

Exactly. That is the essence of stored-program computers. The exec
system call is self-modification with a vengeance.

Fill memory-and-execute is the grandest coercion I know. What is
data one instant is code the next.

It's all a matter of viewpoint and scale. Where is the boundary
between changing one instruction and changing them all? Or is
this boundary a figment of imagination?

Doug


^ permalink raw reply	[flat|nested] 31+ messages in thread
* [TUHS] Harvard and Von Neumann Architectures and Unix
@ 2017-11-25 14:24 Noel Chiappa
  2017-11-25 15:58 ` Lawrence Stewart
                   ` (3 more replies)
  0 siblings, 4 replies; 31+ messages in thread
From: Noel Chiappa @ 2017-11-25 14:24 UTC (permalink / raw)


    > From: Doug McIlroy

    > Optimal code for bitblt (raster block transfers) in the Blit

Interesting case. I'm not familiar with BitBLT codes, do they actually modify
the existing program, or rather do they build small custom ones? Only the
former is what I was thinking of.

       Noel



^ permalink raw reply	[flat|nested] 31+ messages in thread
* [TUHS] Harvard and Von Neumann Architectures and Unix
@ 2017-11-25  3:14 Doug McIlroy
  2017-11-25  4:16 ` Jon Steinhart
  2017-11-25 14:23 ` Ralph Corderoy
  0 siblings, 2 replies; 31+ messages in thread
From: Doug McIlroy @ 2017-11-25  3:14 UTC (permalink / raw)


> The thing is that self-modifying code is pretty much an artifact of the dawn
> of computers, [...]
>
> It's just a Bad Idea.

Surely JIT compiling must count as self-modifying code.

Optimal code for bitblt (raster block transfers) in the Blit


^ permalink raw reply	[flat|nested] 31+ messages in thread
* [TUHS] Harvard and Von Neumann Architectures and Unix
@ 2017-11-24 21:43 Noel Chiappa
  2017-11-24 21:50 ` Jon Steinhart
  2017-11-24 22:20 ` Mike Markowski
  0 siblings, 2 replies; 31+ messages in thread
From: Noel Chiappa @ 2017-11-24 21:43 UTC (permalink / raw)


    > From: Will Senn <will.senn at gmail.com>

    > I am curious about how the Harvard Architecture relates to Unix,
    > historically. If the Harvard Architecture is predicated on the
    > separation of code from data in order to prevent self-modifying code (my
    > interpretation)

That's not the 'dictionary' definition, which is 'separate paths for
instructions and data'. But let's go with the 'no self-modifying code' one for
the moment.

The thing is that self-modifying code is pretty much an artifact of the dawn
of computers, before the economics of gates moved from that of tubes, to
transistors, and also before people understood how important good support for
subroutines was. (This latter is a reference to how Whirlwind did subroutines,
with self-modifying code.) Once people had index registers, and lots of
registers in general, self-modifying code (except for a few small, special
hacks like bootstraps which had to fit in tiny spaces) became as dead as the
dodo.

It's just a Bad Idea.

    > then it would seem to me to be somewhat at odds with a Unix philosophy
    > of extreme abstraction (code, data, it's all 0's and 1's, after all).

The people who built Unix were fundamentally very practical. Self-modifing
code is not 'practical'. (And note that Unix from V4:

  http://minnie.tuhs.org/cgi-bin/utree.pl?file=V4/nsys/ken/text.c

onward has support for pure text - for practical reasons).

    > the PDP-11 itself, with the Unibus and apparently agnostic ISA seem to
    > summarily reject the Harvard Architecure...

You could say that of a zillion computers. The only recent computer I can
think of offhand with separate instruction and data paths was the AMD 42K
(nice chip, I used it in a product we built at Proteon). They had separate
ports for instructions and data purely for performance reasons. (Our card had
a pathway which allowed the CPU to write the instruction memory, needed during
booting, obviously; the details as to how we did it escape me now.)


    > From: Jon Steinhart

    > For all intents and purposes instructions were separate from data from
    > the PDP 11/70 on.

s/70/45/.

And the other -11 memory management (as on the /40, /23, etc) does allow for
execute-only 'segments' (they call them 'pages' in the later versions of the
manual, but they're not) - again, separating code from data. Unix used this
for shared pure texts.

And note that those machines with separate I+D space don't meet the dictionary
definition either, because they only have one bus from the CPU to memory,
shared between data and instruction fetches.

       Noel


^ permalink raw reply	[flat|nested] 31+ messages in thread
* [TUHS] Harvard and Von Neumann Architectures and Unix
@ 2017-11-24 19:25 Will Senn
  2017-11-24 19:28 ` Jon Steinhart
  2017-11-27 14:50 ` Tony Finch
  0 siblings, 2 replies; 31+ messages in thread
From: Will Senn @ 2017-11-24 19:25 UTC (permalink / raw)


I am curious about how the Harvard Architecture relates to Unix, 
historically. If the Harvard Architecture is predicated on the 
separation of code from data in order to prevent self-modifying code (my 
interpretation), then it would seem to me to be somewhat at odds with a 
Unix philosophy of extreme abstraction (code, data, it's all 0's and 
1's, after all). In my naive understanding, the PDP-11 itself, with the 
Unibus and apparently agnostic ISA seem to summarily reject the Harvard 
Architecure...

My question is - was there tension around Harvard and Von Neumann 
architectures in Unix circles and if so, how was it resolved?

Thanks,

Will

-- 
GPG Fingerprint: 68F4 B3BD 1730 555A 4462  7D45 3EAA 5B6D A982 BAAF



^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2017-11-28 19:45 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-27 16:11 [TUHS] Harvard and Von Neumann Architectures and Unix Noel Chiappa
2017-11-27 16:50 ` Larry McVoy
2017-11-27 17:08   ` Clem Cole
2017-11-27 18:21     ` Lawrence Stewart
2017-11-27 18:30       ` Lars Brinkhoff
2017-11-27 18:14   ` Warner Losh
2017-11-27 18:26     ` Paul Winalski
2017-11-27 17:35 ` Ian Zimmerman
2017-11-28 14:55 ` Tim Bradshaw
2017-11-28 19:45   ` Paul Winalski
  -- strict thread matches above, loose matches on Subject: below --
2017-11-27 17:11 Noel Chiappa
2017-11-28  0:23 ` Dave Horsfall
2017-11-25 17:34 Doug McIlroy
2017-11-25 14:24 Noel Chiappa
2017-11-25 15:58 ` Lawrence Stewart
2017-11-25 16:10 ` Lars Brinkhoff
2017-11-25 19:59 ` Steve Simon
2017-11-25 21:59 ` Bakul Shah
2017-11-25  3:14 Doug McIlroy
2017-11-25  4:16 ` Jon Steinhart
2017-11-25  5:17   ` ron minnich
2017-11-25 14:23 ` Ralph Corderoy
2017-11-24 21:43 Noel Chiappa
2017-11-24 21:50 ` Jon Steinhart
2017-11-25 21:55   ` William Cheswick
2017-11-25 23:15     ` Dave Horsfall
2017-11-24 22:20 ` Mike Markowski
2017-11-24 22:31   ` Dave Horsfall
2017-11-24 19:25 Will Senn
2017-11-24 19:28 ` Jon Steinhart
2017-11-27 14:50 ` Tony Finch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).