The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
* [TUHS] Paging code in SysV R2
@ 2022-03-29 11:21 Paul Ruizendaal via TUHS
  2022-03-29 12:38 ` Rob Pike
  2022-03-29 14:05 ` Clem Cole
  0 siblings, 2 replies; 6+ messages in thread
From: Paul Ruizendaal via TUHS @ 2022-03-29 11:21 UTC (permalink / raw)
  To: TUHS main list


I did not have a lot of time to work on documenting the evolution of paging / virtual memory code in 32V, Sys III and early SysV in the past months, but I did get some more background information that seems worth sharing.

My understanding of the virtual memory story at USG is now as follows:

Somewhere in 1981/82 a project plan for Unix 5 / System V was made and evolving John Reiser’s virtual memory code for 32V-r3 was part of that plan. “Evolving” in this context meant making it more maintainable and more hardware independent. John’s code assumed a memory page, a disk block and a file block all to be the same size, and it needed to be more general. It was also designed around the VAX MMU and this too needed to be generalised. The person assigned to that job was Bob (Robert) Baron, reporting to Tom Raleigh. The project involved quite a bit of re-architecting and progress was slowish. On top of that Bob left for CMU to work on Mach. Tom Raleigh tried to pick up where Bob had left off, but progress remained slowish.

In parallel, Keith Kelleman and Steve Burroff were working on Unix for the 3B20 Unix. They did paging code from scratch around the 3B20 MMU (which used a more or less ‘modern’ page table design) and developed their idea for the “regions” abstraction to support large, non-contiguous address spaces. It seems that they built on the main working set ideas/concepts in the Reiser/Baron/Raleigh code base, combined these with their “regions” idea, made it multi-processor capable and made it all work on the 3B20. Around that time Tom Raleigh seems to have transferred to Bellcore, and the VAX code base got orphaned.

Two young engineers appear to have picked up the work on the VAX code base: Dean Jagels and Jim McCormick. My understanding is that they essentially back ported the 3B20 work to the VAX, falling back on the Reiser/Baron/Raleigh work where necessary. They got it working, and as far as I can tell, this is what got released in 1984 as part of SysV R2.4 for the VAX (the oldest surviving source code for this that I could find).

This somewhat tortuous birth may in part explain why Research chose to use the 4BSD virtual memory code for 8th edition.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [TUHS] Paging code in SysV R2
  2022-03-29 11:21 [TUHS] Paging code in SysV R2 Paul Ruizendaal via TUHS
@ 2022-03-29 12:38 ` Rob Pike
  2022-03-29 14:05 ` Clem Cole
  1 sibling, 0 replies; 6+ messages in thread
From: Rob Pike @ 2022-03-29 12:38 UTC (permalink / raw)
  To: Paul Ruizendaal; +Cc: TUHS main list

"This somewhat tortuous birth may in part explain why Research chose
to use the 4BSD virtual memory code for 8th edition."

No.

There was a short but intense debate about Reiser's vs. BSD's kernel
for our 780 in 1980 to '81, before the story you tell gets started.

I was not a central player in the decision, far from it, and there
were legitimate reasons to go either way, but I think it was mostly
due to a stronger feeling of connection with Berkeley than with
Holmdel coupled with some personality conflicts at the management
level. There were also issues around expected support, and discomfort
around the way Reiser's model made some Unix things work a little
differently.

I don't believe I ever used a computer that felt as fast, for its
time, as the Holmdel VAX running Reiser and London's kernel, and I was
definitely rooting for that to win. When BSD landed it was not in the
same league, not at all. But it had other advantages.

-rob

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [TUHS] Paging code in SysV R2
  2022-03-29 11:21 [TUHS] Paging code in SysV R2 Paul Ruizendaal via TUHS
  2022-03-29 12:38 ` Rob Pike
@ 2022-03-29 14:05 ` Clem Cole
  2022-03-29 15:24   ` Paul Ruizendaal
  1 sibling, 1 reply; 6+ messages in thread
From: Clem Cole @ 2022-03-29 14:05 UTC (permalink / raw)
  To: Paul Ruizendaal; +Cc: TUHS main list

[-- Attachment #1: Type: text/plain, Size: 2637 bytes --]

Fascinating - thank you.

Have you figured out that path from here to the SVR4 code base that was
used for the x86 [which I think also went through a few more generations
after the SVR4 release]?

Clem

On Tue, Mar 29, 2022 at 7:22 AM Paul Ruizendaal via TUHS <
tuhs@minnie.tuhs.org> wrote:

>
> I did not have a lot of time to work on documenting the evolution of
> paging / virtual memory code in 32V, Sys III and early SysV in the past
> months, but I did get some more background information that seems worth
> sharing.
>
> My understanding of the virtual memory story at USG is now as follows:
>
> Somewhere in 1981/82 a project plan for Unix 5 / System V was made and
> evolving John Reiser’s virtual memory code for 32V-r3 was part of that
> plan. “Evolving” in this context meant making it more maintainable and more
> hardware independent. John’s code assumed a memory page, a disk block and a
> file block all to be the same size, and it needed to be more general. It
> was also designed around the VAX MMU and this too needed to be generalised.
> The person assigned to that job was Bob (Robert) Baron, reporting to Tom
> Raleigh. The project involved quite a bit of re-architecting and progress
> was slowish. On top of that Bob left for CMU to work on Mach. Tom Raleigh
> tried to pick up where Bob had left off, but progress remained slowish.
>
> In parallel, Keith Kelleman and Steve Burroff were working on Unix for the
> 3B20 Unix. They did paging code from scratch around the 3B20 MMU (which
> used a more or less ‘modern’ page table design) and developed their idea
> for the “regions” abstraction to support large, non-contiguous address
> spaces. It seems that they built on the main working set ideas/concepts in
> the Reiser/Baron/Raleigh code base, combined these with their “regions”
> idea, made it multi-processor capable and made it all work on the 3B20.
> Around that time Tom Raleigh seems to have transferred to Bellcore, and the
> VAX code base got orphaned.
>
> Two young engineers appear to have picked up the work on the VAX code
> base: Dean Jagels and Jim McCormick. My understanding is that they
> essentially back ported the 3B20 work to the VAX, falling back on the
> Reiser/Baron/Raleigh work where necessary. They got it working, and as far
> as I can tell, this is what got released in 1984 as part of SysV R2.4 for
> the VAX (the oldest surviving source code for this that I could find).
>
> This somewhat tortuous birth may in part explain why Research chose to use
> the 4BSD virtual memory code for 8th edition.
>
>
>

[-- Attachment #2: Type: text/html, Size: 3312 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [TUHS] Paging code in SysV R2
  2022-03-29 14:05 ` Clem Cole
@ 2022-03-29 15:24   ` Paul Ruizendaal
  2022-03-29 15:57     ` Dan Cross
  2022-03-29 16:01     ` Larry McVoy
  0 siblings, 2 replies; 6+ messages in thread
From: Paul Ruizendaal @ 2022-03-29 15:24 UTC (permalink / raw)
  To: Clem Cole; +Cc: TUHS main list

No, sorry, my scope of interest is mostly 1975-1985.

I did read the Mach virtual memory paper from 1988 - from that paper I gather that the data structures used are totally different from those in Sys V or BSD.

There is also the VM implementation that Richard Miller did on SysV r1 in 1983.  Interestingly, his design seems to parallel the choices made by Reiser a few years before, but it is lighter touch. Both Reiser and Miller refer to Denning and Tenex as prior art. Miller's 1984 Usenix paper about this project argues that doing approximated LRU from the page table data results in a process local working set view, which he argued was preferable to the system global working set view generated in the BSD clock algorithm.

Paul

> On 29 Mar 2022, at 16:05, Clem Cole <clemc@ccc.com> wrote:
> 
> Fascinating - thank you.
> 
> Have you figured out that path from here to the SVR4 code base that was used for the x86 [which I think also went through a few more generations after the SVR4 release]?   
> 
> Clem
> 
> On Tue, Mar 29, 2022 at 7:22 AM Paul Ruizendaal via TUHS <tuhs@minnie.tuhs.org> wrote:
> 
> I did not have a lot of time to work on documenting the evolution of paging / virtual memory code in 32V, Sys III and early SysV in the past months, but I did get some more background information that seems worth sharing.
> 
> My understanding of the virtual memory story at USG is now as follows:
> 
> Somewhere in 1981/82 a project plan for Unix 5 / System V was made and evolving John Reiser’s virtual memory code for 32V-r3 was part of that plan. “Evolving” in this context meant making it more maintainable and more hardware independent. John’s code assumed a memory page, a disk block and a file block all to be the same size, and it needed to be more general. It was also designed around the VAX MMU and this too needed to be generalised. The person assigned to that job was Bob (Robert) Baron, reporting to Tom Raleigh. The project involved quite a bit of re-architecting and progress was slowish. On top of that Bob left for CMU to work on Mach. Tom Raleigh tried to pick up where Bob had left off, but progress remained slowish.
> 
> In parallel, Keith Kelleman and Steve Burroff were working on Unix for the 3B20 Unix. They did paging code from scratch around the 3B20 MMU (which used a more or less ‘modern’ page table design) and developed their idea for the “regions” abstraction to support large, non-contiguous address spaces. It seems that they built on the main working set ideas/concepts in the Reiser/Baron/Raleigh code base, combined these with their “regions” idea, made it multi-processor capable and made it all work on the 3B20. Around that time Tom Raleigh seems to have transferred to Bellcore, and the VAX code base got orphaned.
> 
> Two young engineers appear to have picked up the work on the VAX code base: Dean Jagels and Jim McCormick. My understanding is that they essentially back ported the 3B20 work to the VAX, falling back on the Reiser/Baron/Raleigh work where necessary. They got it working, and as far as I can tell, this is what got released in 1984 as part of SysV R2.4 for the VAX (the oldest surviving source code for this that I could find).
> 
> This somewhat tortuous birth may in part explain why Research chose to use the 4BSD virtual memory code for 8th edition.
> 
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [TUHS] Paging code in SysV R2
  2022-03-29 15:24   ` Paul Ruizendaal
@ 2022-03-29 15:57     ` Dan Cross
  2022-03-29 16:01     ` Larry McVoy
  1 sibling, 0 replies; 6+ messages in thread
From: Dan Cross @ 2022-03-29 15:57 UTC (permalink / raw)
  To: Paul Ruizendaal; +Cc: TUHS main list

[-- Attachment #1: Type: text/plain, Size: 3953 bytes --]

On Tue, Mar 29, 2022 at 11:25 AM Paul Ruizendaal <pnr@planet.nl> wrote:

> No, sorry, my scope of interest is mostly 1975-1985.
>
> I did read the Mach virtual memory paper from 1988 - from that paper I
> gather that the data structures used are totally different from those in
> Sys V or BSD.
>
> There is also the VM implementation that Richard Miller did on SysV r1 in
> 1983.  Interestingly, his design seems to parallel the choices made by
> Reiser a few years before, but it is lighter touch. Both Reiser and Miller
> refer to Denning and Tenex as prior art. Miller's 1984 Usenix paper about
> this project argues that doing approximated LRU from the page table data
> results in a process local working set view, which he argued was preferable
> to the system global working set view generated in the BSD clock algorithm.
>

I brought this up on this list back in 2017, but a few years later, Charles
Forsyth in the UK did a VM system for SunOS 4 based on the EMAS system.
https://www.terzarima.net/doc/taste.pdf

I'm not quite sure when this was written, but it cites papers from 1989, so
sometime that year or 1990, would be my guess.

        - Dan C.

> On 29 Mar 2022, at 16:05, Clem Cole <clemc@ccc.com> wrote:
> >
> > Fascinating - thank you.
> >
> > Have you figured out that path from here to the SVR4 code base that was
> used for the x86 [which I think also went through a few more generations
> after the SVR4 release]?
> >
> > Clem
> >
> > On Tue, Mar 29, 2022 at 7:22 AM Paul Ruizendaal via TUHS <
> tuhs@minnie.tuhs.org> wrote:
> >
> > I did not have a lot of time to work on documenting the evolution of
> paging / virtual memory code in 32V, Sys III and early SysV in the past
> months, but I did get some more background information that seems worth
> sharing.
> >
> > My understanding of the virtual memory story at USG is now as follows:
> >
> > Somewhere in 1981/82 a project plan for Unix 5 / System V was made and
> evolving John Reiser’s virtual memory code for 32V-r3 was part of that
> plan. “Evolving” in this context meant making it more maintainable and more
> hardware independent. John’s code assumed a memory page, a disk block and a
> file block all to be the same size, and it needed to be more general. It
> was also designed around the VAX MMU and this too needed to be generalised.
> The person assigned to that job was Bob (Robert) Baron, reporting to Tom
> Raleigh. The project involved quite a bit of re-architecting and progress
> was slowish. On top of that Bob left for CMU to work on Mach. Tom Raleigh
> tried to pick up where Bob had left off, but progress remained slowish.
> >
> > In parallel, Keith Kelleman and Steve Burroff were working on Unix for
> the 3B20 Unix. They did paging code from scratch around the 3B20 MMU (which
> used a more or less ‘modern’ page table design) and developed their idea
> for the “regions” abstraction to support large, non-contiguous address
> spaces. It seems that they built on the main working set ideas/concepts in
> the Reiser/Baron/Raleigh code base, combined these with their “regions”
> idea, made it multi-processor capable and made it all work on the 3B20.
> Around that time Tom Raleigh seems to have transferred to Bellcore, and the
> VAX code base got orphaned.
> >
> > Two young engineers appear to have picked up the work on the VAX code
> base: Dean Jagels and Jim McCormick. My understanding is that they
> essentially back ported the 3B20 work to the VAX, falling back on the
> Reiser/Baron/Raleigh work where necessary. They got it working, and as far
> as I can tell, this is what got released in 1984 as part of SysV R2.4 for
> the VAX (the oldest surviving source code for this that I could find).
> >
> > This somewhat tortuous birth may in part explain why Research chose to
> use the 4BSD virtual memory code for 8th edition.
> >
> >
>
>

[-- Attachment #2: Type: text/html, Size: 4631 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [TUHS] Paging code in SysV R2
  2022-03-29 15:24   ` Paul Ruizendaal
  2022-03-29 15:57     ` Dan Cross
@ 2022-03-29 16:01     ` Larry McVoy
  1 sibling, 0 replies; 6+ messages in thread
From: Larry McVoy @ 2022-03-29 16:01 UTC (permalink / raw)
  To: Paul Ruizendaal; +Cc: TUHS main list

On Tue, Mar 29, 2022 at 05:24:17PM +0200, Paul Ruizendaal wrote:
> No, sorry, my scope of interest is mostly 1975-1985.
> 
> I did read the Mach virtual memory paper from 1988 - from that paper I gather that the data structures used are totally different from those in Sys V or BSD.
> 
> There is also the VM implementation that Richard Miller did on SysV r1
> in 1983.  Interestingly, his design seems to parallel the choices made
> by Reiser a few years before, but it is lighter touch. Both Reiser and
> Miller refer to Denning and Tenex as prior art. Miller's 1984 Usenix
> paper about this project argues that doing approximated LRU from the
> page table data results in a process local working set view, which he
> argued was preferable to the system global working set view generated
> in the BSD clock algorithm.

I have HUGE respect for the BSD pageout code.  I was the guy who first
got rid of the rotational delay in UFS which doubled the speed at which
you could read data off the disk.  Which was nice, but it filled memory
with pages and the pageout code could not free them as fast as UFS 
was making more.

So I wrote 13 different attempts at a better pageout system (no, I didn't
save the code, it belonged to Sun, I think it is lost at this point).
While some of those attempts were better in very specific cases,
none of them, not one of them, were as good, in all cases, as the BSD
pageout code.

That said, it couldn't keep up now that UFS was faster (it couldn't 
keep up with a single disk, it had no hope of keeping up with multiple
disks).  So I did an awful hack, that last I checked was still there:

    if (we are doing sequential reads AND we are about to start paging out) {
    	start freeing pages a window behind where we are reading
    }

The effect was that if you were about to be out of free memory, sequential
I/O became a sliding 256K window of pages, the vnode that was using up all
the memory became the one that started freeing memory.  It was much faster
and simpler than having the pageout code scan through every page.

It was also an ugly hack, it was just the best I could come up with at the
time.

The whole effort left me in awe of the BSD pageout code.  It's pretty
simple but I'd challenge anyone to do better than that code in all
use cases.  Not saying it can't be done, but when I was at my coding
prime, I couldn't do it.

I did start down the path of doing a vnode based working set design.
I gave up when I realized that the swap vnode was by far the biggest
vnode, everyone's brk() came from that and it was all jumbled together.
Trying to do a vnode per process got really messy around fork().

I had many a late night design discussion at Antonio's Nut House in
Palo Alto with my office mate, Howard Chartok, trying to come up with
a design that couldn't be shot down.  Perhaps too much beer because we
never got to one.  

I still suspect there is something to be said for a vnode based working
set approach.  Especially if you brought back the hack I did that put
the basename of the vnode in the vnode (Shannon took it out because of
"hard links").  Too bad that didn't survive, I wrote a topvn (like top
but for vnodes) that was really good at giving you some insight into
what was going on.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-03-29 16:01 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-29 11:21 [TUHS] Paging code in SysV R2 Paul Ruizendaal via TUHS
2022-03-29 12:38 ` Rob Pike
2022-03-29 14:05 ` Clem Cole
2022-03-29 15:24   ` Paul Ruizendaal
2022-03-29 15:57     ` Dan Cross
2022-03-29 16:01     ` Larry McVoy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).