The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
* [TUHS] Re: Origins of the SGS (System Generation Software) and COFF (Common Object File Format)
@ 2023-02-23 15:13 Noel Chiappa
  0 siblings, 0 replies; 23+ messages in thread
From: Noel Chiappa @ 2023-02-23 15:13 UTC (permalink / raw)
  To: tuhs; +Cc: jnc

    > From: Clem Cole

    > MIT had a modified a.out format for the NU machine ports - that might
    > have been called b.out.

Yes. Here's the man page output:

  http://ana-3.lcs.mit.edu/~jnc/tech/unix/help/b.out.lpt

(I don't have the source for that, alas.) It's basically just a.out with
32-bit fields instead of 16-bit. I have a .h file for the format too, if
anyone has any interest in it. It's all part of the MIT 68K workbench that
used PCC (the source for all of which I do have).

	 Noel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [TUHS] Re: Origins of the SGS (System Generation Software) and COFF (Common Object File Format)
@ 2023-02-26 15:51 Paul Winalski
  0 siblings, 0 replies; 23+ messages in thread
From: Paul Winalski @ 2023-02-26 15:51 UTC (permalink / raw)
  To: TUHS

On 2/25/23, Brian Walden <tuhs@cuzuco.com> wrote:
> It was originaly 205. See A.OUT(V) (the first page) at
> https://www.bell-labs.com/usr/dmr/www/man51.pdf it was documented as to
> why.
>
>
>      The header always contains 6 words:
>           1 "br .+14" instruction (205(8))
>           2 The size of the program text
>           3 The size of the symbol table
>           4 The size of the relocation bits area
>           5 The size of a data area
>           6 A zero word (unused at present)
>
> I always found this so elegant in it's simplicity. Just load and start
> execution at the start (simplifies exec(2) in the kernel) I always wondered
> if this has done anywhere else before, or invenetd first in unix.

IBM's Basic Program Support (BPS) for System/360 was a set of
stand-alone utilities for developing and running stand-alone programs.
BPS/360 wasn't really an operating system because there wasn't any
resident kernel.  You just IPLed (Initial Program Load; IBM-speak for
"boot") your application directly.  So the executable format for BPS
had a bootstrap loader as the "program header".  Not quite the same
thing as a.out's 205(8) magic number, but similar in concept.

I don't know of any other OS ABI that uses this trick to transfer
control to application programs.

Microsoft uses something similar in PECOFF.  A PECOFF executable for
x86 or X86-64 starts with a bit of code in MS-DOS MZ executable format
that prints the message "This program cannot be run in DOS mode".

-Paul W.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [TUHS] Re: Origins of the SGS (System Generation Software) and COFF (Common Object File Format)
@ 2023-02-25 20:14 Brian Walden
  0 siblings, 0 replies; 23+ messages in thread
From: Brian Walden @ 2023-02-25 20:14 UTC (permalink / raw)
  To: tuhs

It was originaly 205. See A.OUT(V) (the first page) at https://www.bell-labs.com/usr/dmr/www/man51.pdf it was documented as to why.


     The header always contains 6 words:
          1 "br .+14" instruction (205(8))
          2 The size of the program text
          3 The size of the symbol table
          4 The size of the relocation bits area
          5 The size of a data area
          6 A zero word (unused at present)

I always found this so elegant in it's simplicity. Just load and start
execution at the start (simplifies exec(2) in the kernel) I always wondered
if this has done anywhere else before, or invenetd first in unix.

Theres was also a recent discussion of ar(1). That pdf also explains its magic
number a few pages later. It was simply choosen because it seemed unique.

     A file produced by ar has a "magic number" at the start,
     followed by the constituent files, each preceded by a file
     header. The magic number is -147(10), or 177555(8) (it was
     chosen to be unlikely to occur anywhere else).

-Brian

On Sat, 25 Feb 2023, Dave Horsfall wrote:

> On Thu, 23 Feb 2023, Paul Winalski wrote:
>
> > a.out was, as object file formats go, a throwback to the stone age from
> > the get-go.  Even the most primitive of IBM's link editors for
> > System/360 supported arbitrary naming of object file sections and the
> > ability for the programmer to arrange them in whatever order they
> > wished.  a.out's restriction to three sections (.text, .data, .bss) did
> > manage to get the job done, and even (with ZMAGIC) could support
> > demand-paged virtual memory, but only just.
>
> That may be so, but those guys didn't exactly have the resources of
> IBM behind them...
>
> And I wonder how many people here know the significance of the "407" magic
> number?
>
> -- Dave

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [TUHS] Re: Origins of the SGS (System Generation Software) and COFF (Common Object File Format)
  2023-02-25 19:28         ` arnold
@ 2023-02-25 19:34           ` Steffen Nurpmeso
  0 siblings, 0 replies; 23+ messages in thread
From: Steffen Nurpmeso @ 2023-02-25 19:34 UTC (permalink / raw)
  To: arnold; +Cc: tuhs

arnold@skeeve.com wrote in
 <202302251928.31PJSXc1004140@freefriends.org>:
 |Arno Griffioen via TUHS <tuhs@tuhs.org> wrote:
 |> On Fri, Feb 24, 2023 at 05:45:16AM -0700, arnold@skeeve.com wrote:
 |>> With tar and cpio, ar apparently fell out of use as a general
 |>> archiver, and today it's only used for libraries of relocatable
 |>> object files.
 |>
 |> 'ar' is alive and well as the archive format for .deb files though, \
 |> so it 
 |> could be argued that it's actually hugely in use as an archiver all \
 |> across 
 |> the world :)
 |>
 |>        Bye, Arno.
 |
 |I wasn't aware of that. Thanks.

It is also used during the build phase of some programs (groff
last i looked).  Ie they build their support objects, and instead
of having make rules with $(OBJ) they simply link against the
library.  (I have not looked whether they build with -fPIC
/ -fPIE, to outrule your "relocatable", that is.)

 --End of <202302251928.31PJSXc1004140@freefriends.org>

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [TUHS] Re: Origins of the SGS (System Generation Software) and COFF (Common Object File Format)
  2023-02-24 13:13       ` Arno Griffioen via TUHS
@ 2023-02-25 19:28         ` arnold
  2023-02-25 19:34           ` Steffen Nurpmeso
  0 siblings, 1 reply; 23+ messages in thread
From: arnold @ 2023-02-25 19:28 UTC (permalink / raw)
  To: tuhs, arno.griffioen

Arno Griffioen via TUHS <tuhs@tuhs.org> wrote:

> On Fri, Feb 24, 2023 at 05:45:16AM -0700, arnold@skeeve.com wrote:
> > With tar and cpio, ar apparently fell out of use as a general
> > archiver, and today it's only used for libraries of relocatable
> > object files.
>
> 'ar' is alive and well as the archive format for .deb files though, so it 
> could be argued that it's actually hugely in use as an archiver all across 
> the world :)
>
> 							Bye, Arno.

I wasn't aware of that. Thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [TUHS] Re: Origins of the SGS (System Generation Software) and COFF (Common Object File Format)
  2023-02-25 15:30       ` Clem Cole
@ 2023-02-25 17:29         ` Paul Winalski
  0 siblings, 0 replies; 23+ messages in thread
From: Paul Winalski @ 2023-02-25 17:29 UTC (permalink / raw)
  To: Clem Cole; +Cc: The Eunuchs Hysterical Society

On 2/25/23, Clem Cole <clemc@ccc.com> wrote:
>
> The IBM link
> editors needed all that back in the day.

One of the complications of executable images was the address space
layout for OS/360.  There was no virtual memory.  The low part of the
address space was where the operating system kernel (supervisor, in
IBM-speak) lived.  Each of what we now call processes was assigned a
contiguous portion of the remaining address space (this was called a
partition).  As executable program retained the relocations that had
been applied by the linker so that the program loader could adjust the
addresses in the executable depending on which partition it was being
loaded into.  This made things more complicated for the link editor.

The OS/360 link editor also doubled as a patch tool.  The link editor
was capable of taking an executable and a set of new versions for some
object modules, un-linking those modules, and replacing them with the
new versions, adjusting all relocations accordingly.

> Please correct me if I'm misinformed, but Paul, of course, had to support
> the DEC language tools on Unix, which had come from systems that had a more
> flexible format (the solution for Ultrix IICR was to move a flavor of the
> VMS linker to UNIX for object file and just a.out for execution).

VAX Fortran for Ultrix was a port of the VAX/VMS Fortran compiler and
runtime to Ultrix.  There were two big problems, object-file-wise.
First, VAX Fortran used many of the advanced features of the VMS
object language.  To generate a.out directly, those chores would have
to be done in the compiler's code generator.  The runtime library had
an even worse problem.  One innovative feature of VMS was that it had
a very robust ABI that could support every programming language then
known, and the compiler development teams were required to adhere to
this ABI and to provide language extensions were necessary to support
all of the ABI's features.  The result was that calling subroutines
written in a different language was dead easy.  It didn't matter which
programming language you used.  The developers of the Fortran runtime
took full advantage of that, and it contained routines written in
several languages.  This meant that we would either need to add a.out
support to the code generators of all of those compilers (there was no
common back end in those days) or write a VMS .obj-to-a.out
translator.  In the end we decided that the easiest solution to both
problems was to add a.out support to the VMS linker and to port it to
Ultrix.  The result was lk(1), a linker that accepted either VMS .obj
or a.out as input and generated an a.out executable.

> My point, the UNIX developers built what they needed and built that well.

Amen to that.  That observation applies to a lot of the design of Unix.

> Their format worked with their development primary language/tool (a.k.a. C)
> and they even made it work with an f77 implementation. So it is a little
> hard to knock them too much -- it was not part of the original design spec.

They did what was right given their circumstances and resources.  I
know of two major operating system development efforts at DEC, OFIS
and MICA, that illustrate what happens if you try to take the opposite
tack.  Their thinking was, "Gee, we have a lot to do and a short
schedule.  I know--let's invent a new programming language and write a
compiler for it."  Both ended up being disastrous examples of Second
System Syndrome.

> And I wonder how many people here know the significance of the "407" magic
>> number?
>>
> Today, few and fewer I fear.  For those do not, please see page 4-33 of the
> 1975/76 DEC PDP-11 Processor Handbook and think about boot blocks.  🍺
> ᐧ
Cute.

-Paul W.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [TUHS] Re: Origins of the SGS (System Generation Software) and COFF (Common Object File Format)
  2023-02-25  2:07     ` Dave Horsfall
@ 2023-02-25 15:30       ` Clem Cole
  2023-02-25 17:29         ` Paul Winalski
  0 siblings, 1 reply; 23+ messages in thread
From: Clem Cole @ 2023-02-25 15:30 UTC (permalink / raw)
  To: Dave Horsfall; +Cc: The Eunuchs Hysterical Society

[-- Attachment #1: Type: text/plain, Size: 2601 bytes --]

On Fri, Feb 24, 2023 at 9:07 PM Dave Horsfall <dave@horsfall.org> wrote:

> On Thu, 23 Feb 2023, Paul Winalski wrote:
>
> > a.out was, as object file formats go, a throwback to the stone age from
> > the get-go.  Even the most primitive of IBM's link editors for
> > System/360 supported arbitrary naming of object file sections and the
> > ability for the programmer to arrange them in whatever order they
> > wished.  a.out's restriction to three sections (.text, .data, .bss) did
> > manage to get the job done, and even (with ZMAGIC) could support
> > demand-paged virtual memory, but only just.
>
> That may be so, but those guys didn't exactly have the resources of
> IBM behind them...
>
A reasonable point, but it was more Ken, Dennis, and the team did what
>>they needed<< and it was good enough for a long time.  The IBM link
editors needed all that back in the day.  As more and more "modern"
languages came into being, it was not until about 6th editions that
difficulties of not having an expandable object format and better linker
began to show, and as Paul says, until the support for demand paging that
a.out was really stressed.

Please correct me if I'm misinformed, but Paul, of course, had to support
the DEC language tools on Unix, which had come from systems that had a more
flexible format (the solution for Ultrix IICR was to move a flavor of the
VMS linker to UNIX for object file and just a.out for execution).   So he
lived the difficulties/shortcomings.  A valid argument is Tanndenbaum's
compiler toolkit survived with a.out, and he supported many of the same
language targets that DEC did.  Andy and crew did their own assemblers,
does anyone remember if they supplied a new linker and object format? That
would make Paul's point more powerful -> the languages people wanted
something more.

My point, the UNIX developers built what they needed and built that well.
Their format worked with their development primary language/tool (a.k.a. C)
and they even made it work with an f77 implementation. So it is a little
hard to knock them too much -- it was not part of the original design spec.

But as Matt has discussed in his digging through things, it does look like
as the AT&T languages team started to run into some of the same barriers,
they started to move to a new format.


And I wonder how many people here know the significance of the "407" magic
> number?
>
Today, few and fewer I fear.  For those do not, please see page 4-33 of the
1975/76 DEC PDP-11 Processor Handbook and think about boot blocks.  🍺
ᐧ

[-- Attachment #2: Type: text/html, Size: 5639 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [TUHS] Re: Origins of the SGS (System Generation Software) and COFF (Common Object File Format)
  2023-02-23 16:49   ` Paul Winalski
  2023-02-23 18:38     ` segaloco via TUHS
  2023-02-24 12:45     ` arnold
@ 2023-02-25  2:07     ` Dave Horsfall
  2023-02-25 15:30       ` Clem Cole
  2 siblings, 1 reply; 23+ messages in thread
From: Dave Horsfall @ 2023-02-25  2:07 UTC (permalink / raw)
  To: The Eunuchs Hysterical Society

On Thu, 23 Feb 2023, Paul Winalski wrote:

> a.out was, as object file formats go, a throwback to the stone age from 
> the get-go.  Even the most primitive of IBM's link editors for 
> System/360 supported arbitrary naming of object file sections and the 
> ability for the programmer to arrange them in whatever order they 
> wished.  a.out's restriction to three sections (.text, .data, .bss) did 
> manage to get the job done, and even (with ZMAGIC) could support 
> demand-paged virtual memory, but only just.

That may be so, but those guys didn't exactly have the resources of
IBM behind them...

And I wonder how many people here know the significance of the "407" magic 
number?

-- Dave

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [TUHS] Re: Origins of the SGS (System Generation Software) and COFF (Common Object File Format)
  2023-02-23 19:37     ` Warner Losh
@ 2023-02-24 17:01       ` Rich Salz
  0 siblings, 0 replies; 23+ messages in thread
From: Rich Salz @ 2023-02-24 17:01 UTC (permalink / raw)
  To: Warner Losh; +Cc: The Eunuchs Hysterical Society

[-- Attachment #1: Type: text/plain, Size: 890 bytes --]

> MIT's PCC ports to various micros, especailly x86, also used b.out that
> was quite similar to a.out.
>

That was a cross-compiler, right? If so, I remember going over to MIT-LCS
and picking up a copy of the sources for the compiler and the MIT PC/IP
implementation. I had to show a copy of my ATT source license to the
secretary, who made a copy of a 9track tape for me.She was  a hoot: dressed
like a punk, and we takled about not working for defense companies. We were
then able to build and install various TCP/IP utilities on our PC-AT
machines running early Windows releases.  This was probably early 1980's. I
think the PC/IP code was based on David Clark's upcalls idea[1]. It later
became the basis for FTP software, which then died when Microsoft added TCP
to windows.

[1]
https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=e7c6da568d15a3ea50854dbd951c1d6394502a15

[-- Attachment #2: Type: text/html, Size: 1594 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [TUHS] Re: Origins of the SGS (System Generation Software) and COFF (Common Object File Format)
  2023-02-24 12:45     ` arnold
  2023-02-24 13:13       ` Arno Griffioen via TUHS
@ 2023-02-24 14:01       ` Harald Arnesen
  1 sibling, 0 replies; 23+ messages in thread
From: Harald Arnesen @ 2023-02-24 14:01 UTC (permalink / raw)
  To: tuhs

arnold@skeeve.com [24/02/2023 13.45]:

> With tar and cpio, ar apparently fell out of use as a general
> archiver, and today it's only used for libraries of relocatable
> object files.

And Debian/Devuan .deb packages.
-- 
Hilsen Harald
Слава Україні!


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [TUHS] Re: Origins of the SGS (System Generation Software) and COFF (Common Object File Format)
  2023-02-24 12:45     ` arnold
@ 2023-02-24 13:13       ` Arno Griffioen via TUHS
  2023-02-25 19:28         ` arnold
  2023-02-24 14:01       ` Harald Arnesen
  1 sibling, 1 reply; 23+ messages in thread
From: Arno Griffioen via TUHS @ 2023-02-24 13:13 UTC (permalink / raw)
  To: tuhs

On Fri, Feb 24, 2023 at 05:45:16AM -0700, arnold@skeeve.com wrote:
> With tar and cpio, ar apparently fell out of use as a general
> archiver, and today it's only used for libraries of relocatable
> object files.

'ar' is alive and well as the archive format for .deb files though, so it 
could be argued that it's actually hugely in use as an archiver all across 
the world :)

							Bye, Arno.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [TUHS] Re: Origins of the SGS (System Generation Software) and COFF (Common Object File Format)
  2023-02-23 16:49   ` Paul Winalski
  2023-02-23 18:38     ` segaloco via TUHS
@ 2023-02-24 12:45     ` arnold
  2023-02-24 13:13       ` Arno Griffioen via TUHS
  2023-02-24 14:01       ` Harald Arnesen
  2023-02-25  2:07     ` Dave Horsfall
  2 siblings, 2 replies; 23+ messages in thread
From: arnold @ 2023-02-24 12:45 UTC (permalink / raw)
  To: paul.winalski, clemc; +Cc: tuhs, segaloco

Paul Winalski <paul.winalski@gmail.com> wrote:

> I can't think of any reason why ar(1) would care about the file format
> or internal contents of any of the modules it archives.  ar(1) is a
> general archiving tool and can archive anything.  It happens that the
> designers of ld(1) decided to use ar(1) to provide searchable object
> file libraries.
>
> ranlib(1) is a different matter.  In order to index global symbols it
> has to understand the object file format(s) of the modules it is
> indexing.  ranlib(1) most certainly would have to be taught to
> understand COFF.  But not ar(1).

You are correct that ar(1) was originally just an archiver. However,
the System V people built ranlib into it; the .a file for a library
has a sort of hidden extra member that is the list of symbols in
the archive.

With tar and cpio, ar apparently fell out of use as a general
archiver, and today it's only used for libraries of relocatable
object files.

Arnold

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [TUHS] Re: Origins of the SGS (System Generation Software)  and COFF (Common Object File Format)
  2023-02-23 22:11 ` segaloco via TUHS
@ 2023-02-24  0:07   ` segaloco via TUHS
  0 siblings, 0 replies; 23+ messages in thread
From: segaloco via TUHS @ 2023-02-24  0:07 UTC (permalink / raw)
  To: segaloco; +Cc: Paul Ruizendaal, tuhs

Correction, I was able to track it down, this is what I was thinking of, not a Basic-16 board: https://www.ebay.com/itm/225277136492?hash=item3473904e6c:g:Mx8AAOSwxl9jiPUr&amdata=enc%3AAQAHAAAAoD7DdgbGGFS1yXoVSRPovN62X0RChnzorSRATDa223OYNsLVsFrfS6CoAzjlT18M5o6A4V2IGayHcmBqXNMyu8Y3vmiprgMCbbc%2BnNocVmkV6Z2qy83ys05tFWKt2GONSWjerKYUdn1l8n%2BhjfD2sCK0qT6Yk02OMv8jr0YZJN22ghLXovR5IC8q%2BqmgcJWXgjK0jH9H%2FGwMHRVAyTTXC9A%3D%7Ctkp%3ABk9SR8ie24vQYQ

I didn't expect to see that still up, for those who don't want to follow the link, this is a link to a Bell Labs MAC-8 "Mactutor".  Still tempting, if it's still bumping around on eBay after my move I might just have to spring for it.

- Matt G.

------- Original Message -------
On Thursday, February 23rd, 2023 at 2:11 PM, segaloco via TUHS <tuhs@tuhs.org> wrote:


> Basic-16......augh I feel like I actually saw a Basic-16 eval board of some kind pop up in auctions in my documentation search the past few years. I thought about bidding but I didn't, could've had some cool hardware to reply back with pictures of. Lesson learned, if something catches my attention enough I should probably research it more closely.
> 
> Thanks for the article link, that pretty much captures the sort of "origin story" I was seeking out on both the tools and format. I now realize I could've known this already but didn't read far enough in the '84 Bell journal, I've got copies of that and the '78 one, I forget how many juicy details are in there that didn't make it into manuals and technical reports. All the more reason to go back through and take some notes...
> 
> - Matt G.
> 
> ------- Original Message -------
> On Thursday, February 23rd, 2023 at 1:37 PM, Paul Ruizendaal pnr@planet.nl wrote:
> 
> 
> 
> > > Date: Thu, 23 Feb 2023 18:38:25 +0000
> > > Subject: [TUHS] Re: Origins of the SGS (System Generation Software)
> > > and COFF (Common Object File Format)
> > > 
> > > For the sake of timelines:
> > > 
> > > June 1980 - Publication date on the front page of the 3.0 manual in which the utilities are still very much research for PDP-11 and 32V-ish for VAX where distinctions matter.
> > > 
> > > June 1981 - Publication date on the front page of the 4.1 manual in which the man-pages very much refer to all of this as the "3B-20 object format"
> > > 
> > > June 1982 - Publication date on the front page of the 5.0 manual by which point these same pages had been edited and extended to describe the "common object file format"
> > > 
> > > Additions at the 1981 release include dump(1), list(1), and the ld-prefixed library routines for managing these object files. These likewise persist in 5.0, SysV, and beyond as COFF-related tools.
> > > 
> > > So this puts the backstop of what would become COFF at at least '81.
> > > 
> > > - Matt G.
> > 
> > The surviving source code for SysV R2 supports this timeline:
> > - The header files (start from https://github.com/ryanwoodsmall/oldsysv/blob/master/sysvr2-vax/src/head/a.out.h) have dates of late ’82, early ’83.
> > - The source for exec() has a comment that refers to the 4xx magic formats as “pre 5.0 stuff”.
> > - The COFF format headers are #ifdef’ed for the 3B series.
> > 
> > Interestingly, the lowest magic numbers in the 5xx series are not for the 3B, but for the “Basic-16” and for the “x86”. That led me to this paper:
> > 
> > https://www.bell-labs.com/usr/dmr/www/otherports/newp.pdf
> > 
> > It seems that the roots of COFF go back to the initial portability effort for V7 and in particular the 8086 port (which was done in 1978 according to the paper).

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [TUHS] Re: Origins of the SGS (System Generation Software)  and COFF (Common Object File Format)
  2023-02-23 21:37 Paul Ruizendaal
@ 2023-02-23 22:11 ` segaloco via TUHS
  2023-02-24  0:07   ` segaloco via TUHS
  0 siblings, 1 reply; 23+ messages in thread
From: segaloco via TUHS @ 2023-02-23 22:11 UTC (permalink / raw)
  To: Paul Ruizendaal; +Cc: tuhs

Basic-16......augh I feel like I actually saw a Basic-16 eval board of some kind pop up in auctions in my documentation search the past few years.  I thought about bidding but I didn't, could've had some cool hardware to reply back with pictures of.  Lesson learned, if something catches my attention enough I should probably research it more closely.

Thanks for the article link, that pretty much captures the sort of "origin story" I was seeking out on both the tools and format.  I now realize I could've known this already but didn't read far enough in the '84 Bell journal, I've got copies of that and the '78 one, I forget how many juicy details are in there that didn't make it into manuals and technical reports.  All the more reason to go back through and take some notes...

- Matt G.

------- Original Message -------
On Thursday, February 23rd, 2023 at 1:37 PM, Paul Ruizendaal <pnr@planet.nl> wrote:


> > Date: Thu, 23 Feb 2023 18:38:25 +0000
> > Subject: [TUHS] Re: Origins of the SGS (System Generation Software)
> > and COFF (Common Object File Format)
> > 
> > For the sake of timelines:
> > 
> > June 1980 - Publication date on the front page of the 3.0 manual in which the utilities are still very much research for PDP-11 and 32V-ish for VAX where distinctions matter.
> > 
> > June 1981 - Publication date on the front page of the 4.1 manual in which the man-pages very much refer to all of this as the "3B-20 object format"
> > 
> > June 1982 - Publication date on the front page of the 5.0 manual by which point these same pages had been edited and extended to describe the "common object file format"
> > 
> > Additions at the 1981 release include dump(1), list(1), and the ld-prefixed library routines for managing these object files. These likewise persist in 5.0, SysV, and beyond as COFF-related tools.
> > 
> > So this puts the backstop of what would become COFF at at least '81.
> > 
> > - Matt G.
> 
> 
> 
> The surviving source code for SysV R2 supports this timeline:
> - The header files (start from https://github.com/ryanwoodsmall/oldsysv/blob/master/sysvr2-vax/src/head/a.out.h) have dates of late ’82, early ’83.
> - The source for exec() has a comment that refers to the 4xx magic formats as “pre 5.0 stuff”.
> - The COFF format headers are #ifdef’ed for the 3B series.
> 
> Interestingly, the lowest magic numbers in the 5xx series are not for the 3B, but for the “Basic-16” and for the “x86”. That led me to this paper:
> 
> https://www.bell-labs.com/usr/dmr/www/otherports/newp.pdf
> 
> It seems that the roots of COFF go back to the initial portability effort for V7 and in particular the 8086 port (which was done in 1978 according to the paper).

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [TUHS] Re: Origins of the SGS (System Generation Software)  and COFF (Common Object File Format)
@ 2023-02-23 21:37 Paul Ruizendaal
  2023-02-23 22:11 ` segaloco via TUHS
  0 siblings, 1 reply; 23+ messages in thread
From: Paul Ruizendaal @ 2023-02-23 21:37 UTC (permalink / raw)
  To: tuhs


> Date: Thu, 23 Feb 2023 18:38:25 +0000
> Subject: [TUHS] Re: Origins of the SGS (System Generation Software)
> 	and COFF (Common Object File Format)
> 
> For the sake of timelines:
> 
> June 1980 - Publication date on the front page of the 3.0 manual in which the utilities are still very much research for PDP-11 and 32V-ish for VAX where distinctions matter.
> 
> June 1981 - Publication date on the front page of the 4.1 manual in which the man-pages very much refer to all of this as the "3B-20 object format"
> 
> June 1982 - Publication date on the front page of the 5.0 manual by which point these same pages had been edited and extended to describe the "common object file format"
> 
> Additions at the 1981 release include dump(1), list(1), and the ld-prefixed library routines for managing these object files.  These likewise persist in 5.0, SysV, and beyond as COFF-related tools.
> 
> So this puts the backstop of what would become COFF at at least '81.
> 
> - Matt G.


The surviving source code for SysV R2 supports this timeline:
- The header files (start from https://github.com/ryanwoodsmall/oldsysv/blob/master/sysvr2-vax/src/head/a.out.h) have dates of late ’82, early ’83.
- The source for exec() has a comment that refers to the 4xx magic formats as “pre 5.0 stuff”.
- The COFF format headers are #ifdef’ed for the 3B series.

Interestingly, the lowest magic numbers in the 5xx series are not for the 3B, but for the “Basic-16” and for the “x86”. That led me to this paper:

https://www.bell-labs.com/usr/dmr/www/otherports/newp.pdf

It seems that the roots of COFF go back to the initial portability effort for V7 and in particular the 8086 port (which was done in 1978 according to the paper).



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [TUHS] Re: Origins of the SGS (System Generation Software) and COFF (Common Object File Format)
  2023-02-23 18:38     ` segaloco via TUHS
@ 2023-02-23 20:40       ` Paul Winalski
  0 siblings, 0 replies; 23+ messages in thread
From: Paul Winalski @ 2023-02-23 20:40 UTC (permalink / raw)
  To: segaloco; +Cc: The Eunuchs Hysterical Society

One property of a.out is that the format of an executable image is
identical to that of an object file.  One could write an assembly
program that was self-contained and did not need to be linked to any
other modules, run that through as(1).  The resulting a.out file would
be executable.  No need to involve ld(1).

MACH-O retained this concept of supporting self-contained object files
as executables.  COFF departed from that tradition.  A COFF (or
PECOFF) executable requires a data structure in the file called the
optional header that contains important instructions to the program
loader.  This second header is "optional" only in the sense that
non-executable object files do not have one.  The optional header is
created by ld(1).  I'm almost certain that versions of as(1) that
generate COFF output do not generate an optional header.

In ELF, executable images have several sections such as the program
header table section that contain the information present in COFF's
optional header.

Traditionally, operating systems such as IBM's OS for System/360/370/Z
and DEC's VMS have very different formats for object files and
executable programs.  OpenVMS for Itanium and x86-64 use ELF for both
objects and executables--a departure from the practice used on VAX and
Alpha.

-Paul W.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [TUHS] Re: Origins of the SGS (System Generation Software) and COFF (Common Object File Format)
  2023-02-23  6:30   ` Lars Brinkhoff
  2023-02-23 14:25     ` KenUnix
@ 2023-02-23 19:37     ` Warner Losh
  2023-02-24 17:01       ` Rich Salz
  1 sibling, 1 reply; 23+ messages in thread
From: Warner Losh @ 2023-02-23 19:37 UTC (permalink / raw)
  To: Lars Brinkhoff; +Cc: segaloco, The Eunuchs Hysterical Society

[-- Attachment #1: Type: text/plain, Size: 550 bytes --]

On Wed, Feb 22, 2023 at 11:30 PM Lars Brinkhoff <lars@nocrew.org> wrote:

> Clem Cole wrote:
> > MIT had a modified a.out format for the NU machine ports - that might
> > have been called b.out.  CMU had macho which again was an extended
> > a.out but even more flexible.
>
> Digital Research's GEMDOS also used a modified a.out format (at least as
> found on the Atari ST), pretty much the original PDP-11 format with
> 32-bit addresses.
>

MIT's PCC ports to various micros, especailly x86, also used b.out that was
quite similar to a.out.

Warner

[-- Attachment #2: Type: text/html, Size: 941 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [TUHS] Re: Origins of the SGS (System Generation Software) and COFF (Common Object File Format)
  2023-02-23 16:49   ` Paul Winalski
@ 2023-02-23 18:38     ` segaloco via TUHS
  2023-02-23 20:40       ` Paul Winalski
  2023-02-24 12:45     ` arnold
  2023-02-25  2:07     ` Dave Horsfall
  2 siblings, 1 reply; 23+ messages in thread
From: segaloco via TUHS @ 2023-02-23 18:38 UTC (permalink / raw)
  To: Paul Winalski; +Cc: The Eunuchs Hysterical Society

For the sake of timelines:

June 1980 - Publication date on the front page of the 3.0 manual in which the utilities are still very much research for PDP-11 and 32V-ish for VAX where distinctions matter.

June 1981 - Publication date on the front page of the 4.1 manual in which the man-pages very much refer to all of this as the "3B-20 object format"

June 1982 - Publication date on the front page of the 5.0 manual by which point these same pages had been edited and extended to describe the "common object file format"

Additions at the 1981 release include dump(1), list(1), and the ld-prefixed library routines for managing these object files.  These likewise persist in 5.0, SysV, and beyond as COFF-related tools.

So this puts the backstop of what would become COFF at at least '81.

- Matt G.

------- Original Message -------
On Thursday, February 23rd, 2023 at 8:49 AM, Paul Winalski <paul.winalski@gmail.com> wrote:


> On 2/22/23, Clem Cole clemc@ccc.com wrote:
> 
> > > - The System V manual has both this ar(1) version as well as the new
> > > COFF-supporting version.
> > > 
> > > Why would ar(1) care?
> > > 
> > > - Not sure if this implies the VAX ar format was expanded to support
> > > the COFF stuff for a little while until they decided on a new one or
> > > what.
> 
> 
> I can't think of any reason why ar(1) would care about the file format
> or internal contents of any of the modules it archives. ar(1) is a
> general archiving tool and can archive anything. It happens that the
> designers of ld(1) decided to use ar(1) to provide searchable object
> file libraries.
> 
> ranlib(1) is a different matter. In order to index global symbols it
> has to understand the object file format(s) of the modules it is
> indexing. ranlib(1) most certainly would have to be taught to
> understand COFF. But not ar(1).
> 
> > > and development software stuff until ELF comes along some time later.
> > 
> > Yep - never quite understood what the push for ELF was over COFF after all
> > the effort to drive COFF down people's throat. Note Microsoft "embraced
> > and extended" COFF as their format -- originally because of Xenix I
> > believe.
> > Someone like Paul W may have some insights on this and that was before
> > the 3B20.
> 
> 
> a.out was, as object file formats go, a throwback to the stone age
> from the get-go. Even the most primitive of IBM's link editors for
> System/360 supported arbitrary naming of object file sections and the
> ability for the programmer to arrange them in whatever order they
> wished. a.out's restriction to three sections (.text, .data, .bss)
> did manage to get the job done, and even (with ZMAGIC) could support
> demand-paged virtual memory, but only just.
> 
> It became pretty clear in the 1980s that an object file format more
> powerful and flexible than a.out was needed. CMU developed their own
> object file format (MACH-O) for their MACH microkernel-based OS. It
> had up to 8 object file sections, and the section properties (e.g.,
> read vs. read/wrkte; executable vs. data) were not tied to the section
> name as in a.out. A big step forward, although still primitive
> compared to the object formats of VAX/VMS and the IBM S/370 OSes.
> Apple MacOS X still uses MACH-O for object files and executables.
> 
> Whatever its origins, what we now know as COFF (Common Object File
> Format) is, as its name implies, intended to be OS- and
> machine-independent. It still has a relatively small number of
> sections, albeit more than MACH-O. When Microsoft developed Windows
> NT, they needed to replace their own MZ executable format with
> something that could support shareable images and they decided to go
> with COFF for both object files and for executables. In typical
> Microsoft embrace-and-extend fashion, their Portable Executable and
> Common Object File Format (PECOFF) is a heavily modified version of
> COFF with lots of MS-specific extensions. When DEC's GEM back end was
> chosen as the optimizer and code generator for Microsoft C/C++ on
> Windows NT for the DEC Alpha chip, I had to add PECOFF support to
> GEM's existing COFF support (which was used by DEC's commercially sold
> compilers for Ultrix). My original idea was to put the PECOFF support
> under conditional compilation (#ifdef PECOFF), but the two formats
> were sufficiently different that I abandoned that Idea, cloned the
> existing COFF module, and then modified that to create a separate
> PECOFF module.
> 
> ELF is far more flexible than either COFF, PECOFF, or MACH-O. Those
> three make a distinction between sections (the bits that eventually
> end up in memory) and the metadata pieces of an object file or
> executable (program headers, symbol table, debug information, etc.).
> In ELF, everything is a section, including the symbol table and the
> tables that direct the program loader in mapping shareable images into
> a process's memory. ELF was originally limited to 64K sections
> (section numbers were unsigned 16-bit), but there is now a scheme for
> 32-bit section numbers. The essentially unlimited number of sections
> is a big boon to languages such as C++, where grouped sections with a
> name-decoration convention provide a convenient way to support sharing
> of class definitions without requiring language-specific tweaks to the
> software development toolset. Contrast this with the Ada
> implementations I'm aware of, which have their own software
> development library systems layered on top of the conventional
> compiler/linker/archiver to insure that program modules are compiled
> and linked in the correct order.
> 
> I don't know what the timeline for the invention of COFF was. It was
> already called COFF and in widespread use by the time I encountered it
> when we added Ultrix support to GEM. I think MACH-O predated COFF;
> it's certainly more primitive than COFF. MACH-O was probably early to
> mid-1980s. OS kernel bloat was a recognized problem at the time and
> microkernel-based OSes were all the rage. At DEC, Dave Cutler wrote a
> microkernel-based OS called VAXeln to replace VAX/VMS for real-time
> applications. A lot of concepts from VAXeln found their way into
> Windows NT when Cutler left DEC for Microsoft.
> 
> -Paul W.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [TUHS] Re: Origins of the SGS (System Generation Software) and COFF (Common Object File Format)
  2023-02-22 22:20 ` [TUHS] " Clem Cole
  2023-02-23  0:17   ` segaloco via TUHS
  2023-02-23  6:30   ` Lars Brinkhoff
@ 2023-02-23 16:49   ` Paul Winalski
  2023-02-23 18:38     ` segaloco via TUHS
                       ` (2 more replies)
  2 siblings, 3 replies; 23+ messages in thread
From: Paul Winalski @ 2023-02-23 16:49 UTC (permalink / raw)
  To: Clem Cole; +Cc: segaloco, The Eunuchs Hysterical Society

On 2/22/23, Clem Cole <clemc@ccc.com> wrote:
>>
>>    - The System V manual has both this ar(1) version as well as the new
>>    COFF-supporting version.
>>
>> Why would ar(1) care?
>>
>>    - Not sure if this implies the VAX ar format was expanded to support
>>    the COFF stuff for a little while until they decided on a new one or
>> what.

I can't think of any reason why ar(1) would care about the file format
or internal contents of any of the modules it archives.  ar(1) is a
general archiving tool and can archive anything.  It happens that the
designers of ld(1) decided to use ar(1) to provide searchable object
file libraries.

ranlib(1) is a different matter.  In order to index global symbols it
has to understand the object file format(s) of the modules it is
indexing.  ranlib(1) most certainly would have to be taught to
understand COFF.  But not ar(1).

>> and development software stuff until ELF comes along some time later.
>>
> Yep - never quite understood what the push for ELF was over COFF after all
> the effort to drive COFF down people's throat.   Note Microsoft "embraced
> and extended" COFF as their format -- originally because of Xenix I
> believe.
>    Someone like Paul W may have some insights on this and that was before
> the 3B20.

a.out was, as object file formats go, a throwback to the stone age
from the get-go.  Even the most primitive of IBM's link editors for
System/360 supported arbitrary naming of object file sections and the
ability for the programmer to arrange them in whatever order they
wished.  a.out's restriction to three sections (.text, .data, .bss)
did manage to get the job done, and even (with ZMAGIC) could support
demand-paged virtual memory, but only just.

It became pretty clear in the 1980s that an object file format more
powerful and flexible than a.out was needed.  CMU developed their own
object file format (MACH-O) for their MACH microkernel-based OS.  It
had up to 8 object file sections, and the section properties (e.g.,
read vs. read/wrkte; executable vs. data) were not tied to the section
name as in a.out.  A big step forward, although still primitive
compared to the object formats of VAX/VMS and the IBM S/370 OSes.
Apple MacOS X still uses MACH-O for object files and executables.

Whatever its origins, what we now know as COFF (Common Object File
Format) is, as its name implies, intended to be OS- and
machine-independent.  It still has a relatively small number of
sections, albeit more than MACH-O.  When Microsoft developed Windows
NT, they needed to replace their own MZ executable format with
something that could support shareable images and they decided to go
with COFF for both object files and for executables.  In typical
Microsoft embrace-and-extend fashion, their Portable Executable and
Common Object File Format (PECOFF) is a heavily modified version of
COFF with lots of MS-specific extensions.  When DEC's GEM back end was
chosen as the optimizer and code generator for Microsoft C/C++ on
Windows NT for the DEC Alpha chip, I had to add PECOFF support to
GEM's existing COFF support (which was used by DEC's commercially sold
compilers for Ultrix).  My original idea was to put the PECOFF support
under conditional compilation (#ifdef PECOFF), but the two formats
were sufficiently different that I abandoned that Idea, cloned the
existing COFF module, and then modified that to create a separate
PECOFF module.

ELF is far more flexible than either COFF, PECOFF, or MACH-O.  Those
three make a distinction between sections (the bits that eventually
end up in memory) and the metadata pieces of an object file or
executable (program headers, symbol table, debug information, etc.).
In ELF, everything is a section, including the symbol table and the
tables that direct the program loader in mapping shareable images into
a process's memory.  ELF was originally limited to 64K sections
(section numbers were unsigned 16-bit), but there is now a scheme for
32-bit section numbers.  The essentially unlimited number of sections
is a big boon to languages such as C++, where grouped sections with a
name-decoration convention provide a convenient way to support sharing
of class definitions without requiring language-specific tweaks to the
software development toolset.  Contrast this with the Ada
implementations I'm aware of, which have their own software
development library systems layered on top of the conventional
compiler/linker/archiver to insure that program modules are compiled
and linked in the correct order.

I don't know what the timeline for the invention of COFF was.  It was
already called COFF and in widespread use by the time I encountered it
when we added Ultrix support to GEM.  I think MACH-O predated COFF;
it's certainly more primitive than COFF.  MACH-O was probably early to
mid-1980s.  OS kernel bloat was a recognized problem at the time and
microkernel-based OSes were all the rage.  At DEC, Dave Cutler wrote a
microkernel-based OS called VAXeln to replace VAX/VMS for real-time
applications.  A lot of concepts from VAXeln found their way into
Windows NT when Cutler left DEC for Microsoft.

-Paul W.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [TUHS] Re: Origins of the SGS (System Generation Software) and COFF (Common Object File Format)
  2023-02-23  6:30   ` Lars Brinkhoff
@ 2023-02-23 14:25     ` KenUnix
  2023-02-23 19:37     ` Warner Losh
  1 sibling, 0 replies; 23+ messages in thread
From: KenUnix @ 2023-02-23 14:25 UTC (permalink / raw)
  To: Lars Brinkhoff; +Cc: segaloco, The Eunuchs Hysterical Society

[-- Attachment #1: Type: text/plain, Size: 1321 bytes --]

Point of interest.  In the mid 80's Northern Telecom had a product called
the DV-1. It was a
motorola based system on the 68010. At the end of its life it was able to
run GEM with a floppy.
The DV-1 terminal was 68000 based. See some snaps at:

https://www.flickr.com/photos/9479603@N02/1814557731/in/album-72157602824219250/
https://www.flickr.com/photos/9479603@N02/1814548075/in/album-72157602824219250/
https://www.flickr.com/photos/9479603@N02/1814557771/in/album-72157602824219250/
https://www.flickr.com/photos/9479603@N02/1814557653/in/album-72157602824219250/
https://www.flickr.com/photos/9479603@N02/albums/72157602824219250/page2

My complete photo/movie history is at:
https://www.flickr.com/photos/9479603@N02/albums

If you have any to add please pass them along.

Ken


On Thu, Feb 23, 2023 at 1:30 AM Lars Brinkhoff <lars@nocrew.org> wrote:

> Clem Cole wrote:
> > MIT had a modified a.out format for the NU machine ports - that might
> > have been called b.out.  CMU had macho which again was an extended
> > a.out but even more flexible.
>
> Digital Research's GEMDOS also used a modified a.out format (at least as
> found on the Atari ST), pretty much the original PDP-11 format with
> 32-bit addresses.
>
> [Shouldn't this go to the COFF mailing list... oh sorry.]
>


-- 
End of line
JOB TERMINATED

[-- Attachment #2: Type: text/html, Size: 2493 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [TUHS] Re: Origins of the SGS (System Generation Software) and COFF (Common Object File Format)
  2023-02-22 22:20 ` [TUHS] " Clem Cole
  2023-02-23  0:17   ` segaloco via TUHS
@ 2023-02-23  6:30   ` Lars Brinkhoff
  2023-02-23 14:25     ` KenUnix
  2023-02-23 19:37     ` Warner Losh
  2023-02-23 16:49   ` Paul Winalski
  2 siblings, 2 replies; 23+ messages in thread
From: Lars Brinkhoff @ 2023-02-23  6:30 UTC (permalink / raw)
  To: Clem Cole; +Cc: segaloco, The Eunuchs Hysterical Society

Clem Cole wrote:
> MIT had a modified a.out format for the NU machine ports - that might
> have been called b.out.  CMU had macho which again was an extended
> a.out but even more flexible.

Digital Research's GEMDOS also used a modified a.out format (at least as
found on the Atari ST), pretty much the original PDP-11 format with
32-bit addresses.

[Shouldn't this go to the COFF mailing list... oh sorry.]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [TUHS] Re: Origins of the SGS (System Generation Software) and COFF (Common Object File Format)
  2023-02-22 22:20 ` [TUHS] " Clem Cole
@ 2023-02-23  0:17   ` segaloco via TUHS
  2023-02-23  6:30   ` Lars Brinkhoff
  2023-02-23 16:49   ` Paul Winalski
  2 siblings, 0 replies; 23+ messages in thread
From: segaloco via TUHS @ 2023-02-23  0:17 UTC (permalink / raw)
  To: Clem Cole; +Cc: The Eunuchs Hysterical Society

[-- Attachment #1: Type: text/plain, Size: 14843 bytes --]

Thanks for the insights as always Clem! Mea culpa on not looking at the ar situation a little more broadly, was pretty hyperfocused on 3B-20 stuff.

As for compatibility between development tools, I was mainly referring to available option switches (the kind of thing that could potentially trip up someone's scripting). Once SGS hits it seems like everything moving forward from that endeavored to keep comparable command-line switches, but just to further compare:

- PDP as(1) uniquely has the '-' option to treat all undefined labels as globals (as opposed to the much more common "this arg is stdin" that a lone '-' would give elsewhere.) VAX as(1) drops this and adds -dN to define the size reservation for undefined symbols. SGS as(1) keeps neither of these, opting to drop '-' entirely and implement the -dN functionality as -b, -w, and -l instead.

- Things are a bit better for ld(1). SGS ld(1) drops the -X option of the earlier version (pertains to cc behavior regarding internal labels, maybe irrelevant with pcc in the picture). Also presumably the -n and -i options are dropped as their actions are already default on VAX or otherwise only pertain to PDP. Old ld(1) had a -V option to store a version string in the resulting object. This becomes -VS in SGS ld(1) to accommodate -V being a standard "report my version" flag. SGS ld(1) then goes on to add -e to explicitly denote the entry point, -f to provide a short int fill value for sections needing it. We also pick up the now common -L for adding library paths. So all in all, more commonality with pre-SGS ld(1) but still technically some breaking option changes.

- Looks like nm(1) may have some appreciable changes. The -g (only print globals), -p (print in symbol table order), -r (print in reverse order), and -s (sort by size) option values are removed. A few are replaced by different options: -n was originally sort numerically instead of alphabetically (presumably by value rather than name), but in the SGS version, this is reversed, -n being the print by name order option instead (alphabetical is default in old nm(1)). The -o option morphs from meaning to include the name of the source file in the output to print the symbol value as octal. For SGS nm(1), we see the addition of -x (print in hex), -h (suppress headers), -v (sort by value, presumably replaces the old -n meaning), -e (only print statics and externals), -f ("full" output), and -V (version). This presents breaking changes for all but one of the switches to the earlier version of nm(1). For the record, V7, 32V, and System III all appear to have a comparable version. This utility is particularly interesting because a perusal of the current SUS https://pubs.opengroup.org/onlinepubs/9699919799.2018edition/utilities/nm.html shows a mishmash of options, with -g and -u surviving all the way from V7, whereas -e, -f, -o, -v, -x derive from the SGS behavior. Then there are the -A, -P, and -t options which are explained in the rationale section of the standard; basically these are POSIX additions to avoid using conflicting option types where possible.

- As for size(1) and strip(1), the SGS versions only add options.

- Finally, a few utilities are added. System III features a dump(1) command that is some sort of tape dump utility, but this name repurposed at least as of 4.1 into an object file section dumper, a role it retains. The list(1) utility is also added in 4.1.

So in detailed review, as(1), ld(1), and nm(1) had the most changes between the research versions and what eventually landed in SGS, with nm(1) especially being wildly incompatible from a command-line option standpoint. As for as(1), one PDP-11 option drops and one VAX option changes switches (but not functionality as far as I can tell). Finally, ld(1) seems to drop a few options that aren't needed in a non-PDP world and adjusts the version-assignment option to allow -V to be a universal version request of the various object utilities.

- Matt G.

------- Original Message -------
On Wednesday, February 22nd, 2023 at 2:20 PM, Clem Cole <clemc@ccc.com> wrote:

> below are some thoughts/hopefully answers to your questions....
>
> On Wed, Feb 22, 2023 at 3:16 PM segaloco via TUHS <tuhs@tuhs.org> wrote:
>
>> Good day all, figured I'd start a thread on this matter as I'm starting to piece enough together to articulate the questions arising in my research.
>>
>> So based on my analysis of the 3B20S UNIX 4.1 manual I've been working through, all evidence points to the formalized SGS package and COFF originating tightly coupled to the 3B-20 line, then growing legs to support VAX, but never quite absorbing PDP-11 in entirety. That said, there are bits and pieces of the manual pages for the object format libraries that suggest there was some providence for PDP-11 in the development of COFF as well.
>>
>> Where this has landed though is a growing curiosity regarding:
>>
>> - Whether SGS and COFF were tightly coupled to one another from the outset, with SGS being supported by the general library routines being developed for the COFF format
>
> @scj - any enlightenment -- your team in USG must have been part of all that.
>
>> - Whether COFF was envisioned as a one-size-fits-all object format from its inception or started as an experiment in 3B-20 development that wound up being general enough for other platforms
>
> That I can not say, but I can say that to the UNIX source licenses (i.e. not the Universities in the Research system or inside of the Bell Systems) - it was used in the "consider it standard" campaign that AT&T marketing in NC was starting to push. This was around the time that PCC2 was coming out to replace the original PCC but I remember getting PCC2 was extra cost.
>
> Most of the BSD based kernels (DEC, HP, etc..) were originally using a modified a.out of their own flavor but I think almost all them switched to COFF post the System III license. What I have forgotten, and it may have been a requirement/mixed up in the license.
>
> I do remember this was right around when gcc first starts coming out, and they had a tool called robitussin to "cure coffs" as they were using a.out wen they could.
>
>> -
>>
>> - If, prior to this format, there were any other efforts to produce a unifying binary format and set of development tools, or if COFF was a happy accident from what were a myriad of different architectural toolset streams
>
> MIT had a modified a.out format for the NU machine ports - that might have been called b.out.
> CMU had macho which again was an extended a.out but even more flexible.
>
>> - One of the curious things is how VAX for a brief moment did have its own set of tools and a.out particulars before SGS/COFF.
>
> Why is that curious - all original Vax development was just using the original PCC stream from V7 (and pre-Judge Green more in a minute).
>
> What I don't remember is if PCC2 was COFF when introduced, or COFF can first but I think they were separate things - again someone like scj would be authoritative.
>
> The three tools that have to care are the assembler (as), the linker (ld) program loading code in the kernel itself.
>
>> For instance, many of the VAX-targeted utilities in 3.0/System III bear little in common option/manual-wise with the general common SGS utilities in System V. The "not on PDP-11" pages for various SGS components in System V much more closely resemble the 3B-20 utilities in 4.1 than any of the non PDP-11/VAX-only bits in System III.
>>
>> Some examples:
>>
>> - The VAX assembler in System III contains a -dN option indicating the number of bytes to set aside for forward/external references for the linker to fill in.
>> - The VAX assembler in System V contains among others the -n and -m options from 4.1 which indicate to disable address optimization and use m4 respectively
>> - The System V assembler goes on to also include -R (remove input file after completion) -r (VAX only, add .data contents to .text instead) and options -b, -w, and -l to replace the -d1, -d2, and -d4 options indicated in the previous VAX assembler
>> - System V further adds a -V to all the SGS software indicating the version of the software. This is new circa 5.0, absent from the 4.1 manual like the R, r, b, w, and l options
>>
>> - The 4.1 manual's singular ar(1) entry still agrees with the System III version. No arcv(1) is listed, implying the old ar format never made it to 3B-20
>
> Hmm this is confusing old v[456] ar format to new ar format was during Research V6 to Research V7. By the time of any Vax development the old format had pretty much been killed. I'd look at check what PWB 1.0 and 2.0 used. The new ar format was independent of what it was in it.
>
> i.e. V7: man 5 ar
>
> [AR(5)](http://man.cat-v.org/unix_7th/5/AR)
>
> [AR(5)](http://man.cat-v.org/unix_7th/5/AR)
> NAME
>           ar - archive (library) file format
>
>      SYNOPSIS
>           #include <ar.h>
>
>      DESCRIPTION
>           The archive command ar is used to combine several files into
>           one.  Archives are used mainly as libraries to be searched
>           by the link-editor ld.
>
>           A file produced by ar has a magic number at the start, fol-
>           lowed by the constituent files, each preceded by a file
>           header.  The magic number and header layout as described in
>           the include file are:
>
>>> #define ARMAG 0177545
>>
>>> struct ar_hdr {
>
>>>> char ar_name[14];
>>
>>>> long ar_date;
>>
>>>> char ar_uid;
>>
>>>> char ar_gid;
>>
>>>> int ar_mode;
>>
>>>> long ar_size;
>
>>> };
>
>> - The System V manual has both this ar(1) version as well as the new COFF-supporting version.
>
> Why would ar(1) care?
>
>> - Not sure if this implies the VAX ar format was expanded to support the COFF stuff for a little while until they decided on a new one or what.
>>
>> - The System III ld (which is implied to support PDP and VAX) survives in System V, but is cut down to supporting PDP-11 only
>> - The COFF-ish ld shows up in 4.1, is then extended to VAX presumably in the same breath as the other COFF-supporting bits by Sys V, leading to two copies like many others, PDP-11-specific stuff and then COFF-specific stuff
>>
>> The picture that starts to form in the context of all of this is, for a little while in the late 70s/early 80s, the software development environments for PDP-11, VAX-11, and 3B-20 were interplaying with each other in often times inconsistent ways. Taking a peek at the 32V manuals, the VAX tools in System III appear to originate with that project, which makes sense. If I'm understanding the timeline, COFF starts to emerge from the 3B-20 project and USG probably decides that's the way to go, a unified format, but with PDP-11 pretty much out the door support wise already, there was little reason to apply that to PDP-11 as well, so the PDP-11 tools get their swan song in System V, original VAX-11 tools from 32V are likely killed off in 4.x, and the stuff that started with the 3B-20 group goes on to dominate the object file format
>
> That makes sense - but be careful - the 3B and WE32000 ISA may have been the driver but I would expect that compiler folk in Summit were more in the driver seat. The 3B20 kernel would use what they were getting from the tools team and core kernel team in USG.
>
> Remember the politic at the time is Judge Green has unleashed AT&T and they are now allowed to be in the biz, and the sales/marketing folks AT&T was pushing the 3B20 and the WE32000 - so there are big forces behind the scenes that are not obvious/clear.
>
>> and development software stuff until ELF comes along some time later.
>
> Yep - never quite understood what the push for ELF was over COFF after all the effort to drive COFF down people's throat. Note Microsoft "embraced and extended" COFF as their format -- originally because of Xenix I believe.  Someone like Paul W may have some insights on this and that was before the 3B20.
>
> What was the format that the original Xenix used - when it was targeting PDP-11, 68000, x86 and Z8000? Again I'm fuzzy on the details here. But I do remember during the license discussions that would lead to System III, that one of things the Microsoft team was worried about -- IIRC it was Bob Greenberg pushing all that.  I lost contact with Bob a few years ago, but if we can find him, I would expect Bob to know what Xenix was doing. And again that negotiation>>starts<< all pre-Judge Green, but finishes up soon afterwards.
>
>> I guess other questions this raises are:
>>
>> - Were the original VAX tools built with any attention to compatibility with the PDP-11 bits Ken and Dennis wrote many years prior (based on some option discrepancies, possibly not?)
>
> hrmph... folks started with the PDP-11 tools and changed them as needed. I'm not sure compatibility is the right term. They were retargeted nad moved forward by people trying support a new machine they got and did not want run DEC's OS.
>
>> - Do the VAX utilities derive from the Interdata 8/32 work or if there was actually another stream of tools as part of that project?
>
> I guess I don't understand the question. The original V7 tools were retargeted. When useful features were added, they might be offered/returned to other folks, but remember, Research is not "supporting" UNIX. USG is where things start to think in terms of multiple targets >>before Judge Green<< and then after Judge Green, there was a push to stop using non-AT&T based equipment or chips in the Bell System and make what Western Electric was selling be attractive [which sometimes was a little bit of putting lipstick on porcine as it were]. For instance, Rob and Barts's original JERQ is 68000 based, but by the time it becomes a product as 5620 it has to be refactored as a WE32000.
>
>> - Was there any interplay between the existing tool streams (original PDP-11, 32V's VAX utilities, possibly Interdata 8/32) and the eventual COFF/SGS stuff, or was the latter pretty well siloed in 3B-20 land until deployment with 4.1?
>
> I think you are putting too much on the 3B program itself. The 3B was the task at hand at the time and a solid opportunity to bring to bear business choices being made. You need to look at the greater business to understand a lot of the choices. A lot of things were happening in parallel in the market that had other impacts on technology and how it was delivered -- the 3B program was the "technology train" leaving the station that some of them got attached to/delivered using.
>
> But, I as I said to you when we chatted, you really can not underestimate what was happening (or not happening) as AT&T changed its business focus - pre/post-Judge Green. It was a large company with lots of different spheres of interest (read - different executives), each being measured with different things that they might value.

[-- Attachment #2: Type: text/html, Size: 28275 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [TUHS] Re: Origins of the SGS (System Generation Software) and COFF (Common Object File Format)
  2023-02-22 20:16 [TUHS] " segaloco via TUHS
@ 2023-02-22 22:20 ` Clem Cole
  2023-02-23  0:17   ` segaloco via TUHS
                     ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Clem Cole @ 2023-02-22 22:20 UTC (permalink / raw)
  To: segaloco; +Cc: The Eunuchs Hysterical Society

[-- Attachment #1: Type: text/plain, Size: 10926 bytes --]

below are some thoughts/hopefully answers to your questions....

On Wed, Feb 22, 2023 at 3:16 PM segaloco via TUHS <tuhs@tuhs.org> wrote:

> Good day all, figured I'd start a thread on this matter as I'm starting to
> piece enough together to articulate the questions arising in my research.
>
> So based on my analysis of the 3B20S UNIX 4.1 manual I've been working
> through, all evidence points to the formalized SGS package and COFF
> originating tightly coupled to the 3B-20 line, then growing legs to support
> VAX, but never quite absorbing PDP-11 in entirety. That said, there are
> bits and pieces of the manual pages for the object format libraries that
> suggest there was some providence for PDP-11 in the development of COFF as
> well.
>
> Where this has landed though is a growing curiosity regarding:
>
>    1. Whether SGS and COFF were tightly coupled to one another from the
>    outset, with SGS being supported by the general library routines being
>    developed for the COFF format
>
> @scj - any enlightenment -- your team in USG must have been part of all
that.

>
>    1. Whether COFF was envisioned as a one-size-fits-all object format
>    from its inception or started as an experiment in 3B-20 development that
>    wound up being general enough for other platforms
>
> That I can not say, but I can say that to the UNIX source licenses (i.e.
not the Universities in the Research system or inside of the Bell Systems)
- it was used in the "consider it standard" campaign that AT&T marketing in
NC was starting to push.  This was around the time that PCC2 was coming out
to replace the original PCC but I remember getting PCC2 was extra cost.

Most of the BSD based kernels (DEC, HP, etc..) were originally using a
modified a.out of their own flavor but I think almost all them switched to
COFF post the System III license.   What I have forgotten, and it may have
been a requirement/mixed up in the license.

I do remember this was right around when gcc first starts coming out, and
they had a tool called robitussin to "cure coffs" as they were using a.out
wen they could.


>    1.
>
>
>    1. If, prior to this format, there were any other efforts to produce a
>    unifying binary format and set of development tools, or if COFF was a happy
>    accident from what were a myriad of different architectural toolset streams
>
> MIT had a modified a.out format for the NU machine ports - that might
have been called b.out.
CMU had macho which again was an extended a.out but even more flexible.

>
>    1. One of the curious things is how VAX for a brief moment did have
>    its own set of tools and a.out particulars before SGS/COFF.
>
> Why is that curious - all original Vax development was just using the
original PCC stream from V7  (and pre-Judge Green more in a minute).

What I don't remember is if PCC2 was COFF when introduced, or COFF can
first but I think they were separate things - again someone like scj would
be authoritative.

The three tools that have to care are the assembler (as), the linker (ld)
program loading code in the kernel itself.





> For instance, many of the VAX-targeted utilities in 3.0/System III bear
> little in common option/manual-wise with the general common SGS utilities
> in System V. The "not on PDP-11" pages for various SGS components in System
> V much more closely resemble the 3B-20 utilities in 4.1 than any of the non
> PDP-11/VAX-only bits in System III.
>
> Some examples:
>
>    - The VAX assembler in System III contains a -dN option indicating the
>    number of bytes to set aside for forward/external references for the linker
>    to fill in.
>    - The VAX assembler in System V contains among others the -n and -m
>    options from 4.1 which indicate to disable address optimization and use m4
>    respectively
>    - The System V assembler goes on to also include -R (remove input file
>    after completion) -r (VAX only, add .data contents to .text instead) and
>    options -b, -w, and -l to replace the -d1, -d2, and -d4 options indicated
>    in the previous VAX assembler
>    - System V further adds a -V to all the SGS software indicating the
>    version of the software. This is new circa 5.0, absent from the 4.1 manual
>    like the R, r, b, w, and l options
>
>
>
>    - The 4.1 manual's singular ar(1) entry still agrees with the System
>    III version. No arcv(1) is listed, implying the old ar format never made it
>    to 3B-20
>
> Hmm this is confusing old v[456] ar format to new ar format was during
Research V6 to Research V7.  By the time of any Vax development the old
format had pretty much been killed. I'd look at check what PWB 1.0 and 2.0
used. The new ar format was independent of what it was in it.

i.e. V7: man 5 ar

  AR(5) <http://man.cat-v.org/unix_7th/5/AR>
                            AR(5) <http://man.cat-v.org/unix_7th/5/AR>

     NAME
          ar - archive (library) file format

     SYNOPSIS
          #include <ar.h>

     DESCRIPTION
          The archive command ar is used to combine several files into
          one.  Archives are used mainly as libraries to be searched
          by the link-editor ld.

          A file produced by ar has a magic number at the start, fol-
          lowed by the constituent files, each preceded by a file
          header.  The magic number and header layout as described in
          the include file are:


#define ARMAG 0177545

struct ar_hdr {

char ar_name[14];

long ar_date;

char ar_uid;

char ar_gid;

int ar_mode;

long ar_size;

};






>
>    - The System V manual has both this ar(1) version as well as the new
>    COFF-supporting version.
>
> Why would ar(1) care?



>
>    - Not sure if this implies the VAX ar format was expanded to support
>    the COFF stuff for a little while until they decided on a new one or what.
>
>
>
>    - The System III ld (which is implied to support PDP and VAX) survives
>    in System V, but is cut down to supporting PDP-11 only
>    - The COFF-ish ld shows up in 4.1, is then extended to VAX presumably
>    in the same breath as the other COFF-supporting bits by Sys V, leading to
>    two copies like many others, PDP-11-specific stuff and then COFF-specific
>    stuff
>
>
> The picture that starts to form in the context of all of this is, for a
> little while in the late 70s/early 80s, the software development
> environments for PDP-11, VAX-11, and 3B-20 were interplaying with each
> other in often times inconsistent ways. Taking a peek at the 32V manuals,
> the VAX tools in System III appear to originate with that project, which
> makes sense. If I'm understanding the timeline, COFF starts to emerge from
> the 3B-20 project and USG probably decides that's the way to go, a unified
> format, but with PDP-11 pretty much out the door support wise already,
> there was little reason to apply that to PDP-11 as well, so the PDP-11
> tools get their swan song in System V, original VAX-11 tools from 32V are
> likely killed off in 4.x, and the stuff that started with the 3B-20 group
> goes on to dominate the object file format
>
That makes sense - but be careful - the 3B and WE32000 ISA may have been
the driver but I would expect that compiler folk in Summit were more in the
driver seat.   The 3B20 kernel would use what they were getting from the
tools team and core kernel team in USG.

Remember the politic at the time is Judge Green has unleashed AT&T and they
are now allowed to be in the biz, and the  sales/marketing folks AT&T was
pushing the 3B20 and the WE32000 - so there are big forces behind the
scenes that are not obvious/clear.



> and development software stuff until ELF comes along some time later.
>
Yep - never quite understood what the push for ELF was over COFF after all
the effort to drive COFF down people's throat.   Note Microsoft "embraced
and extended" COFF as their format -- originally because of Xenix I believe.
   Someone like Paul W may have some insights on this and that was before
the 3B20.

What was the format that the original Xenix used - when it was targeting
PDP-11, 68000, x86 and Z8000?  Again I'm fuzzy on the details here. But I
do remember during the license discussions that would lead to System III,
that one of things the Microsoft team was worried about -- IIRC it was Bob
Greenberg pushing all that.  I lost contact with Bob a few years ago, but
if we can find him, I would expect Bob to know what Xenix was doing.  And
again that negotiation>>starts<< all pre-Judge Green, but finishes up soon
afterwards.

>
> I guess other questions this raises are:
>
>    1. Were the original VAX tools built with any attention to
>    compatibility with the PDP-11 bits Ken and Dennis wrote many years prior
>    (based on some option discrepancies, possibly not?)
>
> hrmph... folks started with the PDP-11 tools and changed them as needed.
I'm not sure compatibility is the right term.  They were retargeted nad
moved forward by people trying support a new machine they got and did not
want run DEC's OS.

>
>    1. Do the VAX utilities derive from the Interdata 8/32 work or if
>    there was actually another stream of tools as part of that project?
>
> I guess I don't understand the question.  The original V7 tools were
retargeted.  When useful features were added, they might be
offered/returned to other folks, but remember, Research is not "supporting"
UNIX.  USG is where things start to think in terms of multiple targets
>>before Judge Green<< and then after Judge Green, there was a push to stop
using non-AT&T based equipment or chips in the Bell System and make what
Western Electric was selling be attractive [which sometimes was a little
bit of putting lipstick on porcine as it were].  For instance, Rob and
Barts's original JERQ is 68000 based, but by the time it becomes a product
as 5620 it has to be refactored as a WE32000.



>
>    1. Was there any interplay between the existing tool streams (original
>    PDP-11, 32V's VAX utilities, possibly Interdata 8/32) and the eventual
>    COFF/SGS stuff, or was the latter pretty well siloed in 3B-20 land until
>    deployment with 4.1?
>
> I think you are putting too much on the 3B program itself.  The 3B was the
task at hand at the time and a solid opportunity to bring to bear business
choices being made.  You need to look at the greater business to understand
a lot of the choices.  A lot of things were happening in parallel in the
market that had other impacts on technology and how it was delivered -- the
3B program was the "technology train" leaving the station that some of them
got attached to/delivered using.

But, I as I said to you when we chatted, you really can not
underestimate what was happening (or not happening) as AT&T changed its
business focus - pre/post-Judge Green. It was a large company with lots of
different spheres of interest (read - different executives), each being
measured with different things that they might value.

[-- Attachment #2: Type: text/html, Size: 22473 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2023-02-26 15:51 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-23 15:13 [TUHS] Re: Origins of the SGS (System Generation Software) and COFF (Common Object File Format) Noel Chiappa
  -- strict thread matches above, loose matches on Subject: below --
2023-02-26 15:51 Paul Winalski
2023-02-25 20:14 Brian Walden
2023-02-23 21:37 Paul Ruizendaal
2023-02-23 22:11 ` segaloco via TUHS
2023-02-24  0:07   ` segaloco via TUHS
2023-02-22 20:16 [TUHS] " segaloco via TUHS
2023-02-22 22:20 ` [TUHS] " Clem Cole
2023-02-23  0:17   ` segaloco via TUHS
2023-02-23  6:30   ` Lars Brinkhoff
2023-02-23 14:25     ` KenUnix
2023-02-23 19:37     ` Warner Losh
2023-02-24 17:01       ` Rich Salz
2023-02-23 16:49   ` Paul Winalski
2023-02-23 18:38     ` segaloco via TUHS
2023-02-23 20:40       ` Paul Winalski
2023-02-24 12:45     ` arnold
2023-02-24 13:13       ` Arno Griffioen via TUHS
2023-02-25 19:28         ` arnold
2023-02-25 19:34           ` Steffen Nurpmeso
2023-02-24 14:01       ` Harald Arnesen
2023-02-25  2:07     ` Dave Horsfall
2023-02-25 15:30       ` Clem Cole
2023-02-25 17:29         ` Paul Winalski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).