The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
From: Paul Winalski <paul.winalski@gmail.com>
To: Clem Cole <clemc@ccc.com>
Cc: segaloco <segaloco@protonmail.com>,
	The Eunuchs Hysterical Society <tuhs@tuhs.org>
Subject: [TUHS] Re: Origins of the SGS (System Generation Software) and COFF (Common Object File Format)
Date: Thu, 23 Feb 2023 11:49:29 -0500	[thread overview]
Message-ID: <CABH=_VRWFcpSPuC3CL9K5XO9AEYnOucV4k5+rp=koTEjn-RSFA@mail.gmail.com> (raw)
In-Reply-To: <CAC20D2NHvSPvJR81d4hPM9F1X0tV08Z9_g9EVV9MB-uA0vg_Mw@mail.gmail.com>

On 2/22/23, Clem Cole <clemc@ccc.com> wrote:
>>
>>    - The System V manual has both this ar(1) version as well as the new
>>    COFF-supporting version.
>>
>> Why would ar(1) care?
>>
>>    - Not sure if this implies the VAX ar format was expanded to support
>>    the COFF stuff for a little while until they decided on a new one or
>> what.

I can't think of any reason why ar(1) would care about the file format
or internal contents of any of the modules it archives.  ar(1) is a
general archiving tool and can archive anything.  It happens that the
designers of ld(1) decided to use ar(1) to provide searchable object
file libraries.

ranlib(1) is a different matter.  In order to index global symbols it
has to understand the object file format(s) of the modules it is
indexing.  ranlib(1) most certainly would have to be taught to
understand COFF.  But not ar(1).

>> and development software stuff until ELF comes along some time later.
>>
> Yep - never quite understood what the push for ELF was over COFF after all
> the effort to drive COFF down people's throat.   Note Microsoft "embraced
> and extended" COFF as their format -- originally because of Xenix I
> believe.
>    Someone like Paul W may have some insights on this and that was before
> the 3B20.

a.out was, as object file formats go, a throwback to the stone age
from the get-go.  Even the most primitive of IBM's link editors for
System/360 supported arbitrary naming of object file sections and the
ability for the programmer to arrange them in whatever order they
wished.  a.out's restriction to three sections (.text, .data, .bss)
did manage to get the job done, and even (with ZMAGIC) could support
demand-paged virtual memory, but only just.

It became pretty clear in the 1980s that an object file format more
powerful and flexible than a.out was needed.  CMU developed their own
object file format (MACH-O) for their MACH microkernel-based OS.  It
had up to 8 object file sections, and the section properties (e.g.,
read vs. read/wrkte; executable vs. data) were not tied to the section
name as in a.out.  A big step forward, although still primitive
compared to the object formats of VAX/VMS and the IBM S/370 OSes.
Apple MacOS X still uses MACH-O for object files and executables.

Whatever its origins, what we now know as COFF (Common Object File
Format) is, as its name implies, intended to be OS- and
machine-independent.  It still has a relatively small number of
sections, albeit more than MACH-O.  When Microsoft developed Windows
NT, they needed to replace their own MZ executable format with
something that could support shareable images and they decided to go
with COFF for both object files and for executables.  In typical
Microsoft embrace-and-extend fashion, their Portable Executable and
Common Object File Format (PECOFF) is a heavily modified version of
COFF with lots of MS-specific extensions.  When DEC's GEM back end was
chosen as the optimizer and code generator for Microsoft C/C++ on
Windows NT for the DEC Alpha chip, I had to add PECOFF support to
GEM's existing COFF support (which was used by DEC's commercially sold
compilers for Ultrix).  My original idea was to put the PECOFF support
under conditional compilation (#ifdef PECOFF), but the two formats
were sufficiently different that I abandoned that Idea, cloned the
existing COFF module, and then modified that to create a separate
PECOFF module.

ELF is far more flexible than either COFF, PECOFF, or MACH-O.  Those
three make a distinction between sections (the bits that eventually
end up in memory) and the metadata pieces of an object file or
executable (program headers, symbol table, debug information, etc.).
In ELF, everything is a section, including the symbol table and the
tables that direct the program loader in mapping shareable images into
a process's memory.  ELF was originally limited to 64K sections
(section numbers were unsigned 16-bit), but there is now a scheme for
32-bit section numbers.  The essentially unlimited number of sections
is a big boon to languages such as C++, where grouped sections with a
name-decoration convention provide a convenient way to support sharing
of class definitions without requiring language-specific tweaks to the
software development toolset.  Contrast this with the Ada
implementations I'm aware of, which have their own software
development library systems layered on top of the conventional
compiler/linker/archiver to insure that program modules are compiled
and linked in the correct order.

I don't know what the timeline for the invention of COFF was.  It was
already called COFF and in widespread use by the time I encountered it
when we added Ultrix support to GEM.  I think MACH-O predated COFF;
it's certainly more primitive than COFF.  MACH-O was probably early to
mid-1980s.  OS kernel bloat was a recognized problem at the time and
microkernel-based OSes were all the rage.  At DEC, Dave Cutler wrote a
microkernel-based OS called VAXeln to replace VAX/VMS for real-time
applications.  A lot of concepts from VAXeln found their way into
Windows NT when Cutler left DEC for Microsoft.

-Paul W.

  parent reply	other threads:[~2023-02-23 16:49 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-22 20:16 [TUHS] " segaloco via TUHS
2023-02-22 22:20 ` [TUHS] " Clem Cole
2023-02-23  0:17   ` segaloco via TUHS
2023-02-23  6:30   ` Lars Brinkhoff
2023-02-23 14:25     ` KenUnix
2023-02-23 19:37     ` Warner Losh
2023-02-24 17:01       ` Rich Salz
2023-02-23 16:49   ` Paul Winalski [this message]
2023-02-23 18:38     ` segaloco via TUHS
2023-02-23 20:40       ` Paul Winalski
2023-02-24 12:45     ` arnold
2023-02-24 13:13       ` Arno Griffioen via TUHS
2023-02-25 19:28         ` arnold
2023-02-25 19:34           ` Steffen Nurpmeso
2023-02-24 14:01       ` Harald Arnesen
2023-02-25  2:07     ` Dave Horsfall
2023-02-25 15:30       ` Clem Cole
2023-02-25 17:29         ` Paul Winalski
2023-02-23 15:13 Noel Chiappa
2023-02-23 21:37 Paul Ruizendaal
2023-02-23 22:11 ` segaloco via TUHS
2023-02-24  0:07   ` segaloco via TUHS
2023-02-25 20:14 Brian Walden
2023-02-26 15:51 Paul Winalski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABH=_VRWFcpSPuC3CL9K5XO9AEYnOucV4k5+rp=koTEjn-RSFA@mail.gmail.com' \
    --to=paul.winalski@gmail.com \
    --cc=clemc@ccc.com \
    --cc=segaloco@protonmail.com \
    --cc=tuhs@tuhs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).