The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
From: Paul Winalski <paul.winalski@gmail.com>
To: Dan Cross <crossd@gmail.com>
Cc: The Eunuchs Hysterical Society <tuhs@tuhs.org>
Subject: Re: [TUHS] Question about early C behavior.
Date: Fri, 10 Jan 2020 15:24:30 -0500	[thread overview]
Message-ID: <CABH=_VRjeENbmsQNntrue8r3omewE43JJj38mM+x+cTNZTUE7A@mail.gmail.com> (raw)
In-Reply-To: <CAEoi9W4pONAu4QRKnvQ79pRip5LkqQMq=rXgw4YB5bqYL3XNqQ@mail.gmail.com>

On 1/10/20, Dan Cross <crossd@gmail.com> wrote:
>
> Given the definition `int x;` (without an initializer) in a source file the
> corresponding object contains `x` in a "common" section. What this means is
> that, at link time, if some object file explicitly allocates an 'x' (e.g.,
> by specifying an initializer, so that 'x' appears in the data section for
> that object file), use that; otherwise, allocate space for it at link time,
> possibly in the BSS. If several source files contain such a declaration,
> the linker allocates exactly one 'x' (or whatever identifier) as
> appropriate. We've verified that this behavior was present as early as 6th
> edition.

I think the situation you describe (common sections) is how this is
done in ELF.  a.out and COFF, as used on Unix, don't have common
sections.  Instead 'int x;' (without an initializer) becomes symbol
'x' in the object file's symbol table, with both the "external" and
"undefined" attribute bits set, and with the symbol's value being the
size of 'x' (typically 4 bites, in your example).  It is the non-zero
symbol value that distinguishes common symbols from ordinary external
references, e.g., 'extern int x;' (without an initializer).

At link time, common symbols are handled differently from ordinary
external references:

[1] When the linker is searching libraries, an ordinary external
reference to 'x' will cause the linker to load an object that contains
an external definition for 'x'.  Common symbols do not trigger the
loading of an object from a library.

[2] After the linker has processed all of the files and libraries on
the command line, if there is an external definition for 'x', all
common symbol references to 'x' are treated as ordinary external
references to 'x' and resolved against the definition.  If no external
definition is found, the linker allocates 'x' in BSS, using the
maximum allocation size seen in any common symbol references to 'x'.
All common symbol references and ordinary external references to 'x'
are resolved to the newly-allocated space.

> The question is, what is the origin of this concept and nomenclature?
> FORTRAN, of course, has "common blocks": was that an inspiration for the
> name? Where did the idea for the implicit behavior come from (FORTRAN
> common blocks are explicit).

Yes, the concept, nomenclature, and semantics come from FORTRAN, and
they were included in a.out and COFF to support FORTRAN and other
languages (such as PL/I) that have COMMON block-type semantics.  I
don't know why 'int x;' (without an initializer) in C was implemented
as a common symbol.  I suspect it was done to allow C and FORTRAN
object modules linked together in the same executable to share
external data.

-Paul W.

  reply	other threads:[~2020-01-10 20:25 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-10 19:07 Dan Cross
2020-01-10 20:24 ` Paul Winalski [this message]
2020-01-10 20:55 ` Derek Fawcus
2020-01-10 21:02   ` Warner Losh
2020-01-10 21:05 ` Clem Cole

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABH=_VRjeENbmsQNntrue8r3omewE43JJj38mM+x+cTNZTUE7A@mail.gmail.com' \
    --to=paul.winalski@gmail.com \
    --cc=crossd@gmail.com \
    --cc=tuhs@tuhs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).