The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
From: "G. Branden Robinson" <g.branden.robinson@gmail.com>
To: groff@gnu.org
Cc: tuhs@tuhs.org
Subject: [TUHS] Why groff ms doesn't completely support historical documents
Date: Sun, 6 Oct 2024 00:53:20 -0500	[thread overview]
Message-ID: <20241006055320.wlo3syvcbma6wzk2@illithid> (raw)
In-Reply-To: <CAC20D2NgmzDxhQu5P5hjrZ3ciSv=KayiUg8GwsFRpu0wPasprw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3750 bytes --]

Someone on the TUHS list mailed me privately, prompting me to
write this lengthy apology (in the classical sense) of why groff doesn't
make a certain application easier.  I have slightly revised my response.

This message also may serve as a summary of the challenges that need to
be overcome if someone else wants to tackle the job, and potentially
contribute it to groff.

[person creates PDFs of historical Unix documents (many of which are
written using the ms macros) and wishes groff ms made the task easier]

I sympathize.  I sometimes render historical documents, so I prescribed
in groff ms's documentation the approach that I take myself.  I decided
against trying to support a "-matt" or "-msatt" option in groff because
it's flatly impossible to know which definition of `UX` to use.  Even a
date declaration in the document sheds little light, as we then have to
consider the question of whether we want fidelity to the actual state of
the mark at the time of that declared date, or to what would have been
rendered in the author's environment--and they may have been using an ms
that wasn't "up to date" in the same respect.  That information, too, is
not recorded in the document.[1]

Providing all the macros _except_ `UX` didn't seem likely to satisfy
users since that's the most important one!  It shows up in body text
whereas all the others seldom do--if you can live without the cover page
then, often, you're golden.  Except for `UX`.

Finally there is the name collision problem with Berkeley.  4.2BSD and
later ms defined `CT` and `TM` macros (aspects of their "thesis mode")
and once again there's no declarator within the document to tell you
which dialect of ms is in use.  This one can be heuristically figured
out with pretty good odds, I suspect, but troff works as a filter--what
was I going to do, write a preprocessor just for this?

(Hmm, maybe grog(1) could do it, and that would be in its wheelhouse.
But there's no point until and unless we reimplement support for
Berkeley thesis mode in the first place [so that grog has an option
argument to report], and that is an undertaking I have demurred.[2])

It seemed like a moderate amount of work for almost zero upside.  It's
also hard to validate/verify my work.  The only historical troffs to
which I have access are Seventh Edition Unix troff (1979, before
Kernighan) and DWB 3.3 (early 1990s).  It's a right pain in the butt to
inspect typesetter output on V7 because I have nothing that emulates a
C/A/T or translates it to device-independent troff output for a
"ditroff"-style device description that Kernighan troff, DWB/Hierloom
Doctools troff, or GNU troff could use.

And even if I had either of those, they'd have to be vetted to a _high_
degree of quality before they'd be fit for purpose; else I wouldn't know
whether I was chasing bugs in the groff ms macros or the C/A/T
emulator/translator.

So, to summarize, I confine my compatibility efforts to _nroff_ output,
and rule the Bell Labs "site" macros out of scope.  I feel there is not
much more I can do, and have confidence my results, without resources
that I'm lacking.

I hope this sheds some light on my reasoning.

Regards,
Branden

[1] Still, if someone wants to start, I'd start here.

    https://minnie.tuhs.org/cgi-bin/utree.pl?file=V10/vol2/ms/tmac.s

[2] One person, ever, has requested it, 20 years ago.  And I have no
    specimens of input or corresponding model output rendered by an
    "authentic" BSD troff [formatter executable PLUS support files]
    against which to develop a reconstruction.  (On the bright side, the
    Berkeley modifications to the once-encumbered AT&T "tmac.s" are, of
    themselves, presumably BSD-licensed.)

    https://savannah.gnu.org/bugs/?64455

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  parent reply	other threads:[~2024-10-06  5:53 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-04 21:42 [TUHS] Old troff files (1988-2007) Jacobson, Doug W [E CPE] via TUHS
2024-10-04 21:50 ` [TUHS] " Lyndon Nerenberg (VE7TFX/VE6BBM)
2024-10-04 21:52   ` Jacobson, Doug W [E CPE] via TUHS
2024-10-04 22:10     ` Bakul Shah via TUHS
2024-10-04 23:01     ` Clem Cole
2024-10-04 23:16       ` Clem Cole
2024-10-05  0:14 ` G. Branden Robinson
2024-10-05  4:09   ` Peter Yardley
2024-10-05 13:14   ` Clem Cole
2024-10-05 22:22     ` G. Branden Robinson
     [not found]       ` <CAC20D2NgmzDxhQu5P5hjrZ3ciSv=KayiUg8GwsFRpu0wPasprw@mail.gmail.com>
2024-10-06  5:53         ` G. Branden Robinson [this message]
2024-10-06 12:54       ` Jaap Akkerhuis
2024-10-06 15:11         ` G. Branden Robinson
2024-10-06 16:21           ` Ron Natalie
2024-10-09 21:02             ` Ron Natalie
2024-10-07 14:50       ` Leah Neukirchen
2024-10-08  6:45         ` G. Branden Robinson
2024-10-08 10:33           ` Jacobson, Doug W [E CPE] via TUHS
2024-10-08 10:49             ` G. Branden Robinson
2024-10-08 11:24               ` Jacobson, Doug W [E CPE] via TUHS
2024-10-16 21:40 ` Anton Shepelev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241006055320.wlo3syvcbma6wzk2@illithid \
    --to=g.branden.robinson@gmail.com \
    --cc=groff@gnu.org \
    --cc=tuhs@tuhs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).