The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
From: "G. Branden Robinson" <g.branden.robinson@gmail.com>
To: tuhs@minnie.tuhs.org
Subject: Re: [TUHS] Proliferation of options is great simplification of pipes, really?
Date: Mon, 22 Feb 2021 14:32:19 +1100	[thread overview]
Message-ID: <20210222033217.dkqavclp22sa77ln@localhost.localdomain> (raw)
In-Reply-To: <f039c049-add5-ee79-eb5a-84c56a6fc2d2@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4945 bytes --]

At 2021-02-21T20:34:55-0600, Will Senn wrote:
> All,
> 
> So, we've been talking low-level design for a while. I thought I would
> ask a fundamental question. In days of old, we built small
> single-purpose utilities and used pipes to pipeline the data and
> transformations. Even back in the day, it seemed that there was
> tension to add yet another option to every utility. Today, as I was
> marveling at groff's abilities with regard to printing my man pages
> directly to my printer in 2021, I read the groff(1) page:
> 
> example here: https://linux.die.net/man/1/groff

A more up to date copy is available at the Linux man-pages site.

https://man7.org/linux/man-pages/man1/groff.1.html

> What struck me (the wrong way) was the second paragraph of the
> description:
> 
> The groff program allows to control the whole groff system by command
> line options. This is a great simplification in comparison to the
> classical case (which uses pipes only).

What strikes _me_ about the above is the awful Denglish in it.  I fixed
this back in 2017 and the correction shipped as part of groff 1.22.4 in
December 2018.

> Here is the current plethora of options:
> groff [-abcegilpstzCEGNRSUVXZ] [-d cs] [-f fam] [-F dir] [-I dir] [-L arg]
> [-m name] [-M dir] [-n num] [-o list] [-P arg] [-r cn] [-T dev] [-w name]
> [-W name] [file ...]
> 
> Now, I appreciate groff, don't get me wrong, but my sensibilities were
> offended by the idea that a kazillion options was in any way simpler
> than pipelining single-purpose utilities. What say you? Is this the
> perfected logical extension of the unix pioneers' work, or have we
> gone horribly off the trail.

I'd say it's neither, and reflects (1) the limitations of the Unix
filter model, or at least the linear topology of Unix pipelines[1]; and
(2) an arbitrary set of rules determined by convention and common
practice with respect to sequencing.

Consider the first the question of which *roff preprocessor languages
should be embeddable in another preprocessor's language.  Should you be
able to embed equations in tables?  What about tables inside equations
(not too insane an idea--consider matrix literals)?  Nothing in the Unix
filter model implies a choice between these decisions, but an ordering
decision must be made.

V7 Unix tbl(1)'s man page[3] took a moderately strong position on
preprocessor ordering based on more practical concerns (I suppose
loading on shared systems).

	When it is used with
	.I eqn
	or
	.I neqn
	the
	.I tbl
	command should be first, to minimize the volume
	of data passed through
	pipes.

Another factor is ergonomics.  As the number of preprocessors expands,
the number of potential orderings of a document processing pipeline also
grows--combinatorially.  Here's the chunk of the groff front-end
program that determines the ordering of the pipeline it constructs for
the user.

	// grap, chem, and ideal must come before pic;
	// tbl must come before eqn
	const int PRECONV_INDEX = 0;
	const int SOELIM_INDEX = PRECONV_INDEX + 1;
	const int REFER_INDEX = SOELIM_INDEX + 1;
	const int GRAP_INDEX = REFER_INDEX + 1;
	const int CHEM_INDEX = GRAP_INDEX + 1;
	const int IDEAL_INDEX = CHEM_INDEX + 1;
	const int PIC_INDEX = IDEAL_INDEX + 1;
	const int TBL_INDEX = PIC_INDEX + 1;
	const int GRN_INDEX = TBL_INDEX + 1;
	const int EQN_INDEX = GRN_INDEX + 1;
	const int TROFF_INDEX = EQN_INDEX + 1;
	const int POST_INDEX = TROFF_INDEX + 1;
	const int SPOOL_INDEX = POST_INDEX + 1;

Sure, you could have a piece of paper with the above ordering taped to
the wall near your terminal, but why?  Isn't it better to have a tool to
keep track of these arbitrary complexities instead?

groff, as a front-end and pipeline manager, is much smaller than the
actual formatter.  According to sloccount, it's 1,195 lines to troff's
23,023 (measurements taken on groff Git HEAD, where I spend much of my
time).

If you need to alter the pipeline or truncate it, to debug an input
document or resequence the processing order, you can, and groff supplies
the -V flag to help you do so.

A traditionalist need never type the groff command if it offends one's
sensibilities--it would be a welcome change from people grousing about
copyleft.  All the pieces of the pipeline are still there and can be
directly invoked.

For an alternative approach to *roff document interpretation and
rendering, albeit in a limited domain, see the mandoc project[4].  It
interprets the man(7) and mdoc(7) macro languages, a subset of *roff,
and tbl(1)'s mini-language with, as I understand it, a single parser.

Regards,
Branden

[1] Tom Duff noted this a long time ago in his paper presenting the rc
shell[2]; see §9.

[2] https://archive.org/details/rc-shell/page/n2/mode/1up
[3] https://minnie.tuhs.org/cgi-bin/utree.pl?file=V7/usr/man/man1/tbl.1
[4] https://mandoc.bsd.lv/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2021-02-22  3:32 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-22  2:34 Will Senn
2021-02-22  3:32 ` G. Branden Robinson [this message]
2021-02-22  4:32   ` Dan Stromberg
2021-02-22  4:34   ` Will Senn
2021-02-22  5:45     ` G. Branden Robinson
2021-02-22 15:49   ` John P. Linderman
2021-02-22 15:57     ` William Cheswick
2021-02-22 16:03       ` John P. Linderman
2021-02-22 21:16         ` G. Branden Robinson
2021-02-22 16:02     ` Warner Losh
2021-02-22 16:12     ` Robert Clausecker
2021-02-22 17:15       ` John Cowan
2021-02-23  0:24       ` Steffen Nurpmeso
2021-02-22 21:14     ` G. Branden Robinson
2021-02-22  7:20 ` Rich Morin
2021-02-22 18:27 ` Jon Steinhart
2021-02-22 19:30   ` Richard Salz
2021-02-23  2:47 M Douglas McIlroy
2021-02-23 10:42 ` Jaap Akkerhuis
2021-02-23 13:23   ` Brantley Coile
2021-02-23 13:49 ` Ralph Corderoy
2021-02-23 15:04 Steve Simon
2021-02-24  2:42 ` M Douglas McIlroy
2021-02-24 19:38 Norman Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210222033217.dkqavclp22sa77ln@localhost.localdomain \
    --to=g.branden.robinson@gmail.com \
    --cc=tuhs@minnie.tuhs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).