From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FROM,MAILING_LIST_MULTI, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 4486 invoked from network); 8 Jan 2024 07:11:27 -0000 Received: from minnie.tuhs.org (2600:3c01:e000:146::1) by inbox.vuxu.org with ESMTPUTF8; 8 Jan 2024 07:11:27 -0000 Received: from minnie.tuhs.org (localhost [IPv6:::1]) by minnie.tuhs.org (Postfix) with ESMTP id 62EBC43E7D; Mon, 8 Jan 2024 17:11:20 +1000 (AEST) Received: from mail-ot1-x335.google.com (mail-ot1-x335.google.com [IPv6:2607:f8b0:4864:20::335]) by minnie.tuhs.org (Postfix) with ESMTPS id B396143E7C for ; Mon, 8 Jan 2024 17:11:13 +1000 (AEST) Received: by mail-ot1-x335.google.com with SMTP id 46e09a7af769-6dbcebaf9a9so1129824a34.3 for ; Sun, 07 Jan 2024 23:11:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1704697872; x=1705302672; darn=tuhs.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=aQbG3gHLSX0+4yAk4x+c+pvWKzdIkEP9UNVoojjCz+c=; b=Kfu5/tnpfFmHdPIz5WF/q8MrtsnIF5zl/lKGJfN7gC1T0rrAFtso2F2WYCngf75JNq sPyive+jS0NYTLUg+Ysuv/O9zrdDyf/0vPcp2rmT6Rnj6vSgH8pxvUPxL4fNQA9qHkgz ++06fcpt67NCMZIQ2qub9oith5NEJW6Cl6B46GmC0eW9X+dmzEozEyH7DJBUcf0RJvXC s4BxBNAwHmGsouS44+Hc1kIFZCWqdoevQbMdmM0rfKJtgy8/vFrCrpUaUihhmk5Zdr8R 3SQ2gSS9HDbbPL8b9CN2K6oAx1lO/4+7bb3CHoKik/dF0Le3SDEeUV43JZOA14Zamz74 ZClQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704697872; x=1705302672; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=aQbG3gHLSX0+4yAk4x+c+pvWKzdIkEP9UNVoojjCz+c=; b=LpyXAEBd3jAdZ8+RSiEitqMhM6yEb/0XSeJBF8KX1nlMLYwp7dQZl+3k7LjaS4fl5Q XtE9XuNlkGp9pKQYnk2GsVWnqEmf4nIbydRJ3TR3EN89cwcSXzp5y5Bk8DbXoCuobB39 Yon5NwPcUJ5kb26GJQaQuGebVV0YzOuktOD0T9K4l4ZN9K7igYay2a6MskXOOjQQJR62 ZguCF69KHuCFcu8imtrQJGjcBd4EAl0RJG2sxtISkxWtkLTSP6opCqltMDnsy7WFrmbM mH4jvLC0KhWJx409u7dFF2JkMLGSvk3Vl5n2tzJxxAnERtlsf9v1tNrgH7NW+CMxojHu cLEw== X-Gm-Message-State: AOJu0Yw8R+QlUc9G3MJrGwFAGLJi5tJqVJ3NdEWQuP36IdsC8HHCmPUX f/8CFJ9FQ2npqXBwFIHfgLbPgywAZ5w= X-Google-Smtp-Source: AGHT+IEt6hA91Gb+xpTfFDXYb35w+FtNuoOgQbvbXWCSHwGIZ8lEtItH6xJuhvfTrFKWLgzx9+MJsg== X-Received: by 2002:a05:6830:443:b0:6dc:5dd:8727 with SMTP id d3-20020a056830044300b006dc05dd8727mr2169133otc.68.1704697872005; Sun, 07 Jan 2024 23:11:12 -0800 (PST) Received: from illithid (ip68-12-97-90.ok.ok.cox.net. [68.12.97.90]) by smtp.gmail.com with ESMTPSA id b19-20020a9d5d13000000b006ddbfc37c87sm812492oti.49.2024.01.07.23.11.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 07 Jan 2024 23:11:11 -0800 (PST) Date: Mon, 8 Jan 2024 01:11:09 -0600 From: "G. Branden Robinson" To: tuhs@tuhs.org, groff@gnu.org Message-ID: <20240108071109.ykg42tw2gjeacs5f@illithid> References: <20240108032428.co3ozmlneoop6sa2@illithid> <20240108051049.7643537404E9@freecalypso.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="w62wsib45uz3xz2i" Content-Disposition: inline In-Reply-To: <20240108051049.7643537404E9@freecalypso.org> Message-ID-Hash: YW5AOYGJDNWWVCYQ4TZFINQ7PYQGFTT4 X-Message-ID-Hash: YW5AOYGJDNWWVCYQ4TZFINQ7PYQGFTT4 X-MailFrom: g.branden.robinson@gmail.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.6b1 Precedence: list Subject: [TUHS] Re: Original print of V7 manual? / My own version of troff List-Id: The Unix Heritage Society mailing list Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --w62wsib45uz3xz2i Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable At 2024-01-07T21:10:38-0800, Mychaela Falconia wrote: > G. Branden Robinson wrote: >=20 > > This sort of broad, nonspecific, reflexive derogation of groff (or > > GNU generally) is unproductive and frequently indicative of > > ignorance. >=20 > I don't have enough spoons to engage in political fights any more, so > I'll just focus on technical aspects. That may be a wise choice. A good supplement would be, when expressing a negative opinion of GNU or any software project to which people contribute their volunteer labor, to briefly state your grounds for not using it. "I just can't go along with the copyleft thing" or "I refuse to use anything written in C++" might or might not strike people as rational, but such frankness places the responsibility for starting an argument squarely on _their_ shoulders. Any issues people have with groff's implementation quality should be submitted to its bug tracker. (One can do so anonymously, or create an account to be emailed when subsequent activity happens.) There are plenty of defects demanding repair and features needing implementation. I wish there were fewer. I do what I can. https://savannah.gnu.org/bugs/?group=3Dgroff > > But if you are going for pixel-perfect reproduction of documents > > that used fonts you don't have, you're going to need to recreate the > > fonts somehow--perfectly (at least for the glyphs that a given > > document uses). >=20 > The problem you are describing is one which I am *not* actively > working on presently. I am _contemplating_ this problem, but not > actively working on it. In my current stage of 4.3BSD document set > reprinting, I am willing to accept that hyphenations, line breaks and > page breaks will be different from the original because of slightly > different font metrics, and accept the use of only fi and fl ligatures > (in running text, outside of explicit demonstrations) because Adobe's > version dropped ff, ffi and ffl. (In places where original troff docs > explicitly demonstrate the use of all 5 ligatures, I have a hack that > pulls the missing ligs from a different, not-really-matching font.) >=20 > I am willing to accept this imperfection because it is fundamentally > no different from what UCB/Usenix themselves did in 1986: they took > Bell Labs docs that were originally written for CAT and troffed them > on their APS-5 ditroff setup - but those two typesetters also had > slight diffs in their font metrics, causing line and page breaks to > move around! Right. I think this is a reasonable place to erect a threshold of "fidelity" in document rendering, for two reasons: (1) when you don't have control over the fonts in use, it's likely the best you can do anyway, and (2) as a document author you might want to leave yourself room to change your mind about the typeface you use, particularly for running text (which will have the greatest impact on the locations of line and page breaks for most documents). That I was able to get the breaks in "Typesetting Mathematics" almost all the same as the published version even though the Times I used was certainly not the C/A/T's was a due to a combination of (a) good fortune and (b) the power of binary search when selecting values for the LL and PO registers. > OTOH I am very willing to entertain, as an intellectual exercise, what > would it take to produce a new font set that would *truly* replicate > the CAT font set at Bell Labs. The spacing widths of the original > fonts (the key determinant of where breaks will land) are known, right > here: > > https://www.tuhs.org/cgi-bin/utree.pl?file=3DV7/usr/src/cmd/troff/tab3.c Right. Nowadays we call these (and other measurements besides width) the "font metrics". > Back in 2004 in one afternoon I threw together a quick-hack program > that takes the output of original troff (CAT binary codes) and prints > it in PostScript, using standard Adobe fonts. The character > positioning is that of original troff, but because the actual font > characters don't perfectly match these metrics, the result is not > pretty - but the non-pretty result does show *exactly* where every > line and page break lands per original intent! Nice! A tool I'd like to get added to groff someday is a modern "cat2dit". It's come up on these mailing lists before; apparently Adobe had a proprietary one back in the 1980s, and, as I recall, polymath wizard Henry Spencer wrote one but it's long since become a relic. John Gardner wrote yet another but it's in JavaScript so not maximally convenient for a Unix command line grognard. But best of all would be a "cat2dit" in Seventh Edition Unix-compatible C, because that would be super convenient for running on a PDP-11 under SIMH using Ossanna troff. The output would be easy to export because the device-independent troff output format is plain text (and not too strict about whitespace), and SIMH of course runs in a terminal window so it's easy to copy and paste. This would make it much easier to use Ossanna troff as a regression test bed for groff (or other modern formatters). > So what would it take to do such a re-creation properly? My feeling > is that the task would require hiring a professional typeface designer > to produce a modified version of Times font family: modify the fonts > to produce good visual results (change actual characters as needed) to > fit the prescribed, unchangeable metrics as in spacing widths. And > design all 5 f-ligatures while at it. Another approach would be to obtain the C/A/T font plates and describe them numerically. Since the only means of scaling was via an optical lens (from 6 to 36 points), we can conclude that they weren't "hinted" as digital fonts often are. Since those plates are presumably nearly all in landfills these days I suppose the same could be accomplished with sufficiently high-resolution scans of the copy of CSTR #54 in the Seventh Edition Unix manual (because it depicts all possibly glyphs). And of course if a person wants a gratuitous thing to put on their r=C3=A9sum=C3=A9/CV, you could obtain a large number of Times roman faces f= rom a variety of foundries, render a huge volume of text using them in every possible combination and at a large number of sizes, and then use those renderings to train an LLM to generate an "archetypal" Times face for rendering C/A/T-produced documents. You then unleash it on the world and wait for the lawsuits to roll in, which should get a person enough notoriety to land a day job at someplace where the buzzword "AI" excites hard-charging middle managers. > I have no slightest idea how much it would cost to hire a professional > typeface designer to do what I just described, hence I have no idea > whether or not it is something that the hobbyist community could > potentially afford, even collectively. But it is an interesting idea > to ponder nonetheless - which is where I leave it for now. Hobbyist font designers do exist. Some may lurk on one or both of these lists. I would ask them if it's more or less a solved problem already. > > There is a third problem, whose resolution is in progress, when > > producing PDF output from this document; slanted Greek symbols are > > present but "not quite right". This is because unlike PostScript, > > PDF font repertoires generally don't provide a "slanted symbol" > > face. >=20 > Can you please elaborate? I personally hate PDF with a passion, but I > concede that in order to make my documents readable by people other > than me, I have to rcp my .ps file from the 4.3BSD machine to a > semi-modern-ish (Slackware) Linux box and run ps2pdf on the file. Doug McIlroy still does this.[1] > But what "slanted symbol" font are you talking about that exists in > PostScript but not in PDF? The only PostScript fonts whose existence > I take as a given (as opposed to downloading the font explicitly) are > the standard 14: 4 Times family fonts, 4 Helvetica family fonts, 4 > Courier family fonts, Symbol and ZapfDingbats. Which of these 14 is > missing in PDF, and how does "standard" ps2pdf (Ghostscript) handle > it? Sorry, I elided too much from my response on this point. I should not have implied that "slanted symbol" is a standard PostScript font; it is not, per my copy of the _PostScript Language Reference Manual_ (3e) [see Appendix E]. "Slanted symbol", a.k.a. "SS", is a supplemental face in groff...of old provenance--it goes back to groff 1.06 (September 1992) at least. It exists to solve a problem that can be observed when you compare two documents already referenced above. 1. Adobe's _PostScript Language Reference Manual_, p. 794. Table E.13, "Symbol Encoding Vector" 2. CSTR #54 "Nroff/Troff User's Manual" (1976), p. 226*. Table I, "Font Style Examples" * using the page numbering in the HRW reprint of Volume 2 recently discussed on TUHS You will quickly observe that the C/A/T's "Special Mathematical Font", bearing the pellucid name "S" in the Ossanna/Thompson naming convention popular at Bell Labs, renders all its lowercase Greek letters in italic form. PostScript's Symbol font does not. A problem for any post-C/A/T typesetting is how to get upright versions of lowercase Greek letters. AT&T troff was engineered around the assumption that the lowercase Greek letters typically used for mathematical and scientific typesetting are slanted/italic rather than upright. This assumption is baked into the semantics of special character names *a, *b, *g, and so forth. (Except when using nroff, of course, where one "naturally" expects upright glyphs instead, just like the good old Greek box on the Teletype Model 37.) The eqn preprocessor furthermore--and consequently--assumes it doesn't need to do anything special for these special characters to show up in italics (making its rendering to terminals inconsistent with troff output). If you couldn't guess, I plan to change this in groff. It won't break eqn documents because what I "take away" in the semantics of the special characters (an implied font style, which doesn't belong there), I will "put back" via updated eqn character definitions, so people who say sin ( 2 theta ) ~ =3D ~ 2 ~ sin theta cos theta will continue to get what they expect. eqn users who bust down to *roff special characters to get Greek will, unfortunately, need to adapt. But GNU eqn has features to support doing so with minimal pain.[2] I have read that modern standards of mathematical typography mandate that constants, like every non-mathematician's favorite, =CF=80, should be set upright, not italicized as people of my generation (and I guess older ones) are accustomed to seeing it. The idea is that only _variables_ get italics. But I cannot speak further to this point, as it's well out of my wheelhouse. If it's true, I hope the increased flexibility I plan for groff and its eqn will make life easier for those who typeset math. https://savannah.gnu.org/bugs/index.php?64231 https://savannah.gnu.org/bugs/index.php?64232 gropdf(1) has not to date supported a slanted symbol font. But it needs to for the reasons explored on the groff list last June in a lengthy thread, the relevant portion of which starts here. https://lists.gnu.org/archive/html/groff/2023-06/msg00088.html > I also wanted my troff to run under 4.3BSD, using only K&R C, which I > reason would probably be impossible with groff. (I recall reading > somewhere that groff is written in C++ - so it is completely out of > consideration for something that needs to run under 4.3BSD.) Probably, unless someone wants to resurrect cfront... C is not my favorite programming language, and C++ even less so. In a better universe, by my lights, James Clark would have written groff in Ada. I acknowledge that a lot of people would characterize such a universe as a variety of Hell. > My software is written BY a pirate (me) FOR other pirates. If you are > not a pirate, my sw is not for you. Arrrrrr. I believe I take your meaning. Piracy is an occupational hazard of rentierism. Regards, Branden [1] https://lists.gnu.org/archive/html/groff/2023-08/msg00028.html [2] See eqn(1), subsection "Spacing and typeface". --w62wsib45uz3xz2i Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEh3PWHWjjDgcrENwa0Z6cfXEmbc4FAmWboAUACgkQ0Z6cfXEm bc7ouQ/8DAKM+06r/zCnw6SB1A8zZsOL2kNVa5cMqwrMV2ykWJq6LP5/UFFXSoNr VaIGIJ+mStvcxKrQRV0fq8fK6WlnVRstZkc68TuwgJQ+rcdtMMfH5A7tiiZxtJT7 zjBAv8xkpBLsXhePYjZj/DN4IgJ/HVECfBha2hBFxyNzM5YQ4M7a8lgn9aL9EM62 li4iKkSvZ3l9XrmJex6Cmqw3rJZUoFTRh+IfyrXG15MJX2KvU3mOtyqWFObujzEv aXWtskBTQxmHj071sXm9jWkguKyALMy4/TYrLYtvt8w57NciAywSb3wskcrV5Dzd Ykax5O8SqkGl95VqdAZT4seiV4a7ONMdTpuG6LoSdFa8vDm/IeGYMIuNkJ0Rx5y4 oFTzQZp2iEzLyAlZBSVY/5CGE/n7aEU1xvaWLJEhpYWiPIpy2gNtBrgW1C6MbuCJ wkbrWoLNWSs7drvkrPwHf0tarPBE3EiLrWNhtr7AedwU8w/RkFRqfOh0haMHDT9l 6suixQSuEz/RrVaGD5tNLqNX+oaeZf+HfyTaUS64SootfFk9zciANuN2hfswfhy8 /ASk612qrTiS6CrQqUrSeqttJ98Eppn2dirm84wp45oyrYu2+zeuqojDM+Fy91+8 iUOOT6emudmbDeB5VwbnM2n2YPKk+Vvohd8lSZICk+FNa56c5f8= =rSBv -----END PGP SIGNATURE----- --w62wsib45uz3xz2i--