From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 8808 invoked from network); 18 Oct 2023 00:04:51 -0000 Received: from bsd.lv (HELO mandoc.bsd.lv) (66.111.2.12) by inbox.vuxu.org with ESMTPUTF8; 18 Oct 2023 00:04:51 -0000 Received: from fantadrom.bsd.lv (localhost [127.0.0.1]) by mandoc.bsd.lv (OpenSMTPD) with ESMTP id e578304b for ; Wed, 18 Oct 2023 00:04:49 +0000 (UTC) Received: from scc-mailout-kit-02.scc.kit.edu (scc-mailout-kit-02.scc.kit.edu [129.13.231.82]) by mandoc.bsd.lv (OpenSMTPD) with ESMTP id 252672fa for ; Wed, 18 Oct 2023 00:04:49 +0000 (UTC) Received: from hekate.asta.kit.edu ([2a00:1398:5:f401::77]) by scc-mailout-kit-02.scc.kit.edu with esmtps (TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (envelope-from ) id 1qsu3L-009sJD-2W; Wed, 18 Oct 2023 02:04:48 +0200 Received: from login-1.asta.kit.edu ([2a00:1398:5:f400::72]) by hekate.asta.kit.edu with esmtp (Exim 4.94.2) (envelope-from ) id 1qsu3K-000CVu-Op; Wed, 18 Oct 2023 02:04:46 +0200 Received: from schwarze by login-1.asta.kit.edu with local (Exim 4.94.2) (envelope-from ) id 1qsu3K-000dZF-2S; Wed, 18 Oct 2023 02:04:46 +0200 Date: Wed, 18 Oct 2023 02:04:46 +0200 From: Ingo Schwarze To: Alejandro Colomar Cc: tech@mandoc.bsd.lv Subject: Re: mandoc -man -Thtml: unwanted line break after bullet (.IP) Message-ID: References: X-Mailinglist: mandoc-tech Reply-To: tech@mandoc.bsd.lv MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Hi Alejandro, Alejandro Colomar wrote on Tue, Oct 17, 2023 at 11:39:23PM +0200: > Here's what I see in the bookworm one: > And buster: Ah, thank you for pointing me to the specific place where you found a difference. Yes, you are right, there is a difference there. However, that difference is caused by a difference in the manual page source code, not by a difference in the formatter. The bookworm manual pages contains: First, though, a summary of a few details for the impatient: .IP \[bu] 3 The macros that you most likely need to use in modern source code are mandoc(1) renders that as:

First, though, a summary of a few details for the impatient:

The macros that you most likely need to use in modern source... that is, as a tagged list, because it is not (yet) smart enough to figure out the ".IP \[bu]" is intended as a bullet list. By contrast, the buster manual page contains: First, though a summary of a few details for the impatient: .IP * 3 The macros that you most likely need to use in modern source code are mandoc(1) renders that as:

First, though a summary of a few details for the impatient:

  • The macros that you most likely need to use in modern source... which is obviously much more visually pleasing and also makes more sense semantically. The problem here is that just like it is *significantly* more difficult to write a good man(7) page than a good mdoc(7) page - due to the fact that the man(7) language is totally inadequate, by its fundamental design, to express semantic markup - it is *massively* more difficult to produce good HTML output from man(7) than from mdoc(7), and for exactly the same reason that makes writing man87) so difficult for humans in the first place: The man(7) language is a purely presentational language with no feature whatsoever to convey anything semantic, whereas the HTML language is a purely semantic language with no feature whatsoever for expressing anything presentational. Consequently, the mdoc(7) HTML formatter is very straightforward: .Bl -tag becomes
    .Bl -bullet becomes
      .Bl -enum becomes
        and we are *certain* that we have met the manual page author's intention. End of the story, everybody is happy now. For man(7), almost anything we do involves crude guesswork. Both .TP and .IP can become any of
        ,
          , or
            and which is the right one has to be *guessed* from the content. In addition, while mdoc(7) makes it explicit where a list begins and where that list ends, in man(7), we have to *guess* whether any given .TP/.IP macro begins, continues, or ends a list. Look at https://cvsweb.bsd.lv/mandoc/man_html.c?rev=HEAD , function man_IP_pre(). Right now, it maps .IP * to .Bl -bullet .IP \(bu to .Bl -bullet .IP - to .Bl -dash anything else to .Bl -tag together with its helper function list_continues(). Now, the function list_continues() already recognizes \(bu as a potential marker for a bullet list. But that buster manual page uses \[bu] instead. I'm not saying that is wrong, mind you. What i am saying is that parsing and formatting the man(7) language is a nightmare and fragile as hell. The fundamental design of that language is totally 1970ies-style, and it shows in each and every corner. I'm not blaming Doug; at the time man(7) was designed, it was a monumental step forward, and nobody could have been expected to do better in the 1970ies. But after Cynthia Livingston invented mdoc(7) in 1989/90 and Tim Berners-Lee invented HTML in 1989/90, the fundamental concept of man(7) was totally outdated with no redeeming qualities and no hope for healing, and it should have been completely abandoned by 1995 at the latest. A documentation language that is essentially presentational obviously has no place in this millenium. All the same, this conversation has been useful, and i need to change three aspects of mandoc. So, thanks for reporting! 1. change the Makefile to always install mandoc.css 2. better document what mandoc.css is needed for, what the embedded default CSS does and does not provide, and that using a custom CSS file requires a high level of proficiency in both the CSS and the mdoc(7) languages 3. teach list_continues() that \[bu] is the same as \(bu. But this is really an uphill battle. As long as you rely on man(7), the fundamental design of that language implies that you will again and again bump into similar formatting issues - even with mandoc. Still, as long as man(7) is used in the wild, please keep reporting such issues. I rarely look at man(7) pages, so i need reports from the field to be able to make progress. Yours, Ingo -- To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv