From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 29456 invoked from network); 17 Oct 2023 19:03:01 -0000 Received: from bsd.lv (HELO mandoc.bsd.lv) (66.111.2.12) by inbox.vuxu.org with ESMTPUTF8; 17 Oct 2023 19:03:01 -0000 Received: from fantadrom.bsd.lv (localhost [127.0.0.1]) by mandoc.bsd.lv (OpenSMTPD) with ESMTP id 96ce0b2f for ; Tue, 17 Oct 2023 19:02:59 +0000 (UTC) Received: from scc-mailout-kit-01.scc.kit.edu (scc-mailout-kit-01.scc.kit.edu [129.13.231.81]) by mandoc.bsd.lv (OpenSMTPD) with ESMTP id 3a12ac73 for ; Tue, 17 Oct 2023 19:02:59 +0000 (UTC) Received: from hekate.asta.kit.edu ([2a00:1398:5:f401::77]) by scc-mailout-kit-01.scc.kit.edu with esmtps (TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (envelope-from ) id 1qspLF-008z1M-0m; Tue, 17 Oct 2023 21:02:58 +0200 Received: from login-1.asta.kit.edu ([2a00:1398:5:f400::72]) by hekate.asta.kit.edu with esmtp (Exim 4.94.2) (envelope-from ) id 1qspLE-000Bds-8J; Tue, 17 Oct 2023 21:02:55 +0200 Received: from schwarze by login-1.asta.kit.edu with local (Exim 4.94.2) (envelope-from ) id 1qspLD-000c5b-Hj; Tue, 17 Oct 2023 21:02:55 +0200 Date: Tue, 17 Oct 2023 21:02:55 +0200 From: Ingo Schwarze To: Alejandro Colomar Cc: tech@mandoc.bsd.lv Subject: Re: mandoc -man -Thtml: unwanted line break after bullet (.IP) Message-ID: References: X-Mailinglist: mandoc-tech Reply-To: tech@mandoc.bsd.lv MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Hi Alejandro, Alejandro Colomar wrote on Mon, 16 Oct 2023 19:10:22 +0200: > I could reproduce it with the test file you sent. > $ mandoc -Thtml test.1 > test.html Well, that isn't surprising. As i explained, if you do not use CSS, you cannot expect any particular white space behaviour - nor any particular fonts, font sizes, colours etc. etc. for that matter, simply because the HTML language (at least the modern version, HTML 5) is not designed to contain any such information. Historical versions of HTML, i.e. HTML 4 (standardized in 1997) and earlier, provided limited physical formatting capabilities, but those didn't really work well. Nowadays, using anything but HTML 5 wouldn't really make sense. It has been standardized for almost a decade now (since 2014). All the above doesn't apply to mandoc only, but it applies to *any* HTML code and to *any* web site: remove the CSS, and you remove almost all of the formatting and web design. If you ask me, HTML 5 still isn't a very good language, typical design by committee, but certainly better than HTML 4. Oh well, what can we do... I don't think i'm the guy who will save the web. >> mandoc -Thtml -Ostyle=mandoc.css test.1 > test.html > where's this mandoc.css? Is it a file you have locally? https://cvsweb.bsd.lv/mandoc/mandoc.css https://man.bsd.lv/mandoc.css https://cvsweb.openbsd.org/src/usr.bin/mandoc/mandoc.css https://man.openbsd.org/mandoc.css https://man.voidlinux.org/mandoc.css So it's kind of all over the place. :-) It is also contained in the release tarballs: schwarze@fantadrom $ tar -tzvf mandoc-1.14.6.tar.gz | grep css -rw-r--r-- 1 schwarze schwarze 8906 Sep 23 2021 mandoc-1.14.6/mandoc.css schwarze@fantadrom $ tar -tzvf mdocml-1.14.1.tar.gz | grep css -rw-r--r-- 1 schwarze wsrc 3932 Feb 21 2017 mdocml-1.14.1/mandoc.css > The Debian package doesn't provide any CSS file, which seems like a > packaging bug: > > $ apt-file show mandoc | grep css > $ apt-file find mandoc.css > $ Hmmm... https://salsa.debian.org/debian/mdocml/-/blob/master/debian/patches/configure.local.patch tells me that the Debian port doesn't include man.cgi(8). Now unfortunately, my upstream Makefile, https://cvsweb.openbsd.org/src/usr.bin/mandoc/Makefile which Debian appears to use, installs mandoc.css only with "make cgi-install", not with plain "make install", and i suspect that's the reason why the *.deb package ends up without it. It would probably be better if "make install" in my Makefile also installed mandoc.css. The original motivation for only installing it together with man.cgi was that back on the day, i thought using man.cgi might be the most common way of using mandoc HTML output. Thinking about it right now, that's probably not even true: there are only a handful of mandoc-based man.cgi servers wordwide, so there are almost certainly more people who use mandoc HTML output in different ways. And as you found out the hard way, when you care about minute formatting details, CSS is essential, even if you are not running man.cgi. On OpenBSD, we actually install *two* copies of mandoc.css. One copy is installed by default in /usr/share/misc/mandoc.css. That copy is intended for users who run mandoc -T html manually. The other copy is not installed by default, but it is installed when users manually run "make installcgi" in /usr/src/usr.bin/mandoc/, and it is installed to /var/www/htdocs/mandoc.css, which is inside the default HTTP server chroot on OpenBSD such that man.cgi can use it. Portable mandoc probably ought to do something similar. So arguably, the packaging issue on Debian was caused by questionable upstream defaults. > Hmm, in the bookworm page there's the bug, but not on buster. They > probably format the pages with with the corresponding system. I doubt that. I have talked to Michael Stapelberg several times, and we discussed various details and various ways in which his setup is unavoidably complicated, but i don't recall that he ever mentioned manpages.debian.org transparently - and invisibly for the user - redirected to several different servers for several different OS versions running different OS version themselves. Getting such a system to work would be quite complicated, and maintaining it highly inconvenient, in particular considering all the other non-trivial tasks connected to the server that Michael had to take care of. Frankly, it also wouldn't make sense. If you serve manual pages for old Debian versions with the newest software, you get better formatting quality and more reliable manual page parsing for users. Why on earth would you expose users who want to look up manual pages for old versions to formatting bugs that have already been fixed? Besides, on first sight, i don't see which difference between https://manpages.debian.org/bookworm/manpages/ftm.7.en.html and https://manpages.debian.org/buster/manpages/ftm.7.en.html you mean. The date and version number in the page footer differ, but at least on first sight, spacing looks similar to me. Alejandro Colomar wrote on Mon, Oct 16, 2023 at 07:28:40PM +0200: > My bad here; I was testing both my command and your command, and > accidentally mixed the resulting files. I've re-tested, and if I don't > specify -Ostyle=mandoc.css, it embeds CSS. However, that CSS seems to > be defective, The embedded style sheet is not "defective"; it is "simple". Here is what the mandoc(1) manual page says: If a style-sheet is not specified with -O style, -T html defaults to simple output (via an embedded style-sheet) readable in any graphical or text-based web browser. All it claims is that the embedded style sheet is simple and that the output is readable. It does *not* claim that the output perfectly matches -T utf8 terminal output. Actually, perfectly matching terminal output is hard even with the complicated mandoc.css stylesheet. It is intentional that the embedded stylesheet only deals with the most fundamental formatting tasks, in particular selecting adequate font-styles and font-weights for the various macros and making sure that the page header doesn't look too bad. Beyond that, it doesn't bother regarding whitespace. I don't think embedding the complete mandoc.css into each and every output file would be a reasonable choice. When embedded CSS is used as a fallback, keeping it minimal makes sense, i think. Consequently, unless i'm missing something, with respect to what you reported in the thread "unwanted line break", it seems to me everything is working as intended. Maybe i could clarify the mandoc(1) manual page a bit. Essentially calling mandoc.css an "example style sheet" may have been adequate when this text was originally written, but over the years, the file mandoc.css has been polished so much that nowadays, calling it "the standard style sheet" would make more sense. There should probably be a warning that using a different style sheet isn't really recommended unless the user has an above-average understanding of both CSS and of the custom classes used in mandoc HTML output. Otherwise, using a different stylesheet is more likely to degrade the user experience than to customize formatting according to the wishes of the person writing their own CSS. Yours, Ingo -- To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv