tech@mandoc.bsd.lv
 help / color / mirror / Atom feed
From: Ingo Schwarze <schwarze@usta.de>
To: Alejandro Colomar <alx@kernel.org>
Cc: tech@mandoc.bsd.lv
Subject: Re: mandoc -man -Thtml: unwanted line break after bullet (.IP)
Date: Tue, 17 Oct 2023 21:02:55 +0200	[thread overview]
Message-ID: <ZS7aX6oMomG7D3xe@asta-kit.de> (raw)
In-Reply-To: <ZS1yyDgLlsfW0j-J@debian>

Hi Alejandro,

Alejandro Colomar wrote on Mon, 16 Oct 2023 19:10:22 +0200:

> I could reproduce it with the test file you sent.
>    $ mandoc -Thtml test.1 > test.html

Well, that isn't surprising.  As i explained, if you do not use CSS,
you cannot expect any particular white space behaviour - nor any
particular fonts, font sizes, colours etc. etc. for that matter, simply
because the HTML language (at least the modern version, HTML 5) is
not designed to contain any such information.  Historical versions of
HTML, i.e. HTML 4 (standardized in 1997) and earlier, provided limited
physical formatting capabilities, but those didn't really work well.
Nowadays, using anything but HTML 5 wouldn't really make sense.
It has been standardized for almost a decade now (since 2014).

All the above doesn't apply to mandoc only, but it applies to *any*
HTML code and to *any* web site: remove the CSS, and you remove almost
all of the formatting and web design.

If you ask me, HTML 5 still isn't a very good language, typical
design by committee, but certainly better than HTML 4.  Oh well,
what can we do...  I don't think i'm the guy who will save the web.

>>   mandoc -Thtml -Ostyle=mandoc.css test.1 > test.html
> where's this mandoc.css?  Is it a file you have locally?

https://cvsweb.bsd.lv/mandoc/mandoc.css
https://man.bsd.lv/mandoc.css
https://cvsweb.openbsd.org/src/usr.bin/mandoc/mandoc.css
https://man.openbsd.org/mandoc.css
https://man.voidlinux.org/mandoc.css

So it's kind of all over the place.  :-)

It is also contained in the release tarballs:

schwarze@fantadrom $ tar -tzvf mandoc-1.14.6.tar.gz | grep css
-rw-r--r--  1 schwarze schwarze 8906 Sep 23 2021 mandoc-1.14.6/mandoc.css

schwarze@fantadrom $ tar -tzvf mdocml-1.14.1.tar.gz | grep css 
-rw-r--r--  1 schwarze wsrc     3932 Feb 21 2017 mdocml-1.14.1/mandoc.css

> The Debian package doesn't provide any CSS file, which seems like a
> packaging bug:
>
>       $ apt-file show mandoc | grep css
>       $ apt-file find mandoc.css
>       $

Hmmm...

https://salsa.debian.org/debian/mdocml/-/blob/master/debian/patches/configure.local.patch

tells me that the Debian port doesn't include man.cgi(8).
Now unfortunately, my upstream Makefile,

  https://cvsweb.openbsd.org/src/usr.bin/mandoc/Makefile

which Debian appears to use, installs mandoc.css only with "make
cgi-install", not with plain "make install", and i suspect that's
the reason why the *.deb package ends up without it.

It would probably be better if "make install" in my Makefile also
installed mandoc.css.  The original motivation for only installing it
together with man.cgi was that back on the day, i thought using man.cgi
might be the most common way of using mandoc HTML output.  Thinking
about it right now, that's probably not even true: there are only a
handful of mandoc-based man.cgi servers wordwide, so there are almost
certainly more people who use mandoc HTML output in different ways.
And as you found out the hard way, when you care about minute
formatting details, CSS is essential, even if you are not running
man.cgi.

On OpenBSD, we actually install *two* copies of mandoc.css.  One copy
is installed by default in /usr/share/misc/mandoc.css.  That copy is
intended for users who run mandoc -T html manually.  The other copy is
not installed by default, but it is installed when users manually run
"make installcgi" in /usr/src/usr.bin/mandoc/, and it is installed
to /var/www/htdocs/mandoc.css, which is inside the default HTTP server
chroot on OpenBSD such that man.cgi can use it.

Portable mandoc probably ought to do something similar.

So arguably, the packaging issue on Debian was caused by questionable
upstream defaults.

> Hmm, in the bookworm page there's the bug, but not on buster.  They
> probably format the pages with with the corresponding system.

I doubt that.  I have talked to Michael Stapelberg several times, and
we discussed various details and various ways in which his setup is
unavoidably complicated, but i don't recall that he ever mentioned
manpages.debian.org transparently - and invisibly for the user -
redirected to several different servers for several different OS
versions running different OS version themselves.  Getting such a
system to work would be quite complicated, and maintaining it highly
inconvenient, in particular considering all the other non-trivial
tasks connected to the server that Michael had to take care of.

Frankly, it also wouldn't make sense.  If you serve manual pages
for old Debian versions with the newest software, you get better
formatting quality and more reliable manual page parsing for users.
Why on earth would you expose users who want to look up manual
pages for old versions to formatting bugs that have already been
fixed?

Besides, on first sight, i don't see which difference between

  https://manpages.debian.org/bookworm/manpages/ftm.7.en.html   and
  https://manpages.debian.org/buster/manpages/ftm.7.en.html

you mean.  The date and version number in the page footer differ,
but at least on first sight, spacing looks similar to me.


Alejandro Colomar wrote on Mon, Oct 16, 2023 at 07:28:40PM +0200:

> My bad here; I was testing both my command and your command, and
> accidentally mixed the resulting files.  I've re-tested, and if I don't
> specify -Ostyle=mandoc.css, it embeds CSS.  However, that CSS seems to
> be defective,

The embedded style sheet is not "defective"; it is "simple".
Here is what the mandoc(1) manual page says:

  If a style-sheet is not specified with -O style, -T html defaults
  to simple output (via an embedded style-sheet) readable in any
  graphical or text-based web browser.

All it claims is that the embedded style sheet is simple and that
the output is readable.  It does *not* claim that the output
perfectly matches -T utf8 terminal output.  Actually, perfectly
matching terminal output is hard even with the complicated mandoc.css
stylesheet.

It is intentional that the embedded stylesheet only deals with the
most fundamental formatting tasks, in particular selecting
adequate font-styles and font-weights for the various macros
and making sure that the page header doesn't look too bad.
Beyond that, it doesn't bother regarding whitespace.

I don't think embedding the complete mandoc.css into each and every
output file would be a reasonable choice.  When embedded CSS is
used as a fallback, keeping it minimal makes sense, i think.

Consequently, unless i'm missing something, with respect to what
you reported in the thread "unwanted line break", it seems to me
everything is working as intended.

Maybe i could clarify the mandoc(1) manual page a bit.  Essentially
calling mandoc.css an "example style sheet" may have been adequate
when this text was originally written, but over the years, the file
mandoc.css has been polished so much that nowadays, calling it "the
standard style sheet" would make more sense.  There should probably be
a warning that using a different style sheet isn't really recommended
unless the user has an above-average understanding of both CSS and
of the custom classes used in mandoc HTML output.  Otherwise, using
a different stylesheet is more likely to degrade the user experience
than to customize formatting according to the wishes of the person
writing their own CSS.

Yours,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv


  reply	other threads:[~2023-10-17 19:03 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-16 13:17 Alejandro Colomar
2023-10-16 14:52 ` Ingo Schwarze
2023-10-16 15:20   ` Jan Stary
2023-10-16 15:43     ` Ingo Schwarze
2023-10-16 16:03     ` Ingo Schwarze
2023-10-16 17:10   ` Alejandro Colomar
2023-10-16 17:16     ` Alejandro Colomar
2023-10-16 17:28     ` Alejandro Colomar
2023-10-17 19:02       ` Ingo Schwarze [this message]
2023-10-17 21:39         ` Alejandro Colomar
2023-10-18  0:04           ` Ingo Schwarze
2023-10-18 11:32             ` Alejandro Colomar
2023-10-18 14:48             ` Ingo Schwarze
2023-10-18 14:56               ` Alejandro Colomar
2023-10-18 16:20             ` Ingo Schwarze
2023-10-18 18:52               ` Alejandro Colomar
2023-10-19 11:59             ` Ingo Schwarze
2023-10-19 12:48               ` Alejandro Colomar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZS7aX6oMomG7D3xe@asta-kit.de \
    --to=schwarze@usta.de \
    --cc=alx@kernel.org \
    --cc=tech@mandoc.bsd.lv \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).