From: Alejandro Colomar <alx@kernel.org>
To: Ingo Schwarze <schwarze@usta.de>
Cc: tech@mandoc.bsd.lv, branden@debian.org
Subject: Re: mandoc -man -Thtml bug: inconsistent vertical space before .TP
Date: Thu, 19 Oct 2023 23:32:42 +0200 [thread overview]
Message-ID: <ZTGggTYHc-vKcv4e@debian> (raw)
In-Reply-To: <ZTFXC0LDDyS0+X/7@asta-kit.de>
[-- Attachment #1: Type: text/plain, Size: 6564 bytes --]
On Thu, Oct 19, 2023 at 06:19:23PM +0200, Ingo Schwarze wrote:
> Hi Alejandro,
>
> Alejandro Colomar wrote on Thu, Oct 19, 2023 at 05:17:10PM +0200:
>
> > I had this gripe with man(7) some years ago. I thought of using the
> > following instead, which slightly complicates the source code, but makes
> > it more logical.
> >
> > $ cat nested_indent.man
> > .TH nested_indent 7 2023-10-19 experiments
> > .SH Ingo said:
> > .TP
> > Todo
> > Currently, when formatting .TP or .IP with a non-empty head,
> > [yada yada]
> > .RS
> > .PP
> > When formatting .IP or .RS with an empty head, mandoc needs
> > [yada yada]
> > .RE
> >
> > As you can see, here the indentation is controlled by a single RS/RE
> > pair, and everything within it uses PP as a normal paragraph separator.
>
> While that also generates correct terminal and typographical (PS, PDF)
> output in the same purely presentational sense as .TP .IP .TP, it
> does not help with respect to the semantic problem we are discussing
> here.
>
> Look at the AST generated by mandoc(1):
>
> $ mandoc -T tree nested_indent.man
> title = "nested_indent"
> sec = "7"
> vol = "Miscellaneous Information Manual"
> os = "experiments"
> date = "2023-10-19"
>
> SH (block) *2:2
> SH (head) 2:2 ID=HREF=Ingo_said:
> Ingo (text) 2:5
> said: (text) 2:10
> SH (body) 2:2
> TP (block) *3:2
> TP (head) 3:2 ID=HREF
> Todo (text) *4:1
> TP (body) 4:1
> Currently, when formatting .TP or .IP \
> with a nonempty head, (text) *5:1
> [yada yada] (text) *6:1
> RS (block) *7:2
> RS (head) 7:2
> RS (body) 7:2
> PP (block) *8:2
> PP (head) 8:2
> PP (body) 8:2
> When formatting .IP or .RS with an empty head,
> mandoc needs (text) *9:1
> [yada yada] (text) *10:1
> TP (block) *12:2
> TP (head) 12:2 ID=HREF=final
> final tag (text) *13:1
> TP (body) 13:1
> final body (text) *14:1
>
> You see that the first .TP, the .RS, and the second .TP are all child
> nodes of the top-level .SH. The .RS is not a child of the .TP but
> a sibling. The two .TP nodes still aren't siblings of each other.
>
> Now on first sight, you might blame me for that and call it a mandoc
> artifact, arguing that mandoc instead ought to treat the .RS as a
> child of the first .TP. But no, that would be incorrect parsing
> for the following reason: the .TP inmplies an indentation, and
> the .RS also implies an indentation. If the .RS were a child of
> the .TP, we would get double indentation. You can make that
> argument even more convincing by adding a width argument to .RS
> and varying that argument. That way, you see that the .RS is
> indented relative to the .SH, not relative to the .TP.
>
> There are some cases where it is not completely clear whether one
> man(7) node following another man(7) node is a child or a sibling.
> mandoc(1) makes arbitrary choices in such ambiguous cases, usually
> opting for sibling relations where possible and avoiding unnecessary
> child relationships. But this is not an ambiguous case. Just like
> the .IP, the .RS is definitely a sibling and not a child of the .TP.
> As i said, no block can nest inside .TP.
>
> That's why i brought up .RS in my reply and developed rules
> for handling it in a similar way as .IP, even though you did
> not mention .RS before.
>
> > You could put the RS before the first paragraph, but then an unwanted
> > line break appears after the tag.
>
> No matter where you put the .RS, it will never be a child of .TP.
>
> > (Maybe man(7) could be tweaked so
> > that RS doesn't insert the line break after a TP.)
>
> Not really a useful idea because .RS doesn't help with the actual
> problem in the first place.
>
> > In the end I didn't switch to that scheme, because IP just worked, but
> > I might consider it if it proves to be useful. What do you think?
>
> As i said, i am not aware of a better solution than .TP .IP .TP.
> In particular, .RS is not better because it causes exactly the
> same trouble and potentially more trouble besides.
>
> But i also said that trying to define "good style" for man(7)
> is a fool's errand. Because man(7) code is so exceedingly difficult
> to write, man(7) code that is very clearly bad style is very often
> found in the wild, so there is ample opportunity for saying "this
> is bad style." In some cases, it is also possible to point out
> better style, for example
>
> .BR "some word" .
>
> is clearly better style than
>
> .B some word\c
> \&.
>
> even though both are correct man(7) code and even though there are
> situations in man(7) where \c is unavoidable.
>
> But very frequently, situations arise where man(7) doesn't really
> allow any good solution, and the best you can do is not making
> the source gratuitiously worse than it needs to be.
>
> The .TP .IP .TP idiom is such an example. It's definitely ugly from
> both semantic and stylistic points of view, but no good solution
> is available. I'm willing to go further and claim that no better
> solution can be designed even if you are willing to introduce a new
> macro or change the way the .TP API is defined, even in incompatible
> ways, because it's not this particular macro that is broken. What is
> broken is the fundamental design of the language: the language not
> only predates the concept of semantic markup, but it also predates
> the concept of block nesting in markup languages. Yes, that is hard
> to believe for people born after 1970 because those people have
> essentially grown up with HTML and LaTeX and those two markup
> languages have defined their concept of what a markup language is,
> but let's face it, man(7) predates those fundamental concepts,
> and it shows all over the place.
>
> As long as people are using the language, mandoc(1) needs to somehow
> deal with the mess. I'm not happy with that because it is wasting a
> lot of development time which could be spent in more productive ways,
> but what can i do...
>
> Yours,
> Ingo
Hi Ingo,
Hmmm. You convinced me (about the problems of man(7)), I think.
Cheers,
Alex
--
<https://www.alejandro-colomar.es/>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
prev parent reply other threads:[~2023-10-19 21:32 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-16 11:32 Alejandro Colomar
2023-10-16 16:28 ` Ingo Schwarze
2023-10-16 17:22 ` Alejandro Colomar
2023-10-19 14:45 ` Ingo Schwarze
2023-10-19 15:10 ` Ingo Schwarze
2023-10-19 15:17 ` Alejandro Colomar
2023-10-19 16:19 ` Ingo Schwarze
2023-10-19 21:32 ` Alejandro Colomar [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZTGggTYHc-vKcv4e@debian \
--to=alx@kernel.org \
--cc=branden@debian.org \
--cc=schwarze@usta.de \
--cc=tech@mandoc.bsd.lv \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).