tech@mandoc.bsd.lv
 help / color / mirror / Atom feed
From: Alejandro Colomar <alx@kernel.org>
To: Ingo Schwarze <schwarze@usta.de>
Cc: tech@mandoc.bsd.lv, branden@debian.org
Subject: Re: mandoc -man -Thtml bug: inconsistent vertical space before .TP
Date: Thu, 19 Oct 2023 23:32:42 +0200	[thread overview]
Message-ID: <ZTGggTYHc-vKcv4e@debian> (raw)
In-Reply-To: <ZTFXC0LDDyS0+X/7@asta-kit.de>

[-- Attachment #1: Type: text/plain, Size: 6564 bytes --]

On Thu, Oct 19, 2023 at 06:19:23PM +0200, Ingo Schwarze wrote:
> Hi Alejandro,
> 
> Alejandro Colomar wrote on Thu, Oct 19, 2023 at 05:17:10PM +0200:
> 
> > I had this gripe with man(7) some years ago.  I thought of using the
> > following instead, which slightly complicates the source code, but makes
> > it more logical.
> > 
> > 	$ cat nested_indent.man 
> > 	.TH nested_indent 7 2023-10-19 experiments
> > 	.SH Ingo said:
> > 	.TP
> > 	Todo
> > 	Currently, when formatting .TP or .IP with a non-empty head,
> > 	[yada yada]
> > 	.RS
> > 	.PP
> > 	When formatting .IP or .RS with an empty head, mandoc needs
> > 	[yada yada]
> > 	.RE
> > 
> > As you can see, here the indentation is controlled by a single RS/RE
> > pair, and everything within it uses PP as a normal paragraph separator.
> 
> While that also generates correct terminal and typographical (PS, PDF)
> output in the same purely presentational sense as .TP .IP .TP, it
> does not help with respect to the semantic problem we are discussing
> here.
> 
> Look at the AST generated by mandoc(1):
> 
>    $ mandoc -T tree nested_indent.man
>   title = "nested_indent"
>   sec   = "7"
>   vol   = "Miscellaneous Information Manual"
>   os    = "experiments"
>   date  = "2023-10-19"
>   
>   SH (block) *2:2
>     SH (head) 2:2 ID=HREF=Ingo_said:
>         Ingo (text) 2:5
>         said: (text) 2:10
>     SH (body) 2:2
>         TP (block) *3:2
>           TP (head) 3:2 ID=HREF
>               Todo (text) *4:1
>           TP (body) 4:1
>               Currently, when formatting .TP or .IP \
>                   with a nonempty head, (text) *5:1
>               [yada yada] (text) *6:1
>         RS (block) *7:2
>           RS (head) 7:2
>           RS (body) 7:2
>               PP (block) *8:2
>                 PP (head) 8:2
>                 PP (body) 8:2
>                     When formatting .IP or .RS with an empty head,
>                         mandoc needs (text) *9:1
>                     [yada yada] (text) *10:1
>         TP (block) *12:2
>           TP (head) 12:2 ID=HREF=final
>               final tag (text) *13:1
>           TP (body) 13:1
>               final body (text) *14:1
> 
> You see that the first .TP, the .RS, and the second .TP are all child
> nodes of the top-level .SH.  The .RS is not a child of the .TP but
> a sibling.  The two .TP nodes still aren't siblings of each other.
> 
> Now on first sight, you might blame me for that and call it a mandoc
> artifact, arguing that mandoc instead ought to treat the .RS as a
> child of the first .TP.  But no, that would be incorrect parsing
> for the following reason: the .TP inmplies an indentation, and
> the .RS also implies an indentation.  If the .RS were a child of
> the .TP, we would get double indentation.  You can make that
> argument even more convincing by adding a width argument to .RS
> and varying that argument.  That way, you see that the .RS is
> indented relative to the .SH, not relative to the .TP.
> 
> There are some cases where it is not completely clear whether one
> man(7) node following another man(7) node is a child or a sibling.
> mandoc(1) makes arbitrary choices in such ambiguous cases, usually
> opting for sibling relations where possible and avoiding unnecessary
> child relationships.  But this is not an ambiguous case.  Just like
> the .IP, the .RS is definitely a sibling and not a child of the .TP.
> As i said, no block can nest inside .TP.
> 
> That's why i brought up .RS in my reply and developed rules
> for handling it in a similar way as .IP, even though you did
> not mention .RS before.
> 
> > You could put the RS before the first paragraph, but then an unwanted
> > line break appears after the tag.
> 
> No matter where you put the .RS, it will never be a child of .TP.
> 
> > (Maybe man(7) could be tweaked so
> > that RS doesn't insert the line break after a TP.)
> 
> Not really a useful idea because .RS doesn't help with the actual
> problem in the first place.
> 
> > In the end I didn't switch to that scheme, because IP just worked, but
> > I might consider it if it proves to be useful.  What do you think?
> 
> As i said, i am not aware of a better solution than .TP .IP .TP.
> In particular, .RS is not better because it causes exactly the
> same trouble and potentially more trouble besides.
> 
> But i also said that trying to define "good style" for man(7)
> is a fool's errand.  Because man(7) code is so exceedingly difficult
> to write, man(7) code that is very clearly bad style is very often
> found in the wild, so there is ample opportunity for saying "this
> is bad style."  In some cases, it is also possible to point out
> better style, for example
> 
>   .BR "some word" .
> 
> is clearly better style than
> 
>   .B some word\c
>   \&.
> 
> even though both are correct man(7) code and even though there are
> situations in man(7) where \c is unavoidable.
> 
> But very frequently, situations arise where man(7) doesn't really
> allow any good solution, and the best you can do is not making
> the source gratuitiously worse than it needs to be.
> 
> The .TP .IP .TP idiom is such an example.  It's definitely ugly from
> both semantic and stylistic points of view, but no good solution
> is available.  I'm willing to go further and claim that no better
> solution can be designed even if you are willing to introduce a new
> macro or change the way the .TP API is defined, even in incompatible
> ways, because it's not this particular macro that is broken.  What is
> broken is the fundamental design of the language: the language not
> only predates the concept of semantic markup, but it also predates
> the concept of block nesting in markup languages.  Yes, that is hard
> to believe for people born after 1970 because those people have
> essentially grown up with HTML and LaTeX and those two markup
> languages have defined their concept of what a markup language is,
> but let's face it, man(7) predates those fundamental concepts,
> and it shows all over the place.
> 
> As long as people are using the language, mandoc(1) needs to somehow
> deal with the mess.  I'm not happy with that because it is wasting a
> lot of development time which could be spent in more productive ways,
> but what can i do...
> 
> Yours,
>   Ingo

Hi Ingo,

Hmmm.  You convinced me (about the problems of man(7)), I think.


Cheers,
Alex

-- 
<https://www.alejandro-colomar.es/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

      reply	other threads:[~2023-10-19 21:32 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-16 11:32 Alejandro Colomar
2023-10-16 16:28 ` Ingo Schwarze
2023-10-16 17:22   ` Alejandro Colomar
2023-10-19 14:45     ` Ingo Schwarze
2023-10-19 15:10       ` Ingo Schwarze
2023-10-19 15:17       ` Alejandro Colomar
2023-10-19 16:19         ` Ingo Schwarze
2023-10-19 21:32           ` Alejandro Colomar [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZTGggTYHc-vKcv4e@debian \
    --to=alx@kernel.org \
    --cc=branden@debian.org \
    --cc=schwarze@usta.de \
    --cc=tech@mandoc.bsd.lv \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).