On Thu, Oct 19, 2023 at 06:19:23PM +0200, Ingo Schwarze wrote: > Hi Alejandro, > > Alejandro Colomar wrote on Thu, Oct 19, 2023 at 05:17:10PM +0200: > > > I had this gripe with man(7) some years ago. I thought of using the > > following instead, which slightly complicates the source code, but makes > > it more logical. > > > > $ cat nested_indent.man > > .TH nested_indent 7 2023-10-19 experiments > > .SH Ingo said: > > .TP > > Todo > > Currently, when formatting .TP or .IP with a non-empty head, > > [yada yada] > > .RS > > .PP > > When formatting .IP or .RS with an empty head, mandoc needs > > [yada yada] > > .RE > > > > As you can see, here the indentation is controlled by a single RS/RE > > pair, and everything within it uses PP as a normal paragraph separator. > > While that also generates correct terminal and typographical (PS, PDF) > output in the same purely presentational sense as .TP .IP .TP, it > does not help with respect to the semantic problem we are discussing > here. > > Look at the AST generated by mandoc(1): > > $ mandoc -T tree nested_indent.man > title = "nested_indent" > sec = "7" > vol = "Miscellaneous Information Manual" > os = "experiments" > date = "2023-10-19" > > SH (block) *2:2 > SH (head) 2:2 ID=HREF=Ingo_said: > Ingo (text) 2:5 > said: (text) 2:10 > SH (body) 2:2 > TP (block) *3:2 > TP (head) 3:2 ID=HREF > Todo (text) *4:1 > TP (body) 4:1 > Currently, when formatting .TP or .IP \ > with a nonempty head, (text) *5:1 > [yada yada] (text) *6:1 > RS (block) *7:2 > RS (head) 7:2 > RS (body) 7:2 > PP (block) *8:2 > PP (head) 8:2 > PP (body) 8:2 > When formatting .IP or .RS with an empty head, > mandoc needs (text) *9:1 > [yada yada] (text) *10:1 > TP (block) *12:2 > TP (head) 12:2 ID=HREF=final > final tag (text) *13:1 > TP (body) 13:1 > final body (text) *14:1 > > You see that the first .TP, the .RS, and the second .TP are all child > nodes of the top-level .SH. The .RS is not a child of the .TP but > a sibling. The two .TP nodes still aren't siblings of each other. > > Now on first sight, you might blame me for that and call it a mandoc > artifact, arguing that mandoc instead ought to treat the .RS as a > child of the first .TP. But no, that would be incorrect parsing > for the following reason: the .TP inmplies an indentation, and > the .RS also implies an indentation. If the .RS were a child of > the .TP, we would get double indentation. You can make that > argument even more convincing by adding a width argument to .RS > and varying that argument. That way, you see that the .RS is > indented relative to the .SH, not relative to the .TP. > > There are some cases where it is not completely clear whether one > man(7) node following another man(7) node is a child or a sibling. > mandoc(1) makes arbitrary choices in such ambiguous cases, usually > opting for sibling relations where possible and avoiding unnecessary > child relationships. But this is not an ambiguous case. Just like > the .IP, the .RS is definitely a sibling and not a child of the .TP. > As i said, no block can nest inside .TP. > > That's why i brought up .RS in my reply and developed rules > for handling it in a similar way as .IP, even though you did > not mention .RS before. > > > You could put the RS before the first paragraph, but then an unwanted > > line break appears after the tag. > > No matter where you put the .RS, it will never be a child of .TP. > > > (Maybe man(7) could be tweaked so > > that RS doesn't insert the line break after a TP.) > > Not really a useful idea because .RS doesn't help with the actual > problem in the first place. > > > In the end I didn't switch to that scheme, because IP just worked, but > > I might consider it if it proves to be useful. What do you think? > > As i said, i am not aware of a better solution than .TP .IP .TP. > In particular, .RS is not better because it causes exactly the > same trouble and potentially more trouble besides. > > But i also said that trying to define "good style" for man(7) > is a fool's errand. Because man(7) code is so exceedingly difficult > to write, man(7) code that is very clearly bad style is very often > found in the wild, so there is ample opportunity for saying "this > is bad style." In some cases, it is also possible to point out > better style, for example > > .BR "some word" . > > is clearly better style than > > .B some word\c > \&. > > even though both are correct man(7) code and even though there are > situations in man(7) where \c is unavoidable. > > But very frequently, situations arise where man(7) doesn't really > allow any good solution, and the best you can do is not making > the source gratuitiously worse than it needs to be. > > The .TP .IP .TP idiom is such an example. It's definitely ugly from > both semantic and stylistic points of view, but no good solution > is available. I'm willing to go further and claim that no better > solution can be designed even if you are willing to introduce a new > macro or change the way the .TP API is defined, even in incompatible > ways, because it's not this particular macro that is broken. What is > broken is the fundamental design of the language: the language not > only predates the concept of semantic markup, but it also predates > the concept of block nesting in markup languages. Yes, that is hard > to believe for people born after 1970 because those people have > essentially grown up with HTML and LaTeX and those two markup > languages have defined their concept of what a markup language is, > but let's face it, man(7) predates those fundamental concepts, > and it shows all over the place. > > As long as people are using the language, mandoc(1) needs to somehow > deal with the mess. I'm not happy with that because it is wasting a > lot of development time which could be spent in more productive ways, > but what can i do... > > Yours, > Ingo Hi Ingo, Hmmm. You convinced me (about the problems of man(7)), I think. Cheers, Alex --