From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailout.scc.kit.edu (mailout.scc.kit.edu [129.13.185.202]) by krisdoz.my.domain (8.14.5/8.14.5) with ESMTP id r8MCTN25020074 for ; Sun, 22 Sep 2013 08:29:24 -0400 (EDT) Received: from hekate.usta.de (asta-nat.asta.uni-karlsruhe.de [172.22.63.82]) by scc-mailout-02.scc.kit.edu with esmtp (Exim 4.72 #1) id 1VNimw-0004jr-PZ; Sun, 22 Sep 2013 14:29:22 +0200 Received: from donnerwolke.usta.de ([172.24.96.3]) by hekate.usta.de with esmtp (Exim 4.77) (envelope-from ) id 1VNimw-0004Nd-Q6; Sun, 22 Sep 2013 14:29:22 +0200 Received: from iris.usta.de ([172.24.96.5] helo=usta.de) by donnerwolke.usta.de with esmtp (Exim 4.72) (envelope-from ) id 1VNimw-0003eJ-Nf; Sun, 22 Sep 2013 14:29:22 +0200 Received: from schwarze by usta.de with local (Exim 4.77) (envelope-from ) id 1VNimw-00016r-H0; Sun, 22 Sep 2013 14:29:22 +0200 Date: Sun, 22 Sep 2013 14:29:22 +0200 From: Ingo Schwarze To: discuss@mdocml.bsd.lv Cc: "Anthony J. Bentley" Subject: Re: Docbook bulleted lists Message-ID: <20130922122922.GD20512@iris.usta.de> References: <26482.1379824174@cathet.us> X-Mailinglist: mdocml-discuss Reply-To: discuss@mdocml.bsd.lv MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <26482.1379824174@cathet.us> User-Agent: Mutt/1.5.21 (2010-09-15) Hi Anthony, time for a rant. This got long, but there is something to learn about roff(7), mdoc(7), man(7), mandoc(1), groff(1) and DocBook inside. I wanted to make sure that i really understand and am not bashing DocBook crap without a good reason. But i *do* want to make sure that i *am* bashing BocBook hard, until it bleeds and runs away as fast and as far as it can and never comes back! Anthony J. Bentley wrote on Sat, Sep 21, 2013 at 10:29:34PM -0600: > Stuart Henderson pointed out this formatting problem > in the cclive(1) manual: >> The bullet-point lists in the (docbook-xsl generated) manual >> are slightly broken with mandoc: Everything remotely related to DocBook is more than slightly broken. It suffers from utter braindeadness in almost any respect. When you look at DocBook output, there is hardly ever a line that couldn't be easily improved in at least one respect, often more than one. To reiterate: :: [...] :: Of course, ports not matching these criteria might work as well, :: so there is nothing wrong with checking if you like to. However, :: :: * if a port is using DocBook, checking mandoc compatibility will :: likely turn out to be a waste of time. Quoted from: http://www.openbsd.org/faq/ports/specialtopics.html#Mandoc It is good that from time to time somebody does report one of the many stupidities that DocBook is made up of. So i see that my harsh remark in the FAQ, listing it as the number ONE reason why stupid, broken, unportable man(7) code exists in the world, still rings true. >> - oa regular expression pattern >> + o a regular expression pattern >> >> - oformat (media stream) to download >> + o format (media stream) to download Like, huh, how can *such* a thing happen? =:-O >> (manpage source at http://junkpile.org/cclive.1) > Source of the above section (note that the problem shows up > throughout the manual): > > .RS 4 > .ie n \{\ > \h'-04'\(bu\h'+03'\c > .\} > .el \{\ > .sp -1 > .IP \(bu 2.3 > .\} > a regular expression pattern > .RE Yeah, sure, it's obvious to you why one would write a bulleted list this way in man(7) code, right? What, you still have doubt? Then let me translate this for people not *that* fluent in low-level roff request syntax, and while doing so, check my claim that each line of DocBook output is wrong or stupid in at least one way: - Begin a high level indented paragraph, indenting by four times the width of the letter 'n' in the current font. (RS) (One might hope that this makes using low-level indentation macros useless inside the paragraph, because why would anybody use a high-level indentation macro if he is going to use low-level indentation requests anyway. However, tough luck, that's not what RS is for.) - Begin a roff(7) conditional instruction. (ie) - If we are in nroff mode, i.e. formatting for the terminal: (n) - Shift back left by four times the width of the letter 'n'. (\h) (Like, huh, isn't that right where we came from? Besides, this is exactly why the man(7) language has the TP, IP and HP macros. Either of these works quite well for bullet lists but i guess DocBook people think they were not invented there.) - Print a bullet. (\(bu) - Shift right by three times the width of the letter 'n' in the current font. (\h) - Append the following text without inter-word spacing. (\c) (Yes, this complication arises from the fact that we were mixing in low-level formatting instead of relying on the normal flow of text. But wait, are we at the right place at all? Usually, "nnnn" is not the same width as "nnno", is it? Oh, we are in nroff mode, so we have a fixed-width font and this fine detail doesn't actually matter. Easy to forget that when you see so much low level stuff which is normally only required for delicate variable-width typesetting. So, the spacing is actually correct, even though it looks wrong. But don't ask me why it must be so complicated, in particular in nroff mode.) - If we are in troff mode, e.g. formatting for PostScript: (el) - Go back up the height of one text line. (sp) (They do that because the IP macro below is going to emit a blank line. Needless to say, resorting to low-level roff is rather pointless here, this is exactly what the man(7) PD macro is for.) - Start an indented paragraph, using the bullet as a paragraph head and an additional indentation of 2.3 times the width of the letter 'n' in the current font. (Like, huh? I thought the DocBook people didn't know IP existed, or why would they abuse RS for it? And if they choose RS, why don't they stick to it, but go for IP later? They could have chosen IP right from the start, you know! Besides, this is typographically wrong. We want a bullet list, aligned with the enclosing text. Alignment of the bullet with the text is correct in nroff, check with `nroff -mandoc -c cclive.1 | less`. But in troff mode, the bullet goes much too far to the right, check with `groff -mandoc cclive.1 > cclive.ps`. Besides, the 2.3n is too much, the PostScript output looks ugly.) - End of conditional. - The actual text. - End of indented block. (RE) So, these idiots have made the output exceedingly complicated - only to produce very poor results. For nroff mode, portability is poor, rendering is rather ugly (the 4n indent of the RS is excessive). For troff mode, indentation is completely botched. > For good measure, the original manual: > https://github.com/legatvs/cclive/blob/maint-0.7/doc/man1/cclive.1.txt See, all they want is: :: The 'string pair' consists of: :: * a regular expression pattern :: * format (media stream) to download I felt i had to check whether this is really as simple as it looks. Small surprise - it indeed is as simple as it looks. In mdoc(7), the code would look like so: The .Sq string pair consists of: .Bl -bullet -width 1n -compact .It a regular expression pattern .It format (media stream) to download .El The mandoc -Tman converter converts this to the following man(7) code: The \(oqstring pair\(cq consists of: .PD 0 .TP 3n \fBo\fR a regular expression pattern .TP 3n \fBo\fR format (media stream) to download Admittedly, man(7) code (in general) is uglier than mdoc(7) code. But really, even autogenerated man(7) code must by no means look even remotely like DocBook crap. With this code, `mandoc test.1` and `mandoc -Tman test.1 | mandoc -Omdoc` give identical, nice ouput: The `string pair' consists of: o a regular expression pattern o format (media stream) to download The output of `nroff -c -mandoc test.1` is nearly identical (except that groff outputs uglier bullets in mdoc(7)). The output of `mandoc -Tman test.1 | nroff -c -mandoc` is identical to `mandoc -Tman test.1 | mandoc`. Even the `mandoc -Tps *.1` and the `groff -mandoc *.1` PostScript output of both the mdoc(7) and the generated man(7) versions look good and quite similar to each other. So, it *is* easy to write short, clean code that can cleanly be translated to short, clean man(7) code that works compatibly for everything. How DocBook regularly screws up so badly, in more and weirder respects than i could make up, is beyond me. Now, would this improve *your* motivation to implement \h in mandoc(1)? I fear at some point i will have to do it - even if the main point is to partially offset DocBook stupidity. End of rant, finally. For now, i have added an entry to our TODO list. Yours, Ingo Index: TODO =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/TODO,v retrieving revision 1.154 diff -u -r1.154 TODO --- TODO 14 Jul 2013 11:57:38 -0000 1.154 +++ TODO 22 Sep 2013 11:45:45 -0000 @@ -55,6 +55,11 @@ - \c (interrupted text) should prevent the line break even inside .Bd literal; that occurs in chat(8) + also found in cclive(1) - DocBook output + +- \h horizontal move + found in cclive(1) DocBook output + Anthony J. Bentley on discuss@ Sat, 21 Sep 2013 22:29:34 -0600 - using undefined strings or macros defines them to be empty wl@ Mon, 14 Nov 2011 14:37:01 +0000 -- To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv