discuss@mandoc.bsd.lv
 help / color / mirror / Atom feed
From: Ingo Schwarze <schwarze@usta.de>
To: discuss@mdocml.bsd.lv
Cc: "Anthony J. Bentley" <anthony@cathet.us>
Subject: Re: Docbook bulleted lists
Date: Sun, 22 Sep 2013 14:29:22 +0200	[thread overview]
Message-ID: <20130922122922.GD20512@iris.usta.de> (raw)
In-Reply-To: <26482.1379824174@cathet.us>

Hi Anthony,

time for a rant.  This got long, but there is something to learn
about roff(7), mdoc(7), man(7), mandoc(1), groff(1) and DocBook
inside.  I wanted to make sure that i really understand and am
not bashing DocBook crap without a good reason.  But i *do* want
to make sure that i *am* bashing BocBook hard, until it bleeds
and runs away as fast and as far as it can and never comes back!


Anthony J. Bentley wrote on Sat, Sep 21, 2013 at 10:29:34PM -0600:

> Stuart Henderson pointed out this formatting problem
> in the cclive(1) manual:

>> The bullet-point lists in the (docbook-xsl generated) manual
>> are slightly broken with mandoc:

Everything remotely related to DocBook is more than slightly broken.
It suffers from utter braindeadness in almost any respect.  When
you look at DocBook output, there is hardly ever a line that couldn't
be easily improved in at least one respect, often more than one.
To reiterate:

:: [...]
:: Of course, ports not matching these criteria might work as well,
:: so there is nothing wrong with checking if you like to. However,
::
::  * if a port is using DocBook, checking mandoc compatibility will
::    likely turn out to be a waste of time. 

Quoted from: http://www.openbsd.org/faq/ports/specialtopics.html#Mandoc

It is good that from time to time somebody does report one of the
many stupidities that DocBook is made up of.  So i see that my harsh
remark in the FAQ, listing it as the number ONE reason why stupid,
broken, unportable man(7) code exists in the world, still rings true.

>> -               oa regular expression pattern
>> +           o   a regular expression pattern
>>
>> -               oformat (media stream) to download
>> +           o   format (media stream) to download

Like, huh, how can *such* a thing happen?  =:-O

>> (manpage source at http://junkpile.org/cclive.1)
 
> Source of the above section (note that the problem shows up
> throughout the manual):
> 
> .RS 4
> .ie n \{\
> \h'-04'\(bu\h'+03'\c
> .\}
> .el \{\
> .sp -1
> .IP \(bu 2.3
> .\}
> a regular expression pattern
> .RE

Yeah, sure, it's obvious to you why one would write a bulleted
list this way in man(7) code, right?

What, you still have doubt?  Then let me translate this for people
not *that* fluent in low-level roff request syntax, and while doing
so, check my claim that each line of DocBook output is wrong or
stupid in at least one way:

 - Begin a high level indented paragraph, indenting by four times
   the width of the letter 'n' in the current font.  (RS)
    (One might hope that this makes using low-level indentation
     macros useless inside the paragraph, because why would
     anybody use a high-level indentation macro if he is going
     to use low-level indentation requests anyway.
     However, tough luck, that's not what RS is for.)
 - Begin a roff(7) conditional instruction.  (ie)
    - If we are in nroff mode, i.e. formatting for the terminal: (n)
       - Shift back left by four times the width of the letter 'n'.  (\h)
          (Like, huh, isn't that right where we came from?
           Besides, this is exactly why the man(7) language
           has the TP, IP and HP macros.  Either of these works quite
           well for bullet lists but i guess DocBook people think they
           were not invented there.)
       - Print a bullet.  (\(bu)
       - Shift right by three times the width of the letter 'n'
         in the current font.  (\h)
       - Append the following text without inter-word spacing.  (\c)
          (Yes, this complication arises from the fact that we were
           mixing in low-level formatting instead of relying on the
           normal flow of text.  But wait, are we at the right place
           at all?  Usually, "nnnn" is not the same width as "nnno",
           is it?  Oh, we are in nroff mode, so we have a fixed-width
           font and this fine detail doesn't actually matter.
           Easy to forget that when you see so much low level stuff
           which is normally only required for delicate variable-width
           typesetting.  So, the spacing is actually correct, even though
           it looks wrong.  But don't ask me why it must be so complicated,
           in particular in nroff mode.)
    - If we are in troff mode, e.g. formatting for PostScript: (el)
       - Go back up the height of one text line.  (sp)
          (They do that because the IP macro below is going to emit
           a blank line.  Needless to say, resorting to low-level roff
           is rather pointless here, this is exactly what the man(7) PD
           macro is for.)
       - Start an indented paragraph, using the bullet as a paragraph
         head and an additional indentation of 2.3 times the width of the
         letter 'n' in the current font.
          (Like, huh?  I thought the DocBook people didn't know IP
           existed, or why would they abuse RS for it?  And if they
           choose RS, why don't they stick to it, but go for IP later?
           They could have chosen IP right from the start, you know!
           Besides, this is typographically wrong.
           We want a bullet list, aligned with the enclosing text.
           Alignment of the bullet with the text is correct in nroff,
           check with `nroff -mandoc -c cclive.1 | less`.
           But in troff mode, the bullet goes much too far to the
           right, check with `groff -mandoc cclive.1 > cclive.ps`.
           Besides, the 2.3n is too much, the PostScript output
           looks ugly.)
    - End of conditional.
 - The actual text.
 - End of indented block.  (RE)

So, these idiots have made the output exceedingly complicated -
only to produce very poor results.  For nroff mode, portability is
poor, rendering is rather ugly (the 4n indent of the RS is excessive).
For troff mode, indentation is completely botched.

> For good measure, the original manual:
> https://github.com/legatvs/cclive/blob/maint-0.7/doc/man1/cclive.1.txt

See, all they want is:

:: The 'string pair' consists of:
::  * a regular expression pattern
::  * format (media stream) to download

I felt i had to check whether this is really as simple as it looks.
Small surprise - it indeed is as simple as it looks.

In mdoc(7), the code would look like so:

The
.Sq string pair
consists of:
.Bl -bullet -width 1n -compact
.It
a regular expression pattern
.It
format (media stream) to download
.El

The mandoc -Tman converter converts this to the following man(7) code:

The
\(oqstring pair\(cq
consists of:
.PD 0
.TP 3n
\fBo\fR
a regular expression pattern
.TP 3n
\fBo\fR
format (media stream) to download

Admittedly, man(7) code (in general) is uglier than mdoc(7) code.
But really, even autogenerated man(7) code must by no means look
even remotely like DocBook crap.

With this code, `mandoc test.1`
and `mandoc -Tman test.1 | mandoc -Omdoc` give
identical, nice ouput:

     The `string pair' consists of:
     o  a regular expression pattern
     o  format (media stream) to download

The output of `nroff -c -mandoc test.1` is nearly identical (except
that groff outputs uglier bullets in mdoc(7)).
The output of `mandoc -Tman test.1 | nroff -c -mandoc` is identical
to `mandoc -Tman test.1 | mandoc`.

Even the `mandoc -Tps *.1` and the `groff -mandoc *.1`
PostScript output of both the mdoc(7) and the generated man(7)
versions look good and quite similar to each other.

So, it *is* easy to write short, clean code that can cleanly be
translated to short, clean man(7) code that works compatibly
for everything.  How DocBook regularly screws up so badly,
in more and weirder respects than i could make up, is beyond me.

Now, would this improve *your* motivation to implement \h in mandoc(1)?
I fear at some point i will have to do it - even if the main point
is to partially offset DocBook stupidity.

End of rant, finally.

For now, i have added an entry to our TODO list.

Yours,
  Ingo


Index: TODO
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/TODO,v
retrieving revision 1.154
diff -u -r1.154 TODO
--- TODO	14 Jul 2013 11:57:38 -0000	1.154
+++ TODO	22 Sep 2013 11:45:45 -0000
@@ -55,6 +55,11 @@
 
 - \c (interrupted text) should prevent the line break
   even inside .Bd literal; that occurs in chat(8)
+  also found in cclive(1) - DocBook output
+
+- \h horizontal move
+  found in cclive(1) DocBook output
+  Anthony J. Bentley on discuss@  Sat, 21 Sep 2013 22:29:34 -0600
 
 - using undefined strings or macros defines them to be empty
   wl@  Mon, 14 Nov 2011 14:37:01 +0000
--
 To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv

  reply	other threads:[~2013-09-22 12:29 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-22  4:29 Anthony J. Bentley
2013-09-22 12:29 ` Ingo Schwarze [this message]
2013-09-23  3:12   ` Anthony J. Bentley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130922122922.GD20512@iris.usta.de \
    --to=schwarze@usta.de \
    --cc=anthony@cathet.us \
    --cc=discuss@mdocml.bsd.lv \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).