discuss@mandoc.bsd.lv
 help / color / mirror / Atom feed
* Dashes and strange markup
@ 2017-03-06 11:17 Dag-Erling Smørgrav
  2017-03-06 14:55 ` Anthony J. Bentley
  2017-03-06 15:57 ` Ingo Schwarze
  0 siblings, 2 replies; 6+ messages in thread
From: Dag-Erling Smørgrav @ 2017-03-06 11:17 UTC (permalink / raw)
  To: discuss

I came across a FreeBSD man page that left me scratching my head, and I
was hoping you could explain what is going on and how to fix it.  Let me
show you the output first:

     The file that goes by the name of beastie.4th is a set of commands
     designed to draw the ASCII art FreeBSD mascot — known simply as beastie —
     to the right of the boot loader menu.  The commands of beastie.4th by

Nicely formatted with Unicode em-dashes, both in FreeBSD 10 (which uses
groff) and FreeBSD 11 (which uses mandoc).  But whatis tells a different
story:

| % uname -r
| 10.3-RELEASE-p11
| % PAGER=cat whatis beastie
| beastie.4th(8)           - FreeBSD ASCII art boot module

fine, but:

| des@hive ~% uname -r                                                         
| 11.0-RELEASE-p2
| des@hive ~% PAGER=cat whatis beastie
| beastie.4th(8) - FreeBSD ASCII art boot module known simply as beastie to the right of the boot loader menu. [...]

I cut it short, but it prints out the entire DESCRIPTION section instead
of just the document description.

Here's an excerpt from the source:

| .Sh NAME
| .Nm beastie.4th
| .Nd FreeBSD ASCII art boot module
| .Sh DESCRIPTION
| The file that goes by the name of
| .Nm
| is a set of commands designed to draw the ASCII art FreeBSD mascot
| .Nd known simply as
| .Ic beastie
| .Nd to the right of the boot loader menu.

So apparently Nd was (ab)used for an en- or em-dash, and I can sort of
understand it because it is documented to "print a dash followed by its
arguments", and mandoc's makewhatis is less forgiving than groff.

The next issue is that I tried to replace .Nd with either \(en and \(em,
but the former is rendered as a single hyphen and the latter as two
hyphens, instead of actual en- or em-dashes.  Is that intentional?

(I also replaced .Ic with .Em, but that's completely orthogonal)

The third issue is that according to conventional English typography[*],
the correct usage is an em-dash with no surrounding spaces, and I can't
figure out how to suppress the spaces, short of gluing everything
together (like\(em\&this).  Then again, I'm not sure how this will
affect line wrapping, so perhaps I should follow Oxford's example and
switch from an em-dash without spaces to an en-dash with spaces...  but
I'd still like to know if there's a better way to suppress the spaces.

[*] see for instance the Chicago Manual of Style and older editions of
the Oxford Style Manual.

DES
-- 
Dag-Erling Smørgrav - des@des.no
--
 To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Dashes and strange markup
  2017-03-06 11:17 Dashes and strange markup Dag-Erling Smørgrav
@ 2017-03-06 14:55 ` Anthony J. Bentley
  2017-03-06 16:52   ` Dag-Erling Smørgrav
  2017-03-06 15:57 ` Ingo Schwarze
  1 sibling, 1 reply; 6+ messages in thread
From: Anthony J. Bentley @ 2017-03-06 14:55 UTC (permalink / raw)
  To: discuss

Dag-Erling Smørgrav writes:
> | .Sh NAME
> | .Nm beastie.4th
> | .Nd FreeBSD ASCII art boot module
> | .Sh DESCRIPTION
> | The file that goes by the name of
> | .Nm
> | is a set of commands designed to draw the ASCII art FreeBSD mascot
> | .Nd known simply as
> | .Ic beastie
> | .Nd to the right of the boot loader menu.

Mandoc's behavior aside, this manual should be using \(em or \(en instead
of .Nd here.

> The next issue is that I tried to replace .Nd with either \(en and \(em,
> but the former is rendered as a single hyphen and the latter as two
> hyphens, instead of actual en- or em-dashes.  Is that intentional?

\(en and \(em become en and em dashes in a Unicode locale. In an ASCII
locale, they become one and two hyphens, respectively.

> The third issue is that according to conventional English typography[*],
> the correct usage is an em-dash with no surrounding spaces, and I can't
> figure out how to suppress the spaces, short of gluing everything
> together (like\(em\&this).

I don't really understand the question here. If you put spaces in,
there will be spaces. If you don't, there won't be. It's common usage
to just leave out the spaces. For example, with double quotes:

  \(lqquotation here\(rq

-- 
Anthony J. Bentley
--
 To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Dashes and strange markup
  2017-03-06 11:17 Dashes and strange markup Dag-Erling Smørgrav
  2017-03-06 14:55 ` Anthony J. Bentley
@ 2017-03-06 15:57 ` Ingo Schwarze
  2017-03-06 16:39   ` Dag-Erling Smørgrav
  1 sibling, 1 reply; 6+ messages in thread
From: Ingo Schwarze @ 2017-03-06 15:57 UTC (permalink / raw)
  To: Dag-Erling Smørgrav; +Cc: discuss

Hi,

Dag-Erling Smoergrav wrote on Mon, Mar 06, 2017 at 12:17:30PM +0100:

>> des@hive ~% uname -r
>> 11.0-RELEASE-p2
>> des@hive ~% PAGER=cat whatis beastie
>> beastie.4th(8) - FreeBSD ASCII art boot module known simply as beastie
>>   to the right of the boot loader menu. [...]

> I cut it short, but it prints out the entire DESCRIPTION section instead
> of just the document description.

That's correct behaviour.

If you specify a one line desciption, it is supposed to appear in whatis(1).

> So apparently Nd was (ab)used

"Abused" is indeed the correct word.  I have never seen such horrible
abuse of .Nd before.

> for an en- or em-dash, and I can sort of understand it because it is
> documented to "print a dash followed by its arguments",

Well, the old mdoc.samples(7) was very imprecise in many places.
The newer groff_mdoc(7) is now much better, and it clearly says
that the usage is question is incorrect:

        The NAME section consists of at least three items.  The
        first is the .Nm name macro naming the subject of the man
        page.  The second is the name description macro, .Nd, which
        separates the subject name from the third item, which is
        the description.  The description should be the most terse
        and lucid possible, as the space available is small.

That clearly says that what follows .Nd is the description.
It says that is is intended for use in the NAME section; admittedly,
it does not explicitly say that it should not be used outside the
NAME section, but given the content of the above paragraph, that
more or less goes without saying.

The mdoc(7) manual is more explicit:

   Nd
     A one line description of the manual's content.  This is the
     mandatory last macro of the NAME section and not appropriate
     for other sections.

     Examples:
           .Nd mdoc language reference
           .Nd format and display UNIX manuals

     The Nd macro technically accepts child macros and terminates
     with a subsequent Sh invocation.  Do not assume this behaviour:
     some whatis(1) database generators are not smart enough to
     parse more than the line arguments and will display macros
     verbatim.

> and mandoc's makewhatis is less forgiving than groff.

Well, abusing .Nd outside NAME is so bizarre that being forgiving
about it would be a bad idea.  The much more common abuse that
needs to be handled is multi-line .Nd blocks containing macros
roughly as follows:

  .Sh NAME
  .Nm ugliness
  .Nd this is a
  .Em very
  stupid idea, but somewhat widespread
  .Sh DESCRIPTION

People would bitterly complain about

  $ whatis ugliness
  ugliness(1) - this is a

Anthony correctly answered the rest, so my impression is that no
code or documentation changes are required.

Yours,
  Ingo
--
 To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Dashes and strange markup
  2017-03-06 15:57 ` Ingo Schwarze
@ 2017-03-06 16:39   ` Dag-Erling Smørgrav
  2017-03-06 17:34     ` Ingo Schwarze
  0 siblings, 1 reply; 6+ messages in thread
From: Dag-Erling Smørgrav @ 2017-03-06 16:39 UTC (permalink / raw)
  To: Ingo Schwarze; +Cc: discuss

Ingo Schwarze <schwarze@usta.de> writes:
> Well, abusing .Nd outside NAME is so bizarre that being forgiving
> about it would be a bad idea.

Technically, mandoc *is* forgiving about it: it renders it the way the
author (incorrectly) expected instead of throwing an error, which is
probably why it has gone unnoticed until now.

> Anthony correctly answered the rest, so my impression is that no
> code or documentation changes are required.

I asked for advice, not changes.

DES
-- 
Dag-Erling Smørgrav - des@des.no
--
 To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Dashes and strange markup
  2017-03-06 14:55 ` Anthony J. Bentley
@ 2017-03-06 16:52   ` Dag-Erling Smørgrav
  0 siblings, 0 replies; 6+ messages in thread
From: Dag-Erling Smørgrav @ 2017-03-06 16:52 UTC (permalink / raw)
  To: Anthony J. Bentley; +Cc: discuss

"Anthony J. Bentley" <anthony@anjbe.name> writes:
> Dag-Erling Smørgrav <des@des.no> writes:
> > The next issue is that I tried to replace .Nd with either \(en and
> > \(em, but the former is rendered as a single hyphen and the latter
> > as two hyphens, instead of actual en- or em-dashes.  Is that
> > intentional?
> \(en and \(em become en and em dashes in a Unicode locale. In an ASCII
> locale, they become one and two hyphens, respectively.

My locale is en_GB.UTF-8, but it turns out I was looking at the page
with two different versions of mandoc, and while the older version
(1.12.1) didn't format the dashes properly, the newer one does.

> > The third issue is that [...] I can't figure out how to suppress the
> > spaces, short of gluing everything together (like\(em\&this).
> I don't really understand the question here.

That's OK, it was meant exactly as you interpreted it :)

DES
-- 
Dag-Erling Smørgrav - des@des.no
--
 To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Dashes and strange markup
  2017-03-06 16:39   ` Dag-Erling Smørgrav
@ 2017-03-06 17:34     ` Ingo Schwarze
  0 siblings, 0 replies; 6+ messages in thread
From: Ingo Schwarze @ 2017-03-06 17:34 UTC (permalink / raw)
  To: Dag-Erling Smørgrav; +Cc: discuss

Hi,

Dag-Erling Smoergrav wrote on Mon, Mar 06, 2017 at 05:39:55PM +0100:
> Ingo Schwarze <schwarze@usta.de> writes:

>> Well, abusing .Nd outside NAME is so bizarre that being forgiving
>> about it would be a bad idea.

> Technically, mandoc *is* forgiving about it: it renders it the way the
> author (incorrectly) expected instead of throwing an error, which is
> probably why it has gone unnoticed until now.

Good point!

Now, mandoc(1) never errors out, which is an important feature.

But it should indeed throw a warning.

I just committed the patch below, implementing the missing warning.

>> Anthony correctly answered the rest, so my impression is that no
>> code or documentation changes are required.

> I asked for advice, not changes.

Fair enough.

Then again, when intelligent people ask for advice, there is often
something that can be improved.  :)

Thanks for the report,
  Ingo


Log Message:
-----------
Using .Nd only makes sense in the NAME section.
Warn if that macro occurs elsewhere.
Triggered by a question from Dag-Erling Smoergrav <des @ FreeBSD>.

Modified Files:
--------------
    mdocml:
        mandoc.1
        mandoc.h
        mdoc_validate.c
        read.c
    mdocml/regress/mdoc/Nd:
        broken.out_lint

Revision Data
-------------
Index: mandoc.1
===================================================================
RCS file: /home/cvs/mdocml/mdocml/mandoc.1,v
retrieving revision 1.176
retrieving revision 1.177
diff -Lmandoc.1 -Lmandoc.1 -u -p -r1.176 -r1.177
--- mandoc.1
+++ mandoc.1
@@ -908,6 +908,14 @@ The
 .Ic \&Nd
 macro lacks the required argument.
 The title line of the manual will end after the dash.
+.It Sy "description line outside NAME section"
+.Pq mdoc
+An
+.Ic \&Nd
+macro appears outside the NAME section.
+The arguments are printed anyway and the following text is used for
+.Xr apropos 1 ,
+but none of that behaviour is portable.
 .It Sy "sections out of conventional order"
 .Pq mdoc
 A standard section occurs after another section it usually precedes.
Index: read.c
===================================================================
RCS file: /home/cvs/mdocml/mdocml/read.c,v
retrieving revision 1.161
retrieving revision 1.162
diff -Lread.c -Lread.c -u -p -r1.161 -r1.162
--- read.c
+++ read.c
@@ -113,6 +113,7 @@ static	const char * const	mandocerrs[MAN
 	"bad NAME section content",
 	"missing comma before name",
 	"missing description line, using \"\"",
+	"description line outside NAME section",
 	"sections out of conventional order",
 	"duplicate section title",
 	"unexpected section",
Index: mandoc.h
===================================================================
RCS file: /home/cvs/mdocml/mdocml/mandoc.h,v
retrieving revision 1.214
retrieving revision 1.215
diff -Lmandoc.h -Lmandoc.h -u -p -r1.214 -r1.215
--- mandoc.h
+++ mandoc.h
@@ -71,6 +71,7 @@ enum	mandocerr {
 	MANDOCERR_NAMESEC_BAD, /* bad NAME section content: macro */
 	MANDOCERR_NAMESEC_PUNCT, /* missing comma before name: Nm name */
 	MANDOCERR_ND_EMPTY, /* missing description line, using "" */
+	MANDOCERR_ND_LATE, /* description line outside NAME section */
 	MANDOCERR_SEC_ORDER, /* sections out of conventional order: Sh title */
 	MANDOCERR_SEC_REP, /* duplicate section title: Sh title */
 	MANDOCERR_SEC_MSEC, /* unexpected section: Sh title for ... only */
Index: mdoc_validate.c
===================================================================
RCS file: /home/cvs/mdocml/mdocml/mdoc_validate.c,v
retrieving revision 1.318
retrieving revision 1.319
diff -Lmdoc_validate.c -Lmdoc_validate.c -u -p -r1.318 -r1.319
--- mdoc_validate.c
+++ mdoc_validate.c
@@ -1035,6 +1035,10 @@ post_nd(POST_ARGS)
 	if (n->type != ROFFT_BODY)
 		return;
 
+	if (n->sec != SEC_NAME)
+		mandoc_msg(MANDOCERR_ND_LATE, mdoc->parse,
+		    n->line, n->pos, "Nd");
+
 	if (n->child == NULL)
 		mandoc_msg(MANDOCERR_ND_EMPTY, mdoc->parse,
 		    n->line, n->pos, "Nd");
Index: broken.out_lint
===================================================================
RCS file: /home/cvs/mdocml/mdocml/regress/mdoc/Nd/broken.out_lint,v
retrieving revision 1.2
retrieving revision 1.3
diff -Lregress/mdoc/Nd/broken.out_lint -Lregress/mdoc/Nd/broken.out_lint -u -p -r1.2 -r1.3
--- regress/mdoc/Nd/broken.out_lint
+++ regress/mdoc/Nd/broken.out_lint
@@ -3,5 +3,7 @@ mandoc: broken.in:5:2: WARNING: bad NAME
 mandoc: broken.in:9:1: WARNING: bad NAME section content: text
 mandoc: broken.in:4:2: WARNING: NAME section without Nm before Nd
 mandoc: broken.in:4:2: WARNING: NAME section without description
+mandoc: broken.in:16:2: WARNING: description line outside NAME section: Nd
 mandoc: broken.in:13:2: WARNING: moving content out of list: Bl
 mandoc: broken.in:18:1: WARNING: moving content out of list: text
+mandoc: broken.in:27:2: WARNING: description line outside NAME section: Nd
--
 To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-03-06 17:34 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-06 11:17 Dashes and strange markup Dag-Erling Smørgrav
2017-03-06 14:55 ` Anthony J. Bentley
2017-03-06 16:52   ` Dag-Erling Smørgrav
2017-03-06 15:57 ` Ingo Schwarze
2017-03-06 16:39   ` Dag-Erling Smørgrav
2017-03-06 17:34     ` Ingo Schwarze

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).