discuss@mandoc.bsd.lv
 help / color / mirror / Atom feed
* Handling .%P ranges with hyphen in mdoc(7); pp./p.
@ 2019-04-23 16:22 Fabio Scotoni
  2019-04-23 19:12 ` Ingo Schwarze
  0 siblings, 1 reply; 2+ messages in thread
From: Fabio Scotoni @ 2019-04-23 16:22 UTC (permalink / raw)
  To: discuss

Summary:
a. .%P does not insert "pp." or "p." and this may need to be
   documented or changed;
b. A hyphen in .%P should be converted to an en dash in the output.



In mdoc(7), .%P specifies a "book or journal page of an Rs block".
It leaves open how exactly this is meant to be specified.

Regarding the question of whether inserting "pp."/"p." is the
responsibility of -mdoc or the user:

As an example, there seem to be five different variants of how .%P is
invoked in the OpenBSD tree:
- lib/libc/db/man/btree.3: .%P pp 121-138 ("pp" with no dot, range with
  hyphen)
- lib/libc/hash/rmd160.3: .%P pp. 24-28 ("pp." with dot, range with
  hyphen)
- lib/libc/stdlib/qsort.3: .%P pp. 114\-123, 145\-149 ("pp." with dot,
  range with explicit ASCII hyphen)
- share/man/man4/kate.4: .%P pp. 21--23 and pp. 179--184 ("pp." with
  dot, range with double hyphen)
- share/man/man7/eqn.7: %P 151\(en157 (no page number, range with \(en)
These probably cover just about everything that could be found in the wild.
While it seems most of the pages specify "pp."/"p." themselves,
but it may be worthwhile to insert "pp."/"p." if the first character of
the first argument is a number ("pp." if a hyphen is found, "p." otherwise).

The .%P macro seems inspired by refer(1), as it mostly follows the
format of .[ .] references except for specifiers having to begin with a dot.

Checking with the GNU refer(1) man page, it says that "A range of pages
can be specified as m-n", using \- in the troff source of refer.1,
apparently without specifying "p." or "pp.".
The example for UNIX refer(1) %P in "Some Applications of Inverted
Indexes on the UNIX System" by M. E. Lesk in Vol. 2A of the Seventh
Edition UNIX Programmer's Manual also suggests using a hyphen for a
range of pages with no leading "p." or "pp.".
Seventh Edition refer(1) would indeed prepend "p." and "pp.",
as do GNU refer(1) and heirloom-doctools refer(1).

Neither groff -mdoc nor mandoc do this, however.
The cause is probably tradition in 4.4BSD, where .%P arguments
consistently specified "p."/"pp.".



Regarding the question of what to do with the hyphen:

It seems that common typographical wisdom suggests that ranges of
numbers are specified with an en dash (see the section on hyphens and
dashes in Matthew Butterick's Practical Typography).
However, neither GNU refer and heirloom-doctools refer do this.
Some man pages do this manually (e.g. eqn.7 in the mandoc distribution).

To improve the quality of output, mandoc should convert hyphens in .%P
to en dashes in HTML, PostScript and PDF output, possibly also in UTF-8
output.



The benefit of changing the behavior of .%P would be fairly minimal
(typographically correct dashes), but it may be worth considering
nonetheless.

Fabio
--
 To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Handling .%P ranges with hyphen in mdoc(7); pp./p.
  2019-04-23 16:22 Handling .%P ranges with hyphen in mdoc(7); pp./p Fabio Scotoni
@ 2019-04-23 19:12 ` Ingo Schwarze
  0 siblings, 0 replies; 2+ messages in thread
From: Ingo Schwarze @ 2019-04-23 19:12 UTC (permalink / raw)
  To: Fabio Scotoni; +Cc: discuss

Hi Fabio,

Fabio Scotoni wrote on Tue, Apr 23, 2019 at 06:22:17PM +0200:

> In mdoc(7), .%P specifies a "book or journal page of an Rs block".
> It leaves open how exactly this is meant to be specified.
> 
> Regarding the question of whether inserting "pp."/"p." is the
> responsibility of -mdoc or the user:
> 
> As an example, there seem to be five different variants of how .%P is
> invoked in the OpenBSD tree:
> - lib/libc/db/man/btree.3: .%P pp 121-138 ("pp" with no dot, range with
>   hyphen)

I consider the lack of the dot an error and fixed in the OpenBSD tree.

> - lib/libc/hash/rmd160.3: .%P pp. 24-28 ("pp." with dot, range with
>   hyphen)

Clearly, "\(en" is better than a hyphen for numeric ranges.
Not sure i want to fix that right now, though, it is tedious work.

> - lib/libc/stdlib/qsort.3: .%P pp. 114\-123, 145\-149 ("pp." with dot,
>   range with explicit ASCII hyphen)

That's not en "explicit ASCII hyphen".  Quite to the contrary, it's
a minus sign, which is clearly wrong (though understandable in very
old manual pages because \(en appeared later in roff history than \-).
Again, tedious work to fix that.

> - share/man/man4/kate.4: .%P pp. 21--23 and pp. 179--184 ("pp." with
>   dot, range with double hyphen)

That's an outright markup error, and i fixed it in the OpenBSD tree.

> - share/man/man7/eqn.7: %P 151\(en157 (no page number, range with \(en)

I fixed the omission of "pp." both in OpenBSD and bsd.lv.

> These probably cover just about everything that could be found in the wild.
> While it seems most of the pages specify "pp."/"p." themselves,
> but it may be worthwhile to insert "pp."/"p." if the first character of
> the first argument is a number ("pp." if a hyphen is found, "p." otherwise).
> 
> The .%P macro seems inspired by refer(1), as it mostly follows the
> format of .[ .] references except for specifiers having to begin with a dot.
> 
> Checking with the GNU refer(1) man page, it says that "A range of pages
> can be specified as m-n", using \- in the troff source of refer.1,
> apparently without specifying "p." or "pp.".
> The example for UNIX refer(1) %P in "Some Applications of Inverted
> Indexes on the UNIX System" by M. E. Lesk in Vol. 2A of the Seventh
> Edition UNIX Programmer's Manual also suggests using a hyphen for a
> range of pages with no leading "p." or "pp.".
> Seventh Edition refer(1) would indeed prepend "p." and "pp.",
> as do GNU refer(1) and heirloom-doctools refer(1).

I consider refer(1) of little relevance because manual pages
are never generated with refer.  Even outside the domain of manual
pages, refer(1) is no longer all that widely used.

> Neither groff -mdoc nor mandoc do this, however.

Groff behaviour is very relevant, it is the quasi-standard for
the mdoc(7) language unless we explicitely decide otherwise.

> The cause is probably tradition in 4.4BSD, where .%P arguments
> consistently specified "p."/"pp.".

Sure, it looks like Cynthia decided that, and i see little reason
to change it now.  The .%P macro is relatively rarely used, so
changing its definition would not have large impact, yet it would
cause work.  I think sticking to Cynthia's convention is good
enough, 

So i documented the convention in the mdoc(7) manual page,
see the commit below.

> Regarding the question of what to do with the hyphen:
> 
> It seems that common typographical wisdom suggests that ranges of
> numbers are specified with an en dash (see the section on hyphens and
> dashes in Matthew Butterick's Practical Typography).

Yes, see also mandoc_char(7), subsection "Dashes and Hyphens".

> However, neither GNU refer and heirloom-doctools refer do this.
> Some man pages do this manually (e.g. eqn.7 in the mandoc distribution).
> 
> To improve the quality of output, mandoc should convert hyphens in .%P
> to en dashes in HTML, PostScript and PDF output, possibly also in UTF-8
> output.

I'd call that "too much magic" for two reasons.

Most importantly, .%P is not the only context, and probably not
even the most common context, asking for this.  Authors are already
invited to use \(en for numerical ranges in general, in manual pages
just like in any other roff(7) document.

So it would add code for something that is rare in the first
place (the .%P macro), even though the second argument applies,
which you already provided:

> The benefit of changing the behavior of .%P would be fairly minimal
> (typographically correct dashes), but it may be worth considering
> nonetheless.

And even that only for some cases where the author was negligent,
and even in those cases, not consistently...

Yours,
  Ingo


Log Message:
-----------
clarify how .%P is conventionally used;
triggered by a question from Fabio Scotoni <fabio at esse dot ch>

Modified Files:
--------------
    mandoc:
        mdoc.7

Revision Data
-------------
Index: mdoc.7
===================================================================
RCS file: /home/cvs/mandoc/mandoc/mdoc.7,v
retrieving revision 1.276
retrieving revision 1.277
diff -Lmdoc.7 -Lmdoc.7 -u -p -r1.276 -r1.277
--- mdoc.7
+++ mdoc.7
@@ -596,6 +596,13 @@ block.
 Book or journal page number of an
 .Ic \&Rs
 block.
+Conventionally, the argument starts with
+.Ql p.\&
+for a single page or
+.Ql pp.\&
+for a range of pages, for example:
+.Pp
+.Dl .%P pp. 42\e(en47
 .It Ic \&%Q Ar name
 Institutional author (school, government, etc.) of an
 .Ic \&Rs
@@ -2969,7 +2976,7 @@ exclamation mark
 Note that even a period preceded by a backslash
 .Pq Sq \e.\&
 gets this special handling; use
-.Sq \e&.
+.Sq \e&.\&
 to prevent that.
 .Pp
 Many in-line macros interrupt their scope when they encounter
--
 To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-04-23 19:12 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-23 16:22 Handling .%P ranges with hyphen in mdoc(7); pp./p Fabio Scotoni
2019-04-23 19:12 ` Ingo Schwarze

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).