From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from scc-mailout.scc.kit.edu (scc-mailout.scc.kit.edu [129.13.185.202]) by krisdoz.my.domain (8.14.3/8.14.3) with ESMTP id p6VMCBJm031205 for ; Sun, 31 Jul 2011 18:12:13 -0400 (EDT) Received: from hekate.usta.de (asta-nat.asta.uni-karlsruhe.de [172.22.63.82]) by scc-mailout-02.scc.kit.edu with esmtp (Exim 4.72 #1) id 1QneEy-0002dK-TM; Mon, 01 Aug 2011 00:12:08 +0200 Received: from donnerwolke.usta.de ([172.24.96.3]) by hekate.usta.de with esmtp (Exim 4.72) (envelope-from ) id 1QneEy-0002yO-QU; Mon, 01 Aug 2011 00:12:08 +0200 Received: from iris.usta.de ([172.24.96.5] helo=usta.de) by donnerwolke.usta.de with esmtp (Exim 4.69) (envelope-from ) id 1QneEy-0000jP-OJ; Mon, 01 Aug 2011 00:12:08 +0200 Received: from schwarze by usta.de with local (Exim 4.72) (envelope-from ) id 1QneEy-0000L6-GY; Mon, 01 Aug 2011 00:12:08 +0200 Date: Mon, 1 Aug 2011 00:12:08 +0200 From: Ingo Schwarze To: tech@mdocml.bsd.lv Cc: jmc@openbsd.org Subject: document delimiter handling in mdoc(7) Message-ID: <20110731221208.GJ1831@iris.usta.de> X-Mailinglist: mdocml-tech Reply-To: tech@mdocml.bsd.lv MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) Hi, some time ago, i did not merge Kristaps' change of "reserved characters" to "reserved terms" to mdoc.7 in OpenBSD because "reserved characters" is already too generic as a term in this context, and "reserved terms" is even worse. A better term is "delimiters". It is already used a lot in the code. Right before release is a good time to polish manuals, so i had another look - and noticed more issues: The text explains how to escape these characters "for general use", but fails to explain what they are escaped from, i.e. in which way they receive special handling in the first place. The whole subsection is located below "LANGUAGE SYNTAX", but it's only concerned with macro arguments. So it should be moved after explaining macro syntax, and into the "MACRO SYNTAX" section. As a nice side effect, this very complicated subject is no longer one of the first subjects covered in the manual. When reading the patch, you will notice that it is not yet perfect. It explains the special handling of delimiters, but is very imprecise as to which macros exactly all this applies to. There are two reasons why i didn't try to address that yet: The patch is already big, and figuring out the exact lists of macros showing the various behaviours is a lot of work - yes, there are many exceptions. Note that i removed the lie that delimiters terminate any macros. In block partial-implicit, only *trailing* *closing* delimiters terminate the macro, so that's right before the macro scope ends anyway, and that's already covered by the new "Delimiters" subsection. For in-line macros, the situation is quite complicated: Some are terminated by trailing closing delimiters just like block partial-implicit macros, some are only *interrupted* by delimiters and resume afterwards, and in_line_eoln macros don't do any special delimiter handling at all. The following patch is against OpenBSD; in case you like it, i will of course resolve the (physical and logical) conflicts and commit to bsd.lv as well. Yours, Ingo Index: mdoc.7 =================================================================== RCS file: /cvs/src/share/man/man7/mdoc.7,v retrieving revision 1.75 diff -u -r1.75 mdoc.7 --- mdoc.7 31 Jul 2011 17:12:29 -0000 1.75 +++ mdoc.7 31 Jul 2011 22:06:02 -0000 @@ -65,43 +65,6 @@ is also ignored. Macro lines with only a control character and optional whitespace are stripped from input. -.Ss Reserved Characters -Within a macro line, the following characters are reserved: -.Pp -.Bl -tag -width Ds -offset indent -compact -.It \&. -.Pq period -.It \e. -.Pq escaped period -.It \&, -.Pq comma -.It \&: -.Pq colon -.It \&; -.Pq semicolon -.It \&( -.Pq left-parenthesis -.It \&) -.Pq right-parenthesis -.It \&[ -.Pq left-bracket -.It \&] -.Pq right-bracket -.It \&? -.Pq question -.It \&! -.Pq exclamation -.It \&| -.Pq vertical bar -.It \e*(Ba -.Pq reserved-word vertical bar -.El -.Pp -For general use in macro lines, these characters can either be escaped -with a non-breaking space -.Pq Sq \e& -or, if applicable, an appropriate escape sequence can be used. -In text lines, these may be used as normal punctuation. .Ss Special Characters Special characters may occur in both macro and text lines. Sequences begin with the escape character @@ -729,9 +692,8 @@ .It Sx \&Xo Ta Yes Ta Yes Ta closed by Sx \&Xc .El .Ss Block partial-implicit -Like block full-implicit, but with single-line scope closed by -.Sx Reserved Characters -or end of line. +Like block full-implicit, but with single-line scope closed by the +end of the line. .Bd -literal -offset indent \&.Yo \(lB\-arg \(lBval...\(rB\(rB \(lBbody...\(rB \(lBres...\(rB .Ed @@ -777,9 +739,8 @@ .It Sx \&Ta Ta Yes Ta Yes Ta closed by Sx \&Ta , Sx \&It .El .Ss In-line -Closed by -.Sx Reserved Characters , -end of line, fixed argument lengths, and/or subsequent macros. +Closed by the end of the line, fixed argument lengths, +and/or subsequent macros. In-line macros have only text children. If a number (or inequality) of arguments is .Pq n , @@ -869,6 +830,90 @@ .It Sx \&br Ta \&No Ta \&No Ta 0 .It Sx \&sp Ta \&No Ta \&No Ta 1 .El +.Ss Delimiters +When a macro argument consists of one single input character +considered as a delimiter, the argument gets special handling. +This does not apply when delimiters appear in arguments containing +more than one character. +Consequently, to prevent special handling and just handle it +like any other argument, a delimiter can be escaped by prepending +a non-breaking space +.Pq Sq \e& . +In text lines, delimiters never need escaping, but may be used +as normal punctuation. +.Pp +For many macros, when the leading arguments are opening delimiters, +these delimiters are put before the macro scope, +and when the trailing arguments are closing delimiters, +these delimiters are put after the macro scope. +For example, +.Pp +.D1 Pf \. \&Aq "( [ word ] ) ." +.Pp +renders as: +.Pp +.D1 Aq ( [ word ] ) . +.Pp +Opening delimiters are: +.Pp +.Bl -tag -width Ds -offset indent -compact +.It \&( +.Pq left-parenthesis +.It \&[ +.Pq left-bracket +.El +.Pp +Closing delimiters are: +.Pp +.Bl -tag -width Ds -offset indent -compact +.It \&. +.Pq period +.It \&, +.Pq comma +.It \&: +.Pq colon +.It \&; +.Pq semicolon +.It \&) +.Pq right-parenthesis +.It \&] +.Pq right-bracket +.It \&? +.Pq question +.It \&! +.Pq exclamation +.El +.Pp +Note that even a period preceded by a backslash +.Pq Sq \e.\& +gets this special handling; use +.Sq \e&. +to prevent that. +.Pp +Many in-line macros interrupt their scope when they encouter +delimiters, and resume their scope when more arguments follow that +are not delimiters. +For example, +.Pp +.D1 Pf \. \&Fl "a ( b | c \e*(Ba d ) e" +.Pp +renders as: +.Pp +.D1 Fl a ( b | c \*(Ba d ) e +.Pp +This applies to both opening and closing delimiters, +and also to the middle delimiter: +.Pp +.Bl -tag -width Ds -offset indent -compact +.It \&| +.Pq vertical bar +.El +.Pp +As a special case, the predefined string \e*(Ba is handled and rendered +in the same way as a plain +.Sq \&| +character. +Using this predefined string is not recommended in new manuals. .Sh REFERENCE This section is a canonical reference of all macros, arranged alphabetically. @@ -2883,7 +2928,7 @@ lists will restart the sequence only for the sub-list. .It .Sx \&Li -followed by a reserved character is incorrectly used in some manuals +followed by a delimiter is incorrectly used in some manuals instead of properly quoting that character, which sometimes works with historic groff. .It -- To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv