From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from scc-mailout-kit-02.scc.kit.edu (scc-mailout-kit-02.scc.kit.edu [129.13.231.82]); by fantadrom.bsd.lv (OpenSMTPD) with ESMTP id 870b04d3; for ; Sat, 21 Mar 2015 15:01:15 -0500 (EST) Received: from asta-nat.asta.uni-karlsruhe.de ([172.22.63.82] helo=hekate.usta.de) by scc-mailout-kit-02.scc.kit.edu with esmtps (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (envelope-from ) id 1YZPa2-0007NW-VZ for tech@mdocml.bsd.lv; Sat, 21 Mar 2015 21:01:12 +0100 Received: from donnerwolke.usta.de ([172.24.96.3]) by hekate.usta.de with esmtp (Exim 4.77) (envelope-from ) id 1YZPa2-00060V-QP for tech@mdocml.bsd.lv; Sat, 21 Mar 2015 21:01:10 +0100 Received: from athene.usta.de ([172.24.96.10]) by donnerwolke.usta.de with esmtp (Exim 4.80) (envelope-from ) id 1YZPa2-0003Bf-LM for tech@mdocml.bsd.lv; Sat, 21 Mar 2015 21:01:10 +0100 Received: from localhost (1031@localhost [local]); by localhost (OpenSMTPD) with ESMTPA id c15668ec; for ; Sat, 21 Mar 2015 21:01:10 +0100 (CET) Date: Sat, 21 Mar 2015 21:01:10 +0100 From: Ingo Schwarze To: tech@mdocml.bsd.lv Subject: Re: manpages patch: escape apostrophe, grave, tilde Message-ID: <20150321200110.GA32119@athene.usta.de> References: <20198.1426930253@CATHET.us> X-Mailinglist: mdocml-tech Reply-To: tech@mdocml.bsd.lv MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20198.1426930253@CATHET.us> User-Agent: Mutt/1.5.23 (2014-03-12) Hi Anthony, Anthony J. Bentley wrote on Sat, Mar 21, 2015 at 03:30:53AM -0600: > Both groff and Heirloom troff transform the following characters in > typeset output such as -Tpdf: > > transforms ` into U+2018 (left single quote) > transforms ' into U+2019 (right single quote) > transforms ~ into U+02DC (small tilde) > > (Plan 9 and mandoc only do the first two.) > > These, particularly the second one, are desirable when used in prose. > But when the original ASCII character is meant (such as when listing > command-line input, or describing how to escape characters), it must > be escaped to provide correct output in all formats. I hate this. It is backwards, i consider it a bug in troff. ASCII input should produce ASCII output. If people want fancy special characters in the output, it is OK that fancy character escape sequences are required in the input. But it is not OK to require escape sequences in the input to get ASCII output - except in cases where the language syntax requires it, like for \e or for \(dq at the beginning of macro arguments. As long as the problem is limited to PostScript and PDF output, i'd rather ignore it than pester authors with such madness. Yours, Ingo > Index: apropos.1 > =================================================================== > RCS file: /cvs/mdocml/apropos.1,v > retrieving revision 1.37 > diff -u -p -u -p -r1.37 apropos.1 > --- apropos.1 16 Feb 2015 16:23:54 -0000 1.37 > +++ apropos.1 21 Mar 2015 09:11:56 -0000 > @@ -210,7 +210,7 @@ This has syntax > .Sm off > .Oo > .Op Ar key Op , Ar key ... > -.Pq Cm = | ~ > +.Pq Cm = | \(ti > .Oc > .Ar val , > .Sm on > @@ -227,7 +227,7 @@ for a list of available keys. > Operator > .Cm = > evaluates a substring, while > -.Cm ~ > +.Cm \(ti > evaluates a regular expression. > .It Fl i Ar term > If > @@ -398,7 +398,7 @@ as well: > .Pp > Search in names and descriptions using a regular expression: > .Pp > -.Dl $ apropos '~set.?[ug]id' > +.Dl $ apropos \(aq\(tiset.?[ug]id\(aq > .Pp > Search for manuals in the library section mentioning both the > .Qq optind > @@ -413,15 +413,15 @@ Do exactly the same as calling > with the argument > .Qq ssh : > .Pp > -.Dl $ apropos \-\- \-i 'Nm~[[:<:]]ssh[[:>:]]' > +.Dl $ apropos \-\- \-i \(aqNm\(ti[[:<:]]ssh[[:>:]]\(aq > .Pp > The following two invocations are equivalent: > .Pp > .D1 Li $ apropos -S Ar arch Li -s Ar section expression > .Bd -ragged -offset indent > .Li $ apropos \e( Ar expression Li \e) > -.Li -a arch~^( Ns Ar arch Ns Li |any)$ > -.Li -a sec~^ Ns Ar section Ns Li $ > +.Li -a arch\(ti^( Ns Ar arch Ns Li |any)$ > +.Li -a sec\(ti^ Ns Ar section Ns Li $ > .Ed > .Sh SEE ALSO > .Xr man 1 , > Index: eqn.7 > =================================================================== > RCS file: /cvs/mdocml/eqn.7,v > retrieving revision 1.34 > diff -u -p -u -p -r1.34 eqn.7 > --- eqn.7 9 Mar 2015 20:17:23 -0000 1.34 > +++ eqn.7 21 Mar 2015 09:11:56 -0000 > @@ -146,7 +146,7 @@ is used as the delimiter for the value > .Ar val . > This allows for arbitrary enclosure of terms (not just quotes), such as > .Pp > -.D1 Cm define Ar foo 'bar baz' > +.D1 Cm define Ar foo \(aqbar baz\(aq > .D1 Cm define Ar foo cbar bazc > .Pp > It is an error to have an empty > @@ -166,8 +166,8 @@ created. > Definitions can create arbitrary strings, for example, the following is > a legal construction. > .Bd -literal -offset indent > -define foo 'define' > -foo bar 'baz' > +define foo \(aqdefine\(aq > +foo bar \(aqbaz\(aq > .Ed > .Pp > Self-referencing definitions will raise an error. > Index: mandoc.1 > =================================================================== > RCS file: /cvs/mdocml/mandoc.1,v > retrieving revision 1.155 > diff -u -p -u -p -r1.155 mandoc.1 > --- mandoc.1 23 Feb 2015 13:31:03 -0000 1.155 > +++ mandoc.1 21 Mar 2015 09:11:56 -0000 > @@ -570,7 +570,7 @@ as the style-sheet: > .Pp > To check over a large set of manuals: > .Pp > -.Dl $ mandoc \-Tlint `find /usr/src -name \e*\e.[1-9]` > +.Dl $ mandoc \-Tlint \`find /usr/src -name \e*\e.[1-9]\` > .Pp > To produce a series of PostScript manuals for A4 paper: > .Pp > Index: mandoc_char.7 > =================================================================== > RCS file: /cvs/mdocml/mandoc_char.7,v > retrieving revision 1.59 > diff -u -p -u -p -r1.59 mandoc_char.7 > --- mandoc_char.7 20 Jan 2015 19:39:34 -0000 1.59 > +++ mandoc_char.7 21 Mar 2015 09:11:56 -0000 > @@ -196,7 +196,7 @@ Spacing: > .Bl -column "Input" "Description" -offset indent -compact > .It Em Input Ta Em Description > .It Sq \e\ \& Ta unpaddable non-breaking space > -.It \e~ Ta paddable non-breaking space > +.It \e\(ti Ta paddable non-breaking space > .It \e0 Ta unpaddable, breaking digit-width space > .It \e| Ta one-sixth \e(em narrow space, zero width in nroff mode > .It \e^ Ta one-twelfth \e(em half-narrow space, zero width in nroff > @@ -371,9 +371,9 @@ Mathematical: > .It \e(ne Ta \(ne Ta not equivalent > .It \e(ap Ta \(ap Ta tilde operator > .It \e(|= Ta \(|= Ta asymptotically equal > -.It \e(=~ Ta \(=~ Ta approximately equal > -.It \e(~~ Ta \(~~ Ta almost equal > -.It \e(~= Ta \(~= Ta almost equal > +.It \e(=\(ti Ta \(=~ Ta approximately equal > +.It \e(\(ti\(ti Ta \(~~ Ta almost equal > +.It \e(\(ti= Ta \(~= Ta almost equal > .It \e(pt Ta \(pt Ta proportionate > .It \e(es Ta \(es Ta empty set > .It \e(mo Ta \(mo Ta element > @@ -436,15 +436,15 @@ Accents: > .It \e(a. Ta \(a. Ta dotted > .It \e(a^ Ta \(a^ Ta circumflex > .It \e(aa Ta \(aa Ta acute > -.It \e' Ta \' Ta acute > +.It \e\(aq Ta \' Ta acute > .It \e(ga Ta \(ga Ta grave > -.It \e` Ta \` Ta grave > +.It \e\` Ta \` Ta grave > .It \e(ab Ta \(ab Ta breve > .It \e(ac Ta \(ac Ta cedilla > .It \e(ad Ta \(ad Ta dieresis > .It \e(ah Ta \(ah Ta caron > .It \e(ao Ta \(ao Ta ring > -.It \e(a~ Ta \(a~ Ta tilde > +.It \e(a\(ti Ta \(a~ Ta tilde > .It \e(ho Ta \(ho Ta ogonek > .It \e(ha Ta \(ha Ta hat (text) > .It \e(ti Ta \(ti Ta tilde (text) > @@ -453,32 +453,32 @@ Accents: > Accented letters: > .Bl -column "Input" "Rendered" "Description" -offset indent -compact > .It Em Input Ta Em Rendered Ta Em Description > -.It \e('A Ta \('A Ta acute A > -.It \e('E Ta \('E Ta acute E > -.It \e('I Ta \('I Ta acute I > -.It \e('O Ta \('O Ta acute O > -.It \e('U Ta \('U Ta acute U > -.It \e('a Ta \('a Ta acute a > -.It \e('e Ta \('e Ta acute e > -.It \e('i Ta \('i Ta acute i > -.It \e('o Ta \('o Ta acute o > -.It \e('u Ta \('u Ta acute u > -.It \e(`A Ta \(`A Ta grave A > -.It \e(`E Ta \(`E Ta grave E > -.It \e(`I Ta \(`I Ta grave I > -.It \e(`O Ta \(`O Ta grave O > -.It \e(`U Ta \(`U Ta grave U > -.It \e(`a Ta \(`a Ta grave a > -.It \e(`e Ta \(`e Ta grave e > -.It \e(`i Ta \(`i Ta grave i > -.It \e(`o Ta \(`i Ta grave o > -.It \e(`u Ta \(`u Ta grave u > -.It \e(~A Ta \(~A Ta tilde A > -.It \e(~N Ta \(~N Ta tilde N > -.It \e(~O Ta \(~O Ta tilde O > -.It \e(~a Ta \(~a Ta tilde a > -.It \e(~n Ta \(~n Ta tilde n > -.It \e(~o Ta \(~o Ta tilde o > +.It \e(\(aqA Ta \('A Ta acute A > +.It \e(\(aqE Ta \('E Ta acute E > +.It \e(\(aqI Ta \('I Ta acute I > +.It \e(\(aqO Ta \('O Ta acute O > +.It \e(\(aqU Ta \('U Ta acute U > +.It \e(\(aqa Ta \('a Ta acute a > +.It \e(\(aqe Ta \('e Ta acute e > +.It \e(\(aqi Ta \('i Ta acute i > +.It \e(\(aqo Ta \('o Ta acute o > +.It \e(\(aqu Ta \('u Ta acute u > +.It \e(\`A Ta \(`A Ta grave A > +.It \e(\`E Ta \(`E Ta grave E > +.It \e(\`I Ta \(`I Ta grave I > +.It \e(\`O Ta \(`O Ta grave O > +.It \e(\`U Ta \(`U Ta grave U > +.It \e(\`a Ta \(`a Ta grave a > +.It \e(\`e Ta \(`e Ta grave e > +.It \e(\`i Ta \(`i Ta grave i > +.It \e(\`o Ta \(`i Ta grave o > +.It \e(\`u Ta \(`u Ta grave u > +.It \e(\(tiA Ta \(~A Ta tilde A > +.It \e(\(tiN Ta \(~N Ta tilde N > +.It \e(\(tiO Ta \(~O Ta tilde O > +.It \e(\(tia Ta \(~a Ta tilde a > +.It \e(\(tin Ta \(~n Ta tilde n > +.It \e(\(tio Ta \(~o Ta tilde o > .It \e(:A Ta \(:A Ta dieresis A > .It \e(:E Ta \(:E Ta dieresis E > .It \e(:I Ta \(:I Ta dieresis I > @@ -657,7 +657,7 @@ manual. > .Sh UNICODE CHARACTERS > The escape sequences > .Pp > -.Dl \e[uXXXX] and \eC'uXXXX' > +.Dl \e[uXXXX] and \eC\(aquXXXX\(aq > .Pp > are interpreted as Unicode codepoints. > The codepoint must be in the range above U+0080 and less than U+10FFFF. > @@ -685,7 +685,7 @@ escape sequence, inserting the character > from the current character set into the output. > Of course, this is inherently non-portable and is already marked > as deprecated in the Heirloom roff manual. > -For example, do not use \eN'34', use \e(dq, or even the plain > +For example, do not use \eN\(aq34\(aq, use \e(dq, or even the plain > .Sq \(dq > character where possible. > .Sh COMPATIBILITY > @@ -709,7 +709,7 @@ In > .Fl T Ns Cm html > and > .Fl T Ns Cm xhtml , > -the \e(~=, \e(nb, and \e(nc special characters render differently > +the \e(\(ti=, \e(nb, and \e(nc special characters render differently > between mandoc and groff. > .It > The -- To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv