From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from scc-mailout-kit-01.scc.kit.edu (scc-mailout-kit-01.scc.kit.edu [129.13.231.81]) by fantadrom.bsd.lv (OpenSMTPD) with ESMTP id 8ec9a48d for ; Wed, 11 Jan 2017 13:53:15 -0500 (EST) Received: from asta-nat.asta.uni-karlsruhe.de ([172.22.63.82] helo=hekate.usta.de) by scc-mailout-kit-01.scc.kit.edu with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (envelope-from ) id 1cRO1H-0004tX-W8; Wed, 11 Jan 2017 19:53:14 +0100 Received: from donnerwolke.usta.de ([172.24.96.3]) by hekate.usta.de with esmtp (Exim 4.77) (envelope-from ) id 1cRO1F-0004Yy-MP; Wed, 11 Jan 2017 19:53:09 +0100 Received: from athene.usta.de ([172.24.96.10]) by donnerwolke.usta.de with esmtp (Exim 4.84_2) (envelope-from ) id 1cRO1F-0004wH-G1; Wed, 11 Jan 2017 19:53:09 +0100 Received: from localhost (athene.usta.de [local]) by athene.usta.de (OpenSMTPD) with ESMTPA id 3738b8e3; Wed, 11 Jan 2017 19:53:09 +0100 (CET) Date: Wed, 11 Jan 2017 19:53:09 +0100 From: Ingo Schwarze To: Abhinav Upadhyay Cc: discuss@mdocml.bsd.lv Subject: Re: Parsing of .Ex with -std argument Message-ID: <20170111185309.GA40572@athene.usta.de> References: X-Mailinglist: mdocml-discuss Reply-To: discuss@mdocml.bsd.lv MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.6.2 (2016-07-01) Hi Abhinav, sorry for not answering this one in a timely fashion. I was occupied with other matters back then and forgot to send a preliminary answer. Abhinav Upadhyay wrote on Sun, Oct 02, 2016 at 11:13:00PM +0530: > I'm having issues parsing man pages which use the .Ex macro in the > EXIT STATUS section. For example mandoc -Ttree shows following output > for a man page using it: > > Sh (block) *138:2 > Sh (head) 138:2 > EXIT STATUS (text) 138:5 > Sh (body) 138:2 > Ex (elem) -std *139:2 > nbperf (text) 139:2 > > This is causing makemandb(8) (NetBSD's indexing tool) parse the EXIT > STATUS section just as "nbperf" instead of the complete expanded text. This was a symptom of a more general problem. The text production macros - .At, .Bsx, .Bx, .Ex, .Fx, .Lb, .Nx, .Ox, .Rv, .St, .Ux and the obsolete .Bt and .Ud - handled their task inconsistently. Some produced text at the validation stage and added it to the syntax tree, notably .Lb and .St. Others did little at the validation stage and left text production to the formatters, notably .Ex and .Rv. Both approaches had downsides. Text production at the validation stage obscured the content of the original source document, making it impossible to recover the source from the finalized syntax tree. Text production in the formatters implied code duplication and the problem you describe for tools using the library. I now committed substantial changes cleaning this up, introducing a flag NODE_NOPRT such that content that is required in the source code but not intended to be printed can remain in the tree, for example the argument of .St, solving the first problem, and introducing a flag NODE_NOSRC such that generated content not contained in the source document can be marked, allowing to solve the second problem and do text generation at the validation stage for all macros, even for .Ex and .Rv. This work required a number of commits; i'm appending the commit message of the final one at the end. The resulting changes will be in mandoc-1.13.5 and in mandoc-1.14.x when released. > I compared this with the previous version of mandoc. The -Ttree output > then didn't use to show the Ex macro in the output and makemandb(8) > also indexed the EXIT STATUS section as blank strings (I am noticing > it just now). Yes, i think this was always inconsistent in the way described above. > I guess, I could fix this in makemandb(8) by adding a handler for .Ex > macro but it would probably be better if mandoc(3) provided some way > to handle it? I agree with that suggestion, hence this series of commits. Here is the result: schwarze@isnote $ echo .Ex nbperf | mandoc -mdoc -Ttree Ex (elem) *1:2 The (text) 1:2 NOSRC Nm (elem) 1:5 NOSRC nbperf (text) 1:5 utility exits\~0 (text) 1:2 NOSRC on success, and\~>0 if an error occurs. (text) 1:2. NOSRC schwarze@isnote $ echo .Lb libc | ./mandoc -mdoc -Ttree Lb (elem) *1:2 Standard C Library (libc, \-lc) (text) 1:2 NOSRC libc (text) 1:5 NOPRT schwarze@isnote $ echo .Nx 7.0.2 | mandoc -mdoc -Ttree Nx (elem) *1:2 NetBSD (text) 1:2 NOSRC 7.0.2 (text) 1:5 schwarze@isnote $ echo .St -p1003.1-2008 | mandoc -mdoc -Ttree St (elem) *1:2 IEEE Std 1003.1-2008 (\(LqPOSIX.1\(Rq) (text) 1:5 NOSRC -p1003.1-2008 (text) 1:5 NOPRT schwarze@isnote $ echo .Ar | mandoc -mdoc -Ttree Ar (elem) *1:2 file (text) 1:2 NOSRC ... (text) 1:2 NOSRC Thanks for making me aware of the issue, Ingo CVSROOT: /cvs Module name: src Changes by: schwarze@cvs.openbsd.org 2017/01/11 10:39:45 Modified files: usr.bin/mandoc : mandocdb.c mdoc_html.c mdoc_man.c mdoc_term.c mdoc_validate.c regress/usr.bin/mandoc/mdoc: Makefile Added files: regress/usr.bin/mandoc/mdoc/Ud: Makefile arg.in arg.out_ascii arg.out_lint Log message: Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage rather than in the formatters. Use NODE_NOSRC flag for .Lb and NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous syntax tree and in 135 lines less code. This work was triggered by a question from Abhinav Upadhyay (NetBSD) on discuss@. -- To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv