From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from scc-mailout-kit-02.scc.kit.edu (scc-mailout-kit-02.scc.kit.edu [129.13.231.82]) by mandoc.bsd.lv (OpenSMTPD) with ESMTP id c8ae8017 for ; Sun, 1 Mar 2020 18:05:12 -0500 (EST) Received: from hekate.asta.kit.edu ([141.3.145.153] helo=hekate.usta.de) by scc-mailout-kit-02.scc.kit.edu with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (envelope-from ) id 1j8Xe1-000716-Kn; Mon, 02 Mar 2020 00:05:11 +0100 Received: from donnerwolke.asta.kit.edu ([141.3.145.61] helo=donnerwolke.usta.de) by hekate.usta.de with esmtp (Exim 4.92.2) (envelope-from ) id 1j8Xdz-0001Fx-TU; Mon, 02 Mar 2020 00:05:07 +0100 Received: from athene.asta.kit.edu ([141.3.145.60] helo=athene.usta.de) by donnerwolke.usta.de with esmtp (Exim 4.84_2) (envelope-from ) id 1j8Xdz-0008L3-Pb; Mon, 02 Mar 2020 00:05:07 +0100 Received: from localhost (athene.usta.de [local]) by athene.usta.de (OpenSMTPD) with ESMTPA id 0522def3; Mon, 2 Mar 2020 00:05:07 +0100 (CET) Date: Mon, 2 Mar 2020 00:05:07 +0100 From: Ingo Schwarze To: Robert Mustacchi Cc: discuss@mandoc.bsd.lv Subject: Re: man_validate.c PP/LP macro confusion Message-ID: <20200301230507.GF66902@athene.usta.de> References: X-Mailinglist: mandoc-discuss Reply-To: discuss@mandoc.bsd.lv MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.12.2 (2019-09-21) Hi Robert, Robert Mustacchi wrote on Sat, Feb 29, 2020 at 10:57:18AM -0800: > While dealing with a number of old manual pages that haven't been > converted to mdoc yet, I noticed that when asking mandoc to lint them > (mandoc -Tline ), the error message it generates has confused a > few folks. The following is the warning: > > mandoc: abs.3c:10:2: WARNING: skipping paragraph macro: PP after SH > > The warning itself is clear; however, the source actually uses a .LP > macro and not a .PP macro. Most of the time folks know to map the two > together, Indeed, in particular since mandoc -T lint reports the line and column number and since the man(7) manual page explicitly calls .LP and .P synonyms for .PP. > but it would be a little clearer if it listed the actual macro > that was present in the document that was invalid. Granted. > I believe this happens because man_validate() in man_validate.c > explicitly converts the .LP macro internally to be treated as a .PP. > When it then triggers this check in post_SH(), it uses the modified name > to print out the warning. The same appears to be true check_par(). Correct. Note that while it isn't important at which time the rewriting gets done in the validator, it is important that it ultimately happens, for two reasons. First, the formatters shouldn't need to bother with obsolete macros. Currently, only one formatter tests for MAN_PP: man_html.c in man_PP_pre(). But even though that's the only place right now, i'd hate to have to test (n->tok == MAN_PP | n->tok == MAN_LP | n->tok == MAN_P) there. Besides, there is the possibility that someday i might write a generic man(7) output module, such that you could do automatic cleaning of the source code of a manual page by running it through "mandoc -T man" (right now, that merely copies the source code without cleaning it up). In that case, the replacement .LP -> .PP would become a real feature. > While it's possible to modify things such that the warning generates > the right token name, In general, i think that paying attention to the details of messages is useful. But in this case, my impression is it is hard to fix. During the recursive walking of the tree, parent nodes need to be validated after child nodes or features like removing parent nodes that become empty after their empty children are removed wouldn't work. So by the time .SH sees the .LP, it has already been translated, even if we would move the translation into check_par(). This is a general problem, not specific to the case of .LP, not specific to the process of macro translation, and not even specific to man(7). When you validate a tree, any modifications you apply necessarily influence later reporting. It is often hard enough to find an order of operations that yields the desired modifications of the tree, even without trying to tweak the order of operations to get ideal reporting. For example, after nodes are deleted, what adjacent nodes report about their neighbours may seem surprising to users, and many similar effects. > I wasn't sure if such a change would be desired or accepted given > the explicit consolidation and transformation for this being done > in man_validate(). I'd appreciate hearing what others thought. I considered how to do this and found no simpler way to achieve it than iterating the whole tree twice, first doing everything else and then finally translating the macro names. But that feels like a massive complication, quite excessive just for marginally improving a message. Even worse, it is not at clear to me that all kinds of reporting oddities could be fixed in similar ways. So what is special about this one, that would make it merit such a special effort? If you have a simple patch to improve it, i'm not necessarily opposed to that, even though i doubt that all such reporting oddities can be fixed. But i'd be surprised if a simple patch were possible in this case. Yours, Ingo -- To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv