discuss@mandoc.bsd.lv
 help / color / mirror / Atom feed
From: Ingo Schwarze <schwarze@usta.de>
To: Robert Mustacchi <rm@fingolfin.org>
Cc: discuss@mandoc.bsd.lv
Subject: Re: man_validate.c PP/LP macro confusion
Date: Mon, 2 Mar 2020 00:05:07 +0100	[thread overview]
Message-ID: <20200301230507.GF66902@athene.usta.de> (raw)
In-Reply-To: <faabab23-3397-d1d3-0834-4a408c85737c@fingolfin.org>

Hi Robert,

Robert Mustacchi wrote on Sat, Feb 29, 2020 at 10:57:18AM -0800:

> While dealing with a number of old manual pages that haven't been
> converted to mdoc yet, I noticed that when asking mandoc to lint them
> (mandoc -Tline <file>), the error message it generates has confused a
> few folks. The following is the warning:
> 
> mandoc: abs.3c:10:2: WARNING: skipping paragraph macro: PP after SH
> 
> The warning itself is clear; however, the source actually uses a .LP
> macro and not a .PP macro. Most of the time folks know to map the two
> together,

Indeed, in particular since mandoc -T lint reports the line and column
number and since the man(7) manual page explicitly calls .LP and .P
synonyms for .PP.

> but it would be a little clearer if it listed the actual macro
> that was present in the document that was invalid.

Granted.

> I believe this happens because man_validate() in man_validate.c
> explicitly converts the .LP macro internally to be treated as a .PP.
> When it then triggers this check in post_SH(), it uses the modified name
> to print out the warning. The same appears to be true check_par().

Correct.

Note that while it isn't important at which time the rewriting gets
done in the validator, it is important that it ultimately happens,
for two reasons.  First, the formatters shouldn't need to bother
with obsolete macros.  Currently, only one formatter tests for
MAN_PP: man_html.c in man_PP_pre().  But even though that's the
only place right now, i'd hate to have to test (n->tok == MAN_PP |
n->tok == MAN_LP | n->tok == MAN_P) there.
Besides, there is the possibility that someday i might write a
generic man(7) output module, such that you could do automatic
cleaning of the source code of a manual page by running it through
"mandoc -T man" (right now, that merely copies the source code
without cleaning it up).  In that case, the replacement .LP -> .PP
would become a real feature.

> While it's possible to modify things such that the warning generates
> the right token name,

In general, i think that paying attention to the details of messages
is useful.  But in this case, my impression is it is hard to fix.
During the recursive walking of the tree, parent nodes need to be
validated after child nodes or features like removing parent nodes
that become empty after their empty children are removed wouldn't
work.  So by the time .SH sees the .LP, it has already been
translated, even if we would move the translation into check_par().

This is a general problem, not specific to the case of .LP, not
specific to the process of macro translation, and not even specific
to man(7).  When you validate a tree, any modifications you apply
necessarily influence later reporting.  It is often hard enough to
find an order of operations that yields the desired modifications
of the tree, even without trying to tweak the order of operations
to get ideal reporting.  For example, after nodes are deleted, what
adjacent nodes report about their neighbours may seem surprising
to users, and many similar effects.

> I wasn't sure if such a change would be desired or accepted given
> the explicit consolidation and transformation for this being done
> in man_validate(). I'd appreciate hearing what others thought.

I considered how to do this and found no simpler way to achieve it
than iterating the whole tree twice, first doing everything else
and then finally translating the macro names.  But that feels like
a massive complication, quite excessive just for marginally
improving a message.

Even worse, it is not at clear to me that all kinds of reporting
oddities could be fixed in similar ways.  So what is special about
this one, that would make it merit such a special effort?

If you have a simple patch to improve it, i'm not necessarily opposed
to that, even though i doubt that all such reporting oddities can
be fixed.  But i'd be surprised if a simple patch were possible in
this case.

Yours,
  Ingo
--
 To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv

      reply	other threads:[~2020-03-01 23:05 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-29 18:57 Robert Mustacchi
2020-03-01 23:05 ` Ingo Schwarze [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200301230507.GF66902@athene.usta.de \
    --to=schwarze@usta.de \
    --cc=discuss@mandoc.bsd.lv \
    --cc=rm@fingolfin.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).