discuss@mandoc.bsd.lv
 help / color / mirror / Atom feed
* No-arg .Nm (also .Bx?) breaks .Bd -literal?
@ 2022-06-05 16:30 наб
       [not found] ` <Ypz026hcl/USRzsr@www.stare.cz>
  0 siblings, 1 reply; 4+ messages in thread
From: наб @ 2022-06-05 16:30 UTC (permalink / raw)
  To: discuss


[-- Attachment #1.1: Type: text/plain, Size: 634 bytes --]

Hi!

Attaching two offending manuals (the first breaks on the no-arg .Nm,
the second on .Bx) w/fragment outputs. The minimal example is
-- >8 --
.Sh EXAMPLES
.Bd -literal -compact -offset 4n
.Li $ Nm Ar form
.Ed
-- >8 --
which produces
-- >8 --
<section class="Sh">
<h1 class="Sh" id="EXAMPLES"><a class="permalink" href="#EXAMPLES">EXAMPLES</a></h1>
<div class="Bd Bd-indent Li">
<pre><code class="Li">$</code> <code class="Nm"></code></pre>
tr <var class="Ar">form</var></div>
</section>
-- >8 --

Same fresh CVS checkout as my last post;
1.14.5-1 off Debian makes an even worse hash of this.

Best,
наб

[-- Attachment #1.2: rep.2 --]
[-- Type: text/plain, Size: 885 bytes --]

.\" SPDX-License-Identifier: 0BSD
.\"
.Dd
.Dt TR 1
.Os
.
.Sh NAME
.Nm tr
.Nd transliterate bytes
.
.Sh EXAMPLES
Extract all words (maximal runs of letters) from
.Ar form :
.Bd -literal -compact -offset 4n
.Li $ Nm cat Ar form
Groceries for February:
    Bananas    3.5kg     $4.51
    Kiwis      2kg       $3.19     Call Siegfried to explain short!
    Bread                $20.21
.Li $ Nm Fl cs Li \&"[:alpha:]" \&"\en" < Ar form Li "    #" Only compatible with the Bx !
.Li $ Nm Fl cs Li \&"[:alpha:]" \&"[\en*]" < Ar form
Groceries
for
February
Bananas
kg
Kiwis
kg
Call
Siegfried
to
explain
short
Bread
.Ed
.
.Pp
Capitalise that same form:
.Bd -literal -compact -offset 4n
.Li $ Nm Li \&"[:lower:]" \&"[:upper:]" < Ar form
GROCERIES FOR FEBRUARY:
    BANANAS    3.5KG     $4.51
    KIWIS      2KG       $3.19     CALL SIEGFRIED TO EXPLAIN SHORT!
    BREAD                $20.21
.Ed

[-- Attachment #1.3: rep.3 --]
[-- Type: text/plain, Size: 320 bytes --]

.\" SPDX-License-Identifier: 0BSD
.\"
.Dd
.Dt TR 1
.Os
.
.Sh NAME
.Nm tr
.Nd transliterate bytes
.
.Sh EXAMPLES
.Bd -literal -compact -offset 4n
.Li $ Nm cat Ar form
.Li $ Nm tr Fl cs Li \&"[:alpha:]" \&"\en" < Ar form Li "    #" Only compatible with the Bx !
.Li $ Nm tr Fl cs Li \&"[:alpha:]" \&"[\en*]" < Ar form
.Ed

[-- Attachment #1.4: rep.2.html --]
[-- Type: text/html, Size: 1923 bytes --]

[-- Attachment #1.5: rep.3.html --]
[-- Type: text/html, Size: 1230 bytes --]

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: No-arg .Nm (also .Bx?) breaks .Bd -literal?
       [not found] ` <Ypz026hcl/USRzsr@www.stare.cz>
@ 2022-06-05 18:54   ` наб
  2022-06-06 10:44     ` Ingo Schwarze
  0 siblings, 1 reply; 4+ messages in thread
From: наб @ 2022-06-05 18:54 UTC (permalink / raw)
  To: Jan Stary; +Cc: discuss

[-- Attachment #1: Type: text/plain, Size: 2068 bytes --]

On Sun, Jun 05, 2022 at 08:24:27PM +0200, Jan Stary wrote:
> Seriously, where do these horrendous examples come from?
> Are these real manual pages?
What's it to ya?

> Please provide a _minimal_ complete example.
> This calls Nm a couple of times: which oif these
> do you consider broken and why?
The ones without an argument? As opposed to the ones with one?
Is my Subject and 4+6-line reduced example /really/ too long?

> What does the
> "[:lower:]" and "[:upper:]" have to do with it?
Nothing, obviously. You can tell because I didn't mention them in either
the subject, the reiteration of the subject in the body, or the reduced
example.

> In short, don;t make it artificially harder to help you.
lol. It's difficult to be more clear than "No-arg .Nm breaks .Bd
-literal", "the first breaks on the no-arg .Nm, the second on .Bx",
"{example that very obviously killed code and pre and output the Nm text
  after}". However:

This calls Nm exactly twice:
-- >8 --
$ ./mandoc -Thtml -Ofragment -mdoc
.Nm tr
.Bd -literal
.Nm
.Ed
<table class="head">
  <tr>
    <td class="head-ltitle">UNTITLED</td>
    <td class="head-vol">LOCAL</td>
    <td class="head-rtitle">UNTITLED</td>
  </tr>
</table>
<div class="manual-text"><code class="Nm">tr</code>
<div class="Bd Pp Li">
<pre><code class="Nm"></code></pre>
tr</div>
</div>
<table class="foot">
  <tr>
    <td class="foot-date"></td>
    <td class="foot-os"></td>
  </tr>
</table>
-- >8 --

And Bx exactly once:
-- >8 --
$ ./mandoc -Thtml -Ofragment -mdoc
.Bd -literal
.Bx
.Ed
<table class="head">
  <tr>
    <td class="head-ltitle">UNTITLED</td>
    <td class="head-vol">LOCAL</td>
    <td class="head-rtitle">UNTITLED</td>
  </tr>
</table>
<div class="manual-text">
<div class="Bd Pp Li">
<pre><span class="Ux"></span></pre>
BSD</div>
</div>
<table class="foot">
  <tr>
    <td class="foot-date"></td>
    <td class="foot-os"></td>
  </tr>
</table>
-- >8 --

Is this minimal enough for your taste?

наб

You forgot to CC the list.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: No-arg .Nm (also .Bx?) breaks .Bd -literal?
  2022-06-05 18:54   ` наб
@ 2022-06-06 10:44     ` Ingo Schwarze
  2022-06-06 15:39       ` наб
  0 siblings, 1 reply; 4+ messages in thread
From: Ingo Schwarze @ 2022-06-06 10:44 UTC (permalink / raw)
  To: nabijaczleweli; +Cc: Jan Stary, discuss

Hi,

Nab wrote on Sun, Jun 05, 2022 at 08:54:21PM +0200:
> On Sun, Jun 05, 2022 at 08:24:27PM +0200, Jan Stary wrote:

>> Please provide a _minimal_ complete example.

Actually, both complete real-world manual pages demonstrating a mandoc
bug and minimal examples help.

If you send fuzzing results, for example from afl(1), then minimizing
them is very useful because it can be quite hard to isolate the source
of a problem from a full-blown fuzzer-generated input file.

If you send a bug report involving a real-world manual page, minimizing
is much less important because isolating the problem is usually not
too difficult and time consuming when dealing with code written by a
human or even by an automatic mdoc(7) or man(7) code generator.
In this case, having the full page helps to judge the severity
of the impact in context.

>> In short, don;t make it artificially harder to help you.

I don't consider Nab's reports as hard to understand, they seem
clear and to the point to me.

> This calls Nm exactly twice:
> -- >8 --
> $ ./mandoc -Thtml -Ofragment -mdoc
> .Nm tr
> .Bd -literal
> .Nm
> .Ed
[....]
> <div class="Bd Pp Li">
> <pre><code class="Nm"></code></pre>
> tr</div>

The .Nm macro is usually an in-line macro, but in the SYNOPSIS
section, it can be a block macro.  I suspect the mandoc bug might
be related to -T html mistreating .Nm as a block macro here,
but i'm not sure yet and didn't look at the details.

> $ ./mandoc -Thtml -Ofragment -mdoc
> .Bd -literal
> .Bx
> .Ed
[...]
> <div class="Bd Pp Li">
> <pre><span class="Ux"></span></pre>
> BSD</div>

That's even more surprising; .Bx is never a block macro, so maybe
there is another bug or the bug has a different reason after all.

For now, i added this TODO entry:

  --- HTML issues ----------------------------------------------

  - .Nm without an argument and .Bx cause premature </pre>
     Nab Sun, 5 Jun 2022 18:30:09 +0200

Thanks for the report,
  Ingo
--
 To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: No-arg .Nm (also .Bx?) breaks .Bd -literal?
  2022-06-06 10:44     ` Ingo Schwarze
@ 2022-06-06 15:39       ` наб
  0 siblings, 0 replies; 4+ messages in thread
From: наб @ 2022-06-06 15:39 UTC (permalink / raw)
  To: Ingo Schwarze; +Cc: Jan Stary, discuss

[-- Attachment #1: Type: text/plain, Size: 5781 bytes --]

Hi!

On Mon, Jun 06, 2022 at 12:44:44PM +0200, Ingo Schwarze wrote:
> That's even more surprising; .Bx is never a block macro, so maybe
> there is another bug or the bug has a different reason after all.

Encouraged by your simple diagnosis, I had a minor go:

Given:
-- >8 --
.Sh NAME
.Nm tr
.Nd trupsko
.Sh DESCRIPTION
.Bd -literal
.Nm
.Ed
.Sh DESCRIPTION2
.Bd -literal
.Nm tr2
.Ed
.Sh DESCRIPTION3
.Bd -literal
.Bx
.Ed
.Sh DESCRIPTION3
.Bd -literal
trupsko
.Ed
-- >8 --

-Ttree gives
-- >8 --
title = "UNTITLED"
name  = "tr"
vol   = "LOCAL"
os    = ""
date  = ""

Sh (block) *1:2
  Sh (head) 1:2 ID=HREF
      NAME (text) 1:5
  Sh (body) 1:2
      Nm (elem) *2:2
          tr (text) 2:5
      Nd (block) *3:2
        Nd (head) 3:2
        Nd (body) 3:2
            trupsko (text) 3:5
Sh (block) *4:2
  Sh (head) 4:2 ID=HREF
      DESCRIPTION (text) 4:5
  Sh (body) 4:2
      Bd (block) -literal *5:2
        Bd (head) 5:2
        Bd (body) 5:2
            Nm (elem) *6:2 NOFILL
                tr (text) 6:2 NOSRC
Sh (block) *8:2
  Sh (head) 8:2 ID=HREF
      DESCRIPTION2 (text) 8:5
  Sh (body) 8:2
      Bd (block) -literal *9:2
        Bd (head) 9:2
        Bd (body) 9:2
            Nm (elem) *10:2 NOFILL
                tr2 (text) 10:5 NOFILL
Sh (block) *12:2
  Sh (head) 12:2 ID=HREF
      DESCRIPTION3 (text) 12:5
  Sh (body) 12:2
      Bd (block) -literal *13:2
        Bd (head) 13:2
        Bd (body) 13:2
            Bx (elem) *14:2 NOFILL
                BSD (text) 14:2 NOSRC
Sh (block) *16:2
  Sh (head) 16:2 ID=HREF
      DESCRIPTION3 (text) 16:5
  Sh (body) 16:2
      Bd (block) -literal *17:2
        Bd (head) 17:2
        Bd (body) 17:2
            trupsko (text) *18:1 NOFILL
-- >8 --

But -Ttree -Onoval gives:
-- >8 --
Sh (block) *1:2
  Sh (head) 1:2
      NAME (text) 1:5
  Sh (body) 1:2
      Nm (elem) *2:2
          tr (text) 2:5
      Nd (block) *3:2
        Nd (head) 3:2
        Nd (body) 3:2
            trupsko (text) 3:5
Sh (block) *4:2
  Sh (head) 4:2
      DESCRIPTION (text) 4:5
  Sh (body) 4:2
      Bd (block) -literal *5:2
        Bd (head) 5:2
        Bd (body) 5:2
            Nm (elem) *6:2 NOFILL
Sh (block) *8:2
  Sh (head) 8:2
      DESCRIPTION2 (text) 8:5
  Sh (body) 8:2
      Bd (block) -literal *9:2
        Bd (head) 9:2
        Bd (body) 9:2
            Nm (elem) *10:2 NOFILL
                tr2 (text) 10:5 NOFILL
Sh (block) *12:2
  Sh (head) 12:2
      DESCRIPTION3 (text) 12:5
  Sh (body) 12:2
      Bd (block) -literal *13:2
        Bd (head) 13:2
        Bd (body) 13:2
            Bx (elem) *14:2 NOFILL
Sh (block) *16:2
  Sh (head) 16:2
      DESCRIPTION3 (text) 16:5
  Sh (body) 16:2
      Bd (block) -literal *17:2
        Bd (head) 17:2
        Bd (body) 17:2
            trupsko (text) *18:1 NOFILL
-- >8 --

For a diff of
-- >8 --
diff --git a/tr b/nova
index aa4b73c..b177b2e 100644
--- a/tr
+++ b/nova
@@ -1,5 +0,0 @@
-title = "UNTITLED"
-name  = "tr"
-vol   = "LOCAL"
-os    = ""
-date  = ""
@@ -8 +3 @@ Sh (block) *1:2
-  Sh (head) 1:2 ID=HREF
+  Sh (head) 1:2
@@ -18 +13 @@ Sh (block) *4:2
-  Sh (head) 4:2 ID=HREF
+  Sh (head) 4:2
@@ -25 +19,0 @@ Sh (block) *4:2
-                tr (text) 6:2 NOSRC
@@ -27 +21 @@ Sh (block) *8:2
-  Sh (head) 8:2 ID=HREF
+  Sh (head) 8:2
@@ -36 +30 @@ Sh (block) *12:2
-  Sh (head) 12:2 ID=HREF
+  Sh (head) 12:2
@@ -43 +36,0 @@ Sh (block) *12:2
-                BSD (text) 14:2 NOSRC
@@ -45 +38 @@ Sh (block) *16:2
-  Sh (head) 16:2 ID=HREF
+  Sh (head) 16:2
-- >8 --

Note how the text injected by the validator is NOSRC while the "raw"
text, and the text in .Nm tr2 is NOFILL.

I (naively) replaced all NODE_NOSRC with NODE_NOSRC | NODE_NOFILL in
mdoc_validate.c (none of them had it previously), to great success!

-Ttree now yielded:
-- >8 --
title = "UNTITLED"
name  = "tr"
vol   = "LOCAL"
os    = ""
date  = ""

Sh (block) *1:2
  Sh (head) 1:2 ID=HREF
      NAME (text) 1:5
  Sh (body) 1:2
      Nm (elem) *2:2
          tr (text) 2:5
      Nd (block) *3:2
        Nd (head) 3:2
        Nd (body) 3:2
            trupsko (text) 3:5
Sh (block) *4:2
  Sh (head) 4:2 ID=HREF
      DESCRIPTION (text) 4:5
  Sh (body) 4:2
      Bd (block) -literal *5:2
        Bd (head) 5:2
        Bd (body) 5:2
            Nm (elem) *6:2 NOFILL
                tr (text) 6:2 NOFILL NOSRC
Sh (block) *8:2
  Sh (head) 8:2 ID=HREF
      DESCRIPTION2 (text) 8:5
  Sh (body) 8:2
      Bd (block) -literal *9:2
        Bd (head) 9:2
        Bd (body) 9:2
            Nm (elem) *10:2 NOFILL
                tr2 (text) 10:5 NOFILL
Sh (block) *12:2
  Sh (head) 12:2 ID=HREF
      DESCRIPTION3 (text) 12:5
  Sh (body) 12:2
      Bd (block) -literal *13:2
        Bd (head) 13:2
        Bd (body) 13:2
            Bx (elem) *14:2 NOFILL
                BSD (text) 14:2 NOFILL NOSRC
Sh (block) *16:2
  Sh (head) 16:2 ID=HREF
      DESCRIPTION3 (text) 16:5
  Sh (body) 16:2
      Bd (block) -literal *17:2
        Bd (head) 17:2
        Bd (body) 17:2
            trupsko (text) *18:1 NOFILL
-- >8 --

And the .Bd -literals in -Thtml were all fixed, too!
-- >8 --
<div class="Bd Li">
<pre><code class="Nm">tr</code></pre>
</div>
<div class="Bd Li">
<pre><code class="Nm">tr2</code></pre>
</div>
<div class="Bd Li">
<pre><span class="Ux">BSD</span></pre>
</div>
<div class="Bd Li">
<pre>trupsko</pre>
</div>
-- >8 --

There are a lot of elements that generate NOSRC text and I don't know
which of them genuinely shouldn't have NOFILL (and there's some
conditional ones like you said), but as PoC I think this narrows the
problem down.

Best,
наб

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-06-06 15:39 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-05 16:30 No-arg .Nm (also .Bx?) breaks .Bd -literal? наб
     [not found] ` <Ypz026hcl/USRzsr@www.stare.cz>
2022-06-05 18:54   ` наб
2022-06-06 10:44     ` Ingo Schwarze
2022-06-06 15:39       ` наб

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).