discuss@mandoc.bsd.lv
 help / color / mirror / Atom feed
* mandoc HTML output: minor issues
@ 2017-08-25  9:52 Jackson Pauls
  2017-08-29  8:47 ` Anthony J. Bentley
  0 siblings, 1 reply; 3+ messages in thread
From: Jackson Pauls @ 2017-08-25  9:52 UTC (permalink / raw)
  To: discuss

Hi,

I use mandoc 1.14.3 to generate HTML from mdoc sources. I've noticed a
few issues with the HTML output, in no particular order:


1. Duplicate IDs (invalid HTML). E.g. dash.1 gets two elements with
id="HISTSIZE":
...
1138       <dt class="It-tag" style="margin-left: -12.60ex;"><a
class="selflink" href="#HISTSIZE"><code class="Ev" title="Ev"
id="HISTSIZE">HISTSIZE</code></a></dt>
...
1921   <dt class="It-tag" style="margin-left: -13.80ex;"><a
class="selflink" href="#HISTSIZE"><code class="Ev" title="Ev"
id="HISTSIZE">HISTSIZE</code></a></dt>
...
(1138 and 1921 are the line numbers as per my version.)


2. Broken in-page links. E.g. in dash.1, there is an <h2> with
id="White_Space_Splitting_(Field_Splitting)", but the link to that
heading is missing the bit in parentheses
(href="#White_Space_Splitting"):
...
 867 <h2 class="Ss" title="Ss"
id="White_Space_Splitting_(Field_Splitting)"><a class="selflink"
href="#White_Space_Splitting_(Field_Splitting)">White
 868   Space Splitting (Field Splitting)</a></h2>
...
1915       See the <a class="Sx" title="Sx"
href="#White_Space_Splitting">White Space
1916       Splitting</a> section for more details.</dd>
...

Perhaps this is an issue with the mdoc source, but maybe links
shouldn't be created in this case:
...
1014 .Ss White Space Splitting (Field Splitting)
...
2300 .Sx White Space Splitting
...


3. In babel.1, for the -: flag, the colon appears outside the .Fl element:
...
 63       <b class="Fl" title="Fl">-</b>:
...

I expected: <b class="Fl" title="Fl">-:</b>

mdoc source line:
...
 21 .Op Fl i Ar input-type | Fl : Qo Ar SMILES-string Qc
...


4. Running v.Nu (https://validator.github.io/validator/) on a
collection of HTML files generated by mandoc picks up duplicate IDs,
and a bunch of other issues:
* unescaped characters in href attributes (%, "),
* unescaped characters in URL fragments (\, {, }, #, ^, [, ], <, |),
* <div>s appearing inside <pre>s, and
* mismatches between column count in <colgroup> and table rows.
I can gather a bunch of examples if of interest.


5. Finally, I see mandoc adds today's date to the footer if it can't
parse one from the source file. I think this can be misleading, making
it appear a man page has been updated more recently than it actually
has. It would be nice to have an option to disable this behaviour, and
output an empty string or "UNDATED" instead.


mandoc is a pleasure to use btw, I hope the above is useful.


Cheers, Jackson

www.mankier.com
--
 To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: mandoc HTML output: minor issues
  2017-08-25  9:52 mandoc HTML output: minor issues Jackson Pauls
@ 2017-08-29  8:47 ` Anthony J. Bentley
  2017-08-29 15:56   ` Jackson Pauls
  0 siblings, 1 reply; 3+ messages in thread
From: Anthony J. Bentley @ 2017-08-29  8:47 UTC (permalink / raw)
  To: discuss

Hi Jackson,

Jackson Pauls writes:
> 1. Duplicate IDs (invalid HTML). E.g. dash.1 gets two elements with
> id="HISTSIZE":

Indeed a problem with mandoc(1), which should not generate duplicate IDs
in HTML output.

> 2. Broken in-page links. E.g. in dash.1, there is an <h2> with
> id="White_Space_Splitting_(Field_Splitting)", but the link to that
> heading is missing the bit in parentheses
> (href="#White_Space_Splitting"):

This is a problem with the manual; if it names a section it should
refer to the section by that name.

> 3. In babel.1, for the -: flag, the colon appears outside the .Fl element:
> ...
>  21 .Op Fl i Ar input-type | Fl : Qo Ar SMILES-string Qc

Problem with the manual.
Should be Op Fl i Ar input-type | Fl \&: Qo Ar SMILES-string Qc

> 4. Running v.Nu (https://validator.github.io/validator/) on a
> collection of HTML files generated by mandoc picks up duplicate IDs,
> and a bunch of other issues:
> * unescaped characters in href attributes (%, "),
> * unescaped characters in URL fragments (\, {, }, #, ^, [, ], <, |),
> * <div>s appearing inside <pre>s, and
> * mismatches between column count in <colgroup> and table rows.
> I can gather a bunch of examples if of interest.

Please do. mandoc(1) could use some improvement when it comes to
producing valid HTML.

-- 
Anthony J. Bentley
--
 To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: mandoc HTML output: minor issues
  2017-08-29  8:47 ` Anthony J. Bentley
@ 2017-08-29 15:56   ` Jackson Pauls
  0 siblings, 0 replies; 3+ messages in thread
From: Jackson Pauls @ 2017-08-29 15:56 UTC (permalink / raw)
  To: discuss

Hi Anthony,

On 29 August 2017 at 09:47, Anthony J. Bentley <anthony@anjbe.name> wrote:
>> 4. Running v.Nu (https://validator.github.io/validator/) on a
>> collection of HTML files generated by mandoc picks up duplicate IDs,
>> and a bunch of other issues:
>> * unescaped characters in href attributes (%, "),
>> * unescaped characters in URL fragments (\, {, }, #, ^, [, ], <, |),
>> * <div>s appearing inside <pre>s, and
>> * mismatches between column count in <colgroup> and table rows.
>> I can gather a bunch of examples if of interest.
>
> Please do. mandoc(1) could use some improvement when it comes to
> producing valid HTML.

Ok, here goes:


Example of unescaped % in sudo.8:
 403 .It Li %H
becomes
375       [...]<a class="selflink" href="#%H">[...]
should be
<a class="selflink" href="#%25H">


Example of wrongly escaped double quote in sshd.8:
475 .It Cm command="command"
becomes
337   [...]<a class="selflink" href="#command=&quot;command&quot;">[...]
should be
<a class="selflink" href="#command=%22command%22">


Example of unescaped space character in URL path in sudoers.5:
3378 For more information see
3379 .Xr "GROUP PROVIDER PLUGINS" .
becomes
2305     For more information see <a class="Xr" title="Xr"
href="/1/GROUP PROVIDER
2306       PLUGINS">GROUP PROVIDER PLUGINS</a>.</dd>
the href should be "/1/GROUP%20PROVIDER%20PLUGINS"


Example of unescaped { and } characters in fragment in visudo.8:
347 .It Li Warning: unused {User,Runas,Host,Cmnd}_Alias
becomes
242 [...]<a class="selflink"
href="#Warning:_unused_{User,Runas,Host,Cmnd}_Alias">[...]
should be
<a class="selflink" href="#Warning:_unused_%7BUser,Runas,Host,Cmnd%7D_Alias">

There are other issues with unescaped characters in URL fragments,
which can only contain a limited set of characters:
https://stackoverflow.com/a/26119120


Example of a <div> inside a <pre> (cause by a missing </pre>) in rpld.conf.5:
 96 .Bd -literal -ffset indent
 97         foo = something;
 98 or
 99 .Bd -literal -ffset indent
100         bar;
101
102 .Ed
becomes
 52 <div class="Bd" style="margin-left: 0.00ex;">
 53 <pre class="Li">
 54         foo = something;
 55 or
 56
 57 <div class="Pp"></div>
 58 <div class="Bd" style="margin-left: 0.00ex;">
 59 <pre class="Li">
 60         bar;
 61
 62 </pre>
 63 </div>
Maybe that's just an issue with the manual (missing .Ed), but maybe on
hitting the second .Bd mandoc(1) could assume an .Ed for the prior
.Bd.


Example of column count issue in tmux.1:
 475 .Bl -column "XXXXXXXXXX" "X"
 476 .It Sy "Token" Ta Sy "" Ta Sy "Meaning"
becomes
 372 <table class="Bl-column">
 373   <colgroup>
 374     <col style="width: 15.00ex;"/>
 375     <col style="min-width: 1.00ex;"/>
 376   </colgroup>
 377   <tr class="It-column">
 378     <td class="It-column"><b class="Sy" title="Sy">Token</b></td>
 379     <td class="It-column"><b class="Sy" title="Sy"></b></td>
 380     <td class="It-column"><b class="Sy" title="Sy">Meaning</b></td>
 381   </tr>
v.Nu says "error: A table row was 3 columns wide and exceeded the
column count established using column markup (2)," but in practice I'm
not sure how important this is.


Finally, one bonus oddity I just noticed, an opening paren "(" going
stray in dnstop.8:
165 .It *
166 show sources + 8th level query names
167 .It (
168 show sources + 9th level query names
becomes
178   <dt class="It-tag">*</dt>
179   <dd class="It-tag">show sources + 8th level query names</dd>
180   (
181   <dt class="It-tag"></dt>
182   <dd class="It-tag">show sources + 9th level query names</dd>


Cheers, Jackson

www.mankier.com
--
 To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-08-29 15:56 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-25  9:52 mandoc HTML output: minor issues Jackson Pauls
2017-08-29  8:47 ` Anthony J. Bentley
2017-08-29 15:56   ` Jackson Pauls

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).