tech@mandoc.bsd.lv
 help / color / mirror / Atom feed
* ".TS H" macro
@ 2021-03-28 11:23 Kristaps Dzonsons
  2021-03-28 11:30 ` Kristaps Dzonsons
  0 siblings, 1 reply; 3+ messages in thread
From: Kristaps Dzonsons @ 2021-03-28 11:23 UTC (permalink / raw)
  To: tech

[-- Attachment #1: Type: text/plain, Size: 946 bytes --]

Hi!

A recent issue for lowdown[1] noted the existence of ".TS H", which 
allows for multi-page PDF TS/TE tables.  These don't affect mandoc, 
where the PDF tables aren't great anyway (that's my fault from way long 
ago), so using ".TS H" instead of ".TS" is ignored and fine.

The problem is that this introduces a new macro to signal the end of the 
table header, which groff will re-issue at the start of a new page.  And 
of course, the macro is ".TH".

.TH "Untitled article" "7" ""
.TS H
tab(|) expand allbox;
lb lb
l l.
T{
app
T}|T{
quality
T}
.TH
T{
foo
T}|T{
bar
T}
.TE

In mandoc, the subsequent ".TH" overrides the main document's ".TH", so 
the title and stuff shows up as empty.

I've included a patch that has the TH parser bail if it already 
encounters a title.  This passes "make regress", but it may be more 
appropriate to check if it's in a TS/TE context.

Best,

Kristaps

[1]
https://github.com/kristapsdz/lowdown/issues/65

[-- Attachment #2: mandoc.TS.patch --]
[-- Type: text/x-patch, Size: 496 bytes --]

Index: man_validate.c
===================================================================
RCS file: /home/cvs/mdocml/mandoc/man_validate.c,v
retrieving revision 1.155
diff -u -p -u -r1.155 man_validate.c
--- man_validate.c	30 Oct 2020 13:24:33 -0000	1.155
+++ man_validate.c	28 Mar 2021 11:21:33 -0000
@@ -475,6 +475,9 @@ post_TH(CHKARGS)
 	struct roff_node *nb;
 	const char	*p;
 
+	if (man->meta.title != NULL)
+		return;
+
 	free(man->meta.title);
 	free(man->meta.vol);
 	free(man->meta.os);

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: ".TS H" macro
  2021-03-28 11:23 ".TS H" macro Kristaps Dzonsons
@ 2021-03-28 11:30 ` Kristaps Dzonsons
  2021-03-28 18:46   ` Ingo Schwarze
  0 siblings, 1 reply; 3+ messages in thread
From: Kristaps Dzonsons @ 2021-03-28 11:30 UTC (permalink / raw)
  To: tech

On second glance, this is only supported by the -ms parser in groff(1) 
and tables with groff's -man don't support the `.TS H` invocation:

% groff -tk -man -Tpdf foo.man  >/dev/null
error: page 2: table will not fit on one page; use .TS H/.TH with a 
supporting macro package

So this shouldn't affect mandoc because it doesn't accept -ms and thus 
the `.TS H` shouldn't be used.

Sorry for the noise!
--
 To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: ".TS H" macro
  2021-03-28 11:30 ` Kristaps Dzonsons
@ 2021-03-28 18:46   ` Ingo Schwarze
  0 siblings, 0 replies; 3+ messages in thread
From: Ingo Schwarze @ 2021-03-28 18:46 UTC (permalink / raw)
  To: Kristaps Dzonsons; +Cc: tech

Hi Kristaps,

Kristaps Dzonsons wrote on Sun, Mar 28, 2021 at 01:30:41PM +0200:

> On second glance, this is only supported by the -ms parser in groff(1) 
> and tables with groff's -man don't support the `.TS H` invocation:
> 
> % groff -tk -man -Tpdf foo.man  >/dev/null
> error: page 2: table will not fit on one page; use .TS H/.TH with a 
> supporting macro package

Indeed, i agree with your assessment.  ".TS H" is groff_ms(7) syntax
and invalid in groff_man(7) and hence in man(7).

Here is the result of running your sample file (without the "expand"
table option that groff uses but mandoc ignores) through gmdiff:

   $ gmdiff tmp.man         
   ========== tmp.man ========== 
  mandoc: tmp.man:11:2: WARNING: missing manual title, using "": TH
  mandoc: tmp.man:11:2: WARNING: missing manual section, using "": TH 
  mandoc: tmp.man:11:2: WARNING: missing date, using "": TH
  mandoc: tmp.man: STYLE: RCS id missing: (OpenBSD)
  mandoc: see above the output for WARNING messages
  --- /tmp/roff.out       Sun Mar 28 20:09:57 2021
  +++ /tmp/mandoc.out     Sun Mar 28 20:09:57 2021
  @@ -1,17 +1,11 @@
  -UNTITLED(7)       Miscellaneous Information Manual       UNTITLED(7)
  +()                                                                ()
   
   
   
  +
   +----+---------+
   |app | quality |
   +----+---------+
  -|    |         |
  -|    |         |
  -|    |         |
  -()   |         |                                                  ()
  -|    |         |
  -|    |         |
  -|    |         |
   |foo | bar     |
   +----+---------+

So the differences are as follows:

 1. Groff prints the page header line as soon as it sees the first .TH.
    As you wisely decided, mandoc parses first and only starts rendering
    when the parsing is complete.  Hence the difference in the first line
    of output.

 2. *Both* groff and mandoc treat the second .TH as a man(7) .TH macro,
    not as a tbl(7) instruction (we probably shouldn't call tbl(7)
    features line "T{" and ".TH" "macros" because processing of tbl(7)
    and eqn(7) input differs substantially from normal roff(7) processing).
    But both treat this second man(7) .TH macro differently.  Groff goes
    ahead an prints a second header, surrounded by the usual three blank
    lines, overlapping the table.  Mandoc just collects the information,
    leaving the table undisturbed.

> So this shouldn't affect mandoc because it doesn't accept -ms and thus 
> the `.TS H` shouldn't be used.

Agreed.

> Sorry for the noise!

I don't deny your first shot went slightly astray, but it wasn't all
noise.  Consider the following test document:

  .TH ONE 1 2001-01-01
  .TH TWO 2 2002-02-02
  .SH NAME
  name \- name
  .SH DESCRIPTION
  text

Groff prints both headers, which is hardly useful and not a worthy target
for making mandoc(1) bug-compatible.  But mandoc(1) silently ignores the
first header and only prints the second header.

Using the second rather than the first contradicts man(7) documentation:

     Each man document starts with the TH macro specifying the document's
     name and section, ...

It also contadicts groff_man(7) documentation:

     A man page should contain exactly one .TH call at or near the
     beginning of the file, prior to any other macro calls.

So both manuals prefer seeing the first .TH win, not the last one.

Hence, your patch isn't completely wrong.  But even more important than
fixing which macro wins is emitting an ERROR level diagnostic when
encountering more than one .TH macro in the same document.

So i added a TODO entry:

  - report double .TH in man(7) as an ERROR and let the first win
    kristaps@  28 Mar 2021 13:30:41 +0200
    loc *  exist *  algo *  size *  imp *

Thanks,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-03-28 18:46 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-28 11:23 ".TS H" macro Kristaps Dzonsons
2021-03-28 11:30 ` Kristaps Dzonsons
2021-03-28 18:46   ` Ingo Schwarze

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).