tech@mandoc.bsd.lv
 help / color / mirror / Atom feed
* Improper formatting of table with T&
@ 2021-03-06  4:54 Anthony J. Bentley
  2021-03-28 20:36 ` Ingo Schwarze
  0 siblings, 1 reply; 4+ messages in thread
From: Anthony J. Bentley @ 2021-03-06  4:54 UTC (permalink / raw)
  To: tech

Hi,

mairix(1) contains this table:

.TS
box tab(&);
lb | lb | lb | lb.
letter & short for & example & meaning
=
.T&
l | l | l | l.
d & days   & 3d & 3 days
w & weeks  & 2w & 2 weeks (14 days)
m & months & 5m & 5 months (150 days)
y & years  & 4y & 4 years (4*365 days)
.TE

groff renders it like so:

+--------+-------------+-----------+-----------------------+
|letter  |  short for  |  example  |  meaning              |
+--------+-------------+-----------+-----------------------+
|d       |  days       |  3d       |  3 days               |
|w       |  weeks      |  2w       |  2 weeks (14 days)    |
|m       |  months     |  5m       |  5 months (150 days)  |
|y       |  years      |  4y       |  4 years (4*365 days) |
+--------+-------------+-----------+-----------------------+

mandoc renders it similarly, but bolds both the header's bottom
border and the first line following (the 'days' line):

+--------+-------------+-----------+-----------------------+
|letter  |  short for  |  example  |  meaning              |
+========+=============+===========+=======================+
|d       |  days       |  3d       |  3 days               |
|w       |  weeks      |  2w       |  2 weeks (14 days)    |
|m       |  months     |  5m       |  5 months (150 days)  |
|y       |  years      |  4y       |  4 years (4*365 days) |
+--------+-------------+-----------+-----------------------+

I couldn't find any explanation in our tbl(7) page of what T& does,
although it's mentioned briefly in the "Layout" section. We should
probably document it somewhere.

-- 
Anthony J. Bentley
--
 To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Improper formatting of table with T&
  2021-03-06  4:54 Improper formatting of table with T& Anthony J. Bentley
@ 2021-03-28 20:36 ` Ingo Schwarze
  2021-03-29  0:33   ` Ingo Schwarze
  2021-04-07  9:34   ` Anthony J. Bentley
  0 siblings, 2 replies; 4+ messages in thread
From: Ingo Schwarze @ 2021-03-28 20:36 UTC (permalink / raw)
  To: Anthony J. Bentley; +Cc: tech

Hi Anthony,

Anthony J. Bentley wrote on Fri, Mar 05, 2021 at 09:54:13PM -0700:

> mairix(1) contains this table:
> 
> .TS
> box tab(&);
> lb | lb | lb | lb.
> letter & short for & example & meaning
> =
> .T&
> l | l | l | l.
> d & days   & 3d & 3 days
> w & weeks  & 2w & 2 weeks (14 days)
> m & months & 5m & 5 months (150 days)
> y & years  & 4y & 4 years (4*365 days)
> .TE
> 
> groff renders it like so:
> 
> +--------+-------------+-----------+-----------------------+
> |letter  |  short for  |  example  |  meaning              |
> +--------+-------------+-----------+-----------------------+

I consider this line a bug in groff; it should be

  +========+=============+===========+=======================+

like with mandoc.

The groff tbl(1) manual page says:

  If a data line consists of only ‘_’ or ‘=’, a single or double
  line, respectively, is drawn across the table at that point; ...

If you actually wanted "+--------+------ ...", you would have
to specify

  letter & short for & example & meaning
  -

i.e. a dash rather than an equal sign after the header data.

Do you agree?

> |d       |  days       |  3d       |  3 days               |
> |w       |  weeks      |  2w       |  2 weeks (14 days)    |
> |m       |  months     |  5m       |  5 months (150 days)  |
> |y       |  years      |  4y       |  4 years (4*365 days) |
> +--------+-------------+-----------+-----------------------+
> 
> mandoc renders it similarly, but bolds both the header's bottom
> border and the first line following (the 'days' line):
> 
> +--------+-------------+-----------+-----------------------+
> |letter  |  short for  |  example  |  meaning              |
> +========+=============+===========+=======================+
> |d       |  days       |  3d       |  3 days               |

The text in this line being bold looks like a mandoc bug to me.
The author's intention clearly is to use the "l | l | ..." format
without the b modifiers.

> |w       |  weeks      |  2w       |  2 weeks (14 days)    |
> |m       |  months     |  5m       |  5 months (150 days)  |
> |y       |  years      |  4y       |  4 years (4*365 days) |
> +--------+-------------+-----------+-----------------------+
> 
> I couldn't find any explanation in our tbl(7) page of what T& does,
> although it's mentioned briefly in the "Layout" section. We should
> probably document it somewhere.

I admit the documentation is terse, it says in our tbl(7):

  The table layout follows an Options line or a roff(7) TS or T& macro.

So "l | l | l | l" is not data but a layout line even though it occurs
after the first data line.

How do you think should the wording be improved for additional clarity,
such that it doesn't remain too terse but doesn't turn wordy and
cumbersome either?  I guess an improvement is likely possible.

Writing the table as

  .TS
  box tab(&);
  lb | lb | lb | lb
  l | l | l | l.
  letter & short for & example & meaning
  =
  d & days   & 3d & 3 days
  [...]

would be equivalent, shorter and simpler, but that's no excuse for
mandoc misrendering.

So, here is a candidate patch.  When looking for the last layout row
used, we need to look at the layout row used for the previous data line
containing data, not at the previous data line outright, which might be
a horizontal ruler (as it is in this case).  If it is, do not restart
from the first layout row but still proceed to the next data row, which
in this case was just read from T&.

Does it work for you?

If so, is it currently OK to commit to OpenBSD, or is there something
i should pay attention to, such as release preparations or anything
else that might be relevant?  I must admit i'm a bit out of the loop
right now - my fault, sorry for that.

Some many insects - winter must be fading.

Yours,
  Ingo


Index: tbl_data.c
===================================================================
RCS file: /cvs/src/usr.bin/mandoc/tbl_data.c,v
retrieving revision 1.40
diff -u -p -r1.40 tbl_data.c
--- tbl_data.c	11 Jan 2020 20:48:13 -0000	1.40
+++ tbl_data.c	28 Mar 2021 20:25:35 -0000
@@ -242,10 +242,11 @@ tbl_data(struct tbl_node *tbl, int ln, c
 	struct tbl_cell	*cp;
 	struct tbl_span	*sp;
 
-	rp = (sp = tbl->last_span) == NULL ? tbl->first_row :
-	    sp->pos == TBL_SPAN_DATA && sp->layout->next != NULL ?
-	    sp->layout->next : sp->layout;
-
+	for (sp = tbl->last_span; sp != NULL; sp = sp->prev)
+		if (sp->pos == TBL_SPAN_DATA)
+			break;
+	rp = sp == NULL ? tbl->first_row :
+	    sp->layout->next == NULL ? sp->layout : sp->layout->next;
 	assert(rp != NULL);
 
 	if (p[1] == '\0') {
--
 To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Improper formatting of table with T&
  2021-03-28 20:36 ` Ingo Schwarze
@ 2021-03-29  0:33   ` Ingo Schwarze
  2021-04-07  9:34   ` Anthony J. Bentley
  1 sibling, 0 replies; 4+ messages in thread
From: Ingo Schwarze @ 2021-03-29  0:33 UTC (permalink / raw)
  To: Anthony J. Bentley; +Cc: tech

Hi,

.. and by the way, after fixing the bug, i got worried there might
be something i'm missing and there might be more problems hiding here
because i wondered how it could possibly happen that before the patch i
just sent to you, your (Anthony's) version of the test file (with the T&)
exhibited the bug and my changed version (without the T& and the final
layout line pulled up to the main layout block) worked.  The *final*
layout data structure ought to be just identical on both cases...
So what was going on?

The key is a few lines down in tbl_data.c, function tbl_data(),
the top of which i patched.  That function is actually used for
parsing *any* data line, both those containing data and those
containing only horizontal rulers.  So consider what happens
when the *ruler* is processed right after the title line of the
table.

Without my patch and with Anthony's version of the test file,
sp->layout->next is NULL (because the T& has not yet been reached)
so rp gets set to sp->layout (the layout line for the title),
and then that gets used for initializing the span struture for
the ruler, a few lines down in the switch statement.  So the
ruler gets associated with the *header* layout line and that
doesn't change when the T& is later parsed.

On the other hand, with my changed test file, the whole layout structure
including the layout for data lines is already present when the ruler
is parsed, so sp->layout->next is *not* NULL but the final layout line.
Hence the ruler gets associated with the layout line intended for data.
Of course, it doesn't matter at all which layout line is associated with
a ruler, no layout is needed for rendering a ruler in the first place,
so the difference doesn't matter for rendering the ruler itself.

But then later, when parsing the first data line, the one for "days",
the bug caused the meaningless layout pointer stored with the ruler
to actually get used: the incorrect condition "sp->pos == TBL_SPAN_DATA"
in the rp assignment line was false in this case, so whatever garbage
pointer was there got used.  That resulted in correct formatting (by
chance) for my test file but blew up for Anthony's test file.

I'm no longer convinced storing such meaningless pointers is a smart move,
precisely because there is a risk something might use them down the line,
causing erratic behaviour that is hard to disentangle.  But at least i'm
now confident i understand what was going on, and short of redesigning
the invariants of struct tbl_span (which might improve rigour at the
risk of causing regressions) i'm now confident my patch is correct.

Yours,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Improper formatting of table with T&
  2021-03-28 20:36 ` Ingo Schwarze
  2021-03-29  0:33   ` Ingo Schwarze
@ 2021-04-07  9:34   ` Anthony J. Bentley
  1 sibling, 0 replies; 4+ messages in thread
From: Anthony J. Bentley @ 2021-04-07  9:34 UTC (permalink / raw)
  To: Ingo Schwarze; +Cc: tech

Hi Ingo,

Ingo Schwarze writes:
> I consider this line a bug in groff; it should be
>
>   +========+=============+===========+=======================+
>
> like with mandoc.
> ...
> If you actually wanted "+--------+------ ...", you would have
> to specify
>
>   letter & short for & example & meaning
>   -
>
> i.e. a dash rather than an equal sign after the header data.
>
> Do you agree?

Hm, I guess so. I'm not intimately familiar with all the tbl formatting
options though.

> So, here is a candidate patch.  When looking for the last layout row
> used, we need to look at the layout row used for the previous data line
> containing data, not at the previous data line outright, which might be
> a horizontal ruler (as it is in this case).  If it is, do not restart
> from the first layout row but still proceed to the next data row, which
> in this case was just read from T&.
>
> Does it work for you?

Yes, and thanks for the analysis. 

> If so, is it currently OK to commit to OpenBSD, or is there something
> i should pay attention to, such as release preparations or anything
> else that might be relevant?

No lock yet, although it's probably happening soon. Feel free to commit
this with my ok.

-- 
Anthony J. Bentley
--
 To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-04-07  9:35 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-06  4:54 Improper formatting of table with T& Anthony J. Bentley
2021-03-28 20:36 ` Ingo Schwarze
2021-03-29  0:33   ` Ingo Schwarze
2021-04-07  9:34   ` Anthony J. Bentley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).