tech@mandoc.bsd.lv
 help / color / Atom feed
* table borders don't span entire width
@ 2019-02-08 21:43 Anthony J. Bentley
  2019-02-08 22:02 ` Ingo Schwarze
  0 siblings, 1 reply; 6+ messages in thread
From: Anthony J. Bentley @ 2019-02-08 21:43 UTC (permalink / raw)
  To: tech

Hi,

In response to a mailing list question I tried out an example from
The Awk Programming Language. The tbl(7) source it generates looks
in part like this:

.TS
center;
n n n n.
_	_	_	_
2173	77.1	13765	53.6
=	=	=	=
.TE

groff creates solid lines as in the book:

                   ---------------------------
                   2173   77.1   13765   53.6
                   ---------------------------

But mandoc breaks the lines between cells:

                          ----   ----   -----   ----
                          2173   77.1   13765   53.6
                          ====   ====   =====   ====

Side note: while using = for the double line is an improvement over
groff, in UTF-8 the double line is output with U+2501. Wouldn't U+2550
be more appropriate? The typeset output displays a double line, not a
heavy line.

-- 
Anthony J. Bentley
--
 To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: table borders don't span entire width
  2019-02-08 21:43 table borders don't span entire width Anthony J. Bentley
@ 2019-02-08 22:02 ` Ingo Schwarze
  2019-02-08 22:39   ` Ingo Schwarze
  2019-02-09 21:06   ` Ingo Schwarze
  0 siblings, 2 replies; 6+ messages in thread
From: Ingo Schwarze @ 2019-02-08 22:02 UTC (permalink / raw)
  To: Anthony J. Bentley; +Cc: tech

Hi Anthony,

Anthony J. Bentley wrote on Fri, Feb 08, 2019 at 02:43:45PM -0700:

> In response to a mailing list question

I marvel that you understood that question.
I looked at it and didn't understand a word...

> I tried out an example from The Awk Programming Language.
> The tbl(7) source it generates looks in part like this:
> 
> .TS
> center;
> n n n n.
> _	_	_	_
> 2173	77.1	13765	53.6
> =	=	=	=
> .TE
> 
> groff creates solid lines as in the book:
> 
>                    ---------------------------
>                    2173   77.1   13765   53.6
>                    ---------------------------
> 
> But mandoc breaks the lines between cells:
> 
>                           ----   ----   -----   ----
>                           2173   77.1   13765   53.6
>                           ====   ====   =====   ====

That looks like a bug.
Here is what the tbl(7) manual page says:

     If a data cell contains only the single character '_' or '=',
     a single or double horizontal line is drawn across the cell,
     joining its neighbours.  If a data cells contains only the two
     character sequence '\_' or '\=', a single or double horizontal
     line is drawn inside the cell, not joining its neighbours.

> Side note: while using = for the double line is an improvement over
> groff, in UTF-8 the double line is output with U+2501. Wouldn't U+2550
> be more appropriate? The typeset output displays a double line, not a
> heavy line.

I think i did that because the heavy lines looked better to me,
but it's not the first time someone asks, so maybe i should change it.
Adjusting borders_utf8[] in tbl_term.c is probably all that is needed.

Yours,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: table borders don't span entire width
  2019-02-08 22:02 ` Ingo Schwarze
@ 2019-02-08 22:39   ` Ingo Schwarze
  2019-02-08 23:18     ` Anthony J. Bentley
  2019-02-09 21:06   ` Ingo Schwarze
  1 sibling, 1 reply; 6+ messages in thread
From: Ingo Schwarze @ 2019-02-08 22:39 UTC (permalink / raw)
  To: Anthony J. Bentley; +Cc: tech

Hi,

Ingo Schwarze wrote on Fri, Feb 08, 2019 at 11:02:21PM +0100:
> Anthony J. Bentley wrote on Fri, Feb 08, 2019 at 02:43:45PM -0700:

>> Side note: while using = for the double line is an improvement over
>> groff, in UTF-8 the double line is output with U+2501. Wouldn't U+2550
>> be more appropriate? The typeset output displays a double line, not a
>> heavy line.

> I think i did that because the heavy lines looked better to me,
> but it's not the first time someone asks, so maybe i should change it.
> Adjusting borders_utf8[] in tbl_term.c is probably all that is needed.

Oops, it is not that simple.  Box drawing with double lines is simply
incomplete in Unicode.  I wouldn't have thought it possible that
Unicode is actually missing characters...   :-o

I deleted all entries containing heavy lines from the table
and inserted all characters containing double lines that i was
able to find, and here is what i got:

static  const int borders_utf8[81] = {
        0x0020, 0x2576, 0x    ,  /* 000 right */
        0x2577, 0x250c, 0x2552,  /* 001 down */
        0x    , 0x2553, 0x2554,  /* 002 */
        0x2574, 0x2500, 0x    ,  /* 010 left */
        0x2510, 0x252c, 0x    ,  /* 011 left down */
        0x2556, 0x2565, 0x    ,  /* 012 */
        0x    , 0x    , 0x2550,  /* 020 left */
        0x2555, 0x    , 0x2564,  /* 021 left down */
        0x2557, 0x    , 0x2566,  /* 022 */
        0x2575, 0x2514, 0x2558,  /* 100 up */
        0x2502, 0x251c, 0x255e,  /* 101 up down */
        0x    , 0x    , 0x    ,  /* 102 */
        0x2518, 0x2534, 0x    ,  /* 110 up left */
        0x2524, 0x253c, 0x    ,  /* 111 all */
        0x    , 0x    , 0x    ,  /* 112 */
        0x255b, 0x    , 0x2567,  /* 120 up left */
        0x2561, 0x    , 0x256a,  /* 121 all */
        0x    , 0x    , 0x    ,  /* 122 */
        0x    , 0x2559, 0x255a,  /* 200 up */
        0x    , 0x    , 0x    ,  /* 201 up down */
        0x2551, 0x255f, 0x2560,  /* 202 */
        0x255c, 0x2568, 0x    ,  /* 210 up left */
        0x    , 0x    , 0x    ,  /* 211 all */
        0x2562, 0x256b, 0x    ,  /* 212 */
        0x255d, 0x    , 0x2569,  /* 220 up left */
        0x    , 0x    , 0x    ,  /* 221 all */
        0x2563, 0x    , 0x256c,  /* 222 */
};

So in particular, the following are missing:

 * double right
 * double down
 * single left with double right
 * single left down with double right
 * single left with double right down
 * double left
 * double left with single right
 * double left with single right down
 * double left down with single right

and so on...

With heavy instead of double, such combinations exist.

So, should i leave this untouched, or am i missing something?

Yours,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: table borders don't span entire width
  2019-02-08 22:39   ` Ingo Schwarze
@ 2019-02-08 23:18     ` Anthony J. Bentley
  2019-02-09 16:55       ` Ingo Schwarze
  0 siblings, 1 reply; 6+ messages in thread
From: Anthony J. Bentley @ 2019-02-08 23:18 UTC (permalink / raw)
  To: Ingo Schwarze; +Cc: tech

Hi Ingo,

The missing characters ring a bell. I'm pretty sure I investigated this
before, came to the same conclusion (that needed box drawing characters
don't exist), and then must have forgotten all about it.

Ingo Schwarze writes:
> So in particular, the following are missing:
>
>  * double right
>  * double down
>  * single left with double right
>  * single left down with double right
>  * single left with double right down
>  * double left
>  * double left with single right
>  * double left with single right down
>  * double left down with single right
>
> and so on...
>
> With heavy instead of double, such combinations exist.

That's a real shame. I looked in the Unicode standard but couldn't find
any rationale, only a comment that the box drawing section of Unicode
exists for compatibility with historic systems. I guess there wasn't
anything out there that used double boxes in this way.

https://www.unicode.org/versions/Unicode11.0.0/ch22.pdf

> So, should i leave this untouched, or am i missing something?

Leave it as is, I guess. But we should document this limitation in
tbl(7) since it's non-obvious and seems to have come up multiple times.

Or just draw twice as many lines, like groff attempts poorly with
doublebox...

-- 
Anthony J. Bentley
--
 To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: table borders don't span entire width
  2019-02-08 23:18     ` Anthony J. Bentley
@ 2019-02-09 16:55       ` Ingo Schwarze
  0 siblings, 0 replies; 6+ messages in thread
From: Ingo Schwarze @ 2019-02-09 16:55 UTC (permalink / raw)
  To: Anthony J. Bentley; +Cc: tech

Hi Anthony,

Anthony J. Bentley wrote on Fri, Feb 08, 2019 at 04:18:37PM -0700:

> The missing characters ring a bell. I'm pretty sure I investigated this
> before, came to the same conclusion (that needed box drawing characters
> don't exist), and then must have forgotten all about it.

> Ingo Schwarze writes:

>> So in particular, the following are missing:
>>
>>  * double right
>>  * double down
>>  * single left with double right
>>  * single left down with double right
>>  * single left with double right down
>>  * double left
>>  * double left with single right
>>  * double left with single right down
>>  * double left down with single right
>>
>> and so on...
>>
>> With heavy instead of double, such combinations exist.

> That's a real shame. I looked in the Unicode standard but couldn't find
> any rationale, only a comment that the box drawing section of Unicode
> exists for compatibility with historic systems. I guess there wasn't
> anything out there that used double boxes in this way.
> 
> https://www.unicode.org/versions/Unicode11.0.0/ch22.pdf

>> So, should i leave this untouched, or am i missing something?

> Leave it as is, I guess. But we should document this limitation in
> tbl(7) since it's non-obvious and seems to have come up multiple times.

Done, see the commit below.

> Or just draw twice as many lines, like groff attempts poorly with
> doublebox...

No, that would look horrible and waste vertical screen space for
double horizontal lines, and less importantly, horizontal screen
space for double vertical lines.  In particular in tables, screen
space is often a scarce resource.

Besides, representing double lines as heavy lines is *logically*
just fine.  It only mismatches the intent of the author in a minor
presentational detail - or maybe even not at all because the tbl(7)
language does not provide any syntax for "heavy", so the assumption
that authors use "double" to express that they want a line emphasized
seems quite safe.

Yours,
  Ingo


Log Message:
-----------
add a BUGS section explaining the situation with box and line drawing
in UTF-8 output; suggested by bentley@

Modified Files:
--------------
    mandoc:
        tbl.7

Revision Data
-------------
Index: tbl.7
===================================================================
RCS file: /home/cvs/mandoc/mandoc/tbl.7,v
retrieving revision 1.32
retrieving revision 1.33
diff -Ltbl.7 -Ltbl.7 -u -p -r1.32 -r1.33
--- tbl.7
+++ tbl.7
@@ -1,7 +1,7 @@
 .\"	$Id$
 .\"
 .\" Copyright (c) 2010, 2011 Kristaps Dzonsons <kristaps@bsd.lv>
-.\" Copyright (c) 2014, 2015, 2017, 2018 Ingo Schwarze <schwarze@openbsd.org>
+.\" Copyright (c) 2014,2015,2017,2018,2019 Ingo Schwarze <schwarze@openbsd.org>
 .\"
 .\" Permission to use, copy, modify, and distribute this software for any
 .\" purpose with or without fee is hereby granted, provided that the above
@@ -438,3 +438,17 @@ reference was written by
 .An Kristaps Dzonsons Aq Mt kristaps@bsd.lv
 and
 .An Ingo Schwarze Aq Mt schwarze@openbsd.org .
+.Sh BUGS
+In
+.Fl T
+.Cm utf8
+output mode, heavy lines are drawn instead of double lines.
+This cannot be improved because the Unicode standard only provides
+an incomplete set of box drawing characters with double lines,
+whereas it provides a full set of box drawing characters
+with heavy lines.
+It is unlikely this can be improved in the future because the box
+drawing characters are already marked in Unicode as characters
+intended only for backward compatibility with legacy systems,
+and their use is not encouraged.
+So it seems unlikely that the missing ones might get added in the future.
--
 To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: table borders don't span entire width
  2019-02-08 22:02 ` Ingo Schwarze
  2019-02-08 22:39   ` Ingo Schwarze
@ 2019-02-09 21:06   ` Ingo Schwarze
  1 sibling, 0 replies; 6+ messages in thread
From: Ingo Schwarze @ 2019-02-09 21:06 UTC (permalink / raw)
  To: Anthony J. Bentley; +Cc: tech

Hi,

Ingo Schwarze wrote on Fri, Feb 08, 2019 at 11:02:21PM +0100:
> Anthony J. Bentley wrote on Fri, Feb 08, 2019 at 02:43:45PM -0700:

>> I tried out an example from The Awk Programming Language.
>> The tbl(7) source it generates looks in part like this:
>> 
>> .TS
>> center;
>> n n n n.
>> _	_	_	_
>> 2173	77.1	13765	53.6
>> =	=	=	=
>> .TE
>> 
>> groff creates solid lines as in the book:
>> 
>>                    ---------------------------
>>                    2173   77.1   13765   53.6
>>                    ---------------------------
>> 
>> But mandoc breaks the lines between cells:
>> 
>>                           ----   ----   -----   ----
>>                           2173   77.1   13765   53.6
>>                           ====   ====   =====   ====

> That looks like a bug.
> Here is what the tbl(7) manual page says:
> 
>      If a data cell contains only the single character '_' or '=',
>      a single or double horizontal line is drawn across the cell,
>      joining its neighbours.  If a data cells contains only the two
>      character sequence '\_' or '\=', a single or double horizontal
>      line is drawn inside the cell, not joining its neighbours.

Fixed with the commit below.
More testing is welcome.

Yours,
  Ingo


Log Message:
-----------
The horizontal line in a data cell containing only "_" or "="
connects to the horizontally adjacent vertical line or cell;
fixing a bug reported by bentley@.

Modified Files:
--------------
    mandoc:
        tbl_term.c

Revision Data
-------------
Index: tbl_term.c
===================================================================
RCS file: /home/cvs/mandoc/mandoc/tbl_term.c,v
retrieving revision 1.67
retrieving revision 1.68
diff -Ltbl_term.c -Ltbl_term.c -u -p -r1.67 -r1.68
--- tbl_term.c
+++ tbl_term.c
@@ -166,7 +166,7 @@ term_tbl(struct termp *tp, const struct 
 	size_t		 	 save_offset;
 	size_t			 coloff, tsz;
 	int			 hspans, ic, more;
-	int			 dvert, fc, horiz, line, uvert;
+	int			 dvert, fc, horiz, lhori, rhori, uvert;
 
 	/* Inhibit printing of spaces: we do padding ourselves. */
 
@@ -325,11 +325,13 @@ term_tbl(struct termp *tp, const struct 
 		    (horiz || (IS_HORIZ(sp->layout->first) &&
 		      !IS_HORIZ(sp->prev->layout->first))))
 			uvert = sp->prev->layout->vert;
-		line = sp->pos == TBL_SPAN_DHORIZ ||
+		rhori = sp->pos == TBL_SPAN_DHORIZ ||
+		    (sp->first != NULL && sp->first->pos == TBL_DATA_DHORIZ) ||
 		    sp->layout->first->pos == TBL_CELL_DHORIZ ? 2 :
 		    sp->pos == TBL_SPAN_HORIZ ||
+		    (sp->first != NULL && sp->first->pos == TBL_DATA_HORIZ) ||
 		    sp->layout->first->pos == TBL_CELL_HORIZ ? 1 : 0;
-		fc = BUP * uvert + BDOWN * dvert + BRIGHT * line;
+		fc = BUP * uvert + BDOWN * dvert + BRIGHT * rhori;
 		if (uvert > 0 || dvert > 0 || (horiz && sp->opts->lvert)) {
 			(*tp->advance)(tp, tp->tcols->offset);
 			tp->viscol = tp->tcol->offset;
@@ -402,6 +404,15 @@ term_tbl(struct termp *tp, const struct 
 					cpn = cpn->next;
 				}
 
+				lhori = (cp != NULL &&
+				     cp->pos == TBL_CELL_DHORIZ) ||
+				    (dp != NULL &&
+				     dp->pos == TBL_DATA_DHORIZ) ? 2 :
+				    (cp != NULL &&
+				     cp->pos == TBL_CELL_HORIZ) ||
+				    (dp != NULL &&
+				     dp->pos == TBL_DATA_HORIZ) ? 1 : 0;
+
 				/*
 				 * Skip later cells in a span,
 				 * figure out whether to start a span,
@@ -454,57 +465,36 @@ term_tbl(struct termp *tp, const struct 
 				}
 				while (tp->viscol < tp->tcol->rmargin +
 				    tp->tbl.cols[ic].spacing / 2)
-					tbl_direct_border(tp, fc, 1);
+					tbl_direct_border(tp,
+					    BHORIZ * lhori, 1);
 
 				if (tp->tcol + 1 == tp->tcols + tp->lasttcol)
 					continue;
 
-				if (cp != NULL) {
-					switch (cp->pos) {
-					case TBL_CELL_HORIZ:
-						fc = BLEFT;
-						break;
-					case TBL_CELL_DHORIZ:
-						fc = BLEFT * 2;
-						break;
-					default:
-						fc = 0;
-						break;
-					}
+				if (cp != NULL)
 					cp = cp->next;
-				}
-				if (cp != NULL) {
-					switch (cp->pos) {
-					case TBL_CELL_HORIZ:
-						fc += BRIGHT;
-						break;
-					case TBL_CELL_DHORIZ:
-						fc += BRIGHT * 2;
-						break;
-					default:
-						break;
-					}
-				}
+
+				rhori = (cp != NULL &&
+				     cp->pos == TBL_CELL_DHORIZ) ||
+				    (dp != NULL &&
+				     dp->pos == TBL_DATA_DHORIZ) ? 2 :
+				    (cp != NULL &&
+				     cp->pos == TBL_CELL_HORIZ) ||
+				    (dp != NULL &&
+				     dp->pos == TBL_DATA_HORIZ) ? 1 : 0;
+
 				if (tp->tbl.cols[ic].spacing)
-					tbl_direct_border(tp, fc +
+					tbl_direct_border(tp,
+					    BLEFT * lhori + BRIGHT * rhori +
 					    BUP * uvert + BDOWN * dvert, 1);
 
 				if (tp->enc == TERMENC_UTF8)
 					uvert = dvert = 0;
 
-				if (fc != 0) {
-					if (cp != NULL &&
-					    cp->pos == TBL_CELL_HORIZ)
-						fc = BHORIZ;
-					else if (cp != NULL &&
-					    cp->pos == TBL_CELL_DHORIZ)
-						fc = BHORIZ * 2;
-					else
-						fc = 0;
-				}
 				if (tp->tbl.cols[ic].spacing > 2 &&
-				    (uvert > 1 || dvert > 1 || fc != 0))
-					tbl_direct_border(tp, fc +
+				    (uvert > 1 || dvert > 1 || rhori))
+					tbl_direct_border(tp,
+					    BHORIZ * rhori +
 					    BUP * (uvert > 1) +
 					    BDOWN * (dvert > 1), 1);
 			}
@@ -528,20 +518,27 @@ term_tbl(struct termp *tp, const struct 
 		    (horiz || (IS_HORIZ(sp->layout->last) &&
 		     !IS_HORIZ(sp->prev->layout->last))))
 			uvert = sp->prev->layout->last->vert;
-		line = sp->pos == TBL_SPAN_DHORIZ ||
+		lhori = sp->pos == TBL_SPAN_DHORIZ ||
+		    (sp->last != NULL &&
+		     sp->last->pos == TBL_DATA_DHORIZ &&
+		     sp->last->layout->col + 1 == sp->opts->cols) ||
 		    (sp->layout->last->pos == TBL_CELL_DHORIZ &&
 		     sp->layout->last->col + 1 == sp->opts->cols) ? 2 :
 		    sp->pos == TBL_SPAN_HORIZ ||
+		    (sp->last != NULL &&
+		     sp->last->pos == TBL_DATA_HORIZ &&
+		     sp->last->layout->col + 1 == sp->opts->cols) ||
 		    (sp->layout->last->pos == TBL_CELL_HORIZ &&
 		     sp->layout->last->col + 1 == sp->opts->cols) ? 1 : 0;
-		fc = BUP * uvert + BDOWN * dvert + BLEFT * line;
+		fc = BUP * uvert + BDOWN * dvert + BLEFT * lhori;
 		if (uvert > 0 || dvert > 0 || (horiz && sp->opts->rvert)) {
 			if (horiz == 0 && (IS_HORIZ(sp->layout->last) == 0 ||
 			    sp->layout->last->col + 1 < sp->opts->cols)) {
 				tp->tcol++;
-				(*tp->advance)(tp,
-				    tp->tcol->offset > tp->viscol ?
-				    tp->tcol->offset - tp->viscol : 1);
+				do {
+					tbl_direct_border(tp,
+					    BHORIZ * lhori, 1);
+				} while (tp->viscol < tp->tcol->offset);
 			}
 			tbl_direct_border(tp, fc, 1);
 		}
--
 To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, back to index

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-08 21:43 table borders don't span entire width Anthony J. Bentley
2019-02-08 22:02 ` Ingo Schwarze
2019-02-08 22:39   ` Ingo Schwarze
2019-02-08 23:18     ` Anthony J. Bentley
2019-02-09 16:55       ` Ingo Schwarze
2019-02-09 21:06   ` Ingo Schwarze

tech@mandoc.bsd.lv

Archives are clonable: git clone --mirror http://inbox.vuxu.org/mandoc-tech

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://inbox.vuxu.org/vuxu.archive.mandoc.tech


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git