From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from scc-mailout-kit-01.scc.kit.edu (scc-mailout-kit-01.scc.kit.edu [129.13.231.81]) by fantadrom.bsd.lv (OpenSMTPD) with ESMTP id 08c9e726 for ; Wed, 28 Nov 2018 21:15:08 -0500 (EST) Received: from asta-nat.asta.uni-karlsruhe.de ([172.22.63.82] helo=hekate.usta.de) by scc-mailout-kit-01.scc.kit.edu with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (envelope-from ) id 1gSBr7-0002xz-Vw; Thu, 29 Nov 2018 03:15:08 +0100 Received: from donnerwolke.usta.de ([172.24.96.3]) by hekate.usta.de with esmtp (Exim 4.77) (envelope-from ) id 1gSBr7-0001sx-Ep; Thu, 29 Nov 2018 03:15:05 +0100 Received: from athene.usta.de ([172.24.96.10]) by donnerwolke.usta.de with esmtp (Exim 4.84_2) (envelope-from ) id 1gSBr7-0001e6-5y; Thu, 29 Nov 2018 03:15:05 +0100 Received: from localhost (athene.usta.de [local]) by athene.usta.de (OpenSMTPD) with ESMTPA id ffa3c6d1; Thu, 29 Nov 2018 03:15:05 +0100 (CET) Date: Thu, 29 Nov 2018 03:15:05 +0100 From: Ingo Schwarze To: Pali Rohar Cc: discuss@mandoc.bsd.lv Subject: Re: Broken tables in HTML output Message-ID: <20181129021505.GB28147@athene.usta.de> References: <20180716110335.uusqzhscwdgp5qaa@pali> <20180716152919.GB85992@athene.usta.de> <20181126212728.GG82448@athene.usta.de> X-Mailinglist: mandoc-discuss Reply-To: discuss@mandoc.bsd.lv MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181126212728.GG82448@athene.usta.de> User-Agent: Mutt/1.8.0 (2017-02-23) Hi Pali, i'm only keeping the unresolved items, deleting the resolved ones. Ingo Schwarze wrote on Mon, Nov 26, 2018 at 10:27:28PM +0100: > Ingo Schwarze wrote on Mon, Jul 16, 2018 at 05:29:19PM +0200: >> Pali Rohar wrote on Mon, Jul 16, 2018 at 01:03:35PM +0200: [...] >>> Second problem is with text alignment in table. When cell spanning is >>> used (e.g. via s or via \^) then text is not correctly aligned and it >>> looks "ugly". This problem is in both HTML and ASCII output. >> [...] >>> Alignment is wrong. "Name" should be centered and not on top. >>> Same for "value2". > Vertical alignment in ASCII output is a tough problem and not likely > to get fixed soon; the rest should be fixed now, i think. Still open. > [...] >>> Third thing which I observed is that mandoc is in UTF-8 output does >>> not use Unicode Box Drawing characters, but rather ugly ASCII. > That is still open. Now, it is resolved, see my earlier mail - which you already replied to, thanks for that, i'll look at the feedback later. >>> Column for val1 is enormously wide >> >> That's an important known issue listed in the TODO file: >> >> - the "s" layout column specifier is used for placement of data >> into columns, but ignored during column width calculations >> synaptics(4) found by tedu@ Mon, 17 Aug 2015 21:17:42 -0400 >> loc * exist ** algo *** size * imp ** >> >> Priority is only moderate because solving it will require >> quite some work. > That is still open, too. That's now implemented, too, see the commit below. So, according to my bookkeeping, the only item remaining is the tough and relatively low-priority vertical alignment in terminal output. Yours, Ingo Log Message: ----------- Better handle automatic column width assignments in the presence of horizontal spans, by implementing a moderately difficult iterative algoritm. The benefit is that spans containing long text no longer cause an excessive width of their starting column. The result is likely not optimal, in particular in the presence of many spans overlapping in complicated ways nor when spans interact with equalizing or maximizing colums. But i doubt the practical usefulness of making this more complicated. Issue originally reported in synaptics(4), which now looks better, by tedu@ three years ago, and reminded by Pali Rohar this summer. Modified Files: -------------- mandoc: TODO out.c Revision Data ------------- Index: TODO =================================================================== RCS file: /home/cvs/mandoc/mandoc/TODO,v retrieving revision 1.279 retrieving revision 1.280 diff -LTODO -LTODO -u -p -r1.279 -r1.280 --- TODO +++ TODO @@ -168,11 +168,6 @@ are mere guesses, and some may be wrong. --- missing tbl features ----------------------------------------------- -- the "s" layout column specifier is used for placement of data - into columns, but ignored during column width calculations - synaptics(4) found by tedu@ Mon, 17 Aug 2015 21:17:42 -0400 - loc * exist ** algo *** size * imp ** - - vertical centering in cells vertically spanned with ^ pali dot rohar at gmail dot com 16 Jul 2018 13:03:35 +0200 loc * exist *** algo *** size ** imp * Index: out.c =================================================================== RCS file: /home/cvs/mandoc/mandoc/out.c,v retrieving revision 1.74 retrieving revision 1.75 diff -Lout.c -Lout.c -u -p -r1.74 -r1.75 --- out.c +++ out.c @@ -30,12 +30,19 @@ #include "mandoc.h" #include "out.h" -static void tblcalc_data(struct rofftbl *, struct roffcol *, +struct tbl_colgroup { + struct tbl_colgroup *next; + size_t wanted; + int startcol; + int endcol; +}; + +static size_t tblcalc_data(struct rofftbl *, struct roffcol *, const struct tbl_opts *, const struct tbl_dat *, size_t); -static void tblcalc_literal(struct rofftbl *, struct roffcol *, +static size_t tblcalc_literal(struct rofftbl *, struct roffcol *, const struct tbl_dat *, size_t); -static void tblcalc_number(struct rofftbl *, struct roffcol *, +static size_t tblcalc_number(struct rofftbl *, struct roffcol *, const struct tbl_opts *, const struct tbl_dat *); @@ -104,16 +111,18 @@ a2roffsu(const char *src, struct roffsu * used for the actual width calculations. */ void -tblcalc(struct rofftbl *tbl, const struct tbl_span *sp, +tblcalc(struct rofftbl *tbl, const struct tbl_span *sp_first, size_t offset, size_t rmargin) { struct roffsu su; const struct tbl_opts *opts; + const struct tbl_span *sp; const struct tbl_dat *dp; struct roffcol *col; - size_t ewidth, xwidth; - int hspans; - int icol, maxcol, necol, nxcol, quirkcol; + struct tbl_colgroup *first_group, **gp, *g; + size_t *colwidth; + size_t ewidth, min1, min2, wanted, width, xwidth; + int done, icol, maxcol, necol, nxcol, quirkcol; /* * Allocate the master column specifiers. These will hold the @@ -121,33 +130,34 @@ tblcalc(struct rofftbl *tbl, const struc * must be freed and nullified by the caller. */ - assert(NULL == tbl->cols); - tbl->cols = mandoc_calloc((size_t)sp->opts->cols, + assert(tbl->cols == NULL); + tbl->cols = mandoc_calloc((size_t)sp_first->opts->cols, sizeof(struct roffcol)); - opts = sp->opts; + opts = sp_first->opts; - for (maxcol = -1; sp; sp = sp->next) { - if (TBL_SPAN_DATA != sp->pos) + maxcol = -1; + first_group = NULL; + for (sp = sp_first; sp != NULL; sp = sp->next) { + if (sp->pos != TBL_SPAN_DATA) continue; - hspans = 1; + /* * Account for the data cells in the layout, matching it * to data cells in the data section. */ - for (dp = sp->first; dp; dp = dp->next) { - /* Do not used spanned cells in the calculation. */ - if (0 < --hspans) - continue; - hspans = dp->hspans; - if (1 < hspans) - continue; + + gp = &first_group; + for (dp = sp->first; dp != NULL; dp = dp->next) { icol = dp->layout->col; - while (maxcol < icol) + while (icol > maxcol) tbl->cols[++maxcol].spacing = SIZE_MAX; col = tbl->cols + icol; col->flags |= dp->layout->flags; if (dp->layout->flags & TBL_CELL_WIGN) continue; + + /* Handle explicit width specifications. */ + if (dp->layout->wstr != NULL && dp->layout->width == 0 && a2roffsu(dp->layout->wstr, &su, SCALE_EN) @@ -160,15 +170,164 @@ tblcalc(struct rofftbl *tbl, const struc (col->spacing == SIZE_MAX || col->spacing < dp->layout->spacing)) col->spacing = dp->layout->spacing; - tblcalc_data(tbl, col, opts, dp, + + /* + * Calculate an automatic width. + * Except for spanning cells, apply it. + */ + + width = tblcalc_data(tbl, + dp->hspans == 0 ? col : NULL, + opts, dp, dp->block == 0 ? 0 : dp->layout->width ? dp->layout->width : rmargin ? (rmargin + sp->opts->cols / 2) / (sp->opts->cols + 1) : 0); + if (dp->hspans == 0) + continue; + + /* + * Build an ordered, singly linked list + * of all groups of columns joined by spans, + * recording the minimum width for each group. + */ + + while (*gp != NULL && ((*gp)->startcol < icol || + (*gp)->endcol < icol + dp->hspans)) + gp = &(*gp)->next; + if (*gp == NULL || (*gp)->startcol > icol || + (*gp)->endcol > icol + dp->hspans) { + g = mandoc_malloc(sizeof(*g)); + g->next = *gp; + g->wanted = width; + g->startcol = icol; + g->endcol = icol + dp->hspans; + *gp = g; + } else if ((*gp)->wanted < width) + (*gp)->wanted = width; } } /* + * Column spacings are needed for span width calculations, + * so set the default values now. + */ + + for (icol = 0; icol <= maxcol; icol++) + if (tbl->cols[icol].spacing == SIZE_MAX || icol == maxcol) + tbl->cols[icol].spacing = 3; + + /* + * Replace the minimum widths with the missing widths, + * and dismiss groups that are already wide enough. + */ + + gp = &first_group; + while ((g = *gp) != NULL) { + done = 0; + for (icol = g->startcol; icol <= g->endcol; icol++) { + width = tbl->cols[icol].width; + if (icol < g->endcol) + width += tbl->cols[icol].spacing; + if (g->wanted <= width) { + done = 1; + break; + } else + (*gp)->wanted -= width; + } + if (done) { + *gp = g->next; + free(g); + } else + gp = &(*gp)->next; + } + + colwidth = mandoc_reallocarray(NULL, maxcol + 1, sizeof(*colwidth)); + while (first_group != NULL) { + + /* + * Rebuild the array of the widths of all columns + * participating in spans that require expansion. + */ + + for (icol = 0; icol <= maxcol; icol++) + colwidth[icol] = SIZE_MAX; + for (g = first_group; g != NULL; g = g->next) + for (icol = g->startcol; icol <= g->endcol; icol++) + colwidth[icol] = tbl->cols[icol].width; + + /* + * Find the smallest and second smallest column width + * among the columns which may need expamsion. + */ + + min1 = min2 = SIZE_MAX; + for (icol = 0; icol <= maxcol; icol++) { + if (min1 > colwidth[icol]) { + min2 = min1; + min1 = colwidth[icol]; + } else if (min1 < colwidth[icol] && + min2 > colwidth[icol]) + min2 = colwidth[icol]; + } + + /* + * Find the minimum wanted width + * for any one of the narrowest columns, + * and mark the columns wanting that width. + */ + + wanted = min2; + for (g = first_group; g != NULL; g = g->next) { + necol = 0; + for (icol = g->startcol; icol <= g->endcol; icol++) + if (tbl->cols[icol].width == min1) + necol++; + if (necol == 0) + continue; + width = min1 + (g->wanted - 1) / necol + 1; + if (width > min2) + width = min2; + if (wanted > width) + wanted = width; + for (icol = g->startcol; icol <= g->endcol; icol++) + if (colwidth[icol] == min1 || + (colwidth[icol] < min2 && + colwidth[icol] > width)) + colwidth[icol] = width; + } + + /* Record the effect of the widening on the group list. */ + + gp = &first_group; + while ((g = *gp) != NULL) { + done = 0; + for (icol = g->startcol; icol <= g->endcol; icol++) { + if (colwidth[icol] != wanted || + tbl->cols[icol].width == wanted) + continue; + if (g->wanted <= wanted - min1) { + done = 1; + break; + } + g->wanted -= wanted - min1; + } + if (done) { + *gp = g->next; + free(g); + } else + gp = &(*gp)->next; + } + + /* Record the effect of the widening on the columns. */ + + for (icol = 0; icol <= maxcol; icol++) + if (colwidth[icol] == wanted) + tbl->cols[icol].width = wanted; + } + free(colwidth); + + /* * Align numbers with text. * Count columns to equalize and columns to maximize. * Find maximum width of the columns to equalize. @@ -183,8 +342,6 @@ tblcalc(struct rofftbl *tbl, const struc col->decimal += (col->width - col->nwidth) / 2; else col->width = col->nwidth; - if (col->spacing == SIZE_MAX || icol == maxcol) - col->spacing = 3; if (col->flags & TBL_CELL_EQUAL) { necol++; if (ewidth < col->width) @@ -257,7 +414,7 @@ tblcalc(struct rofftbl *tbl, const struc } } -static void +static size_t tblcalc_data(struct rofftbl *tbl, struct roffcol *col, const struct tbl_opts *opts, const struct tbl_dat *dp, size_t mw) { @@ -269,26 +426,24 @@ tblcalc_data(struct rofftbl *tbl, struct case TBL_CELL_HORIZ: case TBL_CELL_DHORIZ: sz = (*tbl->len)(1, tbl->arg); - if (col->width < sz) + if (col != NULL && col->width < sz) col->width = sz; - break; + return sz; case TBL_CELL_LONG: case TBL_CELL_CENTRE: case TBL_CELL_LEFT: case TBL_CELL_RIGHT: - tblcalc_literal(tbl, col, dp, mw); - break; + return tblcalc_literal(tbl, col, dp, mw); case TBL_CELL_NUMBER: - tblcalc_number(tbl, col, opts, dp); - break; + return tblcalc_number(tbl, col, opts, dp); case TBL_CELL_DOWN: - break; + return 0; default: abort(); } } -static void +static size_t tblcalc_literal(struct rofftbl *tbl, struct roffcol *col, const struct tbl_dat *dp, size_t mw) { @@ -297,11 +452,12 @@ tblcalc_literal(struct rofftbl *tbl, str char *end; /* End of the current line. */ size_t lsz; /* Length of the current line. */ size_t wsz; /* Length of the current word. */ + size_t msz; /* Length of the longest line. */ if (dp->string == NULL || *dp->string == '\0') - return; + return 0; str = mw ? mandoc_strdup(dp->string) : dp->string; - lsz = 0; + msz = lsz = 0; for (beg = str; beg != NULL && *beg != '\0'; beg = end) { end = mw ? strchr(beg, ' ') : NULL; if (end != NULL) { @@ -314,14 +470,17 @@ tblcalc_literal(struct rofftbl *tbl, str lsz += 1 + wsz; else lsz = wsz; - if (col->width < lsz) - col->width = lsz; + if (msz < lsz) + msz = lsz; } if (mw) free((void *)str); + if (col != NULL && col->width < msz) + col->width = msz; + return msz; } -static void +static size_t tblcalc_number(struct rofftbl *tbl, struct roffcol *col, const struct tbl_opts *opts, const struct tbl_dat *dp) { @@ -330,7 +489,11 @@ tblcalc_number(struct rofftbl *tbl, stru char buf[2]; if (dp->string == NULL || *dp->string == '\0') - return; + return 0; + + totsz = (*tbl->slen)(dp->string, tbl->arg); + if (col == NULL) + return totsz; /* * Find the last digit and @@ -353,11 +516,10 @@ tblcalc_number(struct rofftbl *tbl, stru /* Not a number, treat as a literal string. */ - totsz = (*tbl->slen)(dp->string, tbl->arg); if (lastdigit == NULL) { - if (col->width < totsz) + if (col != NULL && col->width < totsz) col->width = totsz; - return; + return totsz; } /* Measure the width of the integer part. */ @@ -387,4 +549,5 @@ tblcalc_number(struct rofftbl *tbl, stru if (totsz > col->nwidth) col->nwidth = totsz; + return totsz; } -- To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv