From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from scc-mailout-kit-01.scc.kit.edu (scc-mailout-kit-01.scc.kit.edu [129.13.231.81]) by fantadrom.bsd.lv (OpenSMTPD) with ESMTP id 3dbc5e5b for ; Mon, 16 Jul 2018 10:29:23 -0500 (EST) Received: from asta-nat.asta.uni-karlsruhe.de ([172.22.63.82] helo=hekate.usta.de) by scc-mailout-kit-01.scc.kit.edu with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (envelope-from ) id 1ff5RA-0001xs-AT; Mon, 16 Jul 2018 17:29:22 +0200 Received: from donnerwolke.usta.de ([172.24.96.3]) by hekate.usta.de with esmtp (Exim 4.77) (envelope-from ) id 1ff5RA-0007HI-6f; Mon, 16 Jul 2018 17:29:20 +0200 Received: from athene.usta.de ([172.24.96.10]) by donnerwolke.usta.de with esmtp (Exim 4.84_2) (envelope-from ) id 1ff5RB-0000mw-Ne; Mon, 16 Jul 2018 17:29:21 +0200 Received: from localhost (athene.usta.de [local]) by athene.usta.de (OpenSMTPD) with ESMTPA id 3eaed3d9; Mon, 16 Jul 2018 17:29:20 +0200 (CEST) Date: Mon, 16 Jul 2018 17:29:19 +0200 From: Ingo Schwarze To: Pali =?utf-8?B?Um9ow6Fy?= Cc: discuss@mandoc.bsd.lv Subject: Re: Broken tables in HTML output Message-ID: <20180716152919.GB85992@athene.usta.de> References: <20180716110335.uusqzhscwdgp5qaa@pali> X-Mailinglist: mandoc-discuss Reply-To: discuss@mandoc.bsd.lv MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180716110335.uusqzhscwdgp5qaa@pali> User-Agent: Mutt/1.8.0 (2017-02-23) Hi Pali, i'm rearranging your text while quoting because you address most of the issues twice. Pali Rohar wrote on Mon, Jul 16, 2018 at 01:03:35PM +0200: > It seems that mandoc is not able to format tables in HTML output > correctly. Output is rather ugly which makes it less readable. You have designed a very complicated table for testing, exercising many advanced features of the tbl(7) language: horizontal and vertical table spanning with a mixture of option, layout, and data borders. So it is not very surprising that some aspects of the rendering are not perfect yet. In particular, the tbl(7) HTML formatter is still relatively simplistic. > First thing is that in HTML output is fully missing specified border > even when in table section is box or | specified. This makes hard to > understand meaning of some table when borders are important. > > Note that in ASCII output borders are rendered by '-', '+', '|' and '=' > characters, so seems that mandoc already support borders, just HTML > generator is buggy or does not support them at all. Indeed, the tbl(7) parser fully supports borders and the terminal formatter supports them in most respects. But you are right that they are not yet implemented in the HTML formatter. So i have added an entry to the TODO file: - implement table borders in HTML output pali dot rohar at gmail dot com 16 Jul 2018 13:03:35 +0200 loc * exist * algo ** size ** imp ** I'm not completely sure that fully implementing all aspects of table borders in conjunction with cell spanning is possible in HTML output, but i think it is likely that it can be done in some way. Priority of the basic parts of the task (borders at all) seems moderate, priority of the advanced parts low. > cell spanning is broken, More precisely, it is not implemented. I added an entry to the TODO file, priority is moderate: - implement cell spanning in HTML output pali dot rohar at gmail dot com 16 Jul 2018 13:03:35 +0200 loc * exist * algo ** size ** imp ** > text is not aligned at all. I added an entry to the TODO file, priority is relatively high: - implement horizontal and vertical alignment in HTML output pali dot rohar at gmail dot com 16 Jul 2018 13:03:35 +0200 loc * exist * algo * size * imp *** > Basically nothing is working and it hard to read and understood > what this table means. That seems like a serious overstatement. The parser is fully functional even for all the very advanced features you are using, and even though the HTML formatter is relatively simplistic compared to the terminal formatter, i think the meaning can still be understood. > So can you fix HTML generator in mandoc to produce better formatted > HTML text. Probably, i can fix this, but it may take some time. It will certainly not be completed during the next few months. Most of the features you are asking for are not quite easy to implement, and there are many other things to do in mandoc, some of equal or higher priority. Definely, some of the features you are asking for will appear more quickly than others of these features. For example, vertical alignment on the terminal is particularly hard and may get delayed for a long time. > Because now ASCII version is better then what produce HTML. Absolutely. ASCII output is always better than HTML. We consider terminal output by far the most important output mode in OpenBSD. But lately, HTML has also be improved in many respects, and that work will continue. The importance of HTML output is increasing. > Second problem is with text alignment in table. When cell spanning is > used (e.g. via s or via \^) then text is not correctly aligned and it > looks "ugly". This problem is in both HTML and ASCII output. [...] > Alignment is wrong. "Name" should be centered and not on top. > Same for "value2". Horizontal alignment is correct in terminal output, i think you are exclusively talking about vertical alignment here. In the past, i chose to not implement vertical alignment in the tbl(1) terminal formatter because it is hard to implement and rather unimportant. But it was not yet mentioned in the TODO file, so i just added this entry: - vertical centering in cells vertically spanned with ^ pali dot rohar at gmail dot com 16 Jul 2018 13:03:35 +0200 loc * exist *** algo *** size ** imp * > Third thing which I observed is that mandoc is in UTF-8 output does > not use Unicode Box Drawing characters, but rather ugly ASCII. [...] > Output from mandoc in UTF-8 mode is ugly: > > $ mandoc -Tutf8 ./test.man > +----------------------+---------------------------------+ > | Very long text | Another very long text | > +---------------+------+------------------------+--------+ > | Short | shrt | val1 | val2 | > +===============+======+========================+========+ > |Name | 1 | value1 | value2 | > | | 2 | value3 | | > | | 3 | value4 | | > +---------------+------+------------------------+--------+ > |Name2 | 1 | v1 | v2 | > +---------------+------+------------------------+--------+ > |Name3 | 1 | vv1 | vv2 | > | | 2 | | vv4 | > +---------------+------+------------------------+--------+ > > It looks like it is no UTF-8, but rather ASCII. Well, if you use non-ASCII characters in the input, you will see that it is indeed UTF-8 output, not ASCII output. The TODO file already has the following entry of moderate priority: - use Unicode U+2500 to U+256C for table borders in tbl(7) -Tutf-8 output suggested by bentley@ Tue, 14 Oct 2014 04:10:55 -0600 loc * exist ** algo * size * imp ** > Column for val1 is enormously wide That's an important known issue listed in the TODO file: - the "s" layout column specifier is used for placement of data into columns, but ignored during column width calculations synaptics(4) found by tedu@ Mon, 17 Aug 2015 21:17:42 -0400 loc * exist ** algo *** size * imp ** Priority is only moderate because solving it will require quite some work. > and val2 is too short without any reason. > Both val1 and val2 columns have members of same sizes... The val2 column is not too short, all values fit into the column. The reason it is wider with GNU tbl(1) is that the width of "Another very long text" is greater than the width needed for both val* columns together, so GNU tbl(1) distributes some additional width across both val* columns. This extension is part of what makes column width calculations non-trivial in the presence of spanned columns. Thanks for all your input, Ingo Log Message: ----------- new todos, mostly from Pali Rohar, mostly tbl(7) Modified Files: -------------- mandoc: TODO Revision Data ------------- Index: TODO =================================================================== RCS file: /home/cvs/mandoc/mandoc/TODO,v retrieving revision 1.254 retrieving revision 1.255 diff -LTODO -LTODO -u -p -r1.254 -r1.255 --- TODO +++ TODO @@ -180,6 +180,10 @@ are mere guesses, and some may be wrong. synaptics(4) found by tedu@ Mon, 17 Aug 2015 21:17:42 -0400 loc * exist ** algo *** size * imp ** +- vertical centering in cells vertically spanned with ^ + pali dot rohar at gmail dot com 16 Jul 2018 13:03:35 +0200 + loc * exist *** algo *** size ** imp * + - support mdoc(7) and man(7) macros inside tbl(7) code; probably requires the parser reorg and letting tbl(7) use roff_node such that macro sets can mix; @@ -198,6 +202,18 @@ are mere guesses, and some may be wrong. suggested by bentley@ Tue, 14 Oct 2014 04:10:55 -0600 loc * exist ** algo * size * imp ** +- implement horizontal and vertical alignment in HTNL output + pali dot rohar at gmail dot com 16 Jul 2018 13:03:35 +0200 + loc * exist * algo * size * imp *** + +- implement cell spanning in HTML output + pali dot rohar at gmail dot com 16 Jul 2018 13:03:35 +0200 + loc * exist * algo ** size ** imp ** + +- implement table borders in HTML output + pali dot rohar at gmail dot com 16 Jul 2018 13:03:35 +0200 + loc * exist * algo ** size ** imp ** + --- missing eqn features ----------------------------------------------- - In a matrix, break the output line after each matrix line. @@ -228,6 +244,8 @@ are mere guesses, and some may be wrong. bentley@ Thu, 13 Jul 2017 23:14:20 -0600 --- missing misc features ---------------------------------------------- + +- man -ks 1,8 route; kn@ Jul 13, 2018 orally - italic correction (\/) in PostScript mode Werner LEMBERG on groff at gnu dot org Sun, 10 Nov 2013 12:47:46 -- To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv