discuss@mandoc.bsd.lv
 help / color / mirror / Atom feed
* HTML5
@ 2014-08-09 23:33 Kristaps Dzonsons
  2014-08-10  2:23 ` HTML5 Ingo Schwarze
  2014-08-10  4:33 ` HTML5 Anthony J. Bentley
  0 siblings, 2 replies; 7+ messages in thread
From: Kristaps Dzonsons @ 2014-08-09 23:33 UTC (permalink / raw)
  To: discuss

[-- Attachment #1: Type: text/plain, Size: 968 bytes --]

Hi,

Most everybody supports HTML5 these days.  Do we really need to knock 
around with XHTML and HTML-4.01?  Does anybody have a pressing need to 
use one or the other?

The enclosed ten-minute patch adds HTML5 support and makes it the 
default for both modes.  It also adds a default CSS style (if one isn't 
passed on the command line) identical to OpenBSD's man.cgi CSS.  I don't 
like this--I think online manpages should take more advantage of 
online-ness--but I'm just putting up the bikeshed so it's ready to 
paint.  This patch also rips out some chunks (embedded widths, etc.) 
that didn't HTML5 validate, so be warned...

If it looks useful, we could rip out a decent chunk of code that 
switches between the two existing modes.  (Including some attributes and 
elements in there.)  (Yes, I'd document it better, if useful, and 
probably tidy up the *html.c files as well.)

(Note this doesn't include the earlier patch for SCALE_BU.)

Thoughts?

Kristaps

[-- Attachment #2: html5.diff --]
[-- Type: text/plain, Size: 10557 bytes --]

Index: html.c
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/html.c,v
retrieving revision 1.159
diff -u -p -r1.159 html.c
--- html.c	23 Jul 2014 15:00:08 -0000	1.159
+++ html.c	9 Aug 2014 23:31:38 -0000
@@ -70,12 +70,13 @@ static	const struct htmldata htmltags[TA
 	{"dt",		HTML_CLRLINE}, /* TAG_DT */
 	{"dd",		HTML_CLRLINE}, /* TAG_DD */
 	{"blockquote",	HTML_CLRLINE}, /* TAG_BLOCKQUOTE */
-	{"p",		HTML_CLRLINE | HTML_NOSTACK | HTML_AUTOCLOSE}, /* TAG_P */
+	{"p",		HTML_CLRLINE | HTML_NOSTACK }, /* TAG_P */
 	{"pre",		HTML_CLRLINE }, /* TAG_PRE */
 	{"b",		0 }, /* TAG_B */
 	{"i",		0 }, /* TAG_I */
 	{"code",	0 }, /* TAG_CODE */
 	{"small",	0 }, /* TAG_SMALL */
+	{"style",	HTML_CLRLINE }, /* TAG_STYLE */
 };
 
 static	const char	*const htmlattrs[ATTR_MAX] = {
@@ -93,6 +94,7 @@ static	const char	*const htmlattrs[ATTR_
 	"summary", /* ATTR_SUMMARY */
 	"align", /* ATTR_ALIGN */
 	"colspan", /* ATTR_COLSPAN */
+	"charset", /* ATTR_CHARSET */
 };
 
 static	const char	*const roffscales[SCALE_MAX] = {
@@ -161,14 +163,14 @@ void *
 html_alloc(char *outopts)
 {
 
-	return(ml_alloc(outopts, HTML_HTML_4_01_STRICT));
+	return(ml_alloc(outopts, HTML_HTML5));
 }
 
 void *
 xhtml_alloc(char *outopts)
 {
 
-	return(ml_alloc(outopts, HTML_XHTML_1_0_STRICT));
+	return(ml_alloc(outopts, HTML_HTML5));
 }
 
 void
@@ -194,20 +196,26 @@ void
 print_gen_head(struct html *h)
 {
 	struct htmlpair	 tag[4];
+	struct tag	*t;
 
-	tag[0].key = ATTR_HTTPEQUIV;
-	tag[0].val = "Content-Type";
-	tag[1].key = ATTR_CONTENT;
-	tag[1].val = "text/html; charset=utf-8";
-	print_otag(h, TAG_META, 2, tag);
-
-	tag[0].key = ATTR_NAME;
-	tag[0].val = "resource-type";
-	tag[1].key = ATTR_CONTENT;
-	tag[1].val = "document";
-	print_otag(h, TAG_META, 2, tag);
+	if (HTML_HTML5 == h->type) {
+		tag[0].key = ATTR_CHARSET;
+		tag[0].val = "utf-8";
+		print_otag(h, TAG_META, 1, tag);
+	} else {
+		tag[0].key = ATTR_HTTPEQUIV;
+		tag[0].val = "Content-Type";
+		tag[1].key = ATTR_CONTENT;
+		tag[1].val = "text/html; charset=utf-8";
+		print_otag(h, TAG_META, 2, tag);
+		tag[0].key = ATTR_NAME;
+		tag[0].val = "resource-type";
+		tag[1].key = ATTR_CONTENT;
+		tag[1].val = "document";
+		print_otag(h, TAG_META, 2, tag);
+	}
 
-	if (h->style) {
+	if (NULL != h->style) {
 		tag[0].key = ATTR_REL;
 		tag[0].val = "stylesheet";
 		tag[1].key = ATTR_HREF;
@@ -217,7 +225,66 @@ print_gen_head(struct html *h)
 		tag[3].key = ATTR_MEDIA;
 		tag[3].val = "all";
 		print_otag(h, TAG_LINK, 4, tag);
+		return;
 	}
+
+	/*
+	 * If we don't have a stylesheet, print some nice defaults here.
+	 * This looks like a terminal.
+	 */
+	if (HTML_HTML5 != h->type) {
+		PAIR_INIT(&tag[0], ATTR_TYPE, "text/css");
+		t = print_otag(h, TAG_STYLE, 1, tag);
+	} else 
+		t = print_otag(h, TAG_STYLE, 0, NULL);
+
+	print_text(h, "div.mandoc { min-width: 102ex; "
+		"width: 102ex; font-family: monospace; }\n");
+	print_text(h, "div.mandoc h1, div.mandoc h2 "
+		"{ margin-bottom: 0ex; font-size: inherit; }\n");
+	print_text(h, "div.mandoc h1 { margin-left: -4ex; }\n");
+	print_text(h, "div.mandoc h2 { margin-left: -2ex; }\n");
+       	print_text(h, "div.mandoc div.section { margin: 1ex 5ex; }\n");
+       	print_text(h, "div.mandoc table { width: 100%; }\n");
+	print_text(h, "div.mandoc td { vertical-align: top; }\n");
+       	print_text(h, "div.mandoc blockquote { margin: 0 5ex; }\n");
+       	print_text(h, "div.mandoc table.head.foot "
+		"td:last-child { text-align: right; }\n");
+       	print_text(h, "div.mandoc table.foot td { width: 50%; }\n");
+       	print_text(h, "div.mandoc table.head td { width: 10%; }\n");
+       	print_text(h, "div.mandoc table.head td.head-vol "
+		"{ width: 80%; text-align: center; }\n");
+       	print_text(h, "div.mandoc .emph { font-style: italic; "
+		"font-weight: normal; }\n");
+       	print_text(h, "div.mandoc .symb { font-style: normal; "
+		"font-weight: bold; }\n");
+       	print_text(h, "div.mandoc .symb { font-style: normal; "
+		"font-weight: normal; }\n");
+       	print_text(h, "div.mandoc i.addr.arg.farg.file.ftype."
+		"link-sec.ref-book.ref-issue.ref-jrnl "
+		"{ font-weight: normal; }\n");
+       	print_text(h, "div.mandoc b.cmd.config.diag.flag.fname."
+		"includes.macro.name.var { font-style: normal; }\n");
+       	print_text(h, "div.mandoc span.ref-title "
+		"{ text-decoration: underline; }\n");
+       	print_text(h, "div.mandoc span.type "
+		"{ font-style: italic; font-weight: normal; }\n");
+       	print_text(h, "div.mandoc dd.list-ohang "
+		"{ margin-left: 0ex; }\n");
+	print_text(h, "div.mandoc ul.list-bul "
+		"{ list-style-type: disc; padding-left: 1em; }\n");
+	print_text(h, "div.mandoc ul.list-dash "
+		"{ list-style-type: none; padding-left: 0em; }\n");
+	print_text(h, "div.mandoc li.list-dash:before "
+		"{ content: \'\\\\2014  \'; }\n");
+	print_text(h, "div.mandoc ul.list-hyph "
+		"{ list-style-type: none; padding-left: 0em; }\n");
+	print_text(h, "div.mandoc li.list-hyph:before "
+		"{ content: \'\\\\2013  \'; }\n");
+	print_text(h, "div.mandoc ul.list-item "
+		"{ list-style-type: none; padding-left: 0em; }\n");
+	print_text(h, "div.mandoc ol.list-enum { padding-left: 2em; }\n");
+	print_tagq(h, t);
 }
 
 static void
@@ -508,6 +575,7 @@ print_otag(struct html *h, enum htmltag 
 	if (HTML_AUTOCLOSE & htmltags[tag].flags)
 		switch (h->type) {
 		case HTML_XHTML_1_0_STRICT:
+		case HTML_HTML5:
 			putchar('/');
 			break;
 		default:
@@ -543,6 +611,9 @@ print_gen_decls(struct html *h)
 	const char	*name;
 
 	switch (h->type) {
+	case HTML_HTML5:
+		printf("<!DOCTYPE html>\n");
+		return;
 	case HTML_HTML_4_01_STRICT:
 		name = "HTML";
 		doctype = "-//W3C//DTD HTML 4.01//EN";
Index: html.h
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/html.h,v
retrieving revision 1.51
diff -u -p -r1.51 html.h
--- html.h	20 Apr 2014 16:46:04 -0000	1.51
+++ html.h	9 Aug 2014 23:31:38 -0000
@@ -50,6 +50,7 @@ enum	htmltag {
 	TAG_I,
 	TAG_CODE,
 	TAG_SMALL,
+	TAG_STYLE,
 	TAG_MAX
 };
 
@@ -68,6 +69,7 @@ enum	htmlattr {
 	ATTR_SUMMARY,
 	ATTR_ALIGN,
 	ATTR_COLSPAN,
+	ATTR_CHARSET,
 	ATTR_MAX
 };
 
@@ -107,7 +109,8 @@ struct	htmlpair {
 
 enum	htmltype {
 	HTML_HTML_4_01_STRICT,
-	HTML_XHTML_1_0_STRICT
+	HTML_XHTML_1_0_STRICT,
+	HTML_HTML5
 };
 
 struct	html {
Index: man_html.c
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/man_html.c,v
retrieving revision 1.96
diff -u -p -r1.96 man_html.c
--- man_html.c	1 Aug 2014 19:25:52 -0000	1.96
+++ man_html.c	9 Aug 2014 23:31:38 -0000
@@ -309,14 +309,8 @@ man_root_pre(MAN_ARGS)
 	assert(man->msec);
 	mandoc_asprintf(&title, "%s(%s)", man->title, man->msec);
 
-	PAIR_SUMMARY_INIT(&tag[0], "Document Header");
-	PAIR_CLASS_INIT(&tag[1], "head");
-	PAIR_INIT(&tag[2], ATTR_WIDTH, "100%");
-	t = print_otag(h, TAG_TABLE, 3, tag);
-	PAIR_INIT(&tag[0], ATTR_WIDTH, "30%");
-	print_otag(h, TAG_COL, 1, tag);
-	print_otag(h, TAG_COL, 1, tag);
-	print_otag(h, TAG_COL, 1, tag);
+	PAIR_CLASS_INIT(&tag[0], "head");
+	t = print_otag(h, TAG_TABLE, 1, tag);
 
 	print_otag(h, TAG_TBODY, 0, NULL);
 
@@ -328,15 +322,13 @@ man_root_pre(MAN_ARGS)
 	print_stagq(h, tt);
 
 	PAIR_CLASS_INIT(&tag[0], "head-vol");
-	PAIR_INIT(&tag[1], ATTR_ALIGN, "center");
-	print_otag(h, TAG_TD, 2, tag);
+	print_otag(h, TAG_TD, 1, tag);
 	if (NULL != man->vol)
 		print_text(h, man->vol);
 	print_stagq(h, tt);
 
 	PAIR_CLASS_INIT(&tag[0], "head-rtitle");
-	PAIR_INIT(&tag[1], ATTR_ALIGN, "right");
-	print_otag(h, TAG_TD, 2, tag);
+	print_otag(h, TAG_TD, 1, tag);
 	print_text(h, title);
 	print_tagq(h, t);
 	free(title);
@@ -348,13 +340,8 @@ man_root_post(MAN_ARGS)
 	struct htmlpair	 tag[3];
 	struct tag	*t, *tt;
 
-	PAIR_SUMMARY_INIT(&tag[0], "Document Footer");
-	PAIR_CLASS_INIT(&tag[1], "foot");
-	PAIR_INIT(&tag[2], ATTR_WIDTH, "100%");
-	t = print_otag(h, TAG_TABLE, 3, tag);
-	PAIR_INIT(&tag[0], ATTR_WIDTH, "50%");
-	print_otag(h, TAG_COL, 1, tag);
-	print_otag(h, TAG_COL, 1, tag);
+	PAIR_CLASS_INIT(&tag[0], "foot");
+	t = print_otag(h, TAG_TABLE, 1, tag);
 
 	tt = print_otag(h, TAG_TR, 0, NULL);
 
@@ -366,8 +353,7 @@ man_root_post(MAN_ARGS)
 	print_stagq(h, tt);
 
 	PAIR_CLASS_INIT(&tag[0], "foot-os");
-	PAIR_INIT(&tag[1], ATTR_ALIGN, "right");
-	print_otag(h, TAG_TD, 2, tag);
+	print_otag(h, TAG_TD, 1, tag);
 
 	if (man->source)
 		print_text(h, man->source);
Index: mdoc_html.c
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/mdoc_html.c,v
retrieving revision 1.195
diff -u -p -r1.195 mdoc_html.c
--- mdoc_html.c	6 Aug 2014 15:09:05 -0000	1.195
+++ mdoc_html.c	9 Aug 2014 23:31:38 -0000
@@ -489,13 +489,8 @@ mdoc_root_post(MDOC_ARGS)
 	struct htmlpair	 tag[3];
 	struct tag	*t, *tt;
 
-	PAIR_SUMMARY_INIT(&tag[0], "Document Footer");
-	PAIR_CLASS_INIT(&tag[1], "foot");
-	PAIR_INIT(&tag[2], ATTR_WIDTH, "100%");
-	t = print_otag(h, TAG_TABLE, 3, tag);
-	PAIR_INIT(&tag[0], ATTR_WIDTH, "50%");
-	print_otag(h, TAG_COL, 1, tag);
-	print_otag(h, TAG_COL, 1, tag);
+	PAIR_CLASS_INIT(&tag[0], "foot");
+	t = print_otag(h, TAG_TABLE, 1, tag);
 
 	print_otag(h, TAG_TBODY, 0, NULL);
 
@@ -507,8 +502,7 @@ mdoc_root_post(MDOC_ARGS)
 	print_stagq(h, tt);
 
 	PAIR_CLASS_INIT(&tag[0], "foot-os");
-	PAIR_INIT(&tag[1], ATTR_ALIGN, "right");
-	print_otag(h, TAG_TD, 2, tag);
+	print_otag(h, TAG_TD, 1, tag);
 	print_text(h, meta->os);
 	print_tagq(h, t);
 }
@@ -532,14 +526,8 @@ mdoc_root_pre(MDOC_ARGS)
 		mandoc_asprintf(&title, "%s(%s)",
 		    meta->title, meta->msec);
 
-	PAIR_SUMMARY_INIT(&tag[0], "Document Header");
-	PAIR_CLASS_INIT(&tag[1], "head");
-	PAIR_INIT(&tag[2], ATTR_WIDTH, "100%");
-	t = print_otag(h, TAG_TABLE, 3, tag);
-	PAIR_INIT(&tag[0], ATTR_WIDTH, "30%");
-	print_otag(h, TAG_COL, 1, tag);
-	print_otag(h, TAG_COL, 1, tag);
-	print_otag(h, TAG_COL, 1, tag);
+	PAIR_CLASS_INIT(&tag[0], "head");
+	t = print_otag(h, TAG_TABLE, 1, tag);
 
 	print_otag(h, TAG_TBODY, 0, NULL);
 
@@ -551,14 +539,12 @@ mdoc_root_pre(MDOC_ARGS)
 	print_stagq(h, tt);
 
 	PAIR_CLASS_INIT(&tag[0], "head-vol");
-	PAIR_INIT(&tag[1], ATTR_ALIGN, "center");
-	print_otag(h, TAG_TD, 2, tag);
+	print_otag(h, TAG_TD, 1, tag);
 	print_text(h, volume);
 	print_stagq(h, tt);
 
 	PAIR_CLASS_INIT(&tag[0], "head-rtitle");
-	PAIR_INIT(&tag[1], ATTR_ALIGN, "right");
-	print_otag(h, TAG_TD, 2, tag);
+	print_otag(h, TAG_TD, 1, tag);
 	print_text(h, title);
 	print_tagq(h, t);
 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: HTML5
  2014-08-09 23:33 HTML5 Kristaps Dzonsons
@ 2014-08-10  2:23 ` Ingo Schwarze
  2014-08-10 11:39   ` HTML5 Kristaps Dzonsons
  2014-08-10  4:33 ` HTML5 Anthony J. Bentley
  1 sibling, 1 reply; 7+ messages in thread
From: Ingo Schwarze @ 2014-08-10  2:23 UTC (permalink / raw)
  To: Kristaps Dzonsons; +Cc: discuss

Hi Kristaps,

Kristaps Dzonsons wrote on Sun, Aug 10, 2014 at 01:33:49AM +0200:

> Most everybody supports HTML5 these days.  Do we really need to
> knock around with XHTML and HTML-4.01?

Let me put it this way:  We should not use any fancy features.
If somebody has a browser that doesn't know about HTML versions
and just assumes everything is HTML 4 without looking at any
document types and stuff, then the pages should render cleanly.
Whether or not they validate as HTML 5 seems irrelevant to me.

Supporting multiple HTML variants seems pointless to me and just
complicates the code.

> Does anybody have a pressing need to use one or the other?

Wrong question, i'd say.  Let's use the smallest common denominator
and be done with it.  I certainly don't want any HTML 5 only features
or syntax getting used.  If that is impossible, than i'd rather
stick with HTML 4 than switch to HTML 5, HTML 4 is useable for all
real work and HTML 5 just looks like bloatware in general.  However
you patch looks like having a document that is both valid HTML 4
and valid HTML 5 seems easy, so there probably isn't an issue here.

I do like the cutting down on meta-tags in your patch.  Even if
we switch to HTML 5 and validate against that, we should continue
to validate against HTML 4.01 as well, i think, to make sure
no HTML 5 only stuff sneaks in.

> The enclosed ten-minute patch adds HTML5 support and makes it the
> default for both modes.

On first sight the patch looks harmless, it doesn't appear to
change anything structural.

> It also adds a default CSS style (if one isn't passed on the command
> line) identical to OpenBSD's man.cgi CSS.

Gah.  Can't we just make up our mind a ship one single CSS file
covering all we need?  The proliferation of CSS files in our tree
is disgusting.  I don't see the point in having more than one,
and if you add another, then we have *four* of them.

Also, i'm not sure about the embedded style sheet.  That should
certainly be kept minimal.  Is it needed at all?  If yes, why?
If people set their browser to not download and use CSS, then
they shouldn't get CSS but default rendering, i think.

> If it looks useful, we could rip out a decent chunk of code that
> switches between the two existing modes.  (Including some attributes
> and elements in there.)  (Yes, I'd document it better, if useful,
> and probably tidy up the *html.c files as well.)

Ideally, i'd like to have the switch and the garbage collection
in separate commits - and please don't commit before the release
is out of the door...  :-)

> (Note this doesn't include the earlier patch for SCALE_BU.)

Yes, that should also go in after the release, not before, i think,
and we should decide on the droelfzehn other broken width calculations
i found in that area.  ;-)

> Thoughts?

Thumbs up, in general.
  Ingo
--
 To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: HTML5
  2014-08-09 23:33 HTML5 Kristaps Dzonsons
  2014-08-10  2:23 ` HTML5 Ingo Schwarze
@ 2014-08-10  4:33 ` Anthony J. Bentley
  2014-08-10 15:17   ` HTML5 Ingo Schwarze
  1 sibling, 1 reply; 7+ messages in thread
From: Anthony J. Bentley @ 2014-08-10  4:33 UTC (permalink / raw)
  To: discuss

Kristaps Dzonsons writes:
> Hi,
> 
> Most everybody supports HTML5 these days.  Do we really need to knock 
> around with XHTML and HTML-4.01?  Does anybody have a pressing need to 
> use one or the other?

I actively use HTML 4.01. I don't have a real objection against emitting
a HTML5 doctype, but I do think that it's a good idea for HTML output of
mandoc to validate against both HTML 4.01 and HTML5. And if we do that,
is there any point to switching doctypes?

> The enclosed ten-minute patch adds HTML5 support and makes it the 
> default for both modes.  It also adds a default CSS style (if one isn't 
> passed on the command line) identical to OpenBSD's man.cgi CSS.  I don't 
> like this--I think online manpages should take more advantage of 
> online-ness--but I'm just putting up the bikeshed so it's ready to 
> paint.

IMO, the status quo makes more sense here. We already separate content
and presentation by only making use of external stylesheets. Since the
output of mandoc -Thtml is something people are likely to try to parse,
we shouldn't emit CSS in the default case. If people do ask for CSS,
emitting a link tag instead of embedding it is the right thing to do.

I did talk to Ingo recently about man.cgi stylesheets. The stylesheet
visible on mdocml.bsd.lv is a lot more representative of mandoc's
abilities and I would like to push to remove the pseudo-man.cgi style
completely...

-- 
Anthony J. Bentley
--
 To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: HTML5
  2014-08-10  2:23 ` HTML5 Ingo Schwarze
@ 2014-08-10 11:39   ` Kristaps Dzonsons
  2014-08-10 16:12     ` HTML5 Ingo Schwarze
  0 siblings, 1 reply; 7+ messages in thread
From: Kristaps Dzonsons @ 2014-08-10 11:39 UTC (permalink / raw)
  To: discuss

Ingo, Anthony,

> Let me put it this way:  We should not use any fancy features.
> If somebody has a browser that doesn't know about HTML versions
> and just assumes everything is HTML 4 without looking at any
> document types and stuff, then the pages should render cleanly.
> Whether or not they validate as HTML 5 seems irrelevant to me.
>
> Supporting multiple HTML variants seems pointless to me and just
> complicates the code.

I agree.  In short, unifying under HTML5 will simplify the code (no 
switching)--that much is clear.  I don't care whether it's HTML5 or 
asciidoc so long as it gets the job done.  And for browsers, it's 
between flavours of HTML.  So let's consider why HTML5 instead of just 
HTML4 or XHTML1 as-is.

First, note that the patch's HTML5 is called "polyglot" HTML5, which is 
to say, HTML5 with XML syntax.

(Link: <http://dev.w3.org/html5/html-polyglot>.)

A "pro" is that polyglot HTML5 has the same doctype *and content type* 
for its XHTML and HTML modes.  So we can create well-formed, parseable 
HTML5 mark-up using strict XML syntax, then serve it with text/html and 
be happily standards-compliant.  As it is, we put a burden on the agency 
serving -Thtml or -Txhtml pages to know the difference.

The "con" is that by unifying -Thtml/xhtml as HTML5--or anything, 
really--we lose strict HTML4 callers of -Ofragment (XHTML1 callers would 
be fine).  The only caller right now is cgi.c, which stipulates HTML4. 
This can be fixed easily: remove the HTML4 parts of to cgi.c's DOCTYPE 
and close the void img, meta, and link elements.  See the "pro" above 
for why that's also a smart idea.

Another "pro" is that we get eqn.  Ingo, this is the "feature" you're 
worried about.  And it's a pretty big issue for me (so this is a "for 
me"): all of my equations (in eqn, or really in DocBook--which I use 
with docbook2mdoc for some scientific applications--which I'd like to 
convert into eqn) are lost.  Also lost are LAPACK manuals, OpenGL, and a 
host of other eqn systems.  If we stick to HTML4, we'd need to cripple 
ourselves with table-based equations.  If we stick with XHTML1, we need 
to jump through namespace hoops.  But with HTML5, we get embedded MathML.

The "con", yes, is that MathML is a scary feature, and it doesn't exist 
yet.  YET.

At the end of the day, the browser doesn't really care whether it's 
HTML4, XHTML1, or HTML5.  It'll render regardless.  Callers of 
-Ofragment will care, but right now that's just us.  Ingo, you mentioned 
non-conforming browsers.  Care to point me to one that will puke on the 
HTML5 output from the patch?  Even lynx(1) can read that!

If we're really at loggerheads over it, /adding/ HTML5 is as easy as 
another switch statement of two.  I think it's a good idea regardless of 
whether it's added or replacing for the reasons above.

...on to other matters: the style-sheet.

The status quo bugs me because of the header and footer table.  These 
have hard-coded widths and alignments to make them look decent without a 
stylesheet.  This is inadvisable in any modern flavour of HTML.  At the 
very least, we should replace the "width" and "height" for embedded 
styles.  Unfortunately, that's a problem: inline styles can't be 
overriden without the "!important" qualifier in CSS, which is annoying. 
  (That's why I used the width attributes in the first place.)  So I 
think that putting just the table styles *before* the <link stylesheet 
/> is a good idea.  The question goes: if we're going to do that, is 
there anything else that should go there?

The stylesheet I put in place does serve an important purpose: it 
prevents overriding styles.  In man(7) and mdoc(7), for example, you 
can't have overlapping styles.  E.g.,

.Bf Sy
Hi
.Ar there
.Ef

The "there" doesn't have both styles: they reset when they're nested. 
(There are probably better ways to do it in CSS, but it needs to be done 
one way or another.)  Without some sort of style-sheet, font modes will 
be nested.  Yes, I'm at conflict with myself: on the one hand, mdoc(7) 
does this because consoles generally haven't supported overlapping 
fonts, and I don't like console hold-vers in HTML output.  Or should we 
discard that convention?  (groff's -Tps does it too!)

And yes, I agree that we should just have a single stylesheet in the 
source.  I don't particularly care what it shows.

Best,

Kristaps
--
 To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: HTML5
  2014-08-10  4:33 ` HTML5 Anthony J. Bentley
@ 2014-08-10 15:17   ` Ingo Schwarze
  2014-08-10 17:38     ` HTML5 Anthony J. Bentley
  0 siblings, 1 reply; 7+ messages in thread
From: Ingo Schwarze @ 2014-08-10 15:17 UTC (permalink / raw)
  To: Anthony J. Bentley; +Cc: discuss

Hi,

please take all i say in this thread with a grain of salt,
i'm not specialized in this area and may be wrong.
Hope you will figure it out and correct me.

Anthony J. Bentley wrote on Sat, Aug 09, 2014 at 10:33:48PM -0600:
> Kristaps Dzonsons writes:

>> Most everybody supports HTML5 these days.  Do we really need to knock 
>> around with XHTML and HTML-4.01?  Does anybody have a pressing need to 
>> use one or the other?

> I actively use HTML 4.01. I don't have a real objection against emitting
> a HTML5 doctype, but I do think that it's a good idea for HTML output of
> mandoc to validate against both HTML 4.01 and HTML5.

Why not?  What downside do you see?

> And if we do that, is there any point to switching doctypes?

Yes, one minor and one major.

The minor is less doctype/content-type clutter.
The major is to get MathML for eqn.

That also explains why i think validating against both 5 and 4.01
(for non-eqn content) does have some merit.  Again, if that is
possible, but as far as i understand, it is.

>> The enclosed ten-minute patch adds HTML5 support and makes it the 
>> default for both modes.  It also adds a default CSS style (if one isn't 
>> passed on the command line) identical to OpenBSD's man.cgi CSS.  I don't 
>> like this--I think online manpages should take more advantage of 
>> online-ness--but I'm just putting up the bikeshed so it's ready to 
>> paint.

> IMO, the status quo makes more sense here. We already separate content
> and presentation by only making use of external stylesheets. Since the
> output of mandoc -Thtml is something people are likely to try to parse,
> we shouldn't emit CSS in the default case. If people do ask for CSS,
> emitting a link tag instead of embedding it is the right thing to do.

That makes sense to me; i guess the only exception is cases where
CSS is the only way to make stuff render at all, like for the
header and footer tables, as mentioned by Kristaps.

> I did talk to Ingo recently about man.cgi stylesheets. The stylesheet
> visible on mdocml.bsd.lv is a lot more representative of mandoc's
> abilities and I would like to push to remove the pseudo-man.cgi style
> completely...

That's fine with me.  I don't worry about style changes to the OpenBSD
online man pages.  The style used should be friendly on the eye and
make the semantic markup stand out well, at least as well as for
terminal output, that's all that matters from my perspective.

Yours,
  Ingo
--
 To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: HTML5
  2014-08-10 11:39   ` HTML5 Kristaps Dzonsons
@ 2014-08-10 16:12     ` Ingo Schwarze
  0 siblings, 0 replies; 7+ messages in thread
From: Ingo Schwarze @ 2014-08-10 16:12 UTC (permalink / raw)
  To: Kristaps Dzonsons; +Cc: discuss

Hi Kristaps,

Kristaps Dzonsons wrote on Sun, Aug 10, 2014 at 01:39:51PM +0200:

> I agree.  In short, unifying under HTML5 will simplify the code (no
> switching)--that much is clear.  I don't care whether it's HTML5 or
> asciidoc

I care that we stay away from asciidoc.  :-)

> so long as it gets the job done.  And for browsers, it's
> between flavours of HTML.  So let's consider why HTML5 instead of
> just HTML4 or XHTML1 as-is.
> 
> First, note that the patch's HTML5 is called "polyglot" HTML5, which
> is to say, HTML5 with XML syntax.
> 
> (Link: <http://dev.w3.org/html5/html-polyglot>.)
> 
> A "pro" is that polyglot HTML5 has the same doctype *and content
> type* for its XHTML and HTML modes.  So we can create well-formed,
> parseable HTML5 mark-up using strict XML syntax, then serve it with
> text/html and be happily standards-compliant.  As it is, we put a
> burden on the agency serving -Thtml or -Txhtml pages to know the
> difference.
> 
> The "con" is that by unifying -Thtml/xhtml as HTML5--or anything,
> really--we lose strict HTML4 callers of -Ofragment (XHTML1 callers
> would be fine).  The only caller right now is cgi.c, which
> stipulates HTML4. This can be fixed easily: remove the HTML4 parts
> of to cgi.c's DOCTYPE and close the void img, meta, and link
> elements.  See the "pro" above for why that's also a smart idea.

Fixing that sounds easy indeed and does not seem to have a downside.

As far as i see, the only thing that's really messed up between
HTML 4.01 and XHTML (and hence polyglot HTML 5) is void elements.
Void elements have to be <br/> in polyglot.  That parses as
<br>&gt; in HTML 4.01 if i understand correctly, so strictly
speaking, a document that is supposed to be both valid polyglot 
and valid HTML 4.01 cannot use any void elements.  But i think
ignoring that detail and just shrugging our shoulders with respect
to the extra &gt; that HTML 4.01 parsing would output seems the
best we can do, and good enough.  It is not likely to become a
problem in practice, i think.

> Another "pro" is that we get eqn.  Ingo, this is the "feature"
> you're worried about.

Not really.  I didn't think about that.  What i meant is HTML 5
only syntax creeping in for stuff that can be rendered in HTML 4
because it looks a bit better or even just because it's more modern.

There is no sane way to render eqn in HTML 4, it's beyond the scope.
So a document using eqn will look crappy with a HTML 4 browser in
any case.  Whether that is because the document doesn't attempt to
render the eqn content at all, or renders it in terminal-style,
or emits MathML that the browser cannot handle makes no difference.
So just go ahead emitting MathML for eqn, i have no problem with that.

> And it's a pretty big issue for me (so this
> is a "for me"): all of my equations (in eqn, or really in
> DocBook--which I use with docbook2mdoc for some scientific
> applications--which I'd like to convert into eqn) are lost.  Also
> lost are LAPACK manuals, OpenGL, and a host of other eqn systems.
> If we stick to HTML4, we'd need to cripple ourselves with
> table-based equations.  If we stick with XHTML1, we need to jump
> through namespace hoops.  But with HTML5, we get embedded MathML.

Makes sense to me.

> The "con", yes, is that MathML is a scary feature, and it doesn't
> exist yet.  YET.
> 
> At the end of the day, the browser doesn't really care whether it's
> HTML4, XHTML1, or HTML5.  It'll render regardless.  Callers of
> -Ofragment will care, but right now that's just us.  Ingo, you
> mentioned non-conforming browsers.  Care to point me to one that
> will puke on the HTML5 output from the patch?  Even lynx(1) can read
> that!

I have no specific browser in mind.  My remark was more about
syntax bloat, that is, using elaborate syntax because we can,
as opposed to because it is useful (like for eqn).

> If we're really at loggerheads over it, /adding/ HTML5 is as easy as
> another switch statement of two.  I think it's a good idea
> regardless of whether it's added or replacing for the reasons above.

If Anthony puts forward a good argument why he needs strict HTML 4.01 -
so far, i don't understand why - i guess that's the way to go.
Otherwise, just drop 4.01 and XHTML and call the polyglot output
close enough.

> ...on to other matters: the style-sheet.
> 
> The status quo bugs me because of the header and footer table.
> These have hard-coded widths and alignments to make them look decent
> without a stylesheet.  This is inadvisable in any modern flavour of
> HTML.  At the very least, we should replace the "width" and "height"
> for embedded styles.  Unfortunately, that's a problem: inline styles
> can't be overriden without the "!important" qualifier in CSS, which
> is annoying.  (That's why I used the width attributes in the first
> place.)  So I think that putting just the table styles *before* the
> <link stylesheet /> is a good idea.

Makes perfect sense to me.

> The question goes: if we're
> going to do that, is there anything else that should go there?

Right now, i don't see anything.  If anything crops up later,
with a reasoning as good as the above, it can be added later.

> The stylesheet I put in place does serve an important purpose: it
> prevents overriding styles.  In man(7) and mdoc(7), for example, you
> can't have overlapping styles.  E.g.,
> 
> .Bf Sy
> Hi
> .Ar there
> .Ef
> 
> The "there" doesn't have both styles: they reset when they're
> nested. (There are probably better ways to do it in CSS, but it
> needs to be done one way or another.)  Without some sort of
> style-sheet, font modes will be nested.  Yes, I'm at conflict with
> myself: on the one hand, mdoc(7) does this because consoles
> generally haven't supported overlapping fonts, and I don't like
> console hold-vers in HTML output.  Or should we discard that
> convention?  (groff's -Tps does it too!)

Font-changing blocks that can contain elements or other blocks
are rare in mdoc(7) do not exist in man(7) in the first place:

  .Sh .Ss HEAD (discouraged to contain elements)
  .Dl .Bd -literal
  .Bf 

For .Dl and .Bd, we definitely *want* embedded elements to be
both literal (fixed-width) and italic or bold, respectively.

The .Bf block is almost never useful in the first place; when
do you ever need to embolden or italicise a whole block of text?
And if you do, embedded markup should probably add up, so the
"there" above *should* be bold and italic.

For -Tascii, i'll stay bug-compatible with groff.  But for -Thtml,
lets just do what makes sense.

Yours,
  Ingo
--
 To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: HTML5
  2014-08-10 15:17   ` HTML5 Ingo Schwarze
@ 2014-08-10 17:38     ` Anthony J. Bentley
  0 siblings, 0 replies; 7+ messages in thread
From: Anthony J. Bentley @ 2014-08-10 17:38 UTC (permalink / raw)
  To: Ingo Schwarze; +Cc: discuss

Hi Ingo and Kristaps,

Ingo Schwarze writes:
> Why not?  What downside do you see?

I tried to come up with a technical justification to use HTML4 but
could only come up with aesthetic reasons. And along the way realized
two truths about HTML5 that make me uncomfortable in my decision: first,
that parsing HTML5 really is both simpler and more well-defined (HTML4
still had delusions of SGML-grandeur), and second, that there really is
no sane way to format equations in HTML4; I had not considered the
possibility of eqn at all. So consider my objection to HTML5 output
withdrawn.

> > And if we do that, is there any point to switching doctypes?
> 
> Yes, one minor and one major.
> 
> The minor is less doctype/content-type clutter.
> The major is to get MathML for eqn.
> 
> That also explains why i think validating against both 5 and 4.01
> (for non-eqn content) does have some merit.  Again, if that is
> possible, but as far as i understand, it is.

I don't think this is possible; the whole reason Kristaps' patch
can unify things is because it uses some XML-like syntax that is valid
in HTML5 and invalid in HTML4. Despite that, per above I have no
objections to doing that at this point.

-- 
Anthony J. Bentley
--
 To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-08-10 17:38 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-09 23:33 HTML5 Kristaps Dzonsons
2014-08-10  2:23 ` HTML5 Ingo Schwarze
2014-08-10 11:39   ` HTML5 Kristaps Dzonsons
2014-08-10 16:12     ` HTML5 Ingo Schwarze
2014-08-10  4:33 ` HTML5 Anthony J. Bentley
2014-08-10 15:17   ` HTML5 Ingo Schwarze
2014-08-10 17:38     ` HTML5 Anthony J. Bentley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).