tech@mandoc.bsd.lv
 help / color / mirror / Atom feed
* HTML5 redux
@ 2014-08-13 22:23 Kristaps Dzonsons
  2014-08-14  1:25 ` Ingo Schwarze
  0 siblings, 1 reply; 4+ messages in thread
From: Kristaps Dzonsons @ 2014-08-13 22:23 UTC (permalink / raw)
  To: tech

[-- Attachment #1: Type: text/plain, Size: 1071 bytes --]

Hi folks,

Enclosed is another patch for HTML5.  This one is designed to *add* 
HTML5.  So it creates an extra shim.  Subsequent patches would then 
remove the existing functionality.  I think this is a good step: first 
we add it, then we can start to decrease the complexity.  It consists of 
the following:

  - a single <meta charset="utf-8" />
  - removed summary, align, and width attributes from header/footer
  - remove <col /> from header/footer
  - add a <style /> shim *before* the <link /> to provide a default 
layout of the header/footer (instead of the hard-coded crap)

(The <style /> shim would work with HTML4 as well, but eh.)

With this patch, both -Thtml and -Txhtml will produce HTML5.  This can 
be disabled in html.c by replacing the HTML_HTML5 allocation with the 
respective type.  The resulting HTML5 validates just fine, as do the 
existing modes (if replaced in html.c).

I've also added an initial patch for man.cgi to be HTML5.  Easy, no? 
The tags will also need to be lowercased, but that's purely mechanical.

Thoughts?

Best,

Kristaps

[-- Attachment #2: html5_test2.diff --]
[-- Type: text/plain, Size: 11103 bytes --]

? Makefile.depend.patch
? apropos
? article-template.xml
? article1.html
? article1.xml
? cgi.h
? config.h
? config.log
? demandoc
? foo.1
? foo.man
? foo.ps
? gluPerspective.3
? gluPerspective.html
? hspaces.diff
? html5.diff
? html5_test2.diff
? makewhatis
? mandoc
? mandoc.1.html
? mandoc.html
? mandocdb
? patch
? preconv
? roff_res_charwidth.patch
? scale.diff
? test-dirent-namlen.dSYM
? test-fgetln.dSYM
? test-fts.dSYM
? test-getsubopt.dSYM
? test-mmap.dSYM
? test-strcasestr.dSYM
? test-strlcat.dSYM
? test-strlcpy.dSYM
? test-strptime.dSYM
? test-strsep.dSYM
? test.1
? test.1.html
? test.1.ps
? test.2
? test.2.ps
? test.ps
? testm.ps
? testn.ps
? unit_charwidth.patch
Index: html.c
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/html.c,v
retrieving revision 1.162
diff -u -p -r1.162 html.c
--- html.c	13 Aug 2014 20:34:29 -0000	1.162
+++ html.c	13 Aug 2014 21:29:43 -0000
@@ -68,13 +68,14 @@ static	const struct htmldata htmltags[TA
 	{"dt",		HTML_CLRLINE}, /* TAG_DT */
 	{"dd",		HTML_CLRLINE}, /* TAG_DD */
 	{"blockquote",	HTML_CLRLINE}, /* TAG_BLOCKQUOTE */
-	{"p",		HTML_CLRLINE | HTML_NOSTACK | HTML_AUTOCLOSE}, /* TAG_P */
+	{"p",		HTML_CLRLINE}, /* TAG_P */
 	{"pre",		HTML_CLRLINE }, /* TAG_PRE */
 	{"b",		0 }, /* TAG_B */
 	{"i",		0 }, /* TAG_I */
 	{"code",	0 }, /* TAG_CODE */
 	{"small",	0 }, /* TAG_SMALL */
 	{"em",		0 }, /* TAG_EM */
+	{"style",	HTML_CLRLINE }, /* TAG_STYLE */
 };
 
 static	const char	*const htmlattrs[ATTR_MAX] = {
@@ -92,6 +93,7 @@ static	const char	*const htmlattrs[ATTR_
 	"summary", /* ATTR_SUMMARY */
 	"align", /* ATTR_ALIGN */
 	"colspan", /* ATTR_COLSPAN */
+	"charset", /* ATTR_CHARSET */
 };
 
 static	const char	*const roffscales[SCALE_MAX] = {
@@ -167,7 +169,7 @@ void *
 xhtml_alloc(char *outopts)
 {
 
-	return(ml_alloc(outopts, HTML_XHTML_1_0_STRICT));
+	return(ml_alloc(outopts, HTML_HTML5));
 }
 
 void
@@ -193,18 +195,47 @@ void
 print_gen_head(struct html *h)
 {
 	struct htmlpair	 tag[4];
+	struct tag	*t;
 
-	tag[0].key = ATTR_HTTPEQUIV;
-	tag[0].val = "Content-Type";
-	tag[1].key = ATTR_CONTENT;
-	tag[1].val = "text/html; charset=utf-8";
-	print_otag(h, TAG_META, 2, tag);
-
-	tag[0].key = ATTR_NAME;
-	tag[0].val = "resource-type";
-	tag[1].key = ATTR_CONTENT;
-	tag[1].val = "document";
-	print_otag(h, TAG_META, 2, tag);
+	if (HTML_HTML5 == h->type) {
+		tag[0].key = ATTR_CHARSET;
+		tag[0].val = "utf-8";
+		print_otag(h, TAG_META, 1, tag);
+	} else {
+		tag[0].key = ATTR_HTTPEQUIV;
+		tag[0].val = "Content-Type";
+		tag[1].key = ATTR_CONTENT;
+		tag[1].val = "text/html; charset=utf-8";
+		print_otag(h, TAG_META, 2, tag);
+
+		tag[0].key = ATTR_NAME;
+		tag[0].val = "resource-type";
+		tag[1].key = ATTR_CONTENT;
+		tag[1].val = "document";
+		print_otag(h, TAG_META, 2, tag);
+	}
+
+	if (HTML_HTML5 == h->type) {
+		/*
+		 * To preserve the general look of a manual, we begin
+		 * with some default CSS rules for the header and
+		 * footer.
+		 * Put these before the <link /> so that any CSS file
+		 * will override them.
+		 */
+		t = print_otag(h, TAG_STYLE, 0, NULL);
+		print_text(h, 
+			"table.head, table.foot { width: 100%; }\n"
+			"table.head td:last-child, "
+			 "table.foot td:last-child "
+			 "{ text-align: right; }\n"
+			"table.head td.head-ltitle.head-rtitle "
+			 "{ width: 10%; }\n"
+			"table.head td.head-vol "
+			 "{ width: 80%; text-align: center; }\n"
+			"table.foot td { width: 50%; }\n");
+		print_tagq(h, t);
+	}
 
 	if (h->style) {
 		tag[0].key = ATTR_REL;
@@ -216,7 +247,7 @@ print_gen_head(struct html *h)
 		tag[3].key = ATTR_MEDIA;
 		tag[3].val = "all";
 		print_otag(h, TAG_LINK, 4, tag);
-	}
+	} 
 }
 
 static void
@@ -506,6 +537,7 @@ print_otag(struct html *h, enum htmltag 
 
 	if (HTML_AUTOCLOSE & htmltags[tag].flags)
 		switch (h->type) {
+		case HTML_HTML5:
 		case HTML_XHTML_1_0_STRICT:
 			putchar('/');
 			break;
@@ -542,6 +574,9 @@ print_gen_decls(struct html *h)
 	const char	*name;
 
 	switch (h->type) {
+	case HTML_HTML5:
+		puts("<!DOCTYPE html>");
+		return;
 	case HTML_HTML_4_01_STRICT:
 		name = "HTML";
 		doctype = "-//W3C//DTD HTML 4.01//EN";
Index: html.h
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/html.h,v
retrieving revision 1.52
diff -u -p -r1.52 html.h
--- html.h	13 Aug 2014 15:25:22 -0000	1.52
+++ html.h	13 Aug 2014 21:29:43 -0000
@@ -51,6 +51,7 @@ enum	htmltag {
 	TAG_CODE,
 	TAG_SMALL,
 	TAG_EM,
+	TAG_STYLE,
 	TAG_MAX
 };
 
@@ -69,6 +70,7 @@ enum	htmlattr {
 	ATTR_SUMMARY,
 	ATTR_ALIGN,
 	ATTR_COLSPAN,
+	ATTR_CHARSET,
 	ATTR_MAX
 };
 
@@ -108,7 +110,8 @@ struct	htmlpair {
 
 enum	htmltype {
 	HTML_HTML_4_01_STRICT,
-	HTML_XHTML_1_0_STRICT
+	HTML_XHTML_1_0_STRICT,
+	HTML_HTML5
 };
 
 struct	html {
Index: man_html.c
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/man_html.c,v
retrieving revision 1.97
diff -u -p -r1.97 man_html.c
--- man_html.c	10 Aug 2014 23:54:41 -0000	1.97
+++ man_html.c	13 Aug 2014 21:29:44 -0000
@@ -307,14 +307,19 @@ man_root_pre(MAN_ARGS)
 	assert(man->msec);
 	mandoc_asprintf(&title, "%s(%s)", man->title, man->msec);
 
-	PAIR_SUMMARY_INIT(&tag[0], "Document Header");
-	PAIR_CLASS_INIT(&tag[1], "head");
-	PAIR_INIT(&tag[2], ATTR_WIDTH, "100%");
-	t = print_otag(h, TAG_TABLE, 3, tag);
-	PAIR_INIT(&tag[0], ATTR_WIDTH, "30%");
-	print_otag(h, TAG_COL, 1, tag);
-	print_otag(h, TAG_COL, 1, tag);
-	print_otag(h, TAG_COL, 1, tag);
+	if (h->type != HTML_HTML5) {
+		PAIR_SUMMARY_INIT(&tag[0], "Document Header");
+		PAIR_CLASS_INIT(&tag[1], "head");
+		PAIR_INIT(&tag[2], ATTR_WIDTH, "100%");
+		t = print_otag(h, TAG_TABLE, 3, tag);
+		PAIR_INIT(&tag[0], ATTR_WIDTH, "30%");
+		print_otag(h, TAG_COL, 1, tag);
+		print_otag(h, TAG_COL, 1, tag);
+		print_otag(h, TAG_COL, 1, tag);
+	} else {
+		PAIR_CLASS_INIT(&tag[0], "head");
+		t = print_otag(h, TAG_TABLE, 1, tag);
+	}
 
 	print_otag(h, TAG_TBODY, 0, NULL);
 
@@ -326,15 +331,21 @@ man_root_pre(MAN_ARGS)
 	print_stagq(h, tt);
 
 	PAIR_CLASS_INIT(&tag[0], "head-vol");
-	PAIR_INIT(&tag[1], ATTR_ALIGN, "center");
-	print_otag(h, TAG_TD, 2, tag);
+	if (HTML_HTML5 != h->type) {
+		PAIR_INIT(&tag[1], ATTR_ALIGN, "center");
+		print_otag(h, TAG_TD, 2, tag);
+	} else 
+		print_otag(h, TAG_TD, 1, tag);
 	if (NULL != man->vol)
 		print_text(h, man->vol);
 	print_stagq(h, tt);
 
 	PAIR_CLASS_INIT(&tag[0], "head-rtitle");
-	PAIR_INIT(&tag[1], ATTR_ALIGN, "right");
-	print_otag(h, TAG_TD, 2, tag);
+	if (HTML_HTML5 != h->type) {
+		PAIR_INIT(&tag[1], ATTR_ALIGN, "right");
+		print_otag(h, TAG_TD, 2, tag);
+	} else 
+		print_otag(h, TAG_TD, 1, tag);
 	print_text(h, title);
 	print_tagq(h, t);
 	free(title);
@@ -346,13 +357,18 @@ man_root_post(MAN_ARGS)
 	struct htmlpair	 tag[3];
 	struct tag	*t, *tt;
 
-	PAIR_SUMMARY_INIT(&tag[0], "Document Footer");
-	PAIR_CLASS_INIT(&tag[1], "foot");
-	PAIR_INIT(&tag[2], ATTR_WIDTH, "100%");
-	t = print_otag(h, TAG_TABLE, 3, tag);
-	PAIR_INIT(&tag[0], ATTR_WIDTH, "50%");
-	print_otag(h, TAG_COL, 1, tag);
-	print_otag(h, TAG_COL, 1, tag);
+	if (h->type != HTML_HTML5) {
+		PAIR_SUMMARY_INIT(&tag[0], "Document Footer");
+		PAIR_CLASS_INIT(&tag[1], "foot");
+		PAIR_INIT(&tag[2], ATTR_WIDTH, "100%");
+		t = print_otag(h, TAG_TABLE, 3, tag);
+		PAIR_INIT(&tag[0], ATTR_WIDTH, "50%");
+		print_otag(h, TAG_COL, 1, tag);
+		print_otag(h, TAG_COL, 1, tag);
+	} else {
+		PAIR_CLASS_INIT(&tag[0], "foot");
+		t = print_otag(h, TAG_TABLE, 1, tag);
+	}
 
 	tt = print_otag(h, TAG_TR, 0, NULL);
 
@@ -364,8 +380,11 @@ man_root_post(MAN_ARGS)
 	print_stagq(h, tt);
 
 	PAIR_CLASS_INIT(&tag[0], "foot-os");
-	PAIR_INIT(&tag[1], ATTR_ALIGN, "right");
-	print_otag(h, TAG_TD, 2, tag);
+	if (HTML_HTML5 != h->type) {
+		PAIR_INIT(&tag[1], ATTR_ALIGN, "right");
+		print_otag(h, TAG_TD, 2, tag);
+	} else
+		print_otag(h, TAG_TD, 1, tag);
 
 	if (man->source)
 		print_text(h, man->source);
Index: mdoc_html.c
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/mdoc_html.c,v
retrieving revision 1.197
diff -u -p -r1.197 mdoc_html.c
--- mdoc_html.c	13 Aug 2014 15:25:22 -0000	1.197
+++ mdoc_html.c	13 Aug 2014 21:29:44 -0000
@@ -487,13 +487,18 @@ mdoc_root_post(MDOC_ARGS)
 	struct htmlpair	 tag[3];
 	struct tag	*t, *tt;
 
-	PAIR_SUMMARY_INIT(&tag[0], "Document Footer");
-	PAIR_CLASS_INIT(&tag[1], "foot");
-	PAIR_INIT(&tag[2], ATTR_WIDTH, "100%");
-	t = print_otag(h, TAG_TABLE, 3, tag);
-	PAIR_INIT(&tag[0], ATTR_WIDTH, "50%");
-	print_otag(h, TAG_COL, 1, tag);
-	print_otag(h, TAG_COL, 1, tag);
+	if (HTML_HTML5 != h->type) {
+		PAIR_SUMMARY_INIT(&tag[0], "Document Footer");
+		PAIR_CLASS_INIT(&tag[1], "foot");
+		PAIR_INIT(&tag[2], ATTR_WIDTH, "100%");
+		t = print_otag(h, TAG_TABLE, 3, tag);
+		PAIR_INIT(&tag[0], ATTR_WIDTH, "50%");
+		print_otag(h, TAG_COL, 1, tag);
+		print_otag(h, TAG_COL, 1, tag);
+	} else {
+		PAIR_CLASS_INIT(&tag[0], "foot");
+		t = print_otag(h, TAG_TABLE, 1, tag);
+	}
 
 	print_otag(h, TAG_TBODY, 0, NULL);
 
@@ -505,8 +510,11 @@ mdoc_root_post(MDOC_ARGS)
 	print_stagq(h, tt);
 
 	PAIR_CLASS_INIT(&tag[0], "foot-os");
-	PAIR_INIT(&tag[1], ATTR_ALIGN, "right");
-	print_otag(h, TAG_TD, 2, tag);
+	if (HTML_HTML5 != h->type) {
+		PAIR_INIT(&tag[1], ATTR_ALIGN, "right");
+		print_otag(h, TAG_TD, 2, tag);
+	} else
+		print_otag(h, TAG_TD, 1, tag);
 	print_text(h, meta->os);
 	print_tagq(h, t);
 }
@@ -530,14 +538,19 @@ mdoc_root_pre(MDOC_ARGS)
 		mandoc_asprintf(&title, "%s(%s)",
 		    meta->title, meta->msec);
 
-	PAIR_SUMMARY_INIT(&tag[0], "Document Header");
-	PAIR_CLASS_INIT(&tag[1], "head");
-	PAIR_INIT(&tag[2], ATTR_WIDTH, "100%");
-	t = print_otag(h, TAG_TABLE, 3, tag);
-	PAIR_INIT(&tag[0], ATTR_WIDTH, "30%");
-	print_otag(h, TAG_COL, 1, tag);
-	print_otag(h, TAG_COL, 1, tag);
-	print_otag(h, TAG_COL, 1, tag);
+	if (HTML_HTML5 != h->type) {
+		PAIR_SUMMARY_INIT(&tag[0], "Document Header");
+		PAIR_CLASS_INIT(&tag[1], "head");
+		PAIR_INIT(&tag[2], ATTR_WIDTH, "100%");
+		t = print_otag(h, TAG_TABLE, 3, tag);
+		PAIR_INIT(&tag[0], ATTR_WIDTH, "30%");
+		print_otag(h, TAG_COL, 1, tag);
+		print_otag(h, TAG_COL, 1, tag);
+		print_otag(h, TAG_COL, 1, tag);
+	} else {
+		PAIR_CLASS_INIT(&tag[0], "head");
+		t = print_otag(h, TAG_TABLE, 1, tag);
+	}
 
 	print_otag(h, TAG_TBODY, 0, NULL);
 
@@ -549,14 +562,20 @@ mdoc_root_pre(MDOC_ARGS)
 	print_stagq(h, tt);
 
 	PAIR_CLASS_INIT(&tag[0], "head-vol");
-	PAIR_INIT(&tag[1], ATTR_ALIGN, "center");
-	print_otag(h, TAG_TD, 2, tag);
+	if (HTML_HTML5 != h->type) {
+		PAIR_INIT(&tag[1], ATTR_ALIGN, "center");
+		print_otag(h, TAG_TD, 2, tag);
+	} else
+		print_otag(h, TAG_TD, 1, tag);
 	print_text(h, volume);
 	print_stagq(h, tt);
 
 	PAIR_CLASS_INIT(&tag[0], "head-rtitle");
-	PAIR_INIT(&tag[1], ATTR_ALIGN, "right");
-	print_otag(h, TAG_TD, 2, tag);
+	if (HTML_HTML5 != h->type) {
+		PAIR_INIT(&tag[1], ATTR_ALIGN, "right");
+		print_otag(h, TAG_TD, 2, tag);
+	} else
+		print_otag(h, TAG_TD, 1, tag);
 	print_text(h, title);
 	print_tagq(h, t);
 

[-- Attachment #3: html5_cgi.diff --]
[-- Type: text/plain, Size: 863 bytes --]

Index: cgi.c
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/cgi.c,v
retrieving revision 1.93
diff -u -p -r1.93 cgi.c
--- cgi.c	10 Aug 2014 23:54:41 -0000	1.93
+++ cgi.c	13 Aug 2014 22:20:29 -0000
@@ -375,13 +375,10 @@ resp_begin_html(int code, const char *ms
 
 	resp_begin_http(code, msg);
 
-	printf("<!DOCTYPE HTML PUBLIC "
-	       " \"-//W3C//DTD HTML 4.01//EN\""
-	       " \"http://www.w3.org/TR/html4/strict.dtd\">\n"
+	printf("<!DOCTYPE public>\n"
 	       "<HTML>\n"
 	       "<HEAD>\n"
-	       "<META HTTP-EQUIV=\"Content-Type\""
-	       " CONTENT=\"text/html; charset=utf-8\">\n"
+	       "<META CHARSET=\"utf-8\">\n"
 	       "<LINK REL=\"stylesheet\" HREF=\"%s/man-cgi.css\""
 	       " TYPE=\"text/css\" media=\"all\">\n"
 	       "<LINK REL=\"stylesheet\" HREF=\"%s/man.css\""

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: HTML5 redux
  2014-08-13 22:23 HTML5 redux Kristaps Dzonsons
@ 2014-08-14  1:25 ` Ingo Schwarze
  2014-08-14 20:26   ` Kristaps Dzonsons
  0 siblings, 1 reply; 4+ messages in thread
From: Ingo Schwarze @ 2014-08-14  1:25 UTC (permalink / raw)
  To: Kristaps Dzonsons; +Cc: tech

Hi Kristaps,

Kristaps Dzonsons wrote on Thu, Aug 14, 2014 at 12:23:05AM +0200:

> Enclosed is another patch for HTML5.  This one is designed to *add*
> HTML5.  So it creates an extra shim.  Subsequent patches would then
> remove the existing functionality.  I think this is a good step:
> first we add it, then we can start to decrease the complexity.

I don't object to that route.

I'd probably not merge to the VERSION_1_12 branch until the release
is done, and i'd probably merge to OpenBSD after the cleanup is done.

> It consists of the following:
> 
>  - a single <meta charset="utf-8" />
>  - removed summary, align, and width attributes from header/footer
>  - remove <col /> from header/footer
>  - add a <style /> shim *before* the <link /> to provide a default
> layout of the header/footer (instead of the hard-coded crap)
> 
> (The <style /> shim would work with HTML4 as well, but eh.)
> 
> With this patch, both -Thtml and -Txhtml will produce HTML5.  This
> can be disabled in html.c by replacing the HTML_HTML5 allocation
> with the respective type.  The resulting HTML5 validates just fine,
> as do the existing modes (if replaced in html.c).

Not for me, there is one regression.

But if you remove the chunk cited below, the regression goes
away, so i'd say, go ahead - in particular if you plan to do the
cleanup right afterwards.

> Index: html.c
> ===================================================================
> RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/html.c,v
> retrieving revision 1.162
> diff -u -p -r1.162 html.c
> --- html.c	13 Aug 2014 20:34:29 -0000	1.162
> +++ html.c	13 Aug 2014 21:29:43 -0000
> @@ -68,13 +68,14 @@ static	const struct htmldata htmltags[TA
>  	{"dt",		HTML_CLRLINE}, /* TAG_DT */
>  	{"dd",		HTML_CLRLINE}, /* TAG_DD */
>  	{"blockquote",	HTML_CLRLINE}, /* TAG_BLOCKQUOTE */
> -	{"p",		HTML_CLRLINE | HTML_NOSTACK | HTML_AUTOCLOSE}, /* TAG_P */
> +	{"p",		HTML_CLRLINE}, /* TAG_P */

This change causes regressions for me.
If i remove that chunk, it works fine.

Sample output:

<div class="section">
<h1 id="x4445534352495054494f4e">DESCRIPTION</h1> normal line of text second nor
mal line<br>
 line with a leading space <span class="unix">UNIX</span> normal line after a ma
cro line <span class="unix">UNIX</span><br>
 leading space after a macro line<p>   ### look here
<pre style="margin-left: 0.00ex;" class="lit display">
normal line in a literal display 
 leading space in a literal display 
another normal line</pre>
</p>   ### and here
<p>
<div style="margin-left: 0.00ex;" class="display">
normal line in a filled display<br>
 leading space in a filled display another normal line</div>
</p>   ### and here
</div>

Sample input:

/usr/src/regress/usr.bin/mandoc/char/space/leading-mdoc.in

Yours,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: HTML5 redux
  2014-08-14  1:25 ` Ingo Schwarze
@ 2014-08-14 20:26   ` Kristaps Dzonsons
  2014-08-14 23:50     ` Ingo Schwarze
  0 siblings, 1 reply; 4+ messages in thread
From: Kristaps Dzonsons @ 2014-08-14 20:26 UTC (permalink / raw)
  To: Ingo Schwarze; +Cc: tech

> I don't object to that route.
>
> I'd probably not merge to the VERSION_1_12 branch until the release
> is done, and i'd probably merge to OpenBSD after the cleanup is done.

Ingo,

Ok!  But as seen below, there's more to do before committing...

>> With this patch, both -Thtml and -Txhtml will produce HTML5.  This
>> can be disabled in html.c by replacing the HTML_HTML5 allocation
>> with the respective type.  The resulting HTML5 validates just fine,
>> as do the existing modes (if replaced in html.c).
>
> Not for me, there is one regression.
>
> But if you remove the chunk cited below, the regression goes
> away, so i'd say, go ahead - in particular if you plan to do the
> cleanup right afterwards.

This is a bit trickier than appears.

Basically, <p /> doesn't exist in HTML5 because <p> isn't a void 
element.  However, we can't just remove the auto-close parts of <p> 
because then we might have

  <p>
    Hello
    <div>world</div>
  </p>

Which is invalid.

http://www.w3.org/TR/html-markup/p.html#p

In short, mdoc(7) and man(7) can have all sorts of nested block scopes, 
and so can HTML, but <p> can't contain flow elements.

I think it's best, and cleanest, to just avoid <p> and use <div>, say as 
<div class="par">.

I understand we lose whatever semantic-ness may be acquired from <p>, 
but I don't think it's much of a loss given that, with a classed <div>, 
we'll have *real* semantics (the flow elements will be within the 
paragraph).

Naturally, this would work happily with HTML4 just as it would with HTML5.

I'll work on a patch and will submit it soon...

Best,

Kristaps
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: HTML5 redux
  2014-08-14 20:26   ` Kristaps Dzonsons
@ 2014-08-14 23:50     ` Ingo Schwarze
  0 siblings, 0 replies; 4+ messages in thread
From: Ingo Schwarze @ 2014-08-14 23:50 UTC (permalink / raw)
  To: Kristaps Dzonsons; +Cc: tech

Hi Kristaps,

Kristaps Dzonsons wrote on Thu, Aug 14, 2014 at 10:26:45PM +0200:
> Ingo Schwarze wrote:
>> Kristaps Dzonsons wrote:

>>> With this patch, both -Thtml and -Txhtml will produce HTML5.  This
>>> can be disabled in html.c by replacing the HTML_HTML5 allocation
>>> with the respective type.  The resulting HTML5 validates just fine,
>>> as do the existing modes (if replaced in html.c).

>> Not for me, there is one regression.
>> But if you remove the chunk cited below, the regression goes
>> away, so i'd say, go ahead - in particular if you plan to do the
>> cleanup right afterwards.

> This is a bit trickier than appears.

I fear i don't follow where the problem is.

What i'm saying is that

 - <p> output is valid when you leave the HTML_NOSTACK in place
 - <p> output is (often, not always) invalid when you delete it

> Basically, <p /> doesn't exist in HTML5 because <p> isn't a void
> element.

Sure.  However, i don't see how that is relevant.  For HTML5,
html.c never generates <p/> no matter what we do with HTML_NOSTACK.

Two remarks regarding HTML_AUTOCLOSE:

 1) It is obsolete.  It is set if and only if HTML_NOSTACK is set,
    so one flag would suffice.

 2) After the full switch to HTML5, it is obsolete for a second
    reason: it is never used for anything in HTML5 mode.
    Already now, the only place where it is used for syntactically
    significant output, in print_otag(), is only executed when
    HTML_XHTML_1_0_STRICT is set.

> However, we can't just remove the auto-close parts of <p>
> because then we might have
> 
>  <p>
>    Hello
>    <div>world</div>
>  </p>
> 
> Which is invalid.

Exactly what is said.

But wy don't we just leave the HTML_NOSTACK as it is now?

> http://www.w3.org/TR/html-markup/p.html#p
> 
> In short, mdoc(7) and man(7) can have all sorts of nested block
> scopes, and so can HTML, but <p> can't contain flow elements.

Sure.  It just ends implicitly when a flow element begins.

> I think it's best, and cleanest, to just avoid <p> and use <div>,
> say as <div class="par">.

I don't understand which problem you think that might solve.

But it would make producing well-formed HTML5 output harder,
not easier.  Because, if you make .Pp a <div>, how do you know
when to close it out?  It don't say you cannot come up with a
working algorithm, you certainly can.  But you just make it
harder than if you just open it and never ever close it again.

> I understand we lose whatever semantic-ness may be acquired from
> <p>, but I don't think it's much of a loss

Actually, it is no loss at all, since .Pp is *not* implemented
as a block but as an in_line element.  It doesn't mark up any
content in any way, it just says "put a paragraph break here",
exactly like .br says "put a line break here".

> given that, with a classed <div>, we'll have *real* semantics
> (the flow elements will be within the paragraph).

I have no idea what the advantage would be...

> Naturally, this would work happily with HTML4 just as it
> would with HTML5.

So does what we have now as far as i can see.

> I'll work on a patch and will submit it soon...

Actually, there is one problem with <p>, or rather, with <pre>.
The .Bd block is marked up as <pre>, which can only contain
phrasing, but .Bd can contain all sorts of blocks and .Pp.
But that's unrelated to the HTML5 switch.

Yours,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-08-14 23:50 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-13 22:23 HTML5 redux Kristaps Dzonsons
2014-08-14  1:25 ` Ingo Schwarze
2014-08-14 20:26   ` Kristaps Dzonsons
2014-08-14 23:50     ` Ingo Schwarze

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).