tech@mandoc.bsd.lv
 help / color / Atom feed
* [PATCH mandoc] Add lang attribute to <html>
@ 2019-04-15  8:14 Stephen Gregoratto
  2019-04-23 14:20 ` Stephen Gregoratto
  2019-04-23 20:40 ` Ingo Schwarze
  0 siblings, 2 replies; 3+ messages in thread
From: Stephen Gregoratto @ 2019-04-15  8:14 UTC (permalink / raw)
  To: tech

This patch sets the lang attribute to "en" for all HTML output.
This is required for CSS hyphenation, which is supported by all modern
browsers[1].

Given your comments about non-english manpages[2], I decided that "en" 
is a good default. The alternative would be for mandoc to determine the 
input language and map it to an ISO 639-1 language code.

I've tested adding "hyphens: auto" to mandoc.css on my man.cgi(8) server 
and found good results with Firefox and Chromium.
I didn't add it to the stylesheet in this patch, but if you think this 
could be added in the future I recommend reading this comprehensive 
walkthrough[3] on CSS hyphenation and its fine-grained settings.

[1] https://caniuse.com/#feat=css-hyphens
[2] https://lists.gnu.org/archive/html/groff/2018-12/msg00181.html
[3] http://clagnut.com/blog/2395

Index: cgi.c
===================================================================
RCS file: /cvs/mandoc/cgi.c,v
retrieving revision 1.166
diff -u -p -r1.166 cgi.c
--- cgi.c	6 Mar 2019 12:32:41 -0000	1.166
+++ cgi.c	15 Apr 2019 07:36:49 -0000
@@ -368,7 +368,7 @@ resp_begin_html(int code, const char *ms
 	resp_begin_http(code, msg);
 
 	printf("<!DOCTYPE html>\n"
-	       "<html>\n"
+	       "<html lang=\"en\">\n"
 	       "<head>\n"
 	       "  <meta charset=\"UTF-8\"/>\n"
 	       "  <meta name=\"viewport\""
Index: html.c
===================================================================
RCS file: /cvs/mandoc/html.c,v
retrieving revision 1.254
diff -u -p -r1.254 html.c
--- html.c	3 Mar 2019 13:02:11 -0000	1.254
+++ html.c	15 Apr 2019 07:36:49 -0000
@@ -647,6 +647,9 @@ print_otag(struct html *h, enum htmltag 
 		case 'i':
 			attr = "id";
 			break;
+		case 'l':
+			attr = "lang";
+			break;
 		case '?':
 			attr = arg1;
 			arg1 = va_arg(ap, char *);
Index: mdoc_html.c
===================================================================
RCS file: /cvs/mandoc/mdoc_html.c,v
retrieving revision 1.328
diff -u -p -r1.328 mdoc_html.c
--- mdoc_html.c	1 Mar 2019 10:57:18 -0000	1.328
+++ mdoc_html.c	15 Apr 2019 07:36:49 -0000
@@ -293,7 +293,7 @@ html_mdoc(void *arg, const struct roff_m
 
 	if ((h->oflags & HTML_FRAGMENT) == 0) {
 		print_gen_decls(h);
-		print_otag(h, TAG_HTML, "");
+		print_otag(h, TAG_HTML, "l", "en");
 		if (n != NULL && n->type == ROFFT_COMMENT)
 			print_gen_comment(h, n);
 		t = print_otag(h, TAG_HEAD, "");
-- 
Stephen Gregoratto
PGP: 3FC6 3D0E 2801 C348 1C44 2D34 A80C 0F8E 8BAB EC8B
--
 To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH mandoc] Add lang attribute to <html>
  2019-04-15  8:14 [PATCH mandoc] Add lang attribute to <html> Stephen Gregoratto
@ 2019-04-23 14:20 ` Stephen Gregoratto
  2019-04-23 20:40 ` Ingo Schwarze
  1 sibling, 0 replies; 3+ messages in thread
From: Stephen Gregoratto @ 2019-04-23 14:20 UTC (permalink / raw)
  To: tech

On 2019-04-15 18:14, Stephen Gregoratto wrote:
> This patch sets the lang attribute to "en" for all HTML output.
> This is required for CSS hyphenation, which is supported by all modern
> browsers[1].
> 
> Given your comments about non-english manpages[2], I decided that "en" 
> is a good default. The alternative would be for mandoc to determine the 
> input language and map it to an ISO 639-1 language code.
> 
> I've tested adding "hyphens: auto" to mandoc.css on my man.cgi(8) server 
> and found good results with Firefox and Chromium.
> I didn't add it to the stylesheet in this patch, but if you think this 
> could be added in the future I recommend reading this comprehensive 
> walkthrough[3] on CSS hyphenation and its fine-grained settings.
> 
> [1] https://caniuse.com/#feat=css-hyphens
> [2] https://lists.gnu.org/archive/html/groff/2018-12/msg00181.html
> [3] http://clagnut.com/blog/2395
> 
> Index: cgi.c
> ===================================================================
> RCS file: /cvs/mandoc/cgi.c,v
> retrieving revision 1.166
> diff -u -p -r1.166 cgi.c
> --- cgi.c	6 Mar 2019 12:32:41 -0000	1.166
> +++ cgi.c	15 Apr 2019 07:36:49 -0000
> @@ -368,7 +368,7 @@ resp_begin_html(int code, const char *ms
>  	resp_begin_http(code, msg);
>  
>  	printf("<!DOCTYPE html>\n"
> -	       "<html>\n"
> +	       "<html lang=\"en\">\n"
>  	       "<head>\n"
>  	       "  <meta charset=\"UTF-8\"/>\n"
>  	       "  <meta name=\"viewport\""
> Index: html.c
> ===================================================================
> RCS file: /cvs/mandoc/html.c,v
> retrieving revision 1.254
> diff -u -p -r1.254 html.c
> --- html.c	3 Mar 2019 13:02:11 -0000	1.254
> +++ html.c	15 Apr 2019 07:36:49 -0000
> @@ -647,6 +647,9 @@ print_otag(struct html *h, enum htmltag 
>  		case 'i':
>  			attr = "id";
>  			break;
> +		case 'l':
> +			attr = "lang";
> +			break;
>  		case '?':
>  			attr = arg1;
>  			arg1 = va_arg(ap, char *);
> Index: mdoc_html.c
> ===================================================================
> RCS file: /cvs/mandoc/mdoc_html.c,v
> retrieving revision 1.328
> diff -u -p -r1.328 mdoc_html.c
> --- mdoc_html.c	1 Mar 2019 10:57:18 -0000	1.328
> +++ mdoc_html.c	15 Apr 2019 07:36:49 -0000
> @@ -293,7 +293,7 @@ html_mdoc(void *arg, const struct roff_m
>  
>  	if ((h->oflags & HTML_FRAGMENT) == 0) {
>  		print_gen_decls(h);
> -		print_otag(h, TAG_HTML, "");
> +		print_otag(h, TAG_HTML, "l", "en");
>  		if (n != NULL && n->type == ROFFT_COMMENT)
>  			print_gen_comment(h, n);
>  		t = print_otag(h, TAG_HEAD, "");
> -- 
> Stephen Gregoratto
> PGP: 3FC6 3D0E 2801 C348 1C44 2D34 A80C 0F8E 8BAB EC8B

Not sure if this one got through so ping I guess.
-- 
Stephen Gregoratto
PGP: 3FC6 3D0E 2801 C348 1C44 2D34 A80C 0F8E 8BAB EC8B
--
 To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH mandoc] Add lang attribute to <html>
  2019-04-15  8:14 [PATCH mandoc] Add lang attribute to <html> Stephen Gregoratto
  2019-04-23 14:20 ` Stephen Gregoratto
@ 2019-04-23 20:40 ` Ingo Schwarze
  1 sibling, 0 replies; 3+ messages in thread
From: Ingo Schwarze @ 2019-04-23 20:40 UTC (permalink / raw)
  To: Stephen Gregoratto; +Cc: tech

Hi Stephen,

Stephen Gregoratto wrote on Mon, Apr 15, 2019 at 06:14:14PM +1000:

> This patch sets the lang attribute to "en" for all HTML output.
> This is required for CSS hyphenation, which is supported by all
> modern browsers[1].

Actually, i consider that a downside rather than an advantage.
For technical documents, automatic hyphenation provides no benefit
but risks introducing technical ambiguities.  In a browser, the
very minor gain in beauty matters even less than on a terminal
because browser windows are almost always wider than terminals.

I know that HTML standards recommend specifying the language.
But neither man-cgi(8) nor mandoc -T html can know the language (at
least so far, i don't see any good way to find out), and sometimes
specifying the wrong language is clearly worse than never specifying
any language at all.

> Given your comments about non-english manpages[2],

I do think that maintaining manual pages in non-English languages
is currently riddled with many problems.  Consequently, when
significant gain for English manual pages can be achieved at small
expense for non-English pages, that would probably be the way to go
for now, to be revisited once the more significant problems are
better under control.

However, if very minor (or even irrelevant) gain for English manual
pages would cause substantial problems for non-English languages,
that's not good.  For some languages, translating manual pages might
make sense.  The tools shouldn't gratuitously obstruct reading of
non-English manual pages.

> I decided that "en" is a good default.

Your patch does not change a default.  
It hardcodes "en" with no possibility to get anything else,
or to even leave it out.  That seems excessive to me.

On the other hand, i doubt the lang attribute is important
enough to make it optional or configurable.

So i tend to reject the patch, and i'm not asking for an improved
version.  I don't see how it could be improved - which doesn't
mean it cannot, i just don't see how.

> I didn't add it to the stylesheet in this patch, but if you think this 
> could be added in the future I recommend reading this comprehensive 
> walkthrough[3] on CSS hyphenation and its fine-grained settings.

I wrote an automatic hyphenation system for the German language as
a part of a text editor that i wrote at the time for use by myself
and by my father when i was 13 or 14 years old, and the hyphenation
system worked reasonably well even though it needed less than
hundred lines of code - but i lost interest in the topic before
even coming of age...  :-)

And i would certainly be opposed to switching on hyphenation by
default, or to accept any downsides in order to make hyphenation
configurable.

Yours,
  Ingo

> [1] https://caniuse.com/#feat=css-hyphens
> [2] https://lists.gnu.org/archive/html/groff/2018-12/msg00181.html
> [3] http://clagnut.com/blog/2395
--
 To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, back to index

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-15  8:14 [PATCH mandoc] Add lang attribute to <html> Stephen Gregoratto
2019-04-23 14:20 ` Stephen Gregoratto
2019-04-23 20:40 ` Ingo Schwarze

tech@mandoc.bsd.lv

Archives are clonable: git clone --mirror http://inbox.vuxu.org/mandoc-tech

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://inbox.vuxu.org/vuxu.archive.mandoc.tech


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git