From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from scc-mailout-kit-02.scc.kit.edu (scc-mailout-kit-02.scc.kit.edu [129.13.231.82]) by fantadrom.bsd.lv (OpenSMTPD) with ESMTP id d8abdbe3 for ; Fri, 14 Jul 2017 07:38:10 -0500 (EST) Received: from asta-nat.asta.uni-karlsruhe.de ([172.22.63.82] helo=hekate.usta.de) by scc-mailout-kit-02.scc.kit.edu with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (envelope-from ) id 1dVzr4-0006G7-3i; Fri, 14 Jul 2017 14:38:09 +0200 Received: from donnerwolke.usta.de ([172.24.96.3]) by hekate.usta.de with esmtp (Exim 4.77) (envelope-from ) id 1dVzr3-0006gf-DB; Fri, 14 Jul 2017 14:37:57 +0200 Received: from athene.usta.de ([172.24.96.10]) by donnerwolke.usta.de with esmtp (Exim 4.84_2) (envelope-from ) id 1dVzr3-00086d-4N; Fri, 14 Jul 2017 14:37:57 +0200 Received: from localhost (athene.usta.de [local]) by athene.usta.de (OpenSMTPD) with ESMTPA id f1dde157; Fri, 14 Jul 2017 14:37:57 +0200 (CEST) Date: Fri, 14 Jul 2017 14:37:57 +0200 From: Ingo Schwarze To: "Anthony J. Bentley" Cc: tech@mandoc.bsd.lv Subject: Re: -Thtml: Use hexadecimal character references Message-ID: <20170714123757.GA32226@athene.usta.de> References: <60588.1499994336@cathet.us> X-Mailinglist: mandoc-tech Reply-To: tech@mandoc.bsd.lv MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <60588.1499994336@cathet.us> User-Agent: Mutt/1.6.2 (2016-07-01) Hi, Anthony J. Bentley wrote on Thu, Jul 13, 2017 at 07:05:36PM -0600: > Since Unicode codepoints are universally referred to in hexadecimal, > this would make the HTML source easier to reason about while debugging. > I have learned many hexadecimal codepoints over the years, but not a > single one in decimal. I agree. But codepoints conventionally use capital hex digits and at least four of them. Feel free to commit in the following form, if you agree, and i'll merge to bsd.lv and adjust the test suite. Thanks, Ingo Index: html.c =================================================================== RCS file: /cvs/src/usr.bin/mandoc/html.c,v retrieving revision 1.85 diff -u -p -r1.85 html.c --- html.c 23 Jun 2017 02:31:39 -0000 1.85 +++ html.c 14 Jul 2017 12:33:16 -0000 @@ -451,7 +451,7 @@ print_encode(struct html *h, const char (c > 0x7E && c < 0xA0)) c = 0xFFFD; if (c > 0x7E) { - (void)snprintf(numbuf, sizeof(numbuf), "&#%d;", c); + (void)snprintf(numbuf, sizeof(numbuf), "&#x%.4X;", c); print_word(h, numbuf); } else if (print_escape(h, c) == 0) print_byte(h, c); @@ -514,7 +514,7 @@ print_otag(struct html *h, enum htmltag print_indent(h); else if ((h->flags & HTML_NOSPACE) == 0) { if (h->flags & HTML_KEEP) - print_word(h, " "); + print_word(h, " "); else { if (h->flags & HTML_PREKEEP) h->flags |= HTML_KEEP; @@ -777,7 +777,7 @@ print_text(struct html *h, const char *w h->flags |= HTML_KEEP; print_endword(h); } else - print_word(h, " "); + print_word(h, " "); } assert(NULL == h->metaf); -- To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv