From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-it0-f41.google.com (mail-it0-f41.google.com [209.85.214.41]) by fantadrom.bsd.lv (OpenSMTPD) with ESMTP id bf63f752 for ; Thu, 13 Jul 2017 20:05:39 -0500 (EST) Received: by mail-it0-f41.google.com with SMTP id v202so8752499itb.0 for ; Thu, 13 Jul 2017 18:05:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cathet-us.20150623.gappssmtp.com; s=20150623; h=sender:from:to:subject:mime-version:content-id:date:message-id; bh=nwhPWtse+uIk/bbw43s76DY9q75jy82vLAtnMaZr86Q=; b=fUzCe06gJtgd0LizykhDOSVWs9uPAHn+95qpNY+925lgQL3kXhtxs12L2Q+iQTKD0x 36f0q7hCb+9rRTZyeFCoCqk+8pY57SO3+e7cUO7M3YT9FnT7ZtoY7g/U8AIcJkj6NIXj Dma1qqSUYaNaVBYzLrH2T5wrFT1BgclUe1N9QolySNa1UgP/rTqnMoXQ2YzTDI+QukLf f806KJO2nDa1V3ey8uhUO9//XMC8NXE5V8wX+esilJF9FA/K9QQX7iNE712ZL6cCKq/i IV3Dm4zmUDbQ5J7EZI5jZ5zoCivgmPp1sn+5XEgm0xd03jisjiE8DnFykokXGR4wsTsQ S5PQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:subject:mime-version:content-id :date:message-id; bh=nwhPWtse+uIk/bbw43s76DY9q75jy82vLAtnMaZr86Q=; b=of+X+0tL5xUFLw62DUxidACIZxEUCT5gRuzOKVWNfFp3Nqf6WHNOu7pqM/xaDQIV77 eh8rF1XZSqgNSR+bYVyx/578kJJ9O7OO8EUsAacthSsbdn0mPjdvnccOg/wdtgD++Dg8 1LkNMvBgqTEaryq1Xb7fsL+10nzOlhorP0sjXA71dDMV03N1nKZ/6wE4uYP68r+eaC4l JyX6DhZyU0WBihjoOJv1NDwZrBlwcs0iFqD2Gq/LZ5dDwPZoAiPuBz91+QUpuMqw0Nlv eg4bM9tiK+kie8yqKprvfqtpp6E5Pu/dnmORtyvU8/l4kko6TKNJDABm2sfNd8X39kyH 0+pA== X-Gm-Message-State: AIVw112ADy0ErEFfGe2+2pLYdqtQqSixWzBcQZjJOLb0ytiSvJ4bOh4g 0zWUiUNP4n1YVomeTYQ= X-Received: by 10.107.145.194 with SMTP id t185mr6218996iod.58.1499994338197; Thu, 13 Jul 2017 18:05:38 -0700 (PDT) Received: from cathet.us. (75-161-120-32.albq.qwest.net. [75.161.120.32]) by smtp.gmail.com with ESMTPSA id q81sm544085itb.28.2017.07.13.18.05.37 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 13 Jul 2017 18:05:37 -0700 (PDT) Sender: "Anthony J. Bentley" Received: from cathet.us (localhost [127.0.0.1]) by cathet.us. (OpenSMTPD) with ESMTP id 78f8c3bd for ; Thu, 13 Jul 2017 19:05:36 -0600 (MDT) From: "Anthony J. Bentley" To: tech@mandoc.bsd.lv Subject: -Thtml: Use hexadecimal character references X-Mailinglist: mandoc-tech Reply-To: tech@mandoc.bsd.lv MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <51397.1499994336.1@cathet.us> Date: Thu, 13 Jul 2017 19:05:36 -0600 Message-ID: <60588.1499994336@cathet.us> Hi, Since Unicode codepoints are universally referred to in hexadecimal, this would make the HTML source easier to reason about while debugging. I have learned many hexadecimal codepoints over the years, but not a single one in decimal. The longest character reference, "􏿿", is the same length as "􏿿", so numbuf doesn't need to be increased in size. Index: html.c =================================================================== RCS file: /cvs/src/usr.bin/mandoc/html.c,v retrieving revision 1.85 diff -u -p -r1.85 html.c --- html.c 23 Jun 2017 02:31:39 -0000 1.85 +++ html.c 13 Jul 2017 16:11:23 -0000 @@ -451,7 +451,7 @@ print_encode(struct html *h, const char (c > 0x7E && c < 0xA0)) c = 0xFFFD; if (c > 0x7E) { - (void)snprintf(numbuf, sizeof(numbuf), "&#%d;", c); + (void)snprintf(numbuf, sizeof(numbuf), "&#x%x;", c); print_word(h, numbuf); } else if (print_escape(h, c) == 0) print_byte(h, c); @@ -514,7 +514,7 @@ print_otag(struct html *h, enum htmltag print_indent(h); else if ((h->flags & HTML_NOSPACE) == 0) { if (h->flags & HTML_KEEP) - print_word(h, " "); + print_word(h, " "); else { if (h->flags & HTML_PREKEEP) h->flags |= HTML_KEEP; @@ -777,7 +777,7 @@ print_text(struct html *h, const char *w h->flags |= HTML_KEEP; print_endword(h); } else - print_word(h, " "); + print_word(h, " "); } assert(NULL == h->metaf); -- To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv