From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-oi0-f44.google.com (mail-oi0-f44.google.com [209.85.218.44]) by fantadrom.bsd.lv (OpenSMTPD) with ESMTP id ead57b0d for ; Thu, 24 Dec 2015 21:37:08 -0500 (EST) Received: by mail-oi0-f44.google.com with SMTP id o62so139457832oif.3 for ; Thu, 24 Dec 2015 18:37:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cathet-us.20150623.gappssmtp.com; s=20150623; h=sender:from:to:subject:mime-version:content-type:content-id:date :message-id; bh=zqX34ETzd61U3q29s6JWBRr6tx9wJ3cURrg0xCXIqSU=; b=ig1NIlKOF4oBBEfM3ASk1NxvbqZ8OA37LfuArJ4SP3KhkX6G7WMyblCE7DSpNwlN7T UMVrWFrnSN3cN6eoXXGqHhY/aijzRh9uQQwK42eOD9J7nrzBtLmRpqydtdzQAk5Wr8AS VOBpAOYUKIjdcMgIrLreetfqq5kCh91DUxvbPHXRCJm2imlUYIBtqYW3nXhIINz92wNJ RovfA7NDQfOC/SeF+xthZ1Iscq38zZLvcMaWLd6bNp4ujsg2olo6tCuQ+q7b3F+w5K1S IEjLwMJ8CpYRChGBbY/MExyK5ZuNimCVfIGFg5GIU+BYY4WhDNqlGt9hjMHWn7tlFhEN xJpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:from:to:subject:mime-version:content-type :content-id:date:message-id; bh=zqX34ETzd61U3q29s6JWBRr6tx9wJ3cURrg0xCXIqSU=; b=Fg85ZnxKD7vjQ2nMgs8n5VixwgyidWMelMG7q3+AU8YMOcsPMJSPOHYzAjqSz/NzJV Xc+CJ/4Ph7r0qKtnTDVE6+ro2LUP5gDVwa9D91aegf63HXnjmreYhy255Lnl7QOSQAnY x/aNYMH50XjortA74EQj9X0Gil15eV9669msOCc2xgESWDucM6vjdUQHzA41dk43Rmw8 kzwmhzueWVM2Qp+ShmT/oqFwYZHVlNc0qYYnBW90x6Bu1RWdzCrJuc6fxoaojMT3Q0zw 5/yAcylWnvDE92g5ykQa5KPOn52Xm2BlqSRFMldu6VvDx+DkrKlNLOq11XG4LjRdZZui M+gA== X-Gm-Message-State: ALoCoQmfWzYvUFojf0w+Cq8wSdEq6WuMsj4ZQfk82R6xL4Ue4U9xt5NnXQYo+o5j7kACKrti+kO5mdO7LTaeCi5vI5jcZDKW6w== X-Received: by 10.202.231.193 with SMTP id e184mr20747732oih.106.1451011027399; Thu, 24 Dec 2015 18:37:07 -0800 (PST) Received: from cathet.us. (75-161-65-245.albq.qwest.net. [75.161.65.245]) by smtp.gmail.com with ESMTPSA id n9sm12448876oev.17.2015.12.24.18.37.06 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 24 Dec 2015 18:37:06 -0800 (PST) Sender: "Anthony J. Bentley" Received: from CATHET.us (localhost [127.0.0.1]) by cathet.us. (OpenSMTPD) with ESMTP id 7603c9b7 for ; Thu, 24 Dec 2015 19:37:06 -0700 (MST) From: "Anthony J. Bentley" To: tech@mdocml.bsd.lv Subject: Use literal text for HTML section ids X-Mailinglist: mdocml-tech Reply-To: tech@mdocml.bsd.lv MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <2019.1451011025.1@CATHET.us> Date: Thu, 24 Dec 2015 19:37:05 -0700 Message-ID: <1547.1451011025@CATHET.us> Hi, Currently mandoc(1) generates HTML ids by prefixing with 'x', then printing the ASCII values in hexadecimal. Presumably this was done to satisfy HTML 4's fairly strict requirements for id values, [A-Za-z][-A-Za-z0-9_:.]* Since we've gone full HTML 5, though, the requirement is much simpler: an id can contain anything except spaces. So we can print the contents of the section out directly, and just replace spaces with underscores. This makes section-linked URLs much more sensible: http://www.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man1/ls.1#x546865204c6f6e6720466f726d6174 versus http://www.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man1/ls.1#The_Long_Format It even does the right thing for ids containing UTF-8, whether literal or percent-encoded in the URL (tested in Firefox and Lynx--they will correctly follow both types). This is not a change to apply lightly, as it breaks any existing links that include ids. But mandoc currently doesn't expose them (unless you view source, find the id, and append it to the URL manually), and I suspect the current unreadable format might make people reluctant to use them anyway. I think this change is worth it. Index: mdoc_html.c =================================================================== RCS file: /cvs/mdocml/mdoc_html.c,v retrieving revision 1.238 diff -u -p -r1.238 mdoc_html.c --- mdoc_html.c 12 Oct 2015 00:08:15 -0000 1.238 +++ mdoc_html.c 24 Dec 2015 21:52:02 -0000 @@ -542,7 +542,6 @@ mdoc_sh_pre(MDOC_ARGS) } bufinit(h); - bufcat(h, "x"); for (n = n->child; n != NULL && n->type == ROFFT_TEXT; ) { bufcat_id(h, n->string); @@ -572,7 +571,6 @@ mdoc_ss_pre(MDOC_ARGS) return 1; bufinit(h); - bufcat(h, "x"); for (n = n->child; n != NULL && n->type == ROFFT_TEXT; ) { bufcat_id(h, n->string); @@ -1063,7 +1061,7 @@ mdoc_sx_pre(MDOC_ARGS) struct htmlpair tag[2]; bufinit(h); - bufcat(h, "#x"); + bufcat(h, "#"); for (n = n->child; n; ) { bufcat_id(h, n->string); Index: html.c =================================================================== RCS file: /cvs/mdocml/html.c,v retrieving revision 1.191 diff -u -p -r1.191 html.c --- html.c 13 Oct 2015 22:59:54 -0000 1.191 +++ html.c 24 Dec 2015 21:52:02 -0000 @@ -720,8 +720,8 @@ void bufcat_id(struct html *h, const char *src) { - /* Cf. . */ + /* Cf. . */ - while ('\0' != *src) - bufcat_fmt(h, "%.2x", *src++); + for (; '\0' != *src; *src++) + bufcat_fmt(h, "%c", *src == ' ' ? '_' : *src); } -- To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv