From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp-2.sys.kth.se (smtp-2.sys.kth.se [130.237.32.160]) by krisdoz.my.domain (8.14.3/8.14.3) with ESMTP id oBR04hjb006096 for ; Sun, 26 Dec 2010 19:04:43 -0500 (EST) Received: from smtp-2.sys.kth.se (localhost [127.0.0.1]) by smtp-2.sys.kth.se (Postfix) with ESMTP id EFB0914D79E; Mon, 27 Dec 2010 01:04:36 +0100 (CET) X-Virus-Scanned: by amavisd-new at kth.se Received: from smtp-2.sys.kth.se ([127.0.0.1]) by smtp-2.sys.kth.se (smtp-2.sys.kth.se [127.0.0.1]) (amavisd-new, port 10024) with LMTP id gyR2QMC2Jqpf; Mon, 27 Dec 2010 01:04:13 +0100 (CET) X-KTH-Auth: kristaps [46.109.54.191] X-KTH-mail-from: kristaps@bsd.lv Received: from macky.local (unknown [46.109.54.191]) by smtp-2.sys.kth.se (Postfix) with ESMTP id 31B4414C133; Mon, 27 Dec 2010 01:04:11 +0100 (CET) Message-ID: <4D17D7F9.2070904@bsd.lv> Date: Mon, 27 Dec 2010 02:04:09 +0200 From: Kristaps Dzonsons User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.12) Gecko/20101027 Thunderbird/3.1.6 X-Mailinglist: mdocml-discuss Reply-To: discuss@mdocml.bsd.lv MIME-Version: 1.0 To: discuss@mdocml.bsd.lv CC: Ingo Schwarze , Nicolas Joly , Thomas Klausner Subject: Re: Sx quoting and Header References: <20101225152204.GY21954@danbala.tuwien.ac.at> <20101226233009.GM23914@iris.usta.de> In-Reply-To: <20101226233009.GM23914@iris.usta.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit On 27/12/2010 01:30, Ingo Schwarze wrote: > Hi Thomas and Nicolas, > > gah, you are making *me* fix -Thtml bugs... :) > > Thomas Klausner wrote on Sat, Dec 25, 2010 at 04:22:04PM +0100: > >> 1. When the argument for .Sx is not quoted and contains a space, it >> breaks the link. Example attached. The .Sx links differ as follows: >> the one without the quotes has "20" in a place where the other has >> "x20x". Shouldn't the argument of .Sx be taken as one item independent >> on if it has quotes or not? > > Well, whether it is handled as one AST node internally or not > is not that important here (using -Ttree, you see that it is > indeed not the same, which does make sense in other contexts), > but the bug here is that the ID generator ought to handle > two words in the same way as one word with a blank in the middle. > > OK to commit the following patch? > > It makes sure the "x" protecting the beginning of the ID > only gets added before the first word, not before each word. > > Thanks for reporting! > > Yours, > Ingo > > >> .Dd December 25, 2010 >> .Dt TEST 1 >> .Os >> .Sh NAME >> .Nm test >> .Nd test page >> .Sh DESCRIPTION >> .Ss Programming Guide >> See >> .Sx "Programming Guide" >> or >> .Sx Programming Guide > > > Index: html.c > =================================================================== > RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/html.c,v > retrieving revision 1.123 > diff -d -u -p -r1.123 html.c > --- html.c 24 Dec 2010 14:14:00 -0000 1.123 > +++ html.c 26 Dec 2010 23:10:43 -0000 > @@ -771,20 +771,20 @@ html_idcat(char *dst, const char *src, i > { > int ssz; > > - assert(sz); > + assert(sz> 2); > > /* Cf.. */ > > - for ( ; *dst != '\0'&& sz; dst++, sz--) > - /* Jump to end. */ ; > - > - assert(sz> 2); > - > /* We can't start with a number (bah). */ > > - *dst++ = 'x'; > - *dst = '\0'; > - sz--; > + if ('\0' == *dst) { > + *dst++ = 'x'; > + *dst = '\0'; > + sz--; > + } > + > + for ( ; *dst != '\0'&& sz; dst++, sz--) > + /* Jump to end. */ ; > > for ( ; *src != '\0'&& sz> 1; src++) { > ssz = snprintf(dst, (size_t)sz, "%.2x", *src); Ingo, Great! To assuage the trauma of dealing with -T[x]html, I've a patch on hand that stashes the mdoc_validate.c concatenated `Sh' string (see post_sh()) into a mdoc_sh structure. This buffer is then pumped right into html_idcat(), removing all those stupid loops over children. The neat part about this is that, if I add a sorted list among these, I can do the same in `Sx' and add a check that makes sure `Sx' links actually go somewhere AND that sections/subsections/etc. aren't duplicates. It's only little bits of code, but since this isn't checked by nroff, I'll post it here for a relevancy check before it goes in. Note, if I haven't ever explained myself, that the hex encoding is just to get around the fact that IDs are case-insensitive, while -mdoc is case-sensitive (NAME != name)... Kristaps -- To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv