From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from scc-mailout.scc.kit.edu (scc-mailout.scc.kit.edu [129.13.185.202]) by krisdoz.my.domain (8.14.3/8.14.3) with ESMTP id p8J851BE024638 for ; Mon, 19 Sep 2011 04:05:02 -0400 (EDT) Received: from hekate.usta.de (asta-nat.asta.uni-karlsruhe.de [172.22.63.82]) by scc-mailout-02.scc.kit.edu with esmtp (Exim 4.72 #1) id 1R5YqX-0008SD-28; Mon, 19 Sep 2011 10:04:57 +0200 Received: from donnerwolke.usta.de ([172.24.96.3]) by hekate.usta.de with esmtp (Exim 4.72) (envelope-from ) id 1R5Yqa-0001XJ-01 for tech@mdocml.bsd.lv; Mon, 19 Sep 2011 10:05:00 +0200 Received: from iris.usta.de ([172.24.96.5] helo=usta.de) by donnerwolke.usta.de with esmtp (Exim 4.69) (envelope-from ) id 1R5YqZ-0002KI-TO for tech@mdocml.bsd.lv; Mon, 19 Sep 2011 10:04:59 +0200 Received: from schwarze by usta.de with local (Exim 4.72) (envelope-from ) id 1R5Yk4-00070K-Pi for tech@mdocml.bsd.lv; Mon, 19 Sep 2011 09:58:16 +0200 Date: Mon, 19 Sep 2011 09:58:16 +0200 From: Ingo Schwarze To: tech@mdocml.bsd.lv Subject: Re: 1.11.7 minor issues Message-ID: <20110919075816.GA1243@iris.usta.de> References: <20110918233910.GK29692@iris.usta.de> <4E7684BF.7030702@bsd.lv> X-Mailinglist: mdocml-tech Reply-To: tech@mdocml.bsd.lv MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4E7684BF.7030702@bsd.lv> User-Agent: Mutt/1.5.21 (2010-09-15) Hi Kristaps, Kristaps Dzonsons wrote on Mon, Sep 19, 2011 at 01:54:39AM +0200: > On 19/09/2011 01:39, Ingo Schwarze wrote: >> after fixing the two larger problems, systematic comparisons >> revealed two smaller issues that are new in 1.11.7: >>>@@ -52,8 +52,8 @@ >>> -center:off >>> -center=off >>> -center- >>>- Lynx recognizes "1", "+", "on" and "true" for true values, and "0", >>>- "-", "off" and "false" for false values. Other option-values are >>>+ Lynx recognizes "1", "+", "on" and "true" for true values, and "0", "- >>>+ ", "off" and "false" for false values. Other option-values are >>> ignored. >>>@@ -109,8 +109,8 @@ >>> Many folks attempt a simple-minded regular expression approach, like >>> "s/<.*?>//g", but that fails in many cases because the tags may >>> continue over line breaks, they may contain quoted angle-brackets, or >>>- HTML comment may be present. Plus, folks forget to convert >>>- entities--like "<" for example. >>>+ HTML comment may be present. Plus, folks forget to convert entities-- >>>+ like "<" for example. >>> >>> Here's one "simple-minded" approach, that works for most files: >> I will think about those two tomorrow. > Around line 577 in roff.c is where mandoc_hyph ended up: the quotes > need to be added. > > As for the second one, we should bring jmc@ in, no? I'd think that > double or triple-dashes would be broken. Unicode, for one, > > http://www.cs.tut.fi/~jkorpela/dashes.html#linebreaks > > stipulates that en and em dashes break the line. > > Thoughts? I think you are right that breaking at double dashes ought to be ok. However, groff doesn't break there, i don't consider the point of sufficient importance to deviate from groff, and not breaking at double hyphens keeps the code simpler. I have checked for all non-alpha ASCII character that groff indeed doesn't break the line if they preceed or follow a dash. So, here is what i have done for now - OK? CVSROOT: /cvs Module name: src Changes by: schwarze@cvs.openbsd.org 2011/09/19 01:53:54 Modified files: usr.bin/mandoc : roff.c Log message: Breaking the line at a hyphen is only allowed if the hyphen is both preceded and followed by an alphabetic character. This fixes about a dozen places in base. Index: roff.c =================================================================== RCS file: /cvs/src/usr.bin/mandoc/roff.c,v retrieving revision 1.43 diff -u -p -r1.43 roff.c --- roff.c 18 Sep 2011 23:26:18 -0000 1.43 +++ roff.c 19 Sep 2011 07:49:59 -0000 @@ -552,7 +552,6 @@ again: static enum rofferr roff_parsetext(char *p) { - char l, r; size_t sz; const char *start; enum mandoc_esc esc; @@ -579,14 +578,8 @@ roff_parsetext(char *p) continue; } - l = *(p - 1); - r = *(p + 1); - if ('\\' != l && - '\t' != r && '\t' != l && - ' ' != r && ' ' != l && - '-' != r && '-' != l && - ! isdigit((unsigned char)l) && - ! isdigit((unsigned char)r)) + if (isalpha((unsigned char)p[-1]) && + isalpha((unsigned char)p[1])) *p = ASCII_HYPH; p++; } -- To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv