tech@mandoc.bsd.lv
 help / color / mirror / Atom feed
From: Ingo Schwarze <schwarze@usta.de>
To: tech@mdocml.bsd.lv
Subject: Re: 1.11.7 minor issues
Date: Mon, 19 Sep 2011 09:58:16 +0200	[thread overview]
Message-ID: <20110919075816.GA1243@iris.usta.de> (raw)
In-Reply-To: <4E7684BF.7030702@bsd.lv>

Hi Kristaps,

Kristaps Dzonsons wrote on Mon, Sep 19, 2011 at 01:54:39AM +0200:
> On 19/09/2011 01:39, Ingo Schwarze wrote:

>> after fixing the two larger problems, systematic comparisons
>> revealed two smaller issues that are new in 1.11.7:

>>>@@ -52,8 +52,8 @@
>>>             -center:off
>>>             -center=off
>>>             -center-
>>>-    Lynx recognizes "1", "+", "on" and "true" for true values, and "0",
>>>-    "-", "off" and "false" for false values.  Other option-values are
>>>+    Lynx recognizes "1", "+", "on" and "true" for true values, and "0", "-
>>>+    ", "off" and "false" for false values.  Other option-values are
>>>      ignored.

>>>@@ -109,8 +109,8 @@
>>>     Many folks attempt a simple-minded regular expression approach, like
>>>     "s/<.*?>//g", but that fails in many cases because the tags may
>>>     continue over line breaks, they may contain quoted angle-brackets, or
>>>-    HTML comment may be present.  Plus, folks forget to convert
>>>-    entities--like "&lt;" for example.
>>>+    HTML comment may be present.  Plus, folks forget to convert entities--
>>>+    like "&lt;" for example.
>>>
>>>     Here's one "simple-minded" approach, that works for most files:

>> I will think about those two tomorrow.

> Around line 577 in roff.c is where mandoc_hyph ended up: the quotes
> need to be added.
> 
> As for the second one, we should bring jmc@ in, no?  I'd think that
> double or triple-dashes would be broken.  Unicode, for one,
> 
> http://www.cs.tut.fi/~jkorpela/dashes.html#linebreaks
> 
> stipulates that en and em dashes break the line.
> 
> Thoughts?

I think you are right that breaking at double dashes ought to be ok.
However, groff doesn't break there, i don't consider the point of
sufficient importance to deviate from groff, and not breaking at
double hyphens keeps the code simpler.  I have checked for all
non-alpha ASCII character that groff indeed doesn't break the line
if they preceed or follow a dash.

So, here is what i have done for now - OK?


CVSROOT:	/cvs
Module name:	src
Changes by:	schwarze@cvs.openbsd.org	2011/09/19 01:53:54

Modified files:
	usr.bin/mandoc : roff.c 

Log message:
Breaking the line at a hyphen is only allowed if the hyphen
is both preceded and followed by an alphabetic character.
This fixes about a dozen places in base.


Index: roff.c
===================================================================
RCS file: /cvs/src/usr.bin/mandoc/roff.c,v
retrieving revision 1.43
diff -u -p -r1.43 roff.c
--- roff.c	18 Sep 2011 23:26:18 -0000	1.43
+++ roff.c	19 Sep 2011 07:49:59 -0000
@@ -552,7 +552,6 @@ again:
 static enum rofferr
 roff_parsetext(char *p)
 {
-	char		 l, r;
 	size_t		 sz;
 	const char	*start;
 	enum mandoc_esc	 esc;
@@ -579,14 +578,8 @@ roff_parsetext(char *p)
 			continue;
 		}
 
-		l = *(p - 1);
-		r = *(p + 1);
-		if ('\\' != l &&
-				'\t' != r && '\t' != l &&
-				' ' != r && ' ' != l &&
-				'-' != r && '-' != l &&
-				! isdigit((unsigned char)l) &&
-				! isdigit((unsigned char)r))
+		if (isalpha((unsigned char)p[-1]) &&
+		    isalpha((unsigned char)p[1]))
 			*p = ASCII_HYPH;
 		p++;
 	}

--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

  reply	other threads:[~2011-09-19  8:05 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-18 23:39 Ingo Schwarze
2011-09-18 23:54 ` Kristaps Dzonsons
2011-09-19  7:58   ` Ingo Schwarze [this message]
2011-09-19  8:10     ` Kristaps Dzonsons
2011-09-19  8:41       ` Ingo Schwarze

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110919075816.GA1243@iris.usta.de \
    --to=schwarze@usta.de \
    --cc=tech@mdocml.bsd.lv \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).