From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/74811 Path: news.gmane.org!not-for-mail From: Katsumi Yamaoka Newsgroups: gmane.emacs.gnus.general Subject: Re: shr line breaking Date: Tue, 07 Dec 2010 10:18:49 +0900 Organization: Emacsen advocacy group Message-ID: References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: dough.gmane.org 1291684834 12310 80.91.229.12 (7 Dec 2010 01:20:34 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Tue, 7 Dec 2010 01:20:34 +0000 (UTC) To: ding@gnus.org Original-X-From: ding-owner+M23167@lists.math.uh.edu Tue Dec 07 02:20:24 2010 Return-path: Envelope-to: ding-account@gmane.org Original-Received: from util0.math.uh.edu ([129.7.128.18]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1PPmE7-0005tX-4B for ding-account@gmane.org; Tue, 07 Dec 2010 02:20:19 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu) by util0.math.uh.edu with smtp (Exim 4.63) (envelope-from ) id 1PPmDN-0000I1-M2; Mon, 06 Dec 2010 19:19:33 -0600 Original-Received: from mx2.math.uh.edu ([129.7.128.33]) by util0.math.uh.edu with esmtps (TLSv1:AES256-SHA:256) (Exim 4.63) (envelope-from ) id 1PPmDM-0000Hq-IL for ding@lists.math.uh.edu; Mon, 06 Dec 2010 19:19:32 -0600 Original-Received: from quimby.gnus.org ([80.91.231.51]) by mx2.math.uh.edu with esmtp (Exim 4.72) (envelope-from ) id 1PPmDL-0007rl-I6 for ding@lists.math.uh.edu; Mon, 06 Dec 2010 19:19:32 -0600 Original-Received: from orlando.hostforweb.net ([216.246.45.90]) by quimby.gnus.org with esmtp (Exim 3.36 #1 (Debian)) id 1PPmDK-0001lE-00 for ; Tue, 07 Dec 2010 02:19:30 +0100 Original-Received: from localhost ([127.0.0.1]:36293) by orlando.hostforweb.net with esmtpa (Exim 4.69) (envelope-from ) id 1PPmCl-0005ux-QK for ding@gnus.org; Mon, 06 Dec 2010 19:18:56 -0600 X-Hashcash: 1:20:101207:ding@gnus.org::y7czAI9fQO1JrhDW:00003D49 X-Face: #kKnN,xUnmKia.'[pp`;Omh}odZK)?7wQSl"4o04=EixTF+V[""w~iNbM9ZL+.b*_CxUmFk B#Fu[*?MZZH@IkN:!"\w%I_zt>[$nm7nQosZ<3eu;B:$Q_:p!',P.c0-_Cy[dz4oIpw0ESA^D*1Lw= L&i*6&( User-Agent: Gnus/5.110011 (No Gnus v0.11) Emacs/24.0.50 (gnu/linux) Cancel-Lock: sha1:ctBryTflCE9Baymt5JHDe6d/3Bc= Content-Disposition: inline X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - orlando.hostforweb.net X-AntiAbuse: Original Domain - gnus.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - jpl.org X-Source: X-Source-Args: X-Source-Dir: X-Spam-Score: -1.9 (-) List-ID: Precedence: bulk Xref: news.gmane.org gmane.emacs.gnus.general:74811 Archived-At: Lars Magne Ingebrigtsen wrote: > Like the following line: names like www.example.com into the numeric IP addresses like 192.0.2.1 > (shr-find-fill-point) will put point before the "1", which is wrong in > this instance. It happens with Japanese text, too. ;-) www.example.com =E3=81=AE=E3=82=88=E3=81=86=E3=81=AA=E5=90=8D=E5=89=8D=E3= =81=AF=E6=AC=A1=E3=81=AE=E3=82=88=E3=81=86=E3=81=AA=E6=95=B0=E5=AD=97=E3=81= =AE IP =E3=82=A2=E3=83=89=E3=83=AC=E3=82=B9=E3=81=AB=EF=BC=9E192.0.2.1 I've fixed it so that it may not break a line after a kinsoku-bol character (i.e., "." etc.) if a non-breakable character follows. > Non-CJVK texts can only be broken where there's a space > character, so perhaps we need additional logic to find out whether a > (part of a) line is CJVK or not before trying to find the fill point? > This may be difficult on mixed texts, perhaps... Yes, I also think it's difficult to distinguish CJVK text and others, especially in unicode Emacsen. For instance, even a latin-1 character is regarded as Japanese: (string-match "\\cj" "=C3=98") -> 0 In Emacs 22.3 and earlier, it was nil.