From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/74744 Path: news.gmane.org!not-for-mail From: Lars Magne Ingebrigtsen Newsgroups: gmane.emacs.gnus.general Subject: shr line breaking (was: [gnus git] branch master updated: =1= shr.el (shr-find-fill-point): Work better for kinsoku chars and apostrophes.) Date: Mon, 06 Dec 2010 11:47:58 +0100 Organization: Programmerer Ingebrigtsen Message-ID: References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: dough.gmane.org 1291632547 6496 80.91.229.12 (6 Dec 2010 10:49:07 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Mon, 6 Dec 2010 10:49:07 +0000 (UTC) To: ding@gnus.org Original-X-From: ding-owner+M23100@lists.math.uh.edu Mon Dec 06 11:49:03 2010 Return-path: Envelope-to: ding-account@gmane.org Original-Received: from util0.math.uh.edu ([129.7.128.18]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1PPYcv-0000af-A8 for ding-account@gmane.org; Mon, 06 Dec 2010 11:49:01 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu) by util0.math.uh.edu with smtp (Exim 4.63) (envelope-from ) id 1PPYcF-0002qv-8k; Mon, 06 Dec 2010 04:48:19 -0600 Original-Received: from mx1.math.uh.edu ([129.7.128.32]) by util0.math.uh.edu with esmtps (TLSv1:AES256-SHA:256) (Exim 4.63) (envelope-from ) id 1PPYcC-0002qh-JX for ding@lists.math.uh.edu; Mon, 06 Dec 2010 04:48:16 -0600 Original-Received: from quimby.gnus.org ([80.91.231.51]) by mx1.math.uh.edu with esmtp (Exim 4.72) (envelope-from ) id 1PPYc7-0005NB-J9 for ding@lists.math.uh.edu; Mon, 06 Dec 2010 04:48:15 -0600 Original-Received: from lo.gmane.org ([80.91.229.12]) by quimby.gnus.org with esmtp (Exim 3.36 #1 (Debian)) id 1PPYc6-0007Un-00 for ; Mon, 06 Dec 2010 11:48:10 +0100 Original-Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1PPYc6-00009f-OV for ding@gnus.org; Mon, 06 Dec 2010 11:48:10 +0100 Original-Received: from cm-84.215.34.171.getinternet.no ([84.215.34.171]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 06 Dec 2010 11:48:10 +0100 Original-Received: from larsi by cm-84.215.34.171.getinternet.no with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 06 Dec 2010 11:48:10 +0100 X-Injected-Via-Gmane: http://gmane.org/ Mail-Followup-To: ding@gnus.org Original-Lines: 23 Original-X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: cm-84.215.34.171.getinternet.no Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAAFVBMVEW9ucDm4eV2c30pJTDu 6+7b1dsIAgLMRDTLAAACXUlEQVQ4jW2UTY+rIBSGEWvXMrF7CreuHci4LhO6Ryv7RhL+/0+474HO 9M7kHo3R83i+XlAWX7YIspNYBB7YP94Bh+tZwYJVX3l58N7f+KeKDQHxMvj9lXHVnZ1DxPIijtDQ 4AKg/m+SUg1uGL5zPY2xC2o6OgdPxwtM9cGd/Q9jrDrEdRl+gcZv/qqMbn5HHK6b/2Q6mcOXz1Ww rcjR83cqLNA/CSUILBfhr+ZWKpxiYMUWgO7c9+xQ/A6uOlwnGNPL2yCHrerEuORztghhbGzeVukL CBJQ25yMbBBxbiWrgE/+pnPO1pqOsfARbCy1b+3Dt/AnpXeAyIPwRb5+ftxmAlJNAF1/eaoK8JbJ pGYR9cVTJcfm/VgA9BEExBPwvI8VmIZE/FqgU5t3KoHiE4HlCVxQOVdgJUnytW8GF9psCsmdeAK4 e6jOLUawdraL/wZOIyrweYKMWjYElOqQat1p9DCmLnLNSgRSTiLMf0pnbZ6WYGIBykqt5r0Ov+a9 uz+oUWgVVcj6vYLrnM2YrFESOzG2PI0od9OqcdDEJur5gYg71mDzbqWVgG+SS+CqADta4W9VJwzO 0flJArRWp4UAzQZsaSNR8dbMaRFYDSFioLjNnThFaJ3HRvSZQQ6dNC3tXIqb4x6iaDOnbnZpba5d BXPf8YUGvEayJ0bcItU6rQ98kDGO2UAH8qoOksT14461TpZm2BneT1vZ7bG9tKktE1AOCVEq6I7y mHUFIHR3KICNbJxNTnPCdtJFp/MwYHJllYUZbH9c7PPeYilZjD//QEssPxlYF36Qiv8C+Z/0w5Nk QOAAAAAASUVORK5CYII= Mail-Copies-To: never X-Now-Playing: =?iso-8859-1?Q?Kj=F8tt's?= _Op_: "Krigsrop" User-Agent: Gnus/5.110011 (No Gnus v0.11) Emacs/24.0.50 (gnu/linux) Cancel-Lock: sha1:/2KKHPHxXDFzf84JRiOokiSviGE= X-Spam-Score: -1.9 (-) List-ID: Precedence: bulk Xref: news.gmane.org gmane.emacs.gnus.general:74744 Archived-At: Katsumi Yamaoka writes: > Because "'" is categorized as kinsoku-bol, that should not appear > in the beginning of a line. But I've modified the code so as to > give it special treatment. Thanks. Thanks. But I'm starting to wonder whether the line breaking algo should be broken up into two bits -- one for Japanese (etc.) text an one for the rest. Like the following line: names like www.example.com into the numeric IP addresses like 192.0.2.1 (shr-find-fill-point) will put point before the "1", which is wrong in this instance. Non-CJVK texts can only be broken where there's a space character, so perhaps we need additional logic to find out whether a (part of a) line is CJVK or not before trying to find the fill point? This may be difficult on mixed texts, perhaps... -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen