From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/72793 Path: news.gmane.org!not-for-mail From: Lars Magne Ingebrigtsen Newsgroups: gmane.emacs.gnus.general Subject: Re: shr.el: folding Japanese text Date: Fri, 08 Oct 2010 19:05:54 +0200 Organization: Programmerer Ingebrigtsen Message-ID: References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: dough.gmane.org 1286557583 24384 80.91.229.12 (8 Oct 2010 17:06:23 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Fri, 8 Oct 2010 17:06:23 +0000 (UTC) To: ding@gnus.org Original-X-From: ding-owner+M21165@lists.math.uh.edu Fri Oct 08 19:06:21 2010 Return-path: Envelope-to: ding-account@gmane.org Original-Received: from util0.math.uh.edu ([129.7.128.18]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1P4GOj-0008Of-8c for ding-account@gmane.org; Fri, 08 Oct 2010 19:06:21 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu) by util0.math.uh.edu with smtp (Exim 4.63) (envelope-from ) id 1P4GOe-0006k0-HZ; Fri, 08 Oct 2010 12:06:16 -0500 Original-Received: from mx1.math.uh.edu ([129.7.128.32]) by util0.math.uh.edu with esmtps (TLSv1:AES256-SHA:256) (Exim 4.63) (envelope-from ) id 1P4GOc-0006jj-Vr for ding@lists.math.uh.edu; Fri, 08 Oct 2010 12:06:14 -0500 Original-Received: from quimby.gnus.org ([80.91.231.51]) by mx1.math.uh.edu with esmtp (Exim 4.72) (envelope-from ) id 1P4GOY-00037U-JG for ding@lists.math.uh.edu; Fri, 08 Oct 2010 12:06:14 -0500 Original-Received: from lo.gmane.org ([80.91.229.12]) by quimby.gnus.org with esmtp (Exim 3.36 #1 (Debian)) id 1P4GOX-0008L6-00 for ; Fri, 08 Oct 2010 19:06:09 +0200 Original-Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1P4GOU-0008I6-3M for ding@gnus.org; Fri, 08 Oct 2010 19:06:06 +0200 Original-Received: from cm-84.215.34.171.getinternet.no ([84.215.34.171]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 08 Oct 2010 19:06:06 +0200 Original-Received: from larsi by cm-84.215.34.171.getinternet.no with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 08 Oct 2010 19:06:06 +0200 X-Injected-Via-Gmane: http://gmane.org/ Mail-Followup-To: ding@gnus.org Original-Lines: 31 Original-X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: cm-84.215.34.171.getinternet.no Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAAD1BMVEUJAgwFAAgpISsIAQvL xsMb7Te5AAACMElEQVQ4jXVTi5HjMAhFlguQVg3oWAqAcAVkFPqv6R6y97Jzs8dM4oTH4/PAVAep VlKqRFXVqw2z5oWoFyaihIkUYW7t4RnaTaJ6byDRqIafpyCCSu8ARnfQqMDjTYKGkfUz4mmDZZ6x mEr5iFjDqSew6gEUQKxpj4iwDkYTarIAULDEolpkjgGgSz1jAjiCkk0upSkA+9BglQgOEqDL2fpm DIqqmQw85IvPBid184NjtljgrJjZhd9APCMEJQ7m+CVxxKdZtotmMkvtGH96Jltlz5F9xwIbJUuL iZSlX4xITWygjTKQLri4k1oVjkreu6v7UGSFH0vAfyQl+MhLdTtXKVUNgDaZRbEPTYFJl7miuHZ9 8NwuOq4HltlTdqixMJbk12ULvWBRHjkf7BuAhVr3lGmmrJghJY6XYw6zgVSpyht4VhSvqLHozrYS mAhpeSUuz6x/fgGcxdGbjdZqY1Imno9Pn42nJ+CtDx/oWl3zIHvX4ntW7QDdcFIJ4BKRPQE0dmtx fVUo7m1fbd9LxUqzseeS1WsGNSoCPdaSXJFwAuQb8N8JhGxGiuY0lDLdVeOYx0HM0Nh9NwhA1utk 3A6iQ163M49f1lOwYMom5EVfANFOvpXMJ5x+LwynxnPy5DSss+Yx3HaxcwBNQG8KuBl5TL1D/I4/ 8pj35+QdeRlepL/DnXmI7QbiH9t70q9uU6f7SG4pEln59qXN/P0GSJK28p2lRT+a6s9++p//bX8A VtiOSA72xV8AAAAASUVORK5CYII= Mail-Copies-To: never X-Now-Playing: Scritti Politti's _White Bread Black Beer_: "Locked" User-Agent: Gnus/5.110011 (No Gnus v0.11) Emacs/24.0.50 (gnu/linux) Cancel-Lock: sha1:/5zpE1znNrT95L02afBk6szwhEc= X-Spam-Score: -1.9 (-) List-ID: Precedence: bulk Xref: news.gmane.org gmane.emacs.gnus.general:72793 Archived-At: Katsumi Yamaoka writes: > Could you see articles in the gwene.jp.itmedia.news.bursts > newsgroup? You may see some aren't folded and some are folded > uglily. There's no concept of the word wrapping in Japanese. > Normally there's no space between words. A word may be folded > in the middle of it. Is it funny? But it's our custom. ;-) > I guess that Korean text and Chinese text are similar. > > The following is a quick hack that satisfies me so-so. For the > moment I don't know how we can switch it for Latin text and > others, though. Yeah, this should be fixed to work on all languages. But I'm not sure what would be the right approach here. Should we add language-guessing stuff to switch the folding routine? Or are there other (simpler) ways to get satisfactory results in this area? Right now I'm not seeing any obvious easy solutions here, but surely somebody has dealt with problems like this before. :-) Any ideas? Just when typing this, it occurs to me that there may be help from the unicode standard, perhaps? I mean, it shr-insert inserts a text with characters from the Japanese, Chinese, Korean and *mumble* planes, then we can break it on "character" boundaries, instead where spaces appear. So if Emacs has a way to say (character-from-non-space-language-p char), then our problems would be solved. :-) -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen