From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/69867 Path: news.gmane.org!not-for-mail From: Katsumi Yamaoka Newsgroups: gmane.emacs.gnus.general Subject: Re: Double encoding in ~/Mail/active Date: Wed, 04 Aug 2010 09:38:02 +0900 Organization: Emacsen advocacy group Message-ID: References: <87eiej21gd.fsf@topper.koldfront.dk> <878w4qoqe7.fsf@topper.koldfront.dk> <877hkasr6g.fsf@topper.koldfront.dk> <87tynemvr3.fsf@topper.koldfront.dk> <87d3u07krz.fsf@topper.koldfront.dk> <87zkx42qc9.fsf@topper.koldfront.dk> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: dough.gmane.org 1280882386 29691 80.91.229.12 (4 Aug 2010 00:39:46 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Wed, 4 Aug 2010 00:39:46 +0000 (UTC) To: ding@gnus.org Original-X-From: ding-owner+M18254@lists.math.uh.edu Wed Aug 04 02:39:45 2010 Return-path: Envelope-to: ding-account@gmane.org Original-Received: from util0.math.uh.edu ([129.7.128.18]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1OgS1E-0004yH-98 for ding-account@gmane.org; Wed, 04 Aug 2010 02:39:40 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu) by util0.math.uh.edu with smtp (Exim 4.63) (envelope-from ) id 1OgS0K-0007Fm-83; Tue, 03 Aug 2010 19:38:44 -0500 Original-Received: from mx1.math.uh.edu ([129.7.128.32]) by util0.math.uh.edu with esmtps (TLSv1:AES256-SHA:256) (Exim 4.63) (envelope-from ) id 1OgS0I-0007Fd-M9 for ding@lists.math.uh.edu; Tue, 03 Aug 2010 19:38:42 -0500 Original-Received: from quimby.gnus.org ([80.91.231.51]) by mx1.math.uh.edu with esmtp (Exim 4.72) (envelope-from ) id 1OgS0G-0000Jt-O3 for ding@lists.math.uh.edu; Tue, 03 Aug 2010 19:38:42 -0500 Original-Received: from orlando.hostforweb.net ([216.246.45.90]) by quimby.gnus.org with esmtp (Exim 3.36 #1 (Debian)) id 1OgS0F-0007E9-00 for ; Wed, 04 Aug 2010 02:38:39 +0200 Original-Received: from localhost ([127.0.0.1]:58746) by orlando.hostforweb.net with esmtpa (Exim 4.69) (envelope-from ) id 1OgRzg-0007N6-Sh for ding@gnus.org; Tue, 03 Aug 2010 19:38:05 -0500 X-Hashcash: 1:20:100804:ding@gnus.org::1z7mCpil5+3QITSu:00004fjc X-Face: #kKnN,xUnmKia.'[pp`;Omh}odZK)?7wQSl"4o04=EixTF+V[""w~iNbM9ZL+.b*_CxUmFk B#Fu[*?MZZH@IkN:!"\w%I_zt>[$nm7nQosZ<3eu;B:$Q_:p!',P.c0-_Cy[dz4oIpw0ESA^D*1Lw= L&i*6&( User-Agent: Gnus/5.110011 (No Gnus v0.11) Emacs/24.0.50 (gnu/linux) Cancel-Lock: sha1:PlUCELUkRYN9qyZd7c4vHeEHXlc= X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - orlando.hostforweb.net X-AntiAbuse: Original Domain - gnus.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - jpl.org X-Source: X-Source-Args: X-Source-Dir: X-Spam-Score: -1.9 (-) List-ID: Precedence: bulk Xref: news.gmane.org gmane.emacs.gnus.general:69867 Archived-At: Adam Sj=F8gren wrote: > On Tue, 03 Aug 2010 08:21:17 +0900, Katsumi wrote: [...] >>> Is this a GNU Emacs or an XEmacs bug? I guess it should be reported? >> No, it's my fault. ;-p > But isn't it inconsistent "in a bad way" that GNU Emacs won't double > encode a string and XEmacs will? Wouldn't it be better if both emacsen > behaved the same? > I.e. your example: > (decode-coding-string > (encode-coding-string > (encode-coding-string "hyskenstr=E6de" 'utf-8) > 'utf-8) > 'utf-8) > ought to give either "hyskenstr=E6de" or "hyskenstr=C3=A6de" on both emac= sen? First of all, I believe what is bad is a source code that does the double encoding (I made ;-). I don't know exactly how Emacs decides whether to encode an encoded string but I guess the point is the multibyteness or other. In Emacs, an encoded string is always a unibyte string, that is not worth encoding. Though the double encoding happens in Emacs 21 and 22, and the following form returns the one that is the same as that of XEmacs: (decode-coding-string (encode-coding-string (string-make-multibyte (encode-coding-string "hyskenstr=E6de" 'utf-8)) 'utf-8) 'utf-8) But as for Emacs 23 and greater (i.e. Unicode Emacsen), the double encoding doesn't happen even for that form. Those Emacsen may examine what the data are. OTOH, there is no concept of the multibyteness in XEmacs. Making XEmacs behave like the recent Emacsen may possibly mean making it follow a long journey that Emacs trudged along. And I think it goes for nothing. Again, the bad was me. ;-)