From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/67838 Path: news.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.gnus.general Subject: Re: bug#1174: 23.0.60; Some UTF-8 mails displaying wrongly in Emacs 23 Date: Tue, 02 Dec 2008 02:36:31 -0500 Message-ID: References: <871vyf6q40.fsf@marauder.physik.uni-ulm.de> <87k5ampwej.fsf@marauder.physik.uni-ulm.de> <87y6yzbjgf.fsf@marauder.physik.uni-ulm.de> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1228203487 11930 80.91.229.12 (2 Dec 2008 07:38:07 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 2 Dec 2008 07:38:07 +0000 (UTC) Cc: Frank Schmitt , ding@gnus.org, 1174@emacsbugs.donarmstrong.com To: Simon Josefsson Original-X-From: ding-owner+M16285@lists.math.uh.edu Tue Dec 02 08:39:11 2008 Return-path: Envelope-to: ding-account@gmane.org Original-Received: from util0.math.uh.edu ([129.7.128.18]) by lo.gmane.org with esmtp (Exim 4.50) id 1L7Pqg-0006FQ-Ls for ding-account@gmane.org; Tue, 02 Dec 2008 08:39:11 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu) by util0.math.uh.edu with smtp (Exim 4.63) (envelope-from ) id 1L7Poi-0005on-UH; Tue, 02 Dec 2008 01:37:08 -0600 Original-Received: from mx2.math.uh.edu ([129.7.128.33]) by util0.math.uh.edu with esmtps (TLSv1:AES256-SHA:256) (Exim 4.63) (envelope-from ) id 1L7Poh-0005oe-AA for ding@lists.math.uh.edu; Tue, 02 Dec 2008 01:37:07 -0600 Original-Received: from quimby.gnus.org ([80.91.231.51]) by mx2.math.uh.edu with esmtp (Exim 4.69) (envelope-from ) id 1L7Poe-0004EK-Gz for ding@lists.math.uh.edu; Tue, 02 Dec 2008 01:37:07 -0600 Original-Received: from ironport2-out.pppoe.ca ([206.248.154.182] helo=ironport2-out.teksavvy.com) by quimby.gnus.org with esmtp (Exim 3.36 #1 (Debian)) id 1L7Pop-0004p5-00 for ; Tue, 02 Dec 2008 08:37:15 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ar0EAHx0NEnO+Jkl/2dsb2JhbACBbc8Rgn2BIw X-IronPort-AV: E=Sophos;i="4.33,701,1220241600"; d="scan'208";a="30526338" Original-Received: from 206-248-153-37.dsl.teksavvy.com (HELO pastel.home) ([206.248.153.37]) by ironport2-out.teksavvy.com with ESMTP; 02 Dec 2008 02:36:31 -0500 Original-Received: by pastel.home (Postfix, from userid 20848) id A9AC08BAF; Tue, 2 Dec 2008 02:36:31 -0500 (EST) In-Reply-To: <87y6yzbjgf.fsf@marauder.physik.uni-ulm.de> (Reiner Steib's message of "Mon, 01 Dec 2008 23:48:32 +0100") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.60 (gnu/linux) X-Spam-Score: -3.3 (---) List-ID: Precedence: bulk Xref: news.gmane.org gmane.emacs.gnus.general:67838 Archived-At: > In Emacs 21 (which Gnus still aim to be compatible with), we have > string-as-multibyte, but not string-to-multibyte. So your proposed > code (i.e. mm-string-to-multibyte) runs > (string-as-multibyte (char-to-string string)) > whereas we used to run > (string-as-multibyte string) > Does char-to-string matter here? > (defalias 'mm-string-to-multibyte > (cond > ((featurep 'xemacs) > 'identity) > ((fboundp 'string-to-multibyte) > 'string-to-multibyte) > (t > (lambda (string) > "Return a multibyte string with the same individual chars as string." > (mapconcat > (lambda (ch) (mm-string-as-multibyte (char-to-string ch))) > string ""))))) Oh, that's clever: yes, the mapconcat/char-to-string dance does make it implement the string-to-multibyte behavior because doing the string-as-multibyte conversion one byte at a time avoids the problematic case. To quote myself from mm-util.el: ;; string-as-multibyte often doesn't really do what you think it does. ;; Example: ;; (aref (string-as-multibyte "\201") 0) -> 129 (aka ?\201) ;; (aref (string-as-multibyte "\300") 0) -> 192 (aka ?\300) ;; (aref (string-as-multibyte "\300\201") 0) -> 192 (aka ?\300) ;; (aref (string-as-multibyte "\300\201") 1) -> 129 (aka ?\201) ;; but ;; (aref (string-as-multibyte "\201\300") 0) -> 2240 ;; (aref (string-as-multibyte "\201\300") 1) -> Basically when the sring passed is made of a single byte, string-as-multibyte is equal to string-to-multibyte, which is the property ued by the code you quoted above to build a poor man's string-to-multibyte. Stefan