From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 582 invoked from network); 27 Sep 2020 16:44:17 -0000 Received: from lists.gnu.org (209.51.188.17) by inbox.vuxu.org with ESMTPUTF8; 27 Sep 2020 16:44:17 -0000 Received: from localhost ([::1]:45632 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kMZmY-00056k-Tn for ml@inbox.vuxu.org; Sun, 27 Sep 2020 12:44:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39026) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMZmQ-00055p-2w for info-gnus-english@gnu.org; Sun, 27 Sep 2020 12:44:07 -0400 Received: from static.214.254.202.116.clients.your-server.de ([116.202.254.214]:37088 helo=ciao.gmane.io) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMZmN-00064Y-Bl for info-gnus-english@gnu.org; Sun, 27 Sep 2020 12:44:05 -0400 Received: from list by ciao.gmane.io with local (Exim 4.92) (envelope-from ) id 1kMZmJ-0002fk-1I for info-gnus-english@gnu.org; Sun, 27 Sep 2020 18:43:59 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: info-gnus-english@gnu.org From: Garjola Dindi Subject: Incorrect rendering of accented characters in HTML e-mail Date: Sun, 27 Sep 2020 18:43:52 +0200 Message-ID: <87k0wf8913.fsf@pc-117-162.ovh.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) Cancel-Lock: sha1:F58iD6yENt9T4tgZeHOK7wGcyOM= Received-SPF: pass client-ip=116.202.254.214; envelope-from=gegu-info-gnus-english@m.gmane-mx.org; helo=ciao.gmane.io X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/27 12:43:59 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: 7 X-Spam_score: 0.7 X-Spam_bar: / X-Spam_report: (0.7 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.078, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, TO_NO_BRKTS_PCNT=2.499 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: info-gnus-english@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Announcements and discussions for GNUS, the GNU Emacs Usenet newsreader \(in English\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: info-gnus-english-bounces+ml=inbox.vuxu.org@gnu.org Sender: "info-gnus-english" --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Hi, I am having a problem with reading html e-mail in Gnus: accented characters appear with an incorrect encoding. For instance, "é" (e with acute accent) will appear as "i". The funny part comes now. If I edit the article with gnus-summary-edit-article and just press C-c C-c (that is, I don't do any edits) the characters are displayed correctly. So right now, I use this ,----[ emacs-lisp ] | (defun my/correct-message-encoding-by-dummy-edit () | (interactive) | (progn | (gnus-summary-select-article-buffer) | (gnus-summary-edit-article) | (gnus-article-edit-done))) | | (define-key gnus-summary-mode-map (kbd " ") 'my/correct-message-encoding-by-dummy-edit) `---- to quickly «wash» the articles. If I use describe-char to inspect the characters, I get this before «washing»: ,---- | position: 470 of 867 (54%), column: 30 | character: i (displayed as i) (codepoint 105, #o151, #x69) | charset: ascii (ASCII (ISO646 IRV)) | code point in charset: 0x69 | script: latin | syntax: w which means: word | category: .:Base, L:Left-to-right (strong), a:ASCII, l:Latin, r:Roman | to input: type "C-x 8 RET 69" or "C-x 8 RET LATIN SMALL LETTER I" | buffer code: #x69 | file code: #x69 (encoded by coding system utf-8-unix) | display: by this font (glyph code) | ftcrhb:-GOOG-Noto Sans-normal-normal-normal-*-19-*-*-*-*-0-iso10646-1 (#x4C) | | Character code properties: customize what to show | name: LATIN SMALL LETTER I | general-category: Ll (Letter, Lowercase) | decomposition: (105) ('i') | | There is an overlay here: | From 440 to 520 | face hl-line | priority -50 | window # | | | There are text properties here: | face variable-pitch `---- And this after «washing» ,---- | position: 472 of 871 (54%), column: 30 | character: é (displayed as é) (codepoint 233, #o351, #xe9) | charset: unicode (Unicode (ISO10646)) | code point in charset: 0xE9 | script: latin | syntax: w which means: word | category: .:Base, L:Left-to-right (strong), c:Chinese, j:Japanese, l:Latin, v:Viet | to input: type "C-x 8 RET e9" or "C-x 8 RET LATIN SMALL LETTER E WITH ACUTE" | buffer code: #xC3 #xA9 | file code: #xC3 #xA9 (encoded by coding system utf-8-unix) | display: by this font (glyph code) | ftcrhb:-GOOG-Noto Sans-normal-normal-normal-*-19-*-*-*-*-0-iso10646-1 (#xAB) | | Character code properties: customize what to show | name: LATIN SMALL LETTER E WITH ACUTE | old-name: LATIN SMALL LETTER E ACUTE | general-category: Ll (Letter, Lowercase) | decomposition: (101 769) ('e' '́') | | There is an overlay here: | From 442 to 523 | face hl-line | priority -50 | window # | | | There are text properties here: | face variable-pitch `---- The html part of the e-mails contains ,---- | --=-=-= Content-Type: multipart/alternative; boundary="==-=-=" --==-=-= Content-Type: text/plain | --==-=-= Content-Type: text/plain; charset=utf-8; format=flowed Content-Disposition: inline `---- so I guess that the html renderer should pick it up correctly. I have tried different values for mm-text-html-renderer (shr, w3m, gnus-w3m) and the result is always the same. I would like to understand what is happening and correct the problem, but I have no idea on how to proceed. Any help would be much appreciated. Thanks. Garjola --==-=-=-- --=-=-= Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KaW5mby1nbnVz LWVuZ2xpc2ggbWFpbGluZyBsaXN0CmluZm8tZ251cy1lbmdsaXNoQGdudS5vcmcKaHR0cHM6Ly9s aXN0cy5nbnUub3JnL21haWxtYW4vbGlzdGluZm8vaW5mby1nbnVzLWVuZ2xpc2gK --=-=-=--