Announcements and discussions for Gnus, the GNU Emacs Usenet newsreader
 help / color / mirror / Atom feed
* Incorrect rendering of accented characters in HTML e-mail
@ 2020-09-27 16:43 Garjola Dindi
  0 siblings, 0 replies; only message in thread
From: Garjola Dindi @ 2020-09-27 16:43 UTC (permalink / raw)
  To: info-gnus-english

[-- Attachment #1: Type: text/plain, Size: 3249 bytes --]


Hi,

I am having a problem with reading html e-mail in Gnus: accented
characters appear with an incorrect encoding. For instance, "é" (e with
acute accent) will appear as "i".

The funny part comes now. If I edit the article with
gnus-summary-edit-article and just press C-c C-c (that is, I don't do
any edits) the characters are displayed correctly.

So right now, I use this

,----[ emacs-lisp ]
| (defun my/correct-message-encoding-by-dummy-edit ()
|   (interactive)
|   (progn
|     (gnus-summary-select-article-buffer)
|     (gnus-summary-edit-article)
|     (gnus-article-edit-done)))
| 
| (define-key gnus-summary-mode-map (kbd "<f8> <f8>") 'my/correct-message-encoding-by-dummy-edit)
`----

to quickly «wash» the articles. 

If I use describe-char to inspect the characters, I get this before «washing»:

,----
|              position: 470 of 867 (54%), column: 30
|             character: i (displayed as i) (codepoint 105, #o151, #x69)
|               charset: ascii (ASCII (ISO646 IRV))
| code point in charset: 0x69
|                script: latin
|                syntax: w 	which means: word
|              category: .:Base, L:Left-to-right (strong), a:ASCII, l:Latin, r:Roman
|              to input: type "C-x 8 RET 69" or "C-x 8 RET LATIN SMALL LETTER I"
|           buffer code: #x69
|             file code: #x69 (encoded by coding system utf-8-unix)
|               display: by this font (glyph code)
|     ftcrhb:-GOOG-Noto Sans-normal-normal-normal-*-19-*-*-*-*-0-iso10646-1 (#x4C)
| 
| Character code properties: customize what to show
|   name: LATIN SMALL LETTER I
|   general-category: Ll (Letter, Lowercase)
|   decomposition: (105) ('i')
| 
| There is an overlay here:
|  From 440 to 520
|   face                 hl-line
|   priority             -50
|   window               #<window 141 on *Article nnmaildir+RSSFeeds:ABlog*>
| 
| 
| There are text properties here:
|   face                 variable-pitch
`----

And this after «washing»

,----
|              position: 472 of 871 (54%), column: 30
|             character: é (displayed as é) (codepoint 233, #o351, #xe9)
|               charset: unicode (Unicode (ISO10646))
| code point in charset: 0xE9
|                script: latin
|                syntax: w 	which means: word
|              category: .:Base, L:Left-to-right (strong), c:Chinese, j:Japanese, l:Latin, v:Viet
|              to input: type "C-x 8 RET e9" or "C-x 8 RET LATIN SMALL LETTER E WITH ACUTE"
|           buffer code: #xC3 #xA9
|             file code: #xC3 #xA9 (encoded by coding system utf-8-unix)
|               display: by this font (glyph code)
|     ftcrhb:-GOOG-Noto Sans-normal-normal-normal-*-19-*-*-*-*-0-iso10646-1 (#xAB)
| 
| Character code properties: customize what to show
|   name: LATIN SMALL LETTER E WITH ACUTE
|   old-name: LATIN SMALL LETTER E ACUTE
|   general-category: Ll (Letter, Lowercase)
|   decomposition: (101 769) ('e' '́')
| 
| There is an overlay here:
|  From 442 to 523
|   face                 hl-line
|   priority             -50
|   window               #<window 155 on *Article nnmaildir+RSSFeeds:ABlog*>
| 
| 
| There are text properties here:
|   face                 variable-pitch
`----

The html part of the e-mails contains

,----
| 

[-- Attachment #2.1: Type: text/plain, Size: 2 bytes --]

| 

[-- Attachment #2.2: Type: text/plain, Size: 348 bytes --]

`----

so I guess that the html renderer should pick it up correctly. I have
tried different values for mm-text-html-renderer (shr, w3m, gnus-w3m)
and the result is always the same.


I would like to understand what is happening and correct the problem,
but I have no idea on how to proceed.

Any help would be much appreciated.

Thanks.

Garjola


[-- Attachment #3: Type: text/plain, Size: 162 bytes --]

_______________________________________________
info-gnus-english mailing list
info-gnus-english@gnu.org
https://lists.gnu.org/mailman/listinfo/info-gnus-english

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2020-09-27 16:44 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-27 16:43 Incorrect rendering of accented characters in HTML e-mail Garjola Dindi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).