Unnamed repository; edit this file 'description' to name the repository.
 help / color / Atom feed
* Incorrect character encoding in received messages
@ 2019-12-25 18:13 Garjola Dindi
  2019-12-27 18:07 ` Lars Ingebrigtsen
  0 siblings, 1 reply; 2+ messages in thread
From: Garjola Dindi @ 2019-12-25 18:13 UTC (permalink / raw)
  To: info-gnus-english

Hi all,

I have recently been having trouble with Gnus decoding some e-mails as
ASCII when actually they should be decoded as unicode.

For instance, in French, the “à” char gets displayed as “\340”.

If I go to «edit mode» with 'gnus-summary-edit-article' and just do C-c
C-c (with no real edit), the message gets displayed correctly.

Another example with the "é" char which appears as 'i' in an HTML
message. Describe char gives me this:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>             character: i (displayed as i) (codepoint 105, #o151, #x69)                    >
>               charset: ascii (ASCII (ISO646 IRV))                                         >
> code point in charset: 0x69                                                               >
>                script: latin                                                              >
>                syntax: w 	which means: word                                           >
>              category: .:Base, L:Left-to-right (strong), a:ASCII, l:Latin, r:Roman        >
>              to input: type "C-x 8 RET 69" or "C-x 8 RET LATIN SMALL LETTER I"            >
>           buffer code: #x69                                                               >
>             file code: #x69 (encoded by coding system utf-8-unix)                         >
>               display: by this font (glyph code)                                          >
>     xfthb:-PfEd-DejaVu Sans-normal-normal-normal-*-16-*-*-*-*-0-iso10646-1 (#x4C)         >
>                                                                                           >
> Character code properties: customize what to show                                         >
>   name: LATIN SMALL LETTER I                                                              >
>   general-category: Ll (Letter, Lowercase)                                                >
>   decomposition: (105) ('i')                                                              >
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >

And after 'gnus-summary-edit-article' followed by C-c C-c:

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>             character: é (displayed as é) (codepoint 233, #o351, #xe9)                                 >
>               charset: unicode (Unicode (ISO10646))                                                    >
> code point in charset: 0xE9                                                                            >
>                script: latin                                                                           >
>                syntax: w 	which means: word                                                        >
>              category: .:Base, L:Left-to-right (strong), c:Chinese, j:Japanese, l:Latin, v:Viet        >
>              to input: type "C-x 8 RET e9" or "C-x 8 RET LATIN SMALL LETTER E WITH ACUTE"              >
>           buffer code: #xC3 #xA9                                                                       >
>             file code: #xC3 #xA9 (encoded by coding system utf-8-unix)                                 >
>               display: by this font (glyph code)                                                       >
>     xfthb:-PfEd-DejaVu Sans-normal-normal-normal-*-16-*-*-*-*-0-iso10646-1 (#xAB)                      >
>                                                                                                        >
> Character code properties: customize what to show                                                      >
>   name: LATIN SMALL LETTER E WITH ACUTE                                                                >
>   old-name: LATIN SMALL LETTER E ACUTE                                                                 >
>   general-category: Ll (Letter, Lowercase)                                                             >
>   decomposition: (101 769) ('e' '́')                                                                    >
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >

Any idea of what may be happening here?

I am on emacs master, but there is no difference with 26.3.

Thanks!

G.

-- 


_______________________________________________
info-gnus-english mailing list
info-gnus-english@gnu.org
https://lists.gnu.org/mailman/listinfo/info-gnus-english

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Incorrect character encoding in received messages
  2019-12-25 18:13 Incorrect character encoding in received messages Garjola Dindi
@ 2019-12-27 18:07 ` Lars Ingebrigtsen
  0 siblings, 0 replies; 2+ messages in thread
From: Lars Ingebrigtsen @ 2019-12-27 18:07 UTC (permalink / raw)
  To: Garjola Dindi; +Cc: info-gnus-english

Garjola Dindi <garjola@garjola.net> writes:

> I have recently been having trouble with Gnus decoding some e-mails as
> ASCII when actually they should be decoded as unicode.

Sounds like there's no Content-Type header in the message that says what
the charset it.

> For instance, in French, the “à” char gets displayed as “\340”.

Then the message isn't encoded as utf-8, but is probably latin-1.

`C-u W M c' in the summary buffer should allow you to decode the message
in the proper charset.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

_______________________________________________
info-gnus-english mailing list
info-gnus-english@gnu.org
https://lists.gnu.org/mailman/listinfo/info-gnus-english

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, back to index

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-25 18:13 Incorrect character encoding in received messages Garjola Dindi
2019-12-27 18:07 ` Lars Ingebrigtsen

Unnamed repository; edit this file 'description' to name the repository.

Archives are clonable: git clone --mirror http://inbox.vuxu.org/info-gnus-english

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://inbox.vuxu.org/vuxu.archive.emacs.gnus.user


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git