Gnus development mailing list
 help / color / mirror / Atom feed
* Resending email in Gnus, figuring out charset
@ 2018-10-29 19:19 Adam Sjøgren
  2018-10-29 21:15 ` Andreas Schwab
  0 siblings, 1 reply; 14+ messages in thread
From: Adam Sjøgren @ 2018-10-29 19:19 UTC (permalink / raw)
  To: ding; +Cc: emacs-devel

  Hi,

When I resend (S D r) an email in Gnus with headers like this:

  Content-Type: text/plain; charset=utf-8
  Content-Transfer-Encoding: 8bit

and utf-8 chars in the content (such as →, æ, ø and å), Gnus says it
doesn't know what charset to use, and asks if if I want to send it as
ASCII anyway.

I've edebugged my way into the function mm-find-mime-charset-region in
mm-util.el, which tries to figure out what charset the region is in.

Unfortunately (raw-text no-conversion) is not something the function
knows what to do with, and thus I get the prompt.

I guess the utf-8-ification has outrun this function?

I've applied the following hack locally, but I guess it isn't the
correct solution?

diff --git a/lisp/gnus/mm-util.el b/lisp/gnus/mm-util.el
index 25b156803a..67682f8b7d 100644
--- a/lisp/gnus/mm-util.el
+++ b/lisp/gnus/mm-util.el
@@ -572,7 +572,8 @@ mm-find-mime-charset-region
 		 (while systems
 		   (let* ((head (pop systems))
 			  (cs (or (coding-system-get head :mime-charset)
-				  (coding-system-get head 'mime-charset))))
+				  (coding-system-get head 'mime-charset)
+                                  (if (eq head 'raw-text) 'utf-8 nil))))
 		     ;; The mime-charset (`x-ctext') of
 		     ;; `compound-text' is not in the IANA list.  We
 		     ;; shouldn't normally use anything here with a


  Best regards,

    Adam

-- 
 "Your editor, having been blissfully unaware of the          Adam Sjøgren
  scourge of unnecessary calculators just waiting for    asjo@koldfront.dk
  their opportunity to overwhelm his desktop, has not
  yet come to love the new way of doing things."




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Resending email in Gnus, figuring out charset
  2018-10-29 19:19 Resending email in Gnus, figuring out charset Adam Sjøgren
@ 2018-10-29 21:15 ` Andreas Schwab
  2018-10-29 21:21   ` Adam Sjøgren
  2018-10-29 21:32   ` Resending email in Gnus, figuring out charset Adam Sjøgren
  0 siblings, 2 replies; 14+ messages in thread
From: Andreas Schwab @ 2018-10-29 21:15 UTC (permalink / raw)
  To: Adam Sjøgren; +Cc: ding, emacs-devel

On Okt 29 2018, Adam Sjøgren <asjo@koldfront.dk> wrote:

> When I resend (S D r) an email in Gnus with headers like this:
>
>   Content-Type: text/plain; charset=utf-8
>   Content-Transfer-Encoding: 8bit
>
> and utf-8 chars in the content (such as →, æ, ø and å), Gnus says it
> doesn't know what charset to use, and asks if if I want to send it as
> ASCII anyway.

I just resent this mail to myself, and had no problem whatsoever.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Resending email in Gnus, figuring out charset
  2018-10-29 21:15 ` Andreas Schwab
@ 2018-10-29 21:21   ` Adam Sjøgren
  2018-10-30  6:23     ` Eli Zaretskii
  2018-10-29 21:32   ` Resending email in Gnus, figuring out charset Adam Sjøgren
  1 sibling, 1 reply; 14+ messages in thread
From: Adam Sjøgren @ 2018-10-29 21:21 UTC (permalink / raw)
  To: emacs-devel; +Cc: ding

Andreas writes:

> On Okt 29 2018, Adam Sjøgren <asjo@koldfront.dk> wrote:

> I just resent this mail to myself, and had no problem whatsoever.

Interesting.

I'm using:

  GNU Emacs 27.0.50 (build 17, x86_64-pc-linux-gnu, GTK+ Version 3.24.1) of 2018-10-15

and you?

If you step through mm-find-mime-charset-region when pressing S D r on
the email, what charset does it figure out?

As I wrote, I get (raw-text no-conversion), which the function doesn't
know what to do with, so it returns (nil). What happens in your setup?


  Best regards,

    Adam

-- 
 "He also no longer jokes about world domination; it          Adam Sjøgren
  was only funny when it was obviously meant in jest."   asjo@koldfront.dk




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Resending email in Gnus, figuring out charset
  2018-10-29 21:15 ` Andreas Schwab
  2018-10-29 21:21   ` Adam Sjøgren
@ 2018-10-29 21:32   ` Adam Sjøgren
  2018-10-29 21:38     ` Adam Sjøgren
  1 sibling, 1 reply; 14+ messages in thread
From: Adam Sjøgren @ 2018-10-29 21:32 UTC (permalink / raw)
  To: ding; +Cc: emacs-devel

Andreas writes:

> On Okt 29 2018, Adam Sjøgren <asjo@koldfront.dk> wrote:
>
>> When I resend (S D r) an email in Gnus with headers like this:
>>
>>   Content-Type: text/plain; charset=utf-8
>>   Content-Transfer-Encoding: 8bit
>>
>> and utf-8 chars in the content (such as ’, æ, ø and å), Gnus says it
>> doesn't know what charset to use, and asks if if I want to send it as
>> ASCII anyway.
>
> I just resent this mail to myself, and had no problem whatsoever.

Hm, no problem on _that_ message for me either. But I _do_ get the
problem on this message, in an nnml:-group:

== =
X-From-Line: imap Mon Oct 29 19:42:52 2018
Return-Path: <nobody@koldfront.dk>
X-Original-To: asjo
Delivered-To: asjo@virgil.koldfront.local
Received: by virgil.koldfront.dk (Postfix, from userid 65534)
	id D06F4801C1A5C; Mon, 29 Oct 2018 19:36:28 +0100 (CET)
DKIM-Filter: OpenDKIM Filter v2.11.0 virgil.koldfront.dk D06F4801C1A5C
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=koldfront.dk;
	s=mail; t=1540838188;
	bh=WfvGOpyTtxwaOiJs88bePPtVIyRpvI5bQ0iBkXqesDk=;
	h=Subject:To:Date:From:From;
	b=HkDDxc7mpghKKTNijxksOQWYPo0E1JuQ9d83X4FalaMqWQheWC0THN8bpSf9K8a3k
	 tPjg2W0qUoAxSV0JtMzYSGbiLnhQTMx2Z+8beY1aC3jkbuAxKrXJKOcACO1tesioTG
	 SbmzB/Ohp6RD0Kz0F8j8BV4V25YoUH51qFkNufuQ9884RliLKy3ScjPyqXQvMqLSLO
	 G8G28pKa9en6dGlmyL2GsQ3n7xqDyiuKykYNaIOVExKRppGKwOF1BvGzRxwz03NjV0
	 jAIwUpicR4L1+xyWzkneZy11Og5x/G5rhfD5eUua1ADQmpRC0I8BcQYdf1A09L5SIj
	 quqBBXX9MWPmg==
Subject: Output from your job      210
To: asjo@koldfront.dk
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Gnus-Mail-Source: imap:mail.koldfront.dk:INBOX
Message-Id: <20181029183628.D06F4801C1A5C@virgil.koldfront.dk>
Date: Mon, 29 Oct 2018 19:36:28 +0100 (CET)
From: nobody@koldfront.dk (nobody)
Lines: 12
Xref: tullinup feedbase:211

2018-10-29 19:36:28 feedbase.blog.cfcs

 1/3 Center for Cybersikkerhed indvier i dag nyt Cybersituationscenter (Abonn©r p¥ seneste nyheder fra FE) new article

 2/3 Halvdelen af danskerne genbruger passwords (Abonn©r p¥ seneste nyheder fra FE) new article

 3/3 Forsvarets Efterretningstjeneste p¥ Kulturnat 2018 (Abonn©r p¥ seneste nyheder fra FE) new article

 † New: 3, updated: 0 (updates ignored: 0), crossposted: 0, seen: 0             
Total - new: 3, updated: 0 (updates ignored: 0), crossposted: 0, seen: 0


== =

I haven't figured out what the difference is. mail-group vs. newsgroup?



  /A

-- 
 "This German waltz is not as elegant as the ones from        Adam Sjøgren
  Vienna. But it is louder."                             asjo@koldfront.dk




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Resending email in Gnus, figuring out charset
  2018-10-29 21:32   ` Resending email in Gnus, figuring out charset Adam Sjøgren
@ 2018-10-29 21:38     ` Adam Sjøgren
  0 siblings, 0 replies; 14+ messages in thread
From: Adam Sjøgren @ 2018-10-29 21:38 UTC (permalink / raw)
  To: ding; +Cc: emacs-devel

Adam writes:

> Hm, no problem on _that_ message for me either. But I _do_ get the
> problem on this message, in an nnml:-group:

Ok, that one was broken in encoding. Let me try again:

== =
X-From-Line: imap Mon Oct 29 19:42:52 2018
Return-Path: <nobody@koldfront.dk>
X-Original-To: asjo
Delivered-To: asjo@virgil.koldfront.local
Received: by virgil.koldfront.dk (Postfix, from userid 65534)
	id D06F4801C1A5C; Mon, 29 Oct 2018 19:36:28 +0100 (CET)
DKIM-Filter: OpenDKIM Filter v2.11.0 virgil.koldfront.dk D06F4801C1A5C
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=koldfront.dk;
	s=mail; t=1540838188;
	bh=WfvGOpyTtxwaOiJs88bePPtVIyRpvI5bQ0iBkXqesDk=;
	h=Subject:To:Date:From:From;
	b=HkDDxc7mpghKKTNijxksOQWYPo0E1JuQ9d83X4FalaMqWQheWC0THN8bpSf9K8a3k
	 tPjg2W0qUoAxSV0JtMzYSGbiLnhQTMx2Z+8beY1aC3jkbuAxKrXJKOcACO1tesioTG
	 SbmzB/Ohp6RD0Kz0F8j8BV4V25YoUH51qFkNufuQ9884RliLKy3ScjPyqXQvMqLSLO
	 G8G28pKa9en6dGlmyL2GsQ3n7xqDyiuKykYNaIOVExKRppGKwOF1BvGzRxwz03NjV0
	 jAIwUpicR4L1+xyWzkneZy11Og5x/G5rhfD5eUua1ADQmpRC0I8BcQYdf1A09L5SIj
	 quqBBXX9MWPmg==
Subject: Output from your job      210
To: asjo@koldfront.dk
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Gnus-Mail-Source: imap:mail.koldfront.dk:INBOX
Message-Id: <20181029183628.D06F4801C1A5C@virgil.koldfront.dk>
Date: Mon, 29 Oct 2018 19:36:28 +0100 (CET)
From: nobody@koldfront.dk (nobody)
Lines: 12
Xref: tullinup feedbase:211

2018-10-29 19:36:28 feedbase.blog.cfcs

 1/3 Center for Cybersikkerhed indvier i dag nyt Cybersituationscenter (Abonnér på seneste nyheder fra FE) new article

 2/3 Halvdelen af danskerne genbruger passwords (Abonnér på seneste nyheder fra FE) new article

 3/3 Forsvarets Efterretningstjeneste på Kulturnat 2018 (Abonnér på seneste nyheder fra FE) new article

 → New: 3, updated: 0 (updates ignored: 0), crossposted: 0, seen: 0             
Total - new: 3, updated: 0 (updates ignored: 0), crossposted: 0, seen: 0



== =

This one should be correct (inserted the file from disk).


  Best regards,

    Adam

-- 
 "You shouldn't let poets lie to you."                        Adam Sjøgren
                                                         asjo@koldfront.dk




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Resending email in Gnus, figuring out charset
  2018-10-29 21:21   ` Adam Sjøgren
@ 2018-10-30  6:23     ` Eli Zaretskii
  2018-10-31 18:51       ` Adam Sjøgren
  0 siblings, 1 reply; 14+ messages in thread
From: Eli Zaretskii @ 2018-10-30  6:23 UTC (permalink / raw)
  To: Adam Sjøgren; +Cc: ding, emacs-devel

> From: Adam Sjøgren <asjo@koldfront.dk>
> Date: Mon, 29 Oct 2018 22:21:00 +0100
> Cc: ding@gnus.org
> 
> As I wrote, I get (raw-text no-conversion)

This means you have raw bytes in the mail body, so I think the problem
is elsewhere: where those bytes originated.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Resending email in Gnus, figuring out charset
  2018-10-30  6:23     ` Eli Zaretskii
@ 2018-10-31 18:51       ` Adam Sjøgren
  2018-10-31 18:59         ` Andreas Schwab
  2018-10-31 19:15         ` Eli Zaretskii
  0 siblings, 2 replies; 14+ messages in thread
From: Adam Sjøgren @ 2018-10-31 18:51 UTC (permalink / raw)
  To: emacs-devel; +Cc: ding

Eli writes:

>> As I wrote, I get (raw-text no-conversion)

> This means you have raw bytes in the mail body, so I think the problem
> is elsewhere: where those bytes originated.

I'm not sure I understand that.

The Content-Transfer-Encoding: 8bit header means "raw bytes in the
body", and the Content-Type: text/plain; charset=utf-8 explains how
those bytes should be interpreted, right?

When I look at the feedbase-email in Gnus, it is displayed as expected,
but when I try to resend it, for some reason Gnus can't guess what the
encoding should be.

Interestingly Andreas Schwab tried resending my email explaining the
problem, which had the same headers and raw bytes, and Gnus _can_ figure
the encoding it when doing that.

So there must be some difference between the emails, confusing Gnus.


  Best regards,

    Adam

-- 
 "Det her er min Bob Dylan-sang nummer to                     Adam Sjøgren
  Den første var hæderlig, men ikke rigtigt go'"         asjo@koldfront.dk




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Resending email in Gnus, figuring out charset
  2018-10-31 18:51       ` Adam Sjøgren
@ 2018-10-31 18:59         ` Andreas Schwab
  2018-10-31 19:42           ` Adam Sjøgren
  2018-10-31 19:15         ` Eli Zaretskii
  1 sibling, 1 reply; 14+ messages in thread
From: Andreas Schwab @ 2018-10-31 18:59 UTC (permalink / raw)
  To: Adam Sjøgren; +Cc: emacs-devel, ding

On Okt 31 2018, Adam Sjøgren <asjo@koldfront.dk> wrote:

> When I look at the feedbase-email in Gnus, it is displayed as expected,
> but when I try to resend it, for some reason Gnus can't guess what the
> encoding should be.

What do you get if you run find-coding-systems-region on the article
buffer?

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Resending email in Gnus, figuring out charset
  2018-10-31 18:51       ` Adam Sjøgren
  2018-10-31 18:59         ` Andreas Schwab
@ 2018-10-31 19:15         ` Eli Zaretskii
  2018-10-31 19:43           ` Adam Sjøgren
  1 sibling, 1 reply; 14+ messages in thread
From: Eli Zaretskii @ 2018-10-31 19:15 UTC (permalink / raw)
  To: Adam Sjøgren; +Cc: ding, emacs-devel

> From: Adam Sjøgren <asjo@koldfront.dk>
> Date: Wed, 31 Oct 2018 19:51:57 +0100
> Cc: ding@gnus.org
> 
> Eli writes:
> 
> >> As I wrote, I get (raw-text no-conversion)
> 
> > This means you have raw bytes in the mail body, so I think the problem
> > is elsewhere: where those bytes originated.
> 
> I'm not sure I understand that.
> 
> The Content-Transfer-Encoding: 8bit header means "raw bytes in the
> body", and the Content-Type: text/plain; charset=utf-8 explains how
> those bytes should be interpreted, right?

These headers tell the receiving end how to interpret the message.
But I meant something different: what you have in the Gnus buffer
_before_ the message is sent.

> When I look at the feedbase-email in Gnus, it is displayed as expected,
> but when I try to resend it, for some reason Gnus can't guess what the
> encoding should be.

That's a sign of raw bytes in the buffer.

If you go to one of the offending characters in the Gnus buffer and
type "C-u C-x =", what does Emacs show about those characters?



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Resending email in Gnus, figuring out charset
  2018-10-31 18:59         ` Andreas Schwab
@ 2018-10-31 19:42           ` Adam Sjøgren
  0 siblings, 0 replies; 14+ messages in thread
From: Adam Sjøgren @ 2018-10-31 19:42 UTC (permalink / raw)
  To: ding; +Cc: emacs-devel

Andreas writes:

> On Okt 31 2018, Adam Sjøgren <asjo@koldfront.dk> wrote:
>
>> When I look at the feedbase-email in Gnus, it is displayed as expected,
>> but when I try to resend it, for some reason Gnus can't guess what the
>> encoding should be.
>
> What do you get if you run find-coding-systems-region on the article
> buffer?

I did this change:

@@ -609,6 +610,7 @@ mm-find-mime-charset-region
     (if (and (memq 'iso-2022-jp-2 charsets)
 	     (memq 'iso-2022-jp-2 hack-charsets))
 	(setq charsets (delq 'iso-2022-jp charsets)))
+    (message (format "mm-util.el:mm-find-mime-charset-region b:%s e:%s f-c-s-r: %s return:%s" b e (find-coding-systems-region b e) charsets))
     charsets))
 
 (defmacro mm-with-unibyte-buffer (&rest forms)

and then did S D r on the feedbase-email¹. In *Messages* I got:

  Resending message to asjo@koldfront.dk...
  mm-util.el:mm-find-mime-charset-region b:1402 e:1929 f-c-s-r: (raw-text no-conversion) return:(nil)
  Message contains characters with unknown encoding.  Really send? (y or n) n
  mml-parse-1: Edit your message to remove those characters

Then I tried on my original article in this thread² (the one you tested
with), and I got:

  Resending message to asjo@koldfront.dk...
  mm-util.el:mm-find-mime-charset-region b:1 e:30 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:30 e:78 f-c-s-r: (utf-8 iso-latin-1 korean-iso-8bit euc-jis-2004 japanese-iso-8bit windows-1258 iso-2022-jp-2004 adobe-standard-encoding hp-roman8 next mac-roman cp865 cp861 cp858 cp857 cp850 cp775 windows-1257 windows-1254 windows-1252 iso-latin-9 iso-latin-8 iso-latin-7 iso-latin-6 iso-latin-5 iso-latin-4 chinese-gb18030 chinese-big5-hkscs utf-7 iso-2022-kr iso-2022-jp-2 utf-16 utf-16be-with-signature utf-16le-with-signature utf-16be utf-16le compound-text-with-extensions compound-text iso-2022-7bit utf-8-auto utf-8-with-signature emacs-mule raw-text iso-2022-8bit-ss2 iso-2022-7bit-lock eucjp-ms utf-8-hfs georgian-academy georgian-ps korean-cp949 japanese-shift-jis-2004 japanese-iso-7bit-1978-irv ibm1047 utf-7-imap utf-8-emacs prefer-utf-8 no-co
 nversion ctext-no-compositions iso-2022-7bit-lock-ss2 iso-2022-7bit-ss2) return:(utf-8) [2 times]
  mm-util.el:mm-find-mime-charset-region b:43 e:55 f-c-s-r: (utf-8 iso-latin-1 korean-iso-8bit euc-jis-2004 japanese-iso-8bit windows-1258 iso-2022-jp-2004 adobe-standard-encoding hp-roman8 next mac-roman cp865 cp861 cp858 cp857 cp850 cp775 windows-1257 windows-1254 windows-1252 iso-latin-9 iso-latin-8 iso-latin-7 iso-latin-6 iso-latin-5 iso-latin-4 chinese-gb18030 chinese-big5-hkscs utf-7 iso-2022-kr iso-2022-jp-2 utf-16 utf-16be-with-signature utf-16le-with-signature utf-16be utf-16le compound-text-with-extensions compound-text iso-2022-7bit utf-8-auto utf-8-with-signature emacs-mule raw-text iso-2022-8bit-ss2 iso-2022-7bit-lock eucjp-ms utf-8-hfs georgian-academy georgian-ps korean-cp949 japanese-shift-jis-2004 japanese-iso-7bit-1978-irv ibm1047 utf-7-imap utf-8-emacs prefer-utf-8 no-co
 nversion ctext-no-compositions iso-2022-7bit-lock-ss2 iso-2022-7bit-ss2) return:(utf-8)
  mm-util.el:mm-find-mime-charset-region b:93 e:138 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:138 e:196 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:196 e:238 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:238 e:294 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:294 e:349 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:349 e:404 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:404 e:442 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:442 e:511 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:511 e:521 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:521 e:546 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:546 e:597 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:597 e:633 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:633 e:651 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:651 e:691 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:691 e:723 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:723 e:809 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:809 e:850 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:850 e:907 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:907 e:968 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:968 e:992 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:992 e:1010 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:1010 e:1088 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:1088 e:1139 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:1139 e:1175 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:1175 e:1419 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:1419 e:1646 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:1646 e:1921 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:1921 e:2193 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:2193 e:2421 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:2421 e:2612 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:2612 e:2652 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:2652 e:2671 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:2671 e:2721 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:2721 e:2931 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:2931 e:3023 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:3023 e:3048 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:3048 e:3168 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:3168 e:3215 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:3215 e:3237 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:3237 e:4261 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:4261 e:4286 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:4286 e:4303 f-c-s-r: (undecided) return:nil [2 times]
  mm-util.el:mm-find-mime-charset-region b:4303 e:4376 f-c-s-r: (undecided) return:nil [2 times]
  Sending via mail...

So it looks like it is calling find-mime-charset-region on each header
in the second case and not on the body, but in the failing case, it
calls it on the body only?!


  Still puzzled,

    Adam  

¹ <20181029183628.D06F4801C1A5C@virgil.koldfront.dk>
² <87in1ktvau.fsf@tullinup.koldfront.dk>

-- 
 "We are all in the gutter, but some of us are looking        Adam Sjøgren
  at the stars."                                         asjo@koldfront.dk




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Resending email in Gnus, figuring out charset
  2018-10-31 19:15         ` Eli Zaretskii
@ 2018-10-31 19:43           ` Adam Sjøgren
  2018-10-31 20:10             ` Eli Zaretskii
  0 siblings, 1 reply; 14+ messages in thread
From: Adam Sjøgren @ 2018-10-31 19:43 UTC (permalink / raw)
  To: ding; +Cc: emacs-devel

Eli writes:

>> The Content-Transfer-Encoding: 8bit header means "raw bytes in the
>> body", and the Content-Type: text/plain; charset=utf-8 explains how
>> those bytes should be interpreted, right?
>
> These headers tell the receiving end how to interpret the message.

Yes. So as I received this email, Gnus should be interpreting the bytes
at utf-8. And it seems to be, as they are displayed correctly.

> But I meant something different: what you have in the Gnus buffer
> _before_ the message is sent.

Before I resend the message, the buffer looks correct (i.e. I see the
the arrow and the accented e rather than \nnn\nnn\nnn etc.)

>> When I look at the feedbase-email in Gnus, it is displayed as expected,
>> but when I try to resend it, for some reason Gnus can't guess what the
>> encoding should be.
>
> That's a sign of raw bytes in the buffer.
>
> If you go to one of the offending characters in the Gnus buffer and
> type "C-u C-x =", what does Emacs show about those characters?

Ok, if I open the feedbase-email in Gnus, before I press S D r to
resend, and move point to → and é in the *Article* buffer, I get:

               position: 530 of 684 (77%), column: 1
              character: → (displayed as →) (codepoint 8594, #o20622, #x2192)
      preferred charset: unicode (Unicode (ISO10646))
  code point in charset: 0x2192
                 script: symbol
                 syntax: . 	which means: punctuation
               category: .:Base, c:Chinese, h:Korean, j:Japanese
               to input: type "C-x 8 RET 2192" or "C-x 8 RET RIGHTWARDS ARROW"
            buffer code: #xE2 #x86 #x92
              file code: #xE2 #x86 #x92 (encoded by coding system utf-8-unix)
                display: by this font (glyph code)
      xft:-PfEd-DejaVu Sans Mono-normal-normal-normal-*-20-*-*-*-m-0-iso10646-1 (#x7AE)

  Character code properties: customize what to show
    name: RIGHTWARDS ARROW
    old-name: RIGHT ARROW
    general-category: Sm (Symbol, Math)
    decomposition: (8594) ('→')

and:

               position: 284 of 684 (41%), column: 6
              character: é (displayed as é) (codepoint 233, #o351, #xe9)
      preferred charset: unicode (Unicode (ISO10646))
  code point in charset: 0xE9
                 script: latin
                 syntax: w 	which means: word
               category: .:Base, L:Left-to-right (strong), c:Chinese, j:Japanese, l:Latin, v:Viet
               to input: type "C-x 8 RET e9" or "C-x 8 RET LATIN SMALL LETTER E WITH ACUTE"
            buffer code: #xC3 #xA9
              file code: #xC3 #xA9 (encoded by coding system utf-8-unix)
                display: by this font (glyph code)
      xft:-PfEd-DejaVu Sans Mono-normal-normal-normal-*-20-*-*-*-m-0-iso10646-1 (#xAB)

  Character code properties: customize what to show
    name: LATIN SMALL LETTER E WITH ACUTE
    old-name: LATIN SMALL LETTER E ACUTE
    general-category: Ll (Letter, Lowercase)
    decomposition: (101 769) ('e' '́')


  Best regards,

    Adam

-- 
 "God must've been punting angels left and right."            Adam Sjøgren
                                                         asjo@koldfront.dk




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Resending email in Gnus, figuring out charset
  2018-10-31 19:43           ` Adam Sjøgren
@ 2018-10-31 20:10             ` Eli Zaretskii
  2018-10-31 20:22               ` Adam Sjøgren
  0 siblings, 1 reply; 14+ messages in thread
From: Eli Zaretskii @ 2018-10-31 20:10 UTC (permalink / raw)
  To: Adam Sjøgren; +Cc: ding, emacs-devel

> From: Adam Sjøgren <asjo@koldfront.dk>
> Date: Wed, 31 Oct 2018 20:43:44 +0100
> Cc: ding@gnus.org
> 
> Eli writes:
> 
> >> The Content-Transfer-Encoding: 8bit header means "raw bytes in the
> >> body", and the Content-Type: text/plain; charset=utf-8 explains how
> >> those bytes should be interpreted, right?
> >
> > These headers tell the receiving end how to interpret the message.
> 
> Yes. So as I received this email, Gnus should be interpreting the bytes
> at utf-8. And it seems to be, as they are displayed correctly.

The way they are displayed can deceive.

> > If you go to one of the offending characters in the Gnus buffer and
> > type "C-u C-x =", what does Emacs show about those characters?
> 
> Ok, if I open the feedbase-email in Gnus, before I press S D r to
> resend, and move point to → and é in the *Article* buffer, I get:

Did you try all of those characters?  Just one is enough to cause what
you describe.

If none of them shows up as belonging to eight-bit charset, the
problem could be in how you set up your email encoding (or encoding in
general).



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Resending email in Gnus, figuring out charset
  2018-10-31 20:10             ` Eli Zaretskii
@ 2018-10-31 20:22               ` Adam Sjøgren
  2018-10-31 21:11                 ` Resending email in Gnus, figuring out charset [solved] Adam Sjøgren
  0 siblings, 1 reply; 14+ messages in thread
From: Adam Sjøgren @ 2018-10-31 20:22 UTC (permalink / raw)
  To: emacs-devel; +Cc: ding

Eli writes:

>> Yes. So as I received this email, Gnus should be interpreting the bytes
>> at utf-8. And it seems to be, as they are displayed correctly.
>
> The way they are displayed can deceive.

Ok. If I do C-u g (display the raw article) I see them as \nnn.

>> Ok, if I open the feedbase-email in Gnus, before I press S D r to
>> resend, and move point to → and é in the *Article* buffer, I get:
>
> Did you try all of those characters?  Just one is enough to cause what
> you describe.

No, just the arrow and one of the accented e's. Why would the other
characters be diffent?

I have checked the two other accented e's now, they are reported
identical to the first when I use C-u C-x = on them.

> If none of them shows up as belonging to eight-bit charset, the
> problem could be in how you set up your email encoding (or encoding in
> general).

Could very well be. I haven't set anything up, as far as I know, besides
this:

  (setq mm-coding-system-priorities '(utf-8 iso-8859-1))


  Best regards,

    Adam

-- 
 "Du er som en tiger i fængsel                                Adam Sjøgren
  Jeg er som en kaktus i snor"                           asjo@koldfront.dk




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Resending email in Gnus, figuring out charset [solved]
  2018-10-31 20:22               ` Adam Sjøgren
@ 2018-10-31 21:11                 ` Adam Sjøgren
  0 siblings, 0 replies; 14+ messages in thread
From: Adam Sjøgren @ 2018-10-31 21:11 UTC (permalink / raw)
  To: emacs-devel; +Cc: ding

I did some more testing and noticed that the email where resend (S D r)
didn't grok the encoding had Content-Transfer-Encoding and Content-Type
header, but no MIME-Version header.

After adding a MIME-Version: 1.0 header, I don't get prompted about
characted with unknown encoding!

Thanks for the feedback spurring me on, Andreas and Eli.


  Best regards,

    Adam

-- 
 "Accept failure gracefully."                                 Adam Sjøgren
                                                         asjo@koldfront.dk




^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2018-10-31 21:11 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-29 19:19 Resending email in Gnus, figuring out charset Adam Sjøgren
2018-10-29 21:15 ` Andreas Schwab
2018-10-29 21:21   ` Adam Sjøgren
2018-10-30  6:23     ` Eli Zaretskii
2018-10-31 18:51       ` Adam Sjøgren
2018-10-31 18:59         ` Andreas Schwab
2018-10-31 19:42           ` Adam Sjøgren
2018-10-31 19:15         ` Eli Zaretskii
2018-10-31 19:43           ` Adam Sjøgren
2018-10-31 20:10             ` Eli Zaretskii
2018-10-31 20:22               ` Adam Sjøgren
2018-10-31 21:11                 ` Resending email in Gnus, figuring out charset [solved] Adam Sjøgren
2018-10-29 21:32   ` Resending email in Gnus, figuring out charset Adam Sjøgren
2018-10-29 21:38     ` Adam Sjøgren

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).