Announcements and discussions for Gnus, the GNU Emacs Usenet newsreader
 help / color / mirror / Atom feed
* mail back to UTF-8 before save to HD
@ 2016-02-04  3:37 Emanuel Berg
  2016-02-04  7:45 ` Peter Münster
  0 siblings, 1 reply; 17+ messages in thread
From: Emanuel Berg @ 2016-02-04  3:37 UTC (permalink / raw)
  To: info-gnus-english

(case 1) If I write a mail with the Swedish chars "å",
"ä", or "ö", as in

    äåö
    ÅÅÖ

the mail gets the headers:

    Content-Type: text/plain; charset=utf-8
    Content-Transfer-Encoding: quoted-printable

(case 2) But if I don't use any of them chars, there
is only:

    Content-Type: text/plain

However when the mail from (case 1) is archived in my
nnml group mail.sent, the UTF-8 isn't restored but the
"quoted-printable" style remains:

    =C3=A4=C3=A5=C3=B6
    =C3=84=C3=85=C3=96

I wrote this zsh to restore it:

    back-to-swedish () {
        local files=($@)
        sed -e 's/=C3=A5/å/g; s/=C3=A4/ä/g; s/=C3=B6/ö/g' \
            -e 's/=C3=85/Å/g; s/=C3=84/Å/g; s/=C3=96/Ö/g' \
            -e 's/=20/ /g' $files
    }

(I don't know why the whitespace at the end of the
signature delimiter must be restored as well. It gets
sent and archived as =20.)

My question is, how can I automatize this so the UTF-8
is restored automatically if it was used in the
original message?

-- 
underground experts united
http://user.it.uu.se/~embe8573


_______________________________________________
info-gnus-english mailing list
info-gnus-english@gnu.org
https://lists.gnu.org/mailman/listinfo/info-gnus-english

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: mail back to UTF-8 before save to HD
  2016-02-04  3:37 mail back to UTF-8 before save to HD Emanuel Berg
@ 2016-02-04  7:45 ` Peter Münster
  2016-02-04 21:32   ` Emanuel Berg
  2016-02-05 18:42   ` Emanuel Berg
  0 siblings, 2 replies; 17+ messages in thread
From: Peter Münster @ 2016-02-04  7:45 UTC (permalink / raw)
  To: info-gnus-english

On Thu, Feb 04 2016, Emanuel Berg wrote:

> However when the mail from (case 1) is archived in my
> nnml group mail.sent, the UTF-8 isn't restored but the
> "quoted-printable" style remains:
>
>     =C3=A4=C3=A5=C3=B6
>     =C3=84=C3=85=C3=96

But in the Gnus Article buffer you see "äåö" and "ÅÅÖ", don't you?


> My question is, how can I automatize this so the UTF-8
> is restored automatically if it was used in the
> original message?

If you really want quoted-printable format for sending and archiving,
then I don't know the answer.
But if you don't need quoted-printable in the first place, then

(add-to-list 'mm-content-transfer-encoding-defaults '("text/plain" 8bit))

would probably solve your problem.

-- 
           Peter


_______________________________________________
info-gnus-english mailing list
info-gnus-english@gnu.org
https://lists.gnu.org/mailman/listinfo/info-gnus-english

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: mail back to UTF-8 before save to HD
  2016-02-04  7:45 ` Peter Münster
@ 2016-02-04 21:32   ` Emanuel Berg
  2016-02-05 18:42   ` Emanuel Berg
  1 sibling, 0 replies; 17+ messages in thread
From: Emanuel Berg @ 2016-02-04 21:32 UTC (permalink / raw)
  To: info-gnus-english

Peter Münster <pmlists@free.fr> writes:

> But in the Gnus Article buffer you see "äåö" and
> "ÅÅÖ", don't you?

Otherwise it would be intolerable...

> If you really want quoted-printable format for
> sending and archiving, then I don't know the answer.
> But if you don't need quoted-printable in the first
> place, then
>
> (add-to-list 'mm-content-transfer-encoding-defaults
> '("text/plain" 8bit))
>
> would probably solve your problem.

I have not fiddled with quoted-printable so I don't
know. But I'll test it, right away.

Actually I don't mind mails being sent in any
particular way.

But yes, I would like it *archived* with the chars
intact (or "put back") because I use the shell tools
to interact with the archive - then I don't want to it
to be all gibberish-looking or "no hits" when there
are in fact plenty...

-- 
underground experts united
http://user.it.uu.se/~embe8573


_______________________________________________
info-gnus-english mailing list
info-gnus-english@gnu.org
https://lists.gnu.org/mailman/listinfo/info-gnus-english

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: mail back to UTF-8 before save to HD
  2016-02-04  7:45 ` Peter Münster
  2016-02-04 21:32   ` Emanuel Berg
@ 2016-02-05 18:42   ` Emanuel Berg
  2016-02-05 18:59     ` Teemu Likonen
  1 sibling, 1 reply; 17+ messages in thread
From: Emanuel Berg @ 2016-02-05 18:42 UTC (permalink / raw)
  To: info-gnus-english

Peter Münster <pmlists@free.fr> writes:

> (add-to-list 'mm-content-transfer-encoding-defaults
> '("text/plain" 8bit))
>
> would probably solve your problem.

I tried to alter
`mm-content-transfer-encoding-defaults' with a head

    (".*" base64)

(for `8bit' and `qp-or-base64' as well)

but the headers are always the same:

    Content-Type: text/plain; charset=utf-8
    Content-Transfer-Encoding: quoted-printable

and file(1) also says the same of the archived files:

    message/rfc822; charset=us-ascii

and the translated chars remain:

    =C3=A5 =C3=A4 =C3=B6 =C3=85 =C3=84 =C3=9

Are you sure
`mm-content-transfer-encoding-defaults' applies?

-- 
underground experts united
http://user.it.uu.se/~embe8573


_______________________________________________
info-gnus-english mailing list
info-gnus-english@gnu.org
https://lists.gnu.org/mailman/listinfo/info-gnus-english

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: mail back to UTF-8 before save to HD
  2016-02-05 18:42   ` Emanuel Berg
@ 2016-02-05 18:59     ` Teemu Likonen
  2016-02-06  0:20       ` Emanuel Berg
  0 siblings, 1 reply; 17+ messages in thread
From: Teemu Likonen @ 2016-02-05 18:59 UTC (permalink / raw)
  To: info-gnus-english


[-- Attachment #1.1: Type: text/plain, Size: 498 bytes --]

Mail is archived in the same format that it is sent. I don't think you
can change that. But if you always want UTF-8 charset and 8bit content
transfer encoding I think these will do it:


(setq
 mm-body-charset-encoding-alist '((utf-8 . 8bit))
 mm-coding-system-priorities '(utf-8))

(add-to-list 'mm-content-transfer-encoding-defaults '("text/.*" 8bit))

-- 
/// Teemu Likonen   - .-..   <https://github.com/tlikonen> //
// PGP: 4E10 55DC 84E9 DFF6 13D7 8557 719D 69D3 2453 9450 ///

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: mail back to UTF-8 before save to HD
  2016-02-05 18:59     ` Teemu Likonen
@ 2016-02-06  0:20       ` Emanuel Berg
  2016-02-06  2:11         ` Adam Sjøgren
  2016-02-06  6:57         ` Teemu Likonen
  0 siblings, 2 replies; 17+ messages in thread
From: Emanuel Berg @ 2016-02-06  0:20 UTC (permalink / raw)
  To: info-gnus-english

[-- Attachment #1: Type: text/plain, Size: 1046 bytes --]

Teemu Likonen <tlikonen@iki.fi> writes:

> But if you always want UTF-8 charset and 8bit
> content transfer encoding

Are there any drawbacks to that?

> (setq mm-body-charset-encoding-alist '((utf-8 . 8bit))
> mm-coding-system-priorities '(utf-8))
>
> (add-to-list 'mm-content-transfer-encoding-defaults
> '("text/.*" 8bit))

I already did try all combinations with
`mm-content-transfer-encoding-defaulst' to no avail.

Setting `mm-coding-system-priorities' doesn't seem to
influence either, at least not with

    (setq mm-body-charset-encoding-alist '((utf-8 . 8bit)))

- however that (almost) did it.

file -i says:

    message/rfc822; charset=utf-8

and the headers are now:

    Content-Type: text/plain; charset=utf-8
    Content-Transfer-Encoding: 8bit

Only the Subject is now =?utf-8?B?w6XDpMO2IMOFw4TDlg==?
for "åäö ÅÄÖ" - and there is no
"mm-SUBJECT-charset-encoding-alist"...

And, with `gnus-summary-show-article' and one C-u, it
looks like this:

    ¥¤¶ …„–

-- 
underground experts united
http://user.it.uu.se/~embe8573





^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: mail back to UTF-8 before save to HD
  2016-02-06  0:20       ` Emanuel Berg
@ 2016-02-06  2:11         ` Adam Sjøgren
  2016-02-06  2:15           ` Emanuel Berg
  2016-02-06  6:57         ` Teemu Likonen
  1 sibling, 1 reply; 17+ messages in thread
From: Adam Sjøgren @ 2016-02-06  2:11 UTC (permalink / raw)
  To: info-gnus-english

Emanuel writes:

> Only the Subject is now =?utf-8?B?w6XDpMO2IMOFw4TDlg==?

8-bit is not allowed in headers; rfc2047 specifies how to encode
non-ascii values in headers - the result is what you see.


  Best regards,

-- 
 "Fish swim, birds fly,                                       Adam Sjøgren
  daddy's yell, mama's cry                               asjo@koldfront.dk
  old men sit and think
  I drink"


_______________________________________________
info-gnus-english mailing list
info-gnus-english@gnu.org
https://lists.gnu.org/mailman/listinfo/info-gnus-english

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: mail back to UTF-8 before save to HD
  2016-02-06  2:11         ` Adam Sjøgren
@ 2016-02-06  2:15           ` Emanuel Berg
  2016-02-06 14:23             ` Adam Sjøgren
  0 siblings, 1 reply; 17+ messages in thread
From: Emanuel Berg @ 2016-02-06  2:15 UTC (permalink / raw)
  To: info-gnus-english

asjo@koldfront.dk (Adam Sjøgren) writes:

>> Only the Subject is now =?utf-8?B?w6XDpMO2IMOFw4TDlg==?
>
> 8-bit is not allowed in headers; rfc2047 specifies
> how to encode non-ascii values in headers - the
> result is what you see.

Then the only issue left is why the UTF-8 don't show
up in the raw article mode.

-- 
underground experts united
http://user.it.uu.se/~embe8573


_______________________________________________
info-gnus-english mailing list
info-gnus-english@gnu.org
https://lists.gnu.org/mailman/listinfo/info-gnus-english

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: mail back to UTF-8 before save to HD
  2016-02-06  0:20       ` Emanuel Berg
  2016-02-06  2:11         ` Adam Sjøgren
@ 2016-02-06  6:57         ` Teemu Likonen
  2016-02-06 21:36           ` Emanuel Berg
  1 sibling, 1 reply; 17+ messages in thread
From: Teemu Likonen @ 2016-02-06  6:57 UTC (permalink / raw)
  To: info-gnus-english


[-- Attachment #1.1: Type: text/plain, Size: 683 bytes --]

Emanuel Berg [2016-02-06 01:20:15+01] wrote:

> Only the Subject is now =?utf-8?B?w6XDpMO2IMOFw4TDlg==?

And that's how it should be. Headers must be "Q" or "B" coded (quoted
printable or base64, respectively).

It is actually possible to make headers unencoded with variable
gnus-group-posting-charset-alist BUT DON'T DO IT. Your messages will be
broken in other people's mail agents and you'll be instructed to fix
your configuration. Headers must be in =?charset?encoding?...?= format
if there is anything other than Ascii characters.

-- 
/// Teemu Likonen   - .-..   <https://github.com/tlikonen> //
// PGP: 4E10 55DC 84E9 DFF6 13D7 8557 719D 69D3 2453 9450 ///

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: mail back to UTF-8 before save to HD
  2016-02-06  2:15           ` Emanuel Berg
@ 2016-02-06 14:23             ` Adam Sjøgren
  2016-02-06 21:47               ` Emanuel Berg
  0 siblings, 1 reply; 17+ messages in thread
From: Adam Sjøgren @ 2016-02-06 14:23 UTC (permalink / raw)
  To: info-gnus-english

Emanuel writes:

> asjo@koldfront.dk (Adam Sjøgren) writes:

>>> Only the Subject is now =?utf-8?B?w6XDpMO2IMOFw4TDlg==?

>> 8-bit is not allowed in headers; rfc2047 specifies
>> how to encode non-ascii values in headers - the
>> result is what you see.

> Then the only issue left is why the UTF-8 don't show
> up in the raw article mode.

They won't, because the standards (RFCs) do not allow 8-bit characters
in headers, as I just wrote above. "Raw" headers must be encoded, in the
way specified in rfc2047.

It is then up to the program displaying (or indexing, or searching) them
to decode the representation.


  Best regards,

    Adam

-- 
 "I am still twitching at the idea that you need to           Adam Sjøgren
  load code into the kernel in order to re-map a         asjo@koldfront.dk
  mouse button."


_______________________________________________
info-gnus-english mailing list
info-gnus-english@gnu.org
https://lists.gnu.org/mailman/listinfo/info-gnus-english

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: mail back to UTF-8 before save to HD
  2016-02-06  6:57         ` Teemu Likonen
@ 2016-02-06 21:36           ` Emanuel Berg
  0 siblings, 0 replies; 17+ messages in thread
From: Emanuel Berg @ 2016-02-06 21:36 UTC (permalink / raw)
  To: info-gnus-english

Teemu Likonen <tlikonen@iki.fi> writes:

>> Only the Subject is now
>> =?utf-8?B?w6XDpMO2IMOFw4TDlg==?
>
> And that's how it should be. Headers must be "Q" or
> "B" coded (quoted printable or base64,
> respectively).
>
> It is actually possible to make headers unencoded
> with variable gnus-group-posting-charset-alist BUT
> DON'T DO IT. Your messages will be broken in other
> people's mail agents and you'll be instructed to fix
> your configuration. Headers must be in
> =?charset?encoding?...?= format if there is anything
> other than Ascii characters.

Actually changing the transfer encoding at all wasn't
my original idea. Instead to do a conversion
after-send prior to the actual archiving on the disk.

But UTF-8 is pretty much standard as well, aye? So why
not.

This seems to be the best solution because now not
only are the messages archived in a form that is
searchable and readable, they are also still messages
that don't need to be "converted back" if used in
a mail setting again.

-- 
underground experts united
http://user.it.uu.se/~embe8573



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: mail back to UTF-8 before save to HD
  2016-02-06 14:23             ` Adam Sjøgren
@ 2016-02-06 21:47               ` Emanuel Berg
  2016-02-06 21:56                 ` Adam Sjøgren
  0 siblings, 1 reply; 17+ messages in thread
From: Emanuel Berg @ 2016-02-06 21:47 UTC (permalink / raw)
  To: info-gnus-english

asjo@koldfront.dk (Adam Sjøgren) writes:

>> Then the only issue left is why the UTF-8 don't
>> show up in the raw article mode.
>
> They won't, because the standards (RFCs) do not
> allow 8-bit characters in headers, as I just wrote
> above. "Raw" headers must be encoded, in the way
> specified in rfc2047.

In the *bodies* of the archived mails (which are files
in UTF-8) the Swedish chars are now shown, but after
`C-u g' in Gnus article mode it is the same old -
compare these screenshots:

    http://user.it.uu.se/~embe8573/pics/gnus-format/cat-archived-mail.png

    http://user.it.uu.se/~embe8573/pics/gnus-format/gnus-raw-article.png

-- 
underground experts united
http://user.it.uu.se/~embe8573


_______________________________________________
info-gnus-english mailing list
info-gnus-english@gnu.org
https://lists.gnu.org/mailman/listinfo/info-gnus-english

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: mail back to UTF-8 before save to HD
  2016-02-06 21:47               ` Emanuel Berg
@ 2016-02-06 21:56                 ` Adam Sjøgren
  2016-02-06 22:17                   ` Emanuel Berg
  0 siblings, 1 reply; 17+ messages in thread
From: Adam Sjøgren @ 2016-02-06 21:56 UTC (permalink / raw)
  To: info-gnus-english

Emanuel writes:

> In the *bodies* of the archived mails (which are files in UTF-8) the
> Swedish chars are now shown, but after `C-u g' in Gnus article mode it
> is the same old -

So you expect raw mode not to show you raw bytes, but utf-8 decoded text?


  Best regards,

    Adam

-- 
 "The choice is small If there's choice at all                Adam Sjøgren
  But I'm holding on as if I'm going to fall"            asjo@koldfront.dk


_______________________________________________
info-gnus-english mailing list
info-gnus-english@gnu.org
https://lists.gnu.org/mailman/listinfo/info-gnus-english

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: mail back to UTF-8 before save to HD
  2016-02-06 21:56                 ` Adam Sjøgren
@ 2016-02-06 22:17                   ` Emanuel Berg
  2016-02-06 22:34                     ` Adam Sjøgren
  0 siblings, 1 reply; 17+ messages in thread
From: Emanuel Berg @ 2016-02-06 22:17 UTC (permalink / raw)
  To: info-gnus-english

asjo@koldfront.dk (Adam Sjøgren) writes:

> So you expect raw mode not to show you raw bytes,
> but utf-8 decoded text?

OK, let's rephrase...

As you know there are some washing performed to make
the mails/posts look good. Rarely, but still, washing
doesn't work and instead it messes up some detail, be
it a piece of code, a citation, a figure ("ASCII art"
- yuk, that word), or whatever actually.

How can I then view the mail/post with Gnus (i.e., the
shortcuts etc. intact) but see the post as if I would
use cat(1) in the shell?

-- 
underground experts united
http://user.it.uu.se/~embe8573


_______________________________________________
info-gnus-english mailing list
info-gnus-english@gnu.org
https://lists.gnu.org/mailman/listinfo/info-gnus-english

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: mail back to UTF-8 before save to HD
  2016-02-06 22:17                   ` Emanuel Berg
@ 2016-02-06 22:34                     ` Adam Sjøgren
  2016-02-07  3:21                       ` Emanuel Berg
  0 siblings, 1 reply; 17+ messages in thread
From: Adam Sjøgren @ 2016-02-06 22:34 UTC (permalink / raw)
  To: info-gnus-english

Emanuel writes:

> How can I then view the mail/post with Gnus (i.e., the
> shortcuts etc. intact) but see the post as if I would
> use cat(1) in the shell?

How about typing "| cat RET"?


  Best regards,

    Adam

-- 
 "Ok, Jack, time for your lobotomy!! Hand me a big            Adam Sjøgren
  spoon will you Hobbes?" "Ugh! No anesthetic even."     asjo@koldfront.dk


_______________________________________________
info-gnus-english mailing list
info-gnus-english@gnu.org
https://lists.gnu.org/mailman/listinfo/info-gnus-english

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: mail back to UTF-8 before save to HD
  2016-02-06 22:34                     ` Adam Sjøgren
@ 2016-02-07  3:21                       ` Emanuel Berg
  2016-02-07  3:31                         ` Emanuel Berg
  0 siblings, 1 reply; 17+ messages in thread
From: Emanuel Berg @ 2016-02-07  3:21 UTC (permalink / raw)
  To: info-gnus-english

asjo@koldfront.dk (Adam Sjøgren) writes:

>> How can I then view the mail/post with Gnus (i.e.,
>> the shortcuts etc. intact) but see the post as if
>> I would use cat(1) in the shell?
>
> How about typing "| cat RET"?

Wonderful!

I wrote this -

(defun gnus-article-cat-message ()
  (interactive)
  (gnus-summary-pipe-message "cat") )

- and bound it to `p', for "pipe"!

To get back, there is already `g' for
`gnus-summary-show-article'.

-- 
underground experts united
http://user.it.uu.se/~embe8573


_______________________________________________
info-gnus-english mailing list
info-gnus-english@gnu.org
https://lists.gnu.org/mailman/listinfo/info-gnus-english

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: mail back to UTF-8 before save to HD
  2016-02-07  3:21                       ` Emanuel Berg
@ 2016-02-07  3:31                         ` Emanuel Berg
  0 siblings, 0 replies; 17+ messages in thread
From: Emanuel Berg @ 2016-02-07  3:31 UTC (permalink / raw)
  To: info-gnus-english

Emanuel Berg <embe8573@student.uu.se> writes:

>>> How can I then view the mail/post with Gnus (i.e.,
>>> the shortcuts etc. intact) but see the post as if
>>> I would use cat(1) in the shell?
>>
>> How about typing "| cat RET"?
>
> Wonderful!
>
> I wrote this -
>
> (defun gnus-article-cat-message () (interactive)
> (gnus-summary-pipe-message "cat") )
>
> - and bound it to `p', for "pipe"!
>
> To get back, there is already `g' for
> `gnus-summary-show-article'.

... I spoke too soon :(

It doesn't show what cat shows. I don't know what it
shows (?) - the headers (not all?) are shown and the
citations are expanded, but the [+] remains which
aren't in the files. Also the text isn't as in the
file or original message - it is still "washed".

Perhaps I should look over my "washing" and attack the
problem from there...

-- 
underground experts united
http://user.it.uu.se/~embe8573



^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2016-02-07  3:31 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-04  3:37 mail back to UTF-8 before save to HD Emanuel Berg
2016-02-04  7:45 ` Peter Münster
2016-02-04 21:32   ` Emanuel Berg
2016-02-05 18:42   ` Emanuel Berg
2016-02-05 18:59     ` Teemu Likonen
2016-02-06  0:20       ` Emanuel Berg
2016-02-06  2:11         ` Adam Sjøgren
2016-02-06  2:15           ` Emanuel Berg
2016-02-06 14:23             ` Adam Sjøgren
2016-02-06 21:47               ` Emanuel Berg
2016-02-06 21:56                 ` Adam Sjøgren
2016-02-06 22:17                   ` Emanuel Berg
2016-02-06 22:34                     ` Adam Sjøgren
2016-02-07  3:21                       ` Emanuel Berg
2016-02-07  3:31                         ` Emanuel Berg
2016-02-06  6:57         ` Teemu Likonen
2016-02-06 21:36           ` Emanuel Berg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).