From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/86567 Path: news.gmane.org!not-for-mail From: Katsumi Yamaoka Newsgroups: gmane.emacs.gnus.general Subject: Re: rfc2047 decoding Date: Thu, 07 Jan 2016 09:45:35 +0900 Organization: Emacsen advocacy group Message-ID: References: <8737uawenw.fsf@tullinup.koldfront.dk> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1452127608 17123 80.91.229.3 (7 Jan 2016 00:46:48 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 7 Jan 2016 00:46:48 +0000 (UTC) To: ding@gnus.org Original-X-From: ding-owner+M34794@lists.math.uh.edu Thu Jan 07 01:46:36 2016 Return-path: Envelope-to: ding-account@gmane.org Original-Received: from lists1.math.uh.edu ([129.7.128.208]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1aGyin-0003VM-W8 for ding-account@gmane.org; Thu, 07 Jan 2016 01:46:34 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu) by lists1.math.uh.edu with smtp (Exim 4.85) (envelope-from ) id 1aGyi8-0000Wk-9b; Wed, 06 Jan 2016 18:45:52 -0600 Original-Received: from mx1.math.uh.edu ([129.7.128.32]) by lists1.math.uh.edu with esmtps (TLSv1.2:AES128-GCM-SHA256:128) (Exim 4.85) (envelope-from ) id 1aGyi5-0000WK-6o for ding@lists.math.uh.edu; Wed, 06 Jan 2016 18:45:49 -0600 Original-Received: from quimby.gnus.org ([80.91.231.51]) by mx1.math.uh.edu with esmtps (TLSv1.2:DHE-RSA-AES128-SHA:128) (Exim 4.85) (envelope-from ) id 1aGyi4-00089b-0S for ding@lists.math.uh.edu; Wed, 06 Jan 2016 18:45:49 -0600 Original-Received: from mail-hampton.hostforweb.net ([205.234.186.191] helo=hampton.hostforweb.net) by quimby.gnus.org with esmtps (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1aGyi1-0004vL-MB for ding@gnus.org; Thu, 07 Jan 2016 01:45:45 +0100 Original-Received: from s70.gtokyofl21.vectant.ne.jp ([202.215.75.70]:63738 helo=localhost) by hampton.hostforweb.net with esmtpsa (TLSv1.2:AES128-GCM-SHA256:128) (Exim 4.86) (envelope-from ) id 1aGyhq-001KYf-Gr for ding@gnus.org; Wed, 06 Jan 2016 18:45:36 -0600 X-Face: #kKnN,xUnmKia.'[pp`;Omh}odZK)?7wQSl"4o04=EixTF+V[""w~iNbM9ZL+.b*_CxUmFk B#Fu[*?MZZH@IkN:!"\w%I_zt>[$nm7nQosZ<3eu;B:$Q_:p!',P.c0-_Cy[dz4oIpw0ESA^D*1Lw= L&i*6&( User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (i686-pc-cygwin) Cancel-Lock: sha1:PQHg+OhF/AOMOMfwXaw6Jvtvh1w= X-OutGoing-Spam-Status: No, score=-2.9 X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - hampton.hostforweb.net X-AntiAbuse: Original Domain - gnus.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - jpl.org X-Get-Message-Sender-Via: hampton.hostforweb.net: authenticated_id: yamaoka/from_h X-Authenticated-Sender: hampton.hostforweb.net: yamaoka@jpl.org X-Source: X-Source-Args: X-Source-Dir: X-Spam-Score: -2.9 (--) List-ID: Precedence: bulk Xref: news.gmane.org gmane.emacs.gnus.general:86567 Archived-At: On Wed, 06 Jan 2016 23:01:39 +0100, Adam Sj=F8gren wrote: > What is the correct decoding of this header: > Subject: =3D?UTF-8?Q?Hackerangreb=3D20mod=3D20it=3D2Dl?=3D > =3D?UTF-8?Q?everand=3DC3=3DB8r=3D20bag=3D20app=3D20ti?=3D =3D?UTF-8?Q?l= =3D20DSB?=3D' > =3D?UTF-8?Q?s=3D20gr=3DC3=3DA6nsekontrol?=3D To begin with, there is a wrong encoding that violates RFC2047: ,---- | 5. Use of encoded-words in message headers | [...] | (1) An 'encoded-word' may replace a 'text' token (as defined by RFC 822) | [...] | Ordinary ASCII text and 'encoded-word's may appear together in the | same header field. However, an 'encoded-word' that appears in a | header field defined as '*text' MUST be separated from any adjacent | 'encoded-word' or 'text' by 'linear-white-space'. `---- I mean "=3D?UTF-8?Q?l=3D20DSB?=3D" and "'" are concatenated without SPC. The original text seems to be "DSB's", not "DSB' s", and the correct encoding would be to encode the whole letters into "=3D?utf-8?Q?DSB's?=3D". Gnus does not do so since a word that does not contain non-ASCII letter does not need to be encoded, though. (let ((mm-coding-system-priorities '(utf-8))) (rfc2047-encode-string "Subject:\ Hackerangreb mod it-leverand=F8r bag app til DSB's gr=E6nsekontrol")) "Subject: Hackerangreb mod =3D?utf-8?Q?it-leverand=3DC3=3DB8r?=3D bag app t= il DSB's =3D?utf-8?Q?gr=3DC3=3DA6nsekontrol?=3D" That's excellent, isn't it? :) > Is it: > "Hackerangreb mod it-leverand=F8r bag app til DSB' s gr=E6nsekontrol" - = or: > "Hackerangreb mod it-leverand=F8r bag app til DSB's gr=E6nsekontrol" ? > Gnus decodes it to the first line. So I'm inclined to think that is > correct. Probably there is no prescribed way to decode illegally encoded data and Gnus's way might not necessarily be the best. That Gnus does is simple; concatenate decoded successive encoded words without SPC[1] and leave the others as-is[2]. [1] In reality, rfc2047.el concatenates successive encoded words without SPC, and then decodes it. [2] The reason why there appears SPC between "'" and "s".