From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 10265 invoked from network); 17 Dec 2022 14:57:48 -0000 Received: from mx1.math.uh.edu (129.7.128.32) by inbox.vuxu.org with ESMTPUTF8; 17 Dec 2022 14:57:48 -0000 Received: from lists1.math.uh.edu ([129.7.128.208]) by mx1.math.uh.edu with esmtps (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1p6YdG-008gLx-IN for ml@inbox.vuxu.org; Sat, 17 Dec 2022 08:57:46 -0600 Received: from lists1.math.uh.edu ([127.0.0.1] helo=lists.math.uh.edu) by lists1.math.uh.edu with smtp (Exim 4.96) (envelope-from ) id 1p6YdG-004UyC-0r for ml@inbox.vuxu.org; Sat, 17 Dec 2022 08:57:46 -0600 Received: from mx2.math.uh.edu ([129.7.128.33]) by lists1.math.uh.edu with esmtp (Exim 4.96) (envelope-from ) id 1p6YdB-004Uy3-1J for ding@lists.math.uh.edu; Sat, 17 Dec 2022 08:57:41 -0600 Received: from quimby.gnus.org ([95.216.78.240]) by mx2.math.uh.edu with esmtps (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1p6Yd8-006PoE-L3 for ding@lists.math.uh.edu; Sat, 17 Dec 2022 08:57:40 -0600 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnus.org; s=20200322; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Message-ID :Date:References:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=3yj2+OYPeRY0OPJ9YDBnSyfrb8Z7U2xMR1Ybd4TwEVk=; b=EHqq2LnxG/rzJjYmWBt3n6bewG 9ClR8lvT8IBMSnV5sd5GjCojnByqFs9baOKiAtMIz/KHHuNg/ivTOcY3k258Yg1cJmneCKYlIWOKI hw3RdL+6Iqa5B9aItMjsWBxTYOM3T02QKhrlQuYPWnwLLl327GyVz5TQctoxrTXjFknI=; Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by quimby.gnus.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1p6Yd0-00063x-Ua for ding@gnus.org; Sat, 17 Dec 2022 15:57:33 +0100 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1671289047; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: references:references; bh=3yj2+OYPeRY0OPJ9YDBnSyfrb8Z7U2xMR1Ybd4TwEVk=; b=G5PZMi5AaDBplo7xcksP/bmhclBmdsxZtqmEsHUMebuEjpqo1dZuKGoODawKojD5Iq0On0 zxjh0FkpjbCLBfQS3USVQpUqGvYUziw42XrCALMYvh8AaBhHA2JhoVnstYiVuXIfKmnLYl 3Y06UCwXKclMZe4ApL12BUH6GA+ocZI= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-493-zbxBIvrVM3Cygt9paHrVUw-1; Sat, 17 Dec 2022 09:57:22 -0500 X-MC-Unique: zbxBIvrVM3Cygt9paHrVUw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B96763C01D88; Sat, 17 Dec 2022 14:57:21 +0000 (UTC) Received: from oldenburg.str.redhat.com (unknown [10.2.16.7]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 69DC31121314; Sat, 17 Dec 2022 14:57:20 +0000 (UTC) From: Florian Weimer To: Eli Zaretskii Cc: Lars Ingebrigtsen , Eric Abrahamsen , emacs-devel@gnu.org, ding@gnus.org Subject: Re: master ef14acf: Make nnml handle invalid non-ASCII headers more consistently References: <20210122180801.14756.84264@vcs0.savannah.gnu.org> <20210122180802.F0A1E20A10@vcs0.savannah.gnu.org> <874jtvq8c2.fsf@oldenburg.str.redhat.com> <83k02qiicb.fsf@gnu.org> Date: Sat, 17 Dec 2022 15:57:18 +0100 Message-ID: <87bko26ptd.fsf@oldenburg.str.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.1 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable List-ID: Precedence: bulk * Eli Zaretskii: >> From: Florian Weimer >> Cc: Lars Ingebrigtsen , ding@gnus.org >> Date: Fri, 16 Dec 2022 23:42:21 +0100 >>=20 >> * Lars Ingebrigtsen: >>=20 >> > branch: master >> > commit ef14acfb68bb5b0ce42221e9681b93562f8085eb >> > Author: Lars Ingebrigtsen >> > Commit: Lars Ingebrigtsen >> > >> > Make nnml handle invalid non-ASCII headers more consistently >> > =20 >> > * lisp/gnus/nnml.el (nnml--encode-headers): New function to >> > RFC2047-encode invalid Subject/From headers (bug#45925). This >> > will make them be displayed more consistently in the Summary >> > buffer (but still "wrong" sometimes, since there's not that much >> > we can guess at at this stage, charset wise). >> > (nnml-parse-head): Use it. >> > --- >> > lisp/gnus/nnml.el | 16 ++++++++++++++++ >> > 1 file changed, 16 insertions(+) >> > >> > diff --git a/lisp/gnus/nnml.el b/lisp/gnus/nnml.el >> > index ebececa..3cdfc74 100644 >> > --- a/lisp/gnus/nnml.el >> > +++ b/lisp/gnus/nnml.el >> > @@ -769,8 +769,24 @@ article number. This function is called narrowed= to an article." >> > (let ((headers (nnheader-parse-head t))) >> > =09(setf (mail-header-chars headers) chars) >> > =09(setf (mail-header-number headers) number) >> > +=09;; If there's non-ASCII raw characters in the data, >> > +=09;; RFC2047-encode them to avoid having arbitrary data in the >> > +=09;; .overview file. >> > +=09(nnml--encode-headers headers) >> > =09headers)))) >>=20 >> Unfortunately, this change in particular causes Gnus to stops storing >> messages into nnmail after receiving a message with this header: >>=20 >> From: =3D?utf-8?b?572X5YuH5YiaKFlvbmdnYW5nIEx1bykgdmlhIEVsZnV0aWxzLWRldm= Vs?=3D >> >>=20 >> The logged error message is: >>=20 >> Mail source (maildir :path =E2=80=A6) failed: (error Invalid data for rf= c2047 encoding: =E7=BD=97=E5=8B=87=E5=88=9A(Yonggang Luo) via Elfutils-deve= l ) >>=20 >> On an older Emacs without this change, it seems that the original header >> is written to the .overview file, which sidestep the problem that not >> all strings are encodable by the rfc2047 functions. > > Thanks. I guess this From header is invalid because there's no space > between the "=E7=BD=97=E5=8B=87=E5=88=9A" and the "(Yonggang Luo)" parts? Yes, that seems to be what's tripping the encoder. But I'm not sure if proper encoding of ( or ) (as =3D28 or =3D29 using the Q encoding, or using the B encoding as in the raw text) is actually invalid. RFC 2047 only talks about unencoded ( or ). In contrast, encoded ( and ) are valid syntax at the RFC 822 layer because encoding hides them. > Does the na=C3=AFve patch below solve the problem? > > diff --git a/lisp/gnus/nnml.el b/lisp/gnus/nnml.el > index 40e4b9e..7aa445e 100644 > --- a/lisp/gnus/nnml.el > +++ b/lisp/gnus/nnml.el > @@ -776,17 +776,22 @@ nnml-parse-head > =09(nnml--encode-headers headers) > =09headers)))) > =20 > +;; RFC2047-encode Subject and From, but leave invalid headers unencoded. > (defun nnml--encode-headers (headers) > (let ((subject (mail-header-subject headers)) > =09(rfc2047-encoding-type 'mime)) > (unless (string-match "\\`[[:ascii:]]*\\'" subject) > - (setf (mail-header-subject headers) > -=09 (mail-encode-encoded-word-string subject t)))) > + (let ((encoded-subject > + (ignore-errors (mail-encode-encoded-word-string subject t))= )) > + (if encoded-subject > + (setf (mail-header-subject headers) encoded-subject))))) > (let ((from (mail-header-from headers)) > =09(rfc2047-encoding-type 'address-mime)) > (unless (string-match "\\`[[:ascii:]]*\\'" from) > - (setf (mail-header-from headers) > -=09 (rfc2047-encode-string from t))))) > + (let ((encoded-from > + (ignore-errors (rfc2047-encode-string from t)))) > + (if encoded-from > + (setf (mail-header-from headers) encoded-from)))))) > =20 > (defun nnml-get-nov-buffer (group &optional incrementalp) > (let ((buffer (gnus-get-buffer-create Thanks! I somehow can't reproduce the original issue. I expect more problematic messages to arrive next week, though, and will report then how it goes. Florian