From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/88266 Path: news.gmane.org!.POSTED!not-for-mail From: Eric Abrahamsen Newsgroups: gmane.emacs.gnus.general Subject: Re: nntp servers with multibyte group names? Date: Tue, 27 Nov 2018 17:03:32 -0800 Message-ID: <877egy813v.fsf@ericabrahamsen.net> References: <87tvk270sn.fsf@ericabrahamsen.net> <87in0i5l6h.fsf@tullinup.koldfront.dk> <878t1eni88.fsf@hope.eyrie.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: blaine.gmane.org 1543366941 30559 195.159.176.226 (28 Nov 2018 01:02:21 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Wed, 28 Nov 2018 01:02:21 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) To: ding@gnus.org Original-X-From: ding-owner+M36476@lists.math.uh.edu Wed Nov 28 02:02:16 2018 Return-path: Envelope-to: ding-account@gmane.org Original-Received: from lists1.math.uh.edu ([129.7.128.208]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gRoF6-0007oU-NF for ding-account@gmane.org; Wed, 28 Nov 2018 02:02:16 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu) by lists1.math.uh.edu with smtp (Exim 4.90_1) (envelope-from ) id 1gRoGg-0001gN-J6; Tue, 27 Nov 2018 19:03:54 -0600 Original-Received: from mx1.math.uh.edu ([129.7.128.32]) by lists1.math.uh.edu with esmtps (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.90_1) (envelope-from ) id 1gRoGY-0001dZ-8y for ding@lists.math.uh.edu; Tue, 27 Nov 2018 19:03:46 -0600 Original-Received: from quimby.gnus.org ([80.91.231.51]) by mx1.math.uh.edu with esmtps (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128) (Exim 4.90_1) (envelope-from ) id 1gRoGW-00005j-Df for ding@lists.math.uh.edu; Tue, 27 Nov 2018 19:03:46 -0600 Original-Received: from [195.159.176.226] (helo=blaine.gmane.org) by quimby.gnus.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gRoGV-0000Lj-5C for ding@gnus.org; Wed, 28 Nov 2018 02:03:43 +0100 Original-Received: from list by blaine.gmane.org with local (Exim 4.84_2) (envelope-from ) id 1gRoEL-0006wb-QL for ding@gnus.org; Wed, 28 Nov 2018 02:01:29 +0100 X-Injected-Via-Gmane: http://gmane.org/ Original-Lines: 46 Original-X-Complaints-To: usenet@blaine.gmane.org Cancel-Lock: sha1:8X6xYvAz/CLLW5QG8pbT1FqerTc= X-Spam-Score: -1.1 (-) List-ID: Precedence: bulk Xref: news.gmane.org gmane.emacs.gnus.general:88266 Archived-At: Russ Allbery writes: > Adam Sjøgren writes: > >> According to RFC 3977 (Network News Transfer Protocol (NNTP)): > >> "o Although this specification allows UTF-8 for newsgroup names, they >> SHOULD be restricted to US-ASCII until a successor to RFC 1036 >> [RFC1036] standardises another approach. 8-bit encodings SHOULD >> NOT be used because they are likely to cause interoperability >> problems." > >> - https://tools.ietf.org/html/rfc3977#section-10 > > For a bit of background here, non-ASCII newsgroup names mostly work, and > are even used in some areas, but we saw a few instances of strange > behavior in some experiments. However, putting raw UTF-8 directly into > the Newsgroups header breaks compatibility with RFC 5322 (email), which > prohibits non-ASCII characters in headers. > > Email would say that you should MIME-encode those names, but that will > definitely break all Usenet software, which assumes that Newsgroups are > byte strings that don't require any further interpretation. (And some of > the encoding characters are invalid in newsgroup names, I believe.) > > We weren't able to find a good reconciliation of that conflict before the > IETF working group ran out of steam. > > So you can probably just use raw UTF-8 directly in newsgroup names with a > local server, but expect some strangeness with some clients, and you are > (for whatever it's worth) breaking compatibility with the email standards > by doing so. Thanks very much for this, Russ -- this is good background. Since Gnus is only a client, we're not in a position to decide about the encoding of NNTP group names (thankfully), we only need to decide how to accept and handle such names as we receive from a server. As Stefan Monnier pointed out in answer to a separate question on emacs.help, the NNTP protocol will likely speak several different text encodings, so Gnus should still be running the network connection in binary mode. I'm going to leave the majority of the code as-is, and make the smallest change to group-name decoding I can. Thanks again, Eric