From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/65642 Path: news.gmane.org!not-for-mail From: Katsumi Yamaoka Newsgroups: gmane.emacs.gnus.general,gmane.emacs.devel Subject: [Unicode-2] `read' always returns multibyte symbol Date: Tue, 13 Nov 2007 18:41:08 +0900 Organization: Emacsen advocacy group Message-ID: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1194946943 12417 80.91.229.12 (13 Nov 2007 09:42:23 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 13 Nov 2007 09:42:23 +0000 (UTC) Cc: ding@gnus.org To: emacs-devel@gnu.org Original-X-From: ding-owner+M14138@lists.math.uh.edu Tue Nov 13 10:42:28 2007 Return-path: Envelope-to: ding-account@gmane.org Original-Received: from util0.math.uh.edu ([129.7.128.18]) by lo.gmane.org with esmtp (Exim 4.50) id 1IrsHq-0001o7-VC for ding-account@gmane.org; Tue, 13 Nov 2007 10:42:27 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu) by util0.math.uh.edu with smtp (Exim 4.63) (envelope-from ) id 1IrsGr-0007wY-Q3; Tue, 13 Nov 2007 03:41:25 -0600 Original-Received: from mx2.math.uh.edu ([129.7.128.33]) by util0.math.uh.edu with esmtps (TLSv1:AES256-SHA:256) (Exim 4.63) (envelope-from ) id 1IrsGq-0007wK-EB for ding@lists.math.uh.edu; Tue, 13 Nov 2007 03:41:24 -0600 Original-Received: from quimby.gnus.org ([80.91.231.51]) by mx2.math.uh.edu with esmtp (Exim 4.67) (envelope-from ) id 1IrsGk-0003Lq-1V for ding@lists.math.uh.edu; Tue, 13 Nov 2007 03:41:24 -0600 Original-Received: from orlando.hostforweb.net ([216.246.45.90]) by quimby.gnus.org with esmtp (Exim 3.35 #1 (Debian)) id 1IrsGf-0004vO-00 for ; Tue, 13 Nov 2007 10:41:13 +0100 Original-Received: from [66.225.201.151] (port=40884 helo=mail.jpl.org) by orlando.hostforweb.net with esmtpa (Exim 4.68) (envelope-from ) id 1IrsGc-0000Xg-Qd; Tue, 13 Nov 2007 03:41:11 -0600 X-Hashcash: 1:20:071113:emacs-devel@gnu.org::x18yBakitUo3bahW:00000000000000000000000000000000000000000018Lp X-Hashcash: 1:20:071113:ding@gnus.org::hKSDVvR62EZeqiKP:00002pxJ X-Face: #kKnN,xUnmKia.'[pp`;Omh}odZK)?7wQSl"4o04=EixTF+V[""w~iNbM9ZL+.b*_CxUmFk B#Fu[*?MZZH@IkN:!"\w%I_zt>[$nm7nQosZ<3eu;B:$Q_:p!',P.c0-_Cy[dz4oIpw0ESA^D*1Lw= L&i*6&( User-Agent: Gnus/5.110007 (No Gnus v0.7) Emacs/23.0.60 (gnu/linux) Cancel-Lock: sha1:cPBtgVjvyVPmaeKZ4ch6oHUX2wQ= Content-Disposition: inline X-Antivirus-Scanner: Clean mail though you should still use an Antivirus X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - orlando.hostforweb.net X-AntiAbuse: Original Domain - gnus.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - jpl.org X-Source: X-Source-Args: X-Source-Dir: X-Spam-Score: -2.4 (--) List-ID: Precedence: bulk Xref: news.gmane.org gmane.emacs.gnus.general:65642 gmane.emacs.devel:83114 Archived-At: Hi, The following Lisp snippet emulates what Gnus does when reading active data for the local.=E3=83=86=E3=82=B9=E3=83=88 newsgroup. The buffe= r contains data which have been retrieved from the nntp server. Note that the newsgroup name contains non-ASCII characters, which has been encoded by utf-8 in the server. --8<---------------cut here---------------start------------->8--- (let ((string (encode-coding-string "local.=E3=83=86=E3=82=B9=E3=83=88" 'ut= f-8))) (with-temp-buffer (set-buffer-multibyte t) (insert (string-to-multibyte string)) (goto-char (point-min)) (multibyte-string-p (symbol-name (read (current-buffer)))))) --8<---------------cut here---------------end--------------->8--- While Emacs trunk returns nil for this, Emacs Unicode-2 returns t. If it is not intentional, I hope `read' behaves just like it does in Emacs trunk. Otherwise, is there a way to make `read' return a unibyte symbol (without slowing down)? In the inside of Gnus, non-ASCII group names are all treated as unibyte strings, that are the ones that the server has encoded with certain coding systems. Because of the present behavior of `read' in Emacs Unicode-2, Gnus doesn't work with such newsgroups perfectly. You can find the actual code in gnus-start.el as follows: --8<---------------cut here---------------start------------->8--- ;; Read an active file and place the results in `gnus-active-hashtb'. (defun gnus-active-to-gnus-format (&optional method hashtb ignore-errors real-active) [...] ;; group gets set to a symbol interned in the hash table ;; (what a hack!!) - jwz (setq group (let ((obarray hashtb)) (read cur))) --8<---------------cut here---------------end--------------->8--- As you can see, it needs to work fast because there might be a lot of newsgroups. So, if possible, I don't want to modify it into: --8<---------------cut here---------------start------------->8--- (setq group (intern (mm-string-as-unibyte (symbol-name (read cur))) hashtb= )) --8<---------------cut here---------------end--------------->8--- Regards,