From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/65432 Path: news.gmane.org!not-for-mail From: Katsumi Yamaoka Newsgroups: gmane.emacs.gnus.general Subject: Re: Unknown charset: gbk Date: Mon, 22 Oct 2007 16:11:17 +0900 Organization: Emacsen advocacy group Message-ID: References: <87tzomu61s.fsf@jidanni.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Trace: ger.gmane.org 1193037192 8144 80.91.229.12 (22 Oct 2007 07:13:12 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 22 Oct 2007 07:13:12 +0000 (UTC) To: ding@gnus.org Original-X-From: ding-owner+M13933@lists.math.uh.edu Mon Oct 22 09:13:13 2007 Return-path: Envelope-to: ding-account@gmane.org Original-Received: from util0.math.uh.edu ([129.7.128.18]) by lo.gmane.org with esmtp (Exim 4.50) id 1IjrTH-0000xA-CY for ding-account@gmane.org; Mon, 22 Oct 2007 09:13:07 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu) by util0.math.uh.edu with smtp (Exim 4.63) (envelope-from ) id 1IjrRq-00082X-SA; Mon, 22 Oct 2007 02:11:38 -0500 Original-Received: from mx2.math.uh.edu ([129.7.128.33]) by util0.math.uh.edu with esmtps (TLSv1:AES256-SHA:256) (Exim 4.63) (envelope-from ) id 1IjrRp-00082M-31 for ding@lists.math.uh.edu; Mon, 22 Oct 2007 02:11:37 -0500 Original-Received: from quimby.gnus.org ([80.91.231.51]) by mx2.math.uh.edu with esmtp (Exim 4.67) (envelope-from ) id 1IjrRj-00049m-0a for ding@lists.math.uh.edu; Mon, 22 Oct 2007 02:11:36 -0500 Original-Received: from orlando.hostforweb.net ([216.246.45.90]) by quimby.gnus.org with esmtp (Exim 3.35 #1 (Debian)) id 1IjrRb-0000yY-00 for ; Mon, 22 Oct 2007 09:11:24 +0200 Original-Received: from [66.225.201.151] (port=40087 helo=mail.jpl.org) by orlando.hostforweb.net with esmtpa (Exim 4.68) (envelope-from ) id 1IjrRZ-0001Rl-3J for ding@gnus.org; Mon, 22 Oct 2007 02:11:21 -0500 X-Hashcash: 1:20:071022:ding@gnus.org::xF2lIElC2zOwN/Mj:000021HA X-Face: #kKnN,xUnmKia.'[pp`;Omh}odZK)?7wQSl"4o04=EixTF+V[""w~iNbM9ZL+.b*_CxUmFk B#Fu[*?MZZH@IkN:!"\w%I_zt>[$nm7nQosZ<3eu;B:$Q_:p!',P.c0-_Cy[dz4oIpw0ESA^D*1Lw= L&i*6&( User-Agent: Gnus/5.110007 (No Gnus v0.7) Emacs/23.0.60 (gnu/linux) Cancel-Lock: sha1:MhNg2eNRSrwt/EVT6JCp76XWOXI= X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - orlando.hostforweb.net X-AntiAbuse: Original Domain - gnus.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - jpl.org X-Source: X-Source-Args: X-Source-Dir: X-Spam-Score: -2.4 (--) List-ID: Precedence: bulk Xref: news.gmane.org gmane.emacs.gnus.general:65432 Archived-At: --=-=-= >>>>> jidanni@jidanni.org wrote: > OK, here's another GBK message, wrapped super safely with shar(1). At least for this message, making `gbk' be an alias to `cp936' in Gnus is a bad idea. Reiner? I could reproduce \NNN using Emacs 22.1 and the current Emacs trunk. The cp936 coding system in those versions of Emacsen seems to be incomplete for gbk text. OTOH, Unicode 2 (i.e. Emacs 23.0.60) and the iconv command (both support gbk) look good. Therefore, I tried creating the gbk coding system for Mule version 5 (i.e. Emacs 21.1-23.0.50) using iconv. I use: $ iconv --version iconv (GNU libc) 2.6 and Mule-UCS for Emacs 21.x. If you try this module, you have to load (or require) it before loading Gnus. --=-=-= Content-Type: application/emacs-lisp Content-Disposition: attachment; filename=mule5-gbk.el ;;; mule5-gbk.el --- gbk coding system for Mule version 5 ;; Author: Katsumi Yamaoka (eval-and-compile (if (or (featurep 'xemacs) (and (not (boundp 'mule-version))) (not (string-match "\\`5\\." mule-version))) (error "mule5-gbk.el doesn't supprot this version of Emacs"))) (defun mule5-gbk-post-read-conversion (length) "Decode gbk." (save-excursion (save-restriction (narrow-to-region (point) (+ (point) length)) (let ((data (buffer-string)) (coding-system-for-read 'binary) (coding-system-for-write 'binary)) (delete-region (point-min) (point-max)) (set-buffer-multibyte t) (insert (with-temp-buffer (set-buffer-multibyte nil) (insert data) (call-process-region (point-min) (point-max) "iconv" t t nil "-f" "gbk" "-t" "utf-8") (decode-coding-string (buffer-string) 'utf-8))) (- (point-max) (point-min)))))) (defun mule5-gbk-pre-write-conversion (beg end) "Encode gbk." (save-excursion (save-restriction (narrow-to-region beg end) (let ((data (buffer-string)) (coding-system-for-read 'binary) (coding-system-for-write 'binary)) (delete-region beg end) (set-buffer-multibyte t) (insert (with-temp-buffer (set-buffer-multibyte nil) (insert (encode-coding-string data 'utf-8)) (call-process-region (point-min) (point-max) "iconv" t t nil "-f" "utf-8" "-t" "gbk") (string-to-multibyte (buffer-string))))))) nil) (make-coding-system 'chinese-gbk 5 ?c "GBK encoding for Chinese (MIME:GBK), produced by mule-gbk.el." nil '((post-read-conversion . mule5-gbk-post-read-conversion) (pre-write-conversion . mule5-gbk-pre-write-conversion) (safe-charsets ascii chinese-cns11643-5 chinese-cns11643-6 chinese-cns11643-7 chinese-gb2312) (mime-charset . gbk))) (define-coding-system-alias 'gbk 'chinese-gbk) (define-coding-system-alias 'cp936 'chinese-gbk) (define-coding-system-alias 'windows-936 'chinese-gbk) (provide 'mule5-gbk) ;;; mule5-gbk.el ends here --=-=-=--