From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.user/1585 Path: news.gmane.org!not-for-mail From: chr@nybo.no (Christian Nyb) Newsgroups: gmane.emacs.gnus.user Subject: Re: check for a particular header before calling a washing function? Date: 07 Dec 2002 23:59:18 +0100 Organization: (nil) Message-ID: <87ptsdmo55.fsf@nybo.no> References: <877kemocp2.fsf@nybo.no> <84n0nirlfi.fsf@lucy.cs.uni-dortmund.de> <87vg25mz2u.fsf@nybo.no> <847kelpkhj.fsf@lucy.cs.uni-dortmund.de> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Trace: sea.gmane.org 1138668313 11918 80.91.229.2 (31 Jan 2006 00:45:13 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Tue, 31 Jan 2006 00:45:13 +0000 (UTC) Original-X-From: nobody Tue Jan 17 17:29:23 2006 Original-Path: quimby.gnus.org!newsfeed1.e.nsc.no!nsc.no!nextra.com!news4.e.nsc.no.POSTED!53ab2750!not-for-mail Original-Sender: chr@lapchr Original-Newsgroups: gnu.emacs.gnus User-Agent: Gnus/5.070099 (Pterodactyl Gnus v0.99) XEmacs/21.4 (Common Lisp) Original-NNTP-Posting-Host: 80.212.74.97 Original-X-Complaints-To: news-abuse@telenor.net Original-NNTP-Posting-Date: Sun, 08 Dec 2002 01:00:31 MET Original-X-Trace: news4.ulv.nextra.no 1039305632 80.212.74.97 Original-Xref: bridgekeeper.physik.uni-ulm.de gnus-emacs-gnus:1725 Original-Lines: 113 X-Gnus-Article-Number: 1725 Tue Jan 17 17:29:23 2006 Xref: news.gmane.org gmane.emacs.gnus.user:1585 Archived-At: kai.grossjohann@uni-duisburg.de (Kai Grojohann) writes: > chr@nybo.no (Christian Nyb) writes: > > > kai.grossjohann@uni-duisburg.de (Kai Grojohann) writes: > > > >> chr@nybo.no (Christian Nyb) writes: > >> > >> > If the header "Content-Type: text/plain; charset=utf-8" is present, I > >> > want to call a washing function that translates from utf-8 to latin-1. > >> > The function is ready, but where do I hook the test in? > >> > >> Why do you want to do that? Maybe there is another way to achieve > >> what you want. > > > > Some pos(t)ers in the no.* hierarchy use UTF-8, even though ISO-8859-1 > > would do the job. Untranslated UTF-8 is hard to read. My setup does > > not translate UTF-8. I want to teach it to translate it to Latin-1. > > I guess you want this for articles you write? No. I write my articles in ISO-8859-1. I want to articles /encoded/ in UTF-8 to look nice, like for example http://groups.google.com/groups?selm=lbzptse22mx.fsf%40aqualene.uio.no&oe=UTF-8&output=gplain > Well, the UTF-8 article you forwarded was auto-converted by Gnus from > UTF-8 to iso-8859-1, so it seems to work already :-) I think you misunderstand me. > Oh, now I get it. You can't see UTF-8 in your Emacs? Correct. My Emacs is an XEmacs, btw. I see two-character combinations. I've put together a few functions that convert from UTF-8 to ISO-8859-1, but they're kinda broken as they assume that all the characters will fit in ISO-8859-1, in other words, no character codes above 255. But they'll do, as I so far have only encountered UTF-8-articles with characters in the range 0 to 255. (defun utf-8-decode-region (start end) (interactive "r") (let ((work-buffer (generate-new-buffer " *utf-8-work*"))) (unwind-protect (save-excursion (buffer-disable-undo work-buffer) (progn (goto-char start) (while (not (eobp)) (cond ((zerop (logand (following-char) #x80)) ; high bit is not set (insert-char (following-char) 1 t work-buffer)) ((= (logand #xE0 (following-char)) #xC0) (insert-char (logior (lsh (logand (following-char) #b00011111) 6) (progn (forward-char) (logand (following-char) #b00111111))) 1 t work-buffer))) (forward-char)) (or (markerp end) (setq end (set-marker (make-marker) end))) (goto-char start) (insert-buffer-substring work-buffer) (delete-region (point) end))) (and work-buffer (kill-buffer work-buffer))))) (defun gnus-article-de-utf-8 () "Convert utf-8 to latin-1" (interactive) (save-excursion (set-buffer gnus-article-buffer) (let ((buffer-read-only nil)) (widen) (goto-char (point-min)) (search-forward "\n\n" nil t) (utf-8-decode-region (point) (point-max))))) > But this should work. Type C-h h to view the HELLO file. Does it > say something about UTF-8 near the bottom? What do you see? > It starts like this: You need many fonts to read all. Please correct this incomplete list and add more! --------------------------------------------------------- Amharic ($(3"c!(B Arabic [2](38R(47d(3T!JSa(4W(3W[0](B Croatian (Hrvatski) Bog (Bok), Dobar dan Czech (.BNhesky) DobrN} den and does not mention anything about UTF-8. File is "/usr/local/src/xemacs-21.2.36/etc/HELLO" > Oh, in Emacs 21.2, the HELLO file does not contain UTF-8. Hm. > > Search the web for the file UTF-8-demo.txt, download it, then use C-x > RET c utf-8 RET C-x C-f to open the file in Emacs. What do you see? C-x RET is undefined. I run XEmacs, and it's probably compiled without support for setting a coding system. Opening it as a plain file: UTF-8 encoded sample plain-text file ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾ Markus Kuhn [ˈmaʳkʊs kuːn] — 2002-07-25 -- chr