From: chr@nybo.no (Christian Nybø)
Subject: Re: check for a particular header before calling a washing function?
Date: 07 Dec 2002 23:59:18 +0100 [thread overview]
Message-ID: <87ptsdmo55.fsf@nybo.no> (raw)
In-Reply-To: <847kelpkhj.fsf@lucy.cs.uni-dortmund.de>
kai.grossjohann@uni-duisburg.de (Kai Großjohann) writes:
> chr@nybo.no (Christian Nybø) writes:
>
> > kai.grossjohann@uni-duisburg.de (Kai Großjohann) writes:
> >
> >> chr@nybo.no (Christian Nybø) writes:
> >>
> >> > If the header "Content-Type: text/plain; charset=utf-8" is present, I
> >> > want to call a washing function that translates from utf-8 to latin-1.
> >> > The function is ready, but where do I hook the test in?
> >>
> >> Why do you want to do that? Maybe there is another way to achieve
> >> what you want.
> >
> > Some pos(t)ers in the no.* hierarchy use UTF-8, even though ISO-8859-1
> > would do the job. Untranslated UTF-8 is hard to read. My setup does
> > not translate UTF-8. I want to teach it to translate it to Latin-1.
>
> I guess you want this for articles you write?
No. I write my articles in ISO-8859-1.
I want to articles /encoded/ in UTF-8 to look nice, like for example
http://groups.google.com/groups?selm=lbzptse22mx.fsf%40aqualene.uio.no&oe=UTF-8&output=gplain
> Well, the UTF-8 article you forwarded was auto-converted by Gnus from
> UTF-8 to iso-8859-1, so it seems to work already :-)
I think you misunderstand me.
> Oh, now I get it. You can't see UTF-8 in your Emacs?
Correct. My Emacs is an XEmacs, btw. I see two-character
combinations. I've put together a few functions that convert from
UTF-8 to ISO-8859-1, but they're kinda broken as they assume that all
the characters will fit in ISO-8859-1, in other words, no character
codes above 255. But they'll do, as I so far have only encountered
UTF-8-articles with characters in the range 0 to 255.
(defun utf-8-decode-region (start end)
(interactive "r")
(let ((work-buffer (generate-new-buffer " *utf-8-work*")))
(unwind-protect
(save-excursion
(buffer-disable-undo work-buffer)
(progn
(goto-char start)
(while (not (eobp))
(cond
((zerop (logand (following-char) #x80)) ; high bit is not set
(insert-char (following-char) 1 t work-buffer))
((= (logand #xE0 (following-char)) #xC0)
(insert-char (logior (lsh (logand (following-char) #b00011111) 6)
(progn (forward-char) (logand (following-char) #b00111111)))
1 t work-buffer)))
(forward-char))
(or (markerp end) (setq end (set-marker (make-marker) end)))
(goto-char start)
(insert-buffer-substring work-buffer)
(delete-region (point) end)))
(and work-buffer (kill-buffer work-buffer)))))
(defun gnus-article-de-utf-8 ()
"Convert utf-8 to latin-1"
(interactive)
(save-excursion
(set-buffer gnus-article-buffer)
(let ((buffer-read-only nil))
(widen)
(goto-char (point-min))
(search-forward "\n\n" nil t)
(utf-8-decode-region (point) (point-max)))))
> But this should work. Type C-h h to view the HELLO file. Does it
> say something about UTF-8 near the bottom? What do you see?
>
It starts like this:
You need many fonts to read all.
Please correct this incomplete list and add more!
---------------------------------------------------------
Amharic (^[$(3"c!<!N"^^[(B) ^[$(3!A!,!>^[(B
Arabic ^[[2]^[(38R^[(47d^[(3T!JSa^[(4W^[(3W^[[0]^[(B
Croatian (Hrvatski) Bog (Bok), Dobar dan
Czech (^[.B^[Nhesky) Dobr^[N} den
and does not mention anything about UTF-8. File is
"/usr/local/src/xemacs-21.2.36/etc/HELLO"
> Oh, in Emacs 21.2, the HELLO file does not contain UTF-8. Hm.
>
> Search the web for the file UTF-8-demo.txt, download it, then use C-x
> RET c utf-8 RET C-x C-f to open the file in Emacs. What do you see?
C-x RET is undefined. I run XEmacs, and it's probably
compiled without support for setting a coding system.
Opening it as a plain file:
UTF-8 encoded sample plain-text file
â¾â¾â¾â¾â¾â¾â¾â¾â¾â¾â¾â¾â¾â¾â¾â¾â¾â¾â¾â¾â¾â¾â¾â¾â¾â¾â¾â¾â¾â¾â¾â¾â¾â¾â¾â¾
Markus Kuhn [ËmaʳkÊs kuËn] <mkuhn@acm.org> â 2002-07-25
--
chr
next prev parent reply other threads:[~2002-12-07 22:59 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <877kemocp2.fsf@nybo.no>
2002-12-07 8:01 ` Bijan Soleymani
[not found] ` <84n0nirlfi.fsf@lucy.cs.uni-dortmund.de>
[not found] ` <87vg25mz2u.fsf@nybo.no>
[not found] ` <847kelpkhj.fsf@lucy.cs.uni-dortmund.de>
2002-12-07 22:59 ` Christian Nybø [this message]
[not found] ` <84r8cs8p9e.fsf@lucy.cs.uni-dortmund.de>
2002-12-09 22:54 ` Christian Nybø
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ptsdmo55.fsf@nybo.no \
--to=chr@nybo.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).