* wrong charset in spite of proper format
@ 2005-03-09 13:47 Matthias Andree
2005-03-09 15:24 ` Reiner Steib
0 siblings, 1 reply; 6+ messages in thread
From: Matthias Andree @ 2005-03-09 13:47 UTC (permalink / raw)
Hi,
a Ukraïnian subscriber recently posted a mail of the structure sketched
below to the fetchmail-friends mailing list. No matter what part of the
mail I look at (with C-d, article as ephemeral group), No Gnus (fresh
from CVS) uses my local character set, ISO-8859-whatever, rather than
Windows-1251 for display. mutt gets this right.
What's up here? Does Gnus lack Windows-1251? If so, why does it not
replace everything by dots, X or ?. If it supports Windows-1251, why
doesn't it see it? I don't have the time to play with modified copies of
the mail and Gnus now to figure what's up. (Emacs 21.3 with rm'd movemail)
Full copy of the message available from
<http://home.pages.de/~mandree/tmp/58def670bf48a5363b4df09b717495b6@apple.mk.ua.txt>.
Outline:
(head)
Mime-Version: 1.0 (Apple Message framework v619.2)
Content-Type: multipart/alternative; boundary=Apple-Mail-1-1053523650
--Apple-Mail-1-1053523650
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
charset=WINDOWS-1251;
format=flowed
6 =E1=E5=F0 2005, =EE 0:59, Matthias Andree =ED=E0=EF=E8=F1=E0=E2(=EB=E0):=
...
--Apple-Mail-1-1053523650
Content-Transfer-Encoding: quoted-printable
Content-Type: text/enriched;
charset=WINDOWS-1251
...
--Apple-Mail-1-1053523650--
--
Matthias Andree
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: wrong charset in spite of proper format
2005-03-09 13:47 wrong charset in spite of proper format Matthias Andree
@ 2005-03-09 15:24 ` Reiner Steib
2005-03-10 0:35 ` Matthias Andree
0 siblings, 1 reply; 6+ messages in thread
From: Reiner Steib @ 2005-03-09 15:24 UTC (permalink / raw)
On Wed, Mar 09 2005, Matthias Andree wrote:
> a Ukraïnian subscriber recently posted a mail of the structure sketched
> below to the fetchmail-friends mailing list. No matter what part of the
> mail I look at (with C-d, article as ephemeral group), No Gnus (fresh
> from CVS) uses my local character set, ISO-8859-whatever, rather than
> Windows-1251 for display. mutt gets this right.
>
> What's up here? Does Gnus lack Windows-1251?
Gnus doesn't provide any charsets, (X)Emacs[1] does.
> If it supports Windows-1251, why doesn't it see it?
In Emacs 21.[1-4] you need (codepage-setup 1251) in `~/.gnus.el' or
`~/.emacs'. Or better make that...
(unless (coding-system-p 'windows-1251)
(codepage-setup 1251))
The upcoming Emacs 22 already has preloaded windows-1251 and has
autoloads for other commonly used charsets (iso-8859-*, windows-125*;
see [2]).
> Full copy of the message available from
> <http://home.pages.de/~mandree/tmp/58def670bf48a5363b4df09b717495b6@apple.mk.ua.txt>.
[ Or news://news.gmane.org/gmane.mail.fetchmail.user/7122
<news:58def670bf48a5363b4df09b717495b6@apple.mk.ua> ]
Bye, Reiner.
[1] More on (X)Emacs, Gnus and charsets (in German):
http://theotp1.physik.uni-ulm.de/~ste/comp/emacs/gnus/draft/
[2] http://thread.gmane.org/v9k6orjx0i.fsf@marauder.physik.uni-ulm.de
,----[ emacs/lisp/ChangeLog ]
| 2005-03-04 Reiner Steib <Reiner.Steib@gmx.de>
|
| * international/code-pages.el (windows-1250, windows-125[2-8])
| (iso-8859-10, -13, -16, georgian-ps): Add autoload cookies.
`----
--
,,,
(o o)
---ooO-(_)-Ooo--- | PGP key available | http://rsteib.home.pages.de/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: wrong charset in spite of proper format
2005-03-09 15:24 ` Reiner Steib
@ 2005-03-10 0:35 ` Matthias Andree
2005-03-10 18:53 ` Reiner Steib
0 siblings, 1 reply; 6+ messages in thread
From: Matthias Andree @ 2005-03-10 0:35 UTC (permalink / raw)
Reiner Steib <reinersteib+gmane@imap.cc> writes:
>> What's up here? Does Gnus lack Windows-1251?
>
> Gnus doesn't provide any charsets, (X)Emacs[1] does.
OK. My complaint is that as a result of Emacs 21.SOME_MINOR_RELEASE not
providing a particular character set, Gnus display anything else rather
than falling back to ASCII (where appropriate), masking the unprintables
and stuffing a status line that reads something like "windows-1251 not
supported by your emacs, displaying ASCII parts"
>> If it supports Windows-1251, why doesn't it see it?
>
> In Emacs 21.[1-4] you need (codepage-setup 1251) in `~/.gnus.el' or
> `~/.emacs'. Or better make that...
>
> (unless (coding-system-p 'windows-1251)
> (codepage-setup 1251))
Insufficient. Calling (define-coding-system-alias 'windows-1251 'cp1251)
on top of that works however. This is along the lines Simon Josefsson
suggested one and a half years ago WRT Windows-1252.
It is a shame that such functionality still isn't enabled in the default
No Gnus after such a long time. :-(
> [1] More on (X)Emacs, Gnus and charsets (in German):
> http://theotp1.physik.uni-ulm.de/~ste/comp/emacs/gnus/draft/
Currently unavailable.
--
Matthias Andree
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: wrong charset in spite of proper format
2005-03-10 0:35 ` Matthias Andree
@ 2005-03-10 18:53 ` Reiner Steib
2005-03-10 22:37 ` Miles Bader
0 siblings, 1 reply; 6+ messages in thread
From: Reiner Steib @ 2005-03-10 18:53 UTC (permalink / raw)
[-- Attachment #1: Type: text/plain, Size: 936 bytes --]
On Thu, Mar 10 2005, Matthias Andree wrote:
> OK. My complaint is that as a result of Emacs 21.SOME_MINOR_RELEASE not
> providing a particular character set, Gnus display anything else rather
> than falling back to ASCII (where appropriate), masking the unprintables
> and stuffing a status line that reads something like "windows-1251 not
> supported by your emacs, displaying ASCII parts"
[...]
> This is along the lines Simon Josefsson suggested one and a half
> years ago WRT Windows-1252.
>
> It is a shame that such functionality still isn't enabled in the default
> No Gnus after such a long time. :-(
Could you try the following patch? [*] It should automatically do the
setup for windows-125[0137] (which are available in Emacs 21). If no
charset (or alias) is found, it will print a message. (Displaying as
ASCII is and replacing unknown chars with `?' is not included. I'm
not sure how this could be achieved in Gnus.)
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: rs-mm-util-auto-charset.patch --]
[-- Type: text/x-patch, Size: 3365 bytes --]
--- mm-util.el 21 Feb 2005 12:42:41 +0100 7.26
+++ mm-util.el 10 Mar 2005 19:31:33 +0100
@@ -142,6 +142,34 @@
;; Is this branch ever actually useful?
(car (memq cs (mm-get-coding-system-list))))))
+(defun mm-codepage-setup (number)
+ "Create a coding system cpNUMBER and an alias for windows-NUMBER.
+The coding system is created using `codepage-setup'. The alias
+is added to `mm-charset-synonym-alist'."
+ (interactive
+ (let ((completion-ignore-case t)
+ (candidates (cp-supported-codepages)))
+ (list (completing-read "Setup DOS Codepage: (default 437) " candidates
+ nil t nil nil "437"))))
+ (let* ((cp (intern (format "cp%s" number)))
+ (alias (intern (format "windows-%s" number))))
+ (unless (mm-coding-system-p cp)
+ (when (codepage-setup number)
+ (unless (mm-coding-system-p alias)
+ (add-to-list 'mm-charset-synonym-alist
+ (cons alias cp)))))))
+
+(defvar mm-charset-eval-alist
+ '(;; (iso-8859-13 . (require 'code-pages))
+ ;; Emacs 21 offers: 1250 1251 1253 1257
+ (windows-1250 . (mm-codepage-setup 1250))
+ (windows-1251 . (mm-codepage-setup 1251))
+ (windows-1253 . (mm-codepage-setup 1253))
+ (windows-1257 . (mm-codepage-setup 1257)))
+ "An alist of \(charset . form\) pairs.
+If an article is encoded in an unknown CHARSET, FORM is evaluated.
+This allows to load additional libraries providing CHARSETS.")
+
(defvar mm-charset-synonym-alist
`(
;; Not in XEmacs, but it's not a proper MIME charset anyhow.
@@ -175,7 +203,7 @@
'((ks_c_5601-1987 . cp949))
'((ks_c_5601-1987 . euc-kr))))
)
- "A mapping from invalid charset names to the real charset names.")
+ "A mapping from unknown or invalid charset names to the real charset names.")
(defvar mm-binary-coding-system
(cond
@@ -400,6 +428,10 @@
(pop alist))
out)))
+;; FIXME: `gnus-message' must be replaced by `message'. This is just for
+;; testing.
+(autoload 'gnus-message "gnus-util")
+
(defun mm-charset-to-coding-system (charset &optional lbt)
"Return coding-system corresponding to CHARSET.
CHARSET is a symbol naming a MIME charset.
@@ -428,9 +460,26 @@
;;; (eq charset (coding-system-get charset 'mime-charset))
)
charset)
+ ;; Eval expressions from `mm-charset-eval-alist'
+ ((let* ((el (assq charset mm-charset-eval-alist))
+ (cs (car el))
+ (form (cdr el)))
+ (and cs
+ form
+ ;; Avoid errors...
+ (condition-case nil (eval form) (error nil))
+ ;; (message "Failed to eval `%s'" form))
+ (mm-coding-system-p cs)
+ (gnus-message 7 "Added charset `%s' via `mm-charset-eval-alist'" cs)
+ cs)))
;; Translate invalid charsets.
((let ((cs (cdr (assq charset mm-charset-synonym-alist))))
- (and cs (mm-coding-system-p cs) cs)))
+ (and cs
+ (mm-coding-system-p cs)
+ (gnus-message 7
+ "Using synonym `%s' from `mm-charset-synonym-alist' for `%s'"
+ cs charset)
+ cs)))
;; Last resort: search the coding system list for entries which
;; have the right mime-charset in case the canonical name isn't
;; defined (though it should be).
@@ -442,6 +491,8 @@
(eq charset (or (coding-system-get c :mime-charset)
(coding-system-get c 'mime-charset))))
(setq cs c)))
+ (unless cs
+ (gnus-message 7 "Unknown charset: %s" charset))
cs))))
(eval-and-compile
[-- Attachment #3: Type: text/plain, Size: 1088 bytes --]
>> [1] More on (X)Emacs, Gnus and charsets (in German):
>> http://theotp1.physik.uni-ulm.de/~ste/comp/emacs/gnus/draft/
>
> Currently unavailable.
Up again. (But probably not up to date WRT Emacs 22.)
Bye, Reiner.
[*] I've posted a series of test postings for windows-125* to
<news:gmane.test>:
<news:2005-03-10-gmane-windows-1250@marauder.physik.uni-ulm.de>
<news:2005-03-10-gmane-windows-1251@marauder.physik.uni-ulm.de>
<news:2005-03-10-gmane-windows-1252@marauder.physik.uni-ulm.de>
<news:2005-03-10-gmane-windows-1253@marauder.physik.uni-ulm.de>
<news:2005-03-10-gmane-windows-1254@marauder.physik.uni-ulm.de>
<news:2005-03-10-gmane-windows-1255@marauder.physik.uni-ulm.de>
<news:2005-03-10-gmane-windows-1256@marauder.physik.uni-ulm.de>
<news:2005-03-10-gmane-windows-1257@marauder.physik.uni-ulm.de>
<news:2005-03-10-gmane-windows-1258@marauder.physik.uni-ulm.de>
<news:2005-03-10-gmane-windows-1259@marauder.physik.uni-ulm.de>
--
,,,
(o o)
---ooO-(_)-Ooo--- | PGP key available | http://rsteib.home.pages.de/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: wrong charset in spite of proper format
2005-03-10 18:53 ` Reiner Steib
@ 2005-03-10 22:37 ` Miles Bader
2005-03-11 10:05 ` Reiner Steib
0 siblings, 1 reply; 6+ messages in thread
From: Miles Bader @ 2005-03-10 22:37 UTC (permalink / raw)
Reiner Steib <reinersteib+gmane@imap.cc> writes:
> [*] I've posted a series of test postings for windows-125* to
I notice that my emacs only displays something meaningful for 1251
and 1252; should it be able to handle the others too? Or is it a
font issue or something (though I've got lots of fonts installed;
most of emacs HELLO displays properly)?
Thanks,
-Miles
--
[|nurgle|] ddt- demonic? so quake will have an evil kinda setting? one that
will make every christian in the world foamm at the mouth?
[iddt] nurg, that's the goal
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: wrong charset in spite of proper format
2005-03-10 22:37 ` Miles Bader
@ 2005-03-11 10:05 ` Reiner Steib
0 siblings, 0 replies; 6+ messages in thread
From: Reiner Steib @ 2005-03-11 10:05 UTC (permalink / raw)
On Thu, Mar 10 2005, Miles Bader wrote:
> Reiner Steib <reinersteib+gmane@imap.cc> writes:
>> [*] I've posted a series of test postings for windows-125* to
>
> I notice that my emacs only displays something meaningful for 1251
> and 1252; should it be able to handle the others too?
The article "windows-1252" was actually created as windows-1252. Then
I just replaced "windows-1252" by "windows-125x" (x \in {0...9}) in
the outgoing file using sed. Emacs should display _something_ (at
least for A0-FF), although not necessarily the character denoted in
the "Description (only correct for Latin-1)" column. The description
should be correct for windows-1252. E.g. at Hex A3 there is "POUND
SIGN" in windows-1252 but a Cyrillic-J (Ј) in windows-1251.
As "windows-1259" doesn't exist (AFAIK), this article should display
the "Unknown charset" message (you must have `gnus-verbose' >= 7).
> Or is it a font issue or something (though I've got lots of fonts
> installed; most of emacs HELLO displays properly)?
If you see hollow squares, it is a font issue, I think. If you see
\200 or similar, the position might be unused in this charset.
Bye, Reiner.
--
,,,
(o o)
---ooO-(_)-Ooo--- | PGP key available | http://rsteib.home.pages.de/
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2005-03-11 10:05 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-03-09 13:47 wrong charset in spite of proper format Matthias Andree
2005-03-09 15:24 ` Reiner Steib
2005-03-10 0:35 ` Matthias Andree
2005-03-10 18:53 ` Reiner Steib
2005-03-10 22:37 ` Miles Bader
2005-03-11 10:05 ` Reiner Steib
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).