* wrong charset in spite of proper format @ 2005-03-09 13:47 Matthias Andree 2005-03-09 15:24 ` Reiner Steib 0 siblings, 1 reply; 6+ messages in thread From: Matthias Andree @ 2005-03-09 13:47 UTC (permalink / raw) Hi, a Ukraïnian subscriber recently posted a mail of the structure sketched below to the fetchmail-friends mailing list. No matter what part of the mail I look at (with C-d, article as ephemeral group), No Gnus (fresh from CVS) uses my local character set, ISO-8859-whatever, rather than Windows-1251 for display. mutt gets this right. What's up here? Does Gnus lack Windows-1251? If so, why does it not replace everything by dots, X or ?. If it supports Windows-1251, why doesn't it see it? I don't have the time to play with modified copies of the mail and Gnus now to figure what's up. (Emacs 21.3 with rm'd movemail) Full copy of the message available from <http://home.pages.de/~mandree/tmp/58def670bf48a5363b4df09b717495b6@apple.mk.ua.txt>. Outline: (head) Mime-Version: 1.0 (Apple Message framework v619.2) Content-Type: multipart/alternative; boundary=Apple-Mail-1-1053523650 --Apple-Mail-1-1053523650 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=WINDOWS-1251; format=flowed 6 =E1=E5=F0 2005, =EE 0:59, Matthias Andree =ED=E0=EF=E8=F1=E0=E2(=EB=E0):= ... --Apple-Mail-1-1053523650 Content-Transfer-Encoding: quoted-printable Content-Type: text/enriched; charset=WINDOWS-1251 ... --Apple-Mail-1-1053523650-- -- Matthias Andree ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: wrong charset in spite of proper format 2005-03-09 13:47 wrong charset in spite of proper format Matthias Andree @ 2005-03-09 15:24 ` Reiner Steib 2005-03-10 0:35 ` Matthias Andree 0 siblings, 1 reply; 6+ messages in thread From: Reiner Steib @ 2005-03-09 15:24 UTC (permalink / raw) On Wed, Mar 09 2005, Matthias Andree wrote: > a Ukraïnian subscriber recently posted a mail of the structure sketched > below to the fetchmail-friends mailing list. No matter what part of the > mail I look at (with C-d, article as ephemeral group), No Gnus (fresh > from CVS) uses my local character set, ISO-8859-whatever, rather than > Windows-1251 for display. mutt gets this right. > > What's up here? Does Gnus lack Windows-1251? Gnus doesn't provide any charsets, (X)Emacs[1] does. > If it supports Windows-1251, why doesn't it see it? In Emacs 21.[1-4] you need (codepage-setup 1251) in `~/.gnus.el' or `~/.emacs'. Or better make that... (unless (coding-system-p 'windows-1251) (codepage-setup 1251)) The upcoming Emacs 22 already has preloaded windows-1251 and has autoloads for other commonly used charsets (iso-8859-*, windows-125*; see [2]). > Full copy of the message available from > <http://home.pages.de/~mandree/tmp/58def670bf48a5363b4df09b717495b6@apple.mk.ua.txt>. [ Or news://news.gmane.org/gmane.mail.fetchmail.user/7122 <news:58def670bf48a5363b4df09b717495b6@apple.mk.ua> ] Bye, Reiner. [1] More on (X)Emacs, Gnus and charsets (in German): http://theotp1.physik.uni-ulm.de/~ste/comp/emacs/gnus/draft/ [2] http://thread.gmane.org/v9k6orjx0i.fsf@marauder.physik.uni-ulm.de ,----[ emacs/lisp/ChangeLog ] | 2005-03-04 Reiner Steib <Reiner.Steib@gmx.de> | | * international/code-pages.el (windows-1250, windows-125[2-8]) | (iso-8859-10, -13, -16, georgian-ps): Add autoload cookies. `---- -- ,,, (o o) ---ooO-(_)-Ooo--- | PGP key available | http://rsteib.home.pages.de/ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: wrong charset in spite of proper format 2005-03-09 15:24 ` Reiner Steib @ 2005-03-10 0:35 ` Matthias Andree 2005-03-10 18:53 ` Reiner Steib 0 siblings, 1 reply; 6+ messages in thread From: Matthias Andree @ 2005-03-10 0:35 UTC (permalink / raw) Reiner Steib <reinersteib+gmane@imap.cc> writes: >> What's up here? Does Gnus lack Windows-1251? > > Gnus doesn't provide any charsets, (X)Emacs[1] does. OK. My complaint is that as a result of Emacs 21.SOME_MINOR_RELEASE not providing a particular character set, Gnus display anything else rather than falling back to ASCII (where appropriate), masking the unprintables and stuffing a status line that reads something like "windows-1251 not supported by your emacs, displaying ASCII parts" >> If it supports Windows-1251, why doesn't it see it? > > In Emacs 21.[1-4] you need (codepage-setup 1251) in `~/.gnus.el' or > `~/.emacs'. Or better make that... > > (unless (coding-system-p 'windows-1251) > (codepage-setup 1251)) Insufficient. Calling (define-coding-system-alias 'windows-1251 'cp1251) on top of that works however. This is along the lines Simon Josefsson suggested one and a half years ago WRT Windows-1252. It is a shame that such functionality still isn't enabled in the default No Gnus after such a long time. :-( > [1] More on (X)Emacs, Gnus and charsets (in German): > http://theotp1.physik.uni-ulm.de/~ste/comp/emacs/gnus/draft/ Currently unavailable. -- Matthias Andree ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: wrong charset in spite of proper format 2005-03-10 0:35 ` Matthias Andree @ 2005-03-10 18:53 ` Reiner Steib 2005-03-10 22:37 ` Miles Bader 0 siblings, 1 reply; 6+ messages in thread From: Reiner Steib @ 2005-03-10 18:53 UTC (permalink / raw) [-- Attachment #1: Type: text/plain, Size: 936 bytes --] On Thu, Mar 10 2005, Matthias Andree wrote: > OK. My complaint is that as a result of Emacs 21.SOME_MINOR_RELEASE not > providing a particular character set, Gnus display anything else rather > than falling back to ASCII (where appropriate), masking the unprintables > and stuffing a status line that reads something like "windows-1251 not > supported by your emacs, displaying ASCII parts" [...] > This is along the lines Simon Josefsson suggested one and a half > years ago WRT Windows-1252. > > It is a shame that such functionality still isn't enabled in the default > No Gnus after such a long time. :-( Could you try the following patch? [*] It should automatically do the setup for windows-125[0137] (which are available in Emacs 21). If no charset (or alias) is found, it will print a message. (Displaying as ASCII is and replacing unknown chars with `?' is not included. I'm not sure how this could be achieved in Gnus.) [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: rs-mm-util-auto-charset.patch --] [-- Type: text/x-patch, Size: 3365 bytes --] --- mm-util.el 21 Feb 2005 12:42:41 +0100 7.26 +++ mm-util.el 10 Mar 2005 19:31:33 +0100 @@ -142,6 +142,34 @@ ;; Is this branch ever actually useful? (car (memq cs (mm-get-coding-system-list)))))) +(defun mm-codepage-setup (number) + "Create a coding system cpNUMBER and an alias for windows-NUMBER. +The coding system is created using `codepage-setup'. The alias +is added to `mm-charset-synonym-alist'." + (interactive + (let ((completion-ignore-case t) + (candidates (cp-supported-codepages))) + (list (completing-read "Setup DOS Codepage: (default 437) " candidates + nil t nil nil "437")))) + (let* ((cp (intern (format "cp%s" number))) + (alias (intern (format "windows-%s" number)))) + (unless (mm-coding-system-p cp) + (when (codepage-setup number) + (unless (mm-coding-system-p alias) + (add-to-list 'mm-charset-synonym-alist + (cons alias cp))))))) + +(defvar mm-charset-eval-alist + '(;; (iso-8859-13 . (require 'code-pages)) + ;; Emacs 21 offers: 1250 1251 1253 1257 + (windows-1250 . (mm-codepage-setup 1250)) + (windows-1251 . (mm-codepage-setup 1251)) + (windows-1253 . (mm-codepage-setup 1253)) + (windows-1257 . (mm-codepage-setup 1257))) + "An alist of \(charset . form\) pairs. +If an article is encoded in an unknown CHARSET, FORM is evaluated. +This allows to load additional libraries providing CHARSETS.") + (defvar mm-charset-synonym-alist `( ;; Not in XEmacs, but it's not a proper MIME charset anyhow. @@ -175,7 +203,7 @@ '((ks_c_5601-1987 . cp949)) '((ks_c_5601-1987 . euc-kr)))) ) - "A mapping from invalid charset names to the real charset names.") + "A mapping from unknown or invalid charset names to the real charset names.") (defvar mm-binary-coding-system (cond @@ -400,6 +428,10 @@ (pop alist)) out))) +;; FIXME: `gnus-message' must be replaced by `message'. This is just for +;; testing. +(autoload 'gnus-message "gnus-util") + (defun mm-charset-to-coding-system (charset &optional lbt) "Return coding-system corresponding to CHARSET. CHARSET is a symbol naming a MIME charset. @@ -428,9 +460,26 @@ ;;; (eq charset (coding-system-get charset 'mime-charset)) ) charset) + ;; Eval expressions from `mm-charset-eval-alist' + ((let* ((el (assq charset mm-charset-eval-alist)) + (cs (car el)) + (form (cdr el))) + (and cs + form + ;; Avoid errors... + (condition-case nil (eval form) (error nil)) + ;; (message "Failed to eval `%s'" form)) + (mm-coding-system-p cs) + (gnus-message 7 "Added charset `%s' via `mm-charset-eval-alist'" cs) + cs))) ;; Translate invalid charsets. ((let ((cs (cdr (assq charset mm-charset-synonym-alist)))) - (and cs (mm-coding-system-p cs) cs))) + (and cs + (mm-coding-system-p cs) + (gnus-message 7 + "Using synonym `%s' from `mm-charset-synonym-alist' for `%s'" + cs charset) + cs))) ;; Last resort: search the coding system list for entries which ;; have the right mime-charset in case the canonical name isn't ;; defined (though it should be). @@ -442,6 +491,8 @@ (eq charset (or (coding-system-get c :mime-charset) (coding-system-get c 'mime-charset)))) (setq cs c))) + (unless cs + (gnus-message 7 "Unknown charset: %s" charset)) cs)))) (eval-and-compile [-- Attachment #3: Type: text/plain, Size: 1088 bytes --] >> [1] More on (X)Emacs, Gnus and charsets (in German): >> http://theotp1.physik.uni-ulm.de/~ste/comp/emacs/gnus/draft/ > > Currently unavailable. Up again. (But probably not up to date WRT Emacs 22.) Bye, Reiner. [*] I've posted a series of test postings for windows-125* to <news:gmane.test>: <news:2005-03-10-gmane-windows-1250@marauder.physik.uni-ulm.de> <news:2005-03-10-gmane-windows-1251@marauder.physik.uni-ulm.de> <news:2005-03-10-gmane-windows-1252@marauder.physik.uni-ulm.de> <news:2005-03-10-gmane-windows-1253@marauder.physik.uni-ulm.de> <news:2005-03-10-gmane-windows-1254@marauder.physik.uni-ulm.de> <news:2005-03-10-gmane-windows-1255@marauder.physik.uni-ulm.de> <news:2005-03-10-gmane-windows-1256@marauder.physik.uni-ulm.de> <news:2005-03-10-gmane-windows-1257@marauder.physik.uni-ulm.de> <news:2005-03-10-gmane-windows-1258@marauder.physik.uni-ulm.de> <news:2005-03-10-gmane-windows-1259@marauder.physik.uni-ulm.de> -- ,,, (o o) ---ooO-(_)-Ooo--- | PGP key available | http://rsteib.home.pages.de/ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: wrong charset in spite of proper format 2005-03-10 18:53 ` Reiner Steib @ 2005-03-10 22:37 ` Miles Bader 2005-03-11 10:05 ` Reiner Steib 0 siblings, 1 reply; 6+ messages in thread From: Miles Bader @ 2005-03-10 22:37 UTC (permalink / raw) Reiner Steib <reinersteib+gmane@imap.cc> writes: > [*] I've posted a series of test postings for windows-125* to I notice that my emacs only displays something meaningful for 1251 and 1252; should it be able to handle the others too? Or is it a font issue or something (though I've got lots of fonts installed; most of emacs HELLO displays properly)? Thanks, -Miles -- [|nurgle|] ddt- demonic? so quake will have an evil kinda setting? one that will make every christian in the world foamm at the mouth? [iddt] nurg, that's the goal ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: wrong charset in spite of proper format 2005-03-10 22:37 ` Miles Bader @ 2005-03-11 10:05 ` Reiner Steib 0 siblings, 0 replies; 6+ messages in thread From: Reiner Steib @ 2005-03-11 10:05 UTC (permalink / raw) On Thu, Mar 10 2005, Miles Bader wrote: > Reiner Steib <reinersteib+gmane@imap.cc> writes: >> [*] I've posted a series of test postings for windows-125* to > > I notice that my emacs only displays something meaningful for 1251 > and 1252; should it be able to handle the others too? The article "windows-1252" was actually created as windows-1252. Then I just replaced "windows-1252" by "windows-125x" (x \in {0...9}) in the outgoing file using sed. Emacs should display _something_ (at least for A0-FF), although not necessarily the character denoted in the "Description (only correct for Latin-1)" column. The description should be correct for windows-1252. E.g. at Hex A3 there is "POUND SIGN" in windows-1252 but a Cyrillic-J (Ј) in windows-1251. As "windows-1259" doesn't exist (AFAIK), this article should display the "Unknown charset" message (you must have `gnus-verbose' >= 7). > Or is it a font issue or something (though I've got lots of fonts > installed; most of emacs HELLO displays properly)? If you see hollow squares, it is a font issue, I think. If you see \200 or similar, the position might be unused in this charset. Bye, Reiner. -- ,,, (o o) ---ooO-(_)-Ooo--- | PGP key available | http://rsteib.home.pages.de/ ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2005-03-11 10:05 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2005-03-09 13:47 wrong charset in spite of proper format Matthias Andree 2005-03-09 15:24 ` Reiner Steib 2005-03-10 0:35 ` Matthias Andree 2005-03-10 18:53 ` Reiner Steib 2005-03-10 22:37 ` Miles Bader 2005-03-11 10:05 ` Reiner Steib
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).