Gnus development mailing list
 help / color / mirror / Atom feed
* multiple charsets handling in gnus
@ 2001-09-07 17:39 Alexander Kotelnikov
  2001-09-07 19:19 ` Kai Großjohann
  2001-09-07 21:47 ` Kai Großjohann
  0 siblings, 2 replies; 13+ messages in thread
From: Alexander Kotelnikov @ 2001-09-07 17:39 UTC (permalink / raw)


Hello.

I have a problem with gnus: I am russian so I read mail in cyrillic.
Cyrillic has at least three wide(or not wide) spread charsets:
1. koi8-r (unix systems)
2. cp1251, or Windows-1251 (windows cyrillic)
3. iso-8859-5, not widly used, but ISO standard.

Most of mail I get is in koi8-r and I read it without any
troubles. But when I get a mail with
Content-Type: text/plain; charset=windows-1251
or
Content-Type: text/plain; charset=cp1251
I see undecoded subject, like =?windows-1251?B?4vLu8O7l?= and body.

As gnus info sais I have
(put-charset-property 'cyrillic-iso8859-5
		      'preferred-coding-system 'koi8-r)
and 

(codepage-setup 1251)
(define-coding-system-alias 'windows-1251 'cp1251)

set in my ~/.gnus.el

One more problem: people now write mail in utf-8 encoding, can it be
converted by gnus to me default, if emacs can't display unicode?

I use GNU Emacs 20.7.2 (i386-debian-linux-gnu, X toolkit) with more or
less up-to-date cvs gnus.

PS Cc: are welcome

-- 
Alexander Kotelnikov
Saint-Petersburg, Russia


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: multiple charsets handling in gnus
  2001-09-07 17:39 multiple charsets handling in gnus Alexander Kotelnikov
@ 2001-09-07 19:19 ` Kai Großjohann
  2001-09-07 21:00   ` Alexander Kotelnikov
  2001-09-07 21:47 ` Kai Großjohann
  1 sibling, 1 reply; 13+ messages in thread
From: Kai Großjohann @ 2001-09-07 19:19 UTC (permalink / raw)
  Cc: ding

Alexander Kotelnikov <sacha@softjoys.ru> writes:

> One more problem: people now write mail in utf-8 encoding, can it be
> converted by gnus to me default, if emacs can't display unicode?

Maybe it's easiest to just install Mule-UCS which enables Emacs to
grok Unicode, to some degree.  (The degree is sufficient for viewing
the messages, I think.)

kai
-- 
Symbol's function definition is void: signature


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: multiple charsets handling in gnus
  2001-09-07 19:19 ` Kai Großjohann
@ 2001-09-07 21:00   ` Alexander Kotelnikov
  0 siblings, 0 replies; 13+ messages in thread
From: Alexander Kotelnikov @ 2001-09-07 21:00 UTC (permalink / raw)
  Cc: ding

>>>>> On Fri, 07 Sep 2001 21:19:56 +0200
>>>>> "Kai" == Kai Großjohann <Kai.Grossjohann@CS.Uni-Dortmund.DE> wrote:
Kai> 
Kai> Alexander Kotelnikov <sacha@softjoys.ru> writes:
>> One more problem: people now write mail in utf-8 encoding, can it be
>> converted by gnus to me default, if emacs can't display unicode?
Kai> 
Kai> Maybe it's easiest to just install Mule-UCS which enables Emacs to
Kai> grok Unicode, to some degree.  (The degree is sufficient for viewing
Kai> the messages, I think.)

oh, really helps with utf-8, thanks.

Still wondering about cp1251 recoding...

-- 
Alexander Kotelnikov
Saint-Petersburg, Russia


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: multiple charsets handling in gnus
  2001-09-07 17:39 multiple charsets handling in gnus Alexander Kotelnikov
  2001-09-07 19:19 ` Kai Großjohann
@ 2001-09-07 21:47 ` Kai Großjohann
  2001-09-07 21:56   ` Alexander Kotelnikov
  1 sibling, 1 reply; 13+ messages in thread
From: Kai Großjohann @ 2001-09-07 21:47 UTC (permalink / raw)
  Cc: ding

Alexander Kotelnikov <sacha@softjoys.ru> writes:

> Most of mail I get is in koi8-r and I read it without any
> troubles. But when I get a mail with
> Content-Type: text/plain; charset=windows-1251
> or
> Content-Type: text/plain; charset=cp1251
> I see undecoded subject, like =?windows-1251?B?4vLu8O7l?= and body.

Does the variable mm-charset-synonym-alist help?

kai
-- 
Symbol's function definition is void: signature


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: multiple charsets handling in gnus
  2001-09-07 21:47 ` Kai Großjohann
@ 2001-09-07 21:56   ` Alexander Kotelnikov
  2001-09-07 23:17     ` Kai Großjohann
  0 siblings, 1 reply; 13+ messages in thread
From: Alexander Kotelnikov @ 2001-09-07 21:56 UTC (permalink / raw)


>>>>> On Fri, 07 Sep 2001 23:47:40 +0200
>>>>> "Kai" == Kai Großjohann <Kai.Grossjohann@CS.Uni-Dortmund.DE> wrote:
Kai> 
Kai> Alexander Kotelnikov <sacha@softjoys.ru> writes:
>> Most of mail I get is in koi8-r and I read it without any
>> troubles. But when I get a mail with
>> Content-Type: text/plain; charset=windows-1251
>> or
>> Content-Type: text/plain; charset=cp1251
>> I see undecoded subject, like =?windows-1251?B?4vLu8O7l?= and body.
Kai> 
Kai> Does the variable mm-charset-synonym-alist help?

How can it help, if recoding from cp1251 to koi8-r does not work even
when charset is not an alias (Windows-1251) but the original (cp1251)?

-- 
Alexander Kotelnikov
Saint-Petersburg, Russia


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: multiple charsets handling in gnus
  2001-09-07 21:56   ` Alexander Kotelnikov
@ 2001-09-07 23:17     ` Kai Großjohann
  2001-09-07 23:30       ` Alexander Kotelnikov
  0 siblings, 1 reply; 13+ messages in thread
From: Kai Großjohann @ 2001-09-07 23:17 UTC (permalink / raw)
  Cc: ding

Alexander Kotelnikov <sacha@softjoys.ru> writes:

>>>>>> On Fri, 07 Sep 2001 23:47:40 +0200
>>>>>> "Kai" == Kai Großjohann <Kai.Grossjohann@CS.Uni-Dortmund.DE> wrote:
> Kai> 
> Kai> Alexander Kotelnikov <sacha@softjoys.ru> writes:
> >> Most of mail I get is in koi8-r and I read it without any
>>> troubles. But when I get a mail with
>>> Content-Type: text/plain; charset=windows-1251
>>> or
>>> Content-Type: text/plain; charset=cp1251
>>> I see undecoded subject, like =?windows-1251?B?4vLu8O7l?= and body.
> Kai> 
> Kai> Does the variable mm-charset-synonym-alist help?
>
> How can it help, if recoding from cp1251 to koi8-r does not work even
> when charset is not an alias (Windows-1251) but the original (cp1251)?

Hm.  The content-type header has no bearing whatsoever on what Gnus
does with headers.

You could have charset=gb-2312 and the handling of
=?windows-1251?... in the header wouldn't change at all.

Does Gnus display the body correctly?

kai
-- 
Symbol's function definition is void: signature


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: multiple charsets handling in gnus
  2001-09-07 23:17     ` Kai Großjohann
@ 2001-09-07 23:30       ` Alexander Kotelnikov
  2001-09-08 10:59         ` Kai Großjohann
  0 siblings, 1 reply; 13+ messages in thread
From: Alexander Kotelnikov @ 2001-09-07 23:30 UTC (permalink / raw)


>>>>> On Sat, 08 Sep 2001 01:17:17 +0200
>>>>> "Kai" == Kai Großjohann <Kai.Grossjohann@CS.Uni-Dortmund.DE> wrote:
Kai> 
Kai> Does the variable mm-charset-synonym-alist help?
>> 
>> How can it help, if recoding from cp1251 to koi8-r does not work even
>> when charset is not an alias (Windows-1251) but the original (cp1251)?
Kai> 
Kai> Hm.  The content-type header has no bearing whatsoever on what Gnus
Kai> does with headers.
Kai> 
Kai> You could have charset=gb-2312 and the handling of
Kai> =?windows-1251?... in the header wouldn't change at all.
Kai> 
Kai> Does Gnus display the body correctly?

not at all.

BTW, I've put together 4 test messages in koi8-r, cp1251, Windows-1251
and utf-8 in file
http://people.debian.org/~sacha/russ.box

-- 
Alexander Kotelnikov
Saint-Petersburg, Russia


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: multiple charsets handling in gnus
  2001-09-07 23:30       ` Alexander Kotelnikov
@ 2001-09-08 10:59         ` Kai Großjohann
  2001-09-08 11:06           ` Kai Großjohann
  0 siblings, 1 reply; 13+ messages in thread
From: Kai Großjohann @ 2001-09-08 10:59 UTC (permalink / raw)
  Cc: ding

Alexander Kotelnikov <sacha@softjoys.ru> writes:

> BTW, I've put together 4 test messages in koi8-r, cp1251, Windows-1251
> and utf-8 in file
> http://people.debian.org/~sacha/russ.box

Okay.  I can now view them all.  You have to do the codepage-setup
stuff before Gnus loads.

And I used the following after loading Gnus:

(add-to-list 'mm-charset-synonym-alist '(windows-1251 . cp1251))

I also did the following, but I think that's not necessary.

(add-to-list 'mm-mime-mule-charset-alist '(windows-1251 cp1251))

Now I think I should try to insert the magic words in .emacs and .gnus
to see whether it works after restarting Emacs.

kai
-- 
Symbol's function definition is void: signature


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: multiple charsets handling in gnus
  2001-09-08 10:59         ` Kai Großjohann
@ 2001-09-08 11:06           ` Kai Großjohann
  2001-09-08 21:04             ` Alexander Kotelnikov
  0 siblings, 1 reply; 13+ messages in thread
From: Kai Großjohann @ 2001-09-08 11:06 UTC (permalink / raw)
  Cc: ding

Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai Großjohann) writes:

> Okay.  I can now view them all.  You have to do the codepage-setup
> stuff before Gnus loads.

I can now testify that the following lets me view all four messages:

* Near the beginning of ~/.emacs, I inserted the following line:
  (codepage-setup 1251)

* Near the end of my ~/.gnus, I inserted the following lines:
  (require 'mm-util)
  (add-to-list 'mm-charset-synonym-alist '(windows-1251 . cp1251))

Of course, since I use Emacs 21, I have a slight advantage regarding
the UTF-8 stuff.  But I successfully viewed UTF-8 messages before in
Emacs 20 with Mule-UCS.  Therefore, I'm confident that Mule-UCS will
help you with UTF-8, too.

Please report your findings.

kai
-- 
Symbol's function definition is void: signature


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: multiple charsets handling in gnus
  2001-09-08 11:06           ` Kai Großjohann
@ 2001-09-08 21:04             ` Alexander Kotelnikov
  0 siblings, 0 replies; 13+ messages in thread
From: Alexander Kotelnikov @ 2001-09-08 21:04 UTC (permalink / raw)


>>>>> On Sat, 08 Sep 2001 13:06:42 +0200
>>>>> "Kai" == Kai Großjohann <Kai.Grossjohann@CS.Uni-Dortmund.DE> wrote:
Kai> 
Kai> Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai Großjohann) writes:
>> Okay.  I can now view them all.  You have to do the codepage-setup
>> stuff before Gnus loads.
Kai> 
Kai> I can now testify that the following lets me view all four messages:
Kai> 
Kai> * Near the beginning of ~/.emacs, I inserted the following line:
Kai>   (codepage-setup 1251)

my test say: "put it in before .gnus, e.g. in .emacs"

Kai> 
Kai> * Near the end of my ~/.gnus, I inserted the following lines:
Kai>   (require 'mm-util)
Kai>   (add-to-list 'mm-charset-synonym-alist '(windows-1251
Kai> . cp1251))

I just put (define-coding-system-alias 'windows-1251 'cp1251) just
after codepage-setup.

Kai> Of course, since I use Emacs 21, I have a slight advantage regarding
Kai> the UTF-8 stuff.  But I successfully viewed UTF-8 messages before in
Kai> Emacs 20 with Mule-UCS.  Therefore, I'm confident that Mule-UCS will
Kai> help you with UTF-8, too.

me too. I read utf-8 now using mule-ucs.

Kai> Please report your findings.

So I did.

Thank you very much, Kai, for your help.
-- 
Alexander Kotelnikov
Saint-Petersburg, Russia


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: multiple charsets handling in gnus
  2001-09-12 19:13 ` Pavel Janík
@ 2001-09-13  0:14   ` ShengHuo ZHU
  0 siblings, 0 replies; 13+ messages in thread
From: ShengHuo ZHU @ 2001-09-13  0:14 UTC (permalink / raw)


Pavel@Janik.cz (Pavel Janík) writes:


[...]

>
> 2001-09-12  Pavel Janík  <Pavel@Janik.cz>
>
> 	* mm-util.el (mm-charset-synonym-alist): add windows-1250 so we
> 	can read e-mails from Microsoft Outlook users not using ISO
> 	8859-2 character set.

Installed.

ShengHuo


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: multiple charsets handling in gnus
  2001-09-10  7:27 Pavel Janík
@ 2001-09-12 19:13 ` Pavel Janík
  2001-09-13  0:14   ` ShengHuo ZHU
  0 siblings, 1 reply; 13+ messages in thread
From: Pavel Janík @ 2001-09-12 19:13 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 1103 bytes --]

   From: Pavel@Janik.cz (Pavel Janík)
   Date: Mon, 10 Sep 2001 09:27:17 +0200

   >    > * Near the end of my ~/.gnus, I inserted the following lines:
   >    > (require 'mm-util)
   >    > (add-to-list 'mm-charset-synonym-alist '(windows-1251 . cp1251))
   > 
   > What about adding this to Gnus directly? There is already this code:
   > 
   >     ,(unless (mm-coding-system-p 'windows-1252)	; should be defined eventually
   >        '(windows-1252 . iso-8859-1))
   > 
   > so we can extend this to have also 1251 together with 1250. I too have this
   > in my ~/.gnus to be able to read e-mails written on Windows:
   > 
   > ;; Support for Windows-1250
   > (require 'mm-util)
   > (add-to-list 'mm-charset-synonym-alist '(windows-1250 . cp1250))
   > 
   > I hope that I'll be able to remove that from my .gnus soon :-)

Please apply the following patch. I thing Kai will immediately send his
one ;-)

2001-09-12  Pavel Janík  <Pavel@Janik.cz>

	* mm-util.el (mm-charset-synonym-alist): add windows-1250 so we
	can read e-mails from Microsoft Outlook users not using ISO
	8859-2 character set.


[-- Attachment #2: gnus-windows-1250.diff --]
[-- Type: text/x-patch, Size: 1221 bytes --]

diff -ur gnus.orig/lisp/ChangeLog gnus/lisp/ChangeLog
--- gnus.orig/lisp/ChangeLog	Wed Sep 12 20:57:00 2001
+++ gnus/lisp/ChangeLog	Wed Sep 12 21:11:36 2001
@@ -1,3 +1,9 @@
+2001-09-12  Pavel Janík  <Pavel@Janik.cz>
+
+	* mm-util.el (mm-charset-synonym-alist): add windows-1250 so we
+	can read e-mails from Microsoft Outlook users not using ISO
+	8859-2 character set.
+
 2001-09-12  Didier Verna  <didier@xemacs.org>
 
 	* nndiary.el: new version (0.2-b13).
diff -ur gnus.orig/lisp/mm-util.el gnus/lisp/mm-util.el
--- gnus.orig/lisp/mm-util.el	Wed Sep 12 20:57:00 2001
+++ gnus/lisp/mm-util.el	Wed Sep 12 21:09:59 2001
@@ -166,6 +166,11 @@
     ;; `gnus-article-dumbquotes-map'.
     ,(unless (mm-coding-system-p 'windows-1252)	; should be defined eventually
        '(windows-1252 . iso-8859-1))
+    ;; Windows-1250 is a variant of Latin-2 heavily used by Microsoft
+    ;; Outlook users in Czech republic. Use this to allow reading of their
+    ;; e-mails. cp1250 should be defined by M-x codepage-setup.
+    ,(unless (mm-coding-system-p 'windows-1250)	; should be defined eventually
+       '(windows-1250 . cp1250))
     (x-ctext . ctext))
   "A mapping from invalid charset names to the real charset names.")
 

[-- Attachment #3: Type: text/plain, Size: 170 bytes --]


-- 
Pavel Janík

When you're in a fight with an idiot, it's difficult for other people to tell
which one the idiot is.
                  -- Bruce Perens on debian-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: multiple charsets handling in gnus
@ 2001-09-10  7:27 Pavel Janík
  2001-09-12 19:13 ` Pavel Janík
  0 siblings, 1 reply; 13+ messages in thread
From: Pavel Janík @ 2001-09-10  7:27 UTC (permalink / raw)


   > * Near the end of my ~/.gnus, I inserted the following lines:
   > (require 'mm-util)
   > (add-to-list 'mm-charset-synonym-alist '(windows-1251 . cp1251))

What about adding this to Gnus directly? There is already this code:

    ,(unless (mm-coding-system-p 'windows-1252)	; should be defined eventually
       '(windows-1252 . iso-8859-1))

so we can extend this to have also 1251 together with 1250. I too have this
in my ~/.gnus to be able to read e-mails written on Windows:

;; Support for Windows-1250
(require 'mm-util)
(add-to-list 'mm-charset-synonym-alist '(windows-1250 . cp1250))

I hope that I'll be able to remove that from my .gnus soon :-)
-- 
Pavel Janík

panic("bad_user_access_length executed (not cool, dude)");
                  -- 2.0.38 kernel/panic.c


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2001-09-13  0:14 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-09-07 17:39 multiple charsets handling in gnus Alexander Kotelnikov
2001-09-07 19:19 ` Kai Großjohann
2001-09-07 21:00   ` Alexander Kotelnikov
2001-09-07 21:47 ` Kai Großjohann
2001-09-07 21:56   ` Alexander Kotelnikov
2001-09-07 23:17     ` Kai Großjohann
2001-09-07 23:30       ` Alexander Kotelnikov
2001-09-08 10:59         ` Kai Großjohann
2001-09-08 11:06           ` Kai Großjohann
2001-09-08 21:04             ` Alexander Kotelnikov
2001-09-10  7:27 Pavel Janík
2001-09-12 19:13 ` Pavel Janík
2001-09-13  0:14   ` ShengHuo ZHU

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).