Gnus development mailing list
 help / color / mirror / Atom feed
* IMAP charset confusion
@ 2011-08-21 21:17 Lars Magne Ingebrigtsen
  2011-08-21 21:31 ` Lars Magne Ingebrigtsen
  2011-08-21 23:23 ` Katsumi Yamaoka
  0 siblings, 2 replies; 9+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-08-21 21:17 UTC (permalink / raw)
  To: ding; +Cc: Katsumi Yamaoka

I just tried copying a message to an nnimap group with non-ASCII
characters, and it wasn't a pretty sight.

Basically, I gave the name as "hïllo", and it was created as
"h llo" on the server.

The problem seems to have something to do with double-encoding names.
Before calling `-request-create-group', `gnus-read-move-group-name' does
this:

      (setq encoded (mm-encode-coding-string
		     to-newsgroup
		     (gnus-group-name-charset to-method to-newsgroup)))

Which does this:                     

(encode-coding-string "hïllo" 'iso-8859-1)
"h\357llo"

nnimap then encodes this in utf7:

(utf7-encode (encode-coding-string "hïllo" 'iso-8859-1) t)
"h&ACA-llo"

While we should have:

(utf7-encode "hïllo" 'iso-8859-1)
"h&AO8-llo"

(utf7-decode "h&AO8-llo" t)
"hïllo"

Katsumi, you seemed to introduce the encoding stuff in 2007, for good
reasons, I'm sure.  :-)  But here we have a backend that really knows
what charset it wants to use (the standard defines utf-7), so I'm not
sure what the solution here is.

Hm...  would it make sense to have `gnus-group-name-charset' just return
nil for all nnimap groups?

-- 
(domestic pets only, the antidote for overdose, milk.)
  bloggy blog http://lars.ingebrigtsen.no/



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: IMAP charset confusion
  2011-08-21 21:17 IMAP charset confusion Lars Magne Ingebrigtsen
@ 2011-08-21 21:31 ` Lars Magne Ingebrigtsen
  2011-08-21 21:37   ` Lars Magne Ingebrigtsen
  2011-08-21 23:23 ` Katsumi Yamaoka
  1 sibling, 1 reply; 9+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-08-21 21:31 UTC (permalink / raw)
  To: ding; +Cc: Katsumi Yamaoka

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> Hm...  would it make sense to have `gnus-group-name-charset' just return
> nil for all nnimap groups?

I did that (pushed it now), and that seems to make stuff work.  I can
move an article to a group, and I can read the group.

However, if I browse the group in the server buffer, the group name
appears as "raw utf8", since the `read' calls are wrapped in
`mm-string-as-unibyte'...  I'm not quite sure what the purpose of those
are...

-- 
(domestic pets only, the antidote for overdose, milk.)
  bloggy blog http://lars.ingebrigtsen.no/



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: IMAP charset confusion
  2011-08-21 21:31 ` Lars Magne Ingebrigtsen
@ 2011-08-21 21:37   ` Lars Magne Ingebrigtsen
  0 siblings, 0 replies; 9+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-08-21 21:37 UTC (permalink / raw)
  To: ding; +Cc: Katsumi Yamaoka

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> I did that (pushed it now), and that seems to make stuff work.  I can
> move an article to a group, and I can read the group.

I spoke too soon.  I could select it, but the next time I hit `g', the
"Tést" group showed up as "T\303\151st".  So I'm guessing there's more
`mm-string-as-unibyte' things going on.  :-)

Should nnimap transform the non-ASCII things into some other form before
giving them to Gnus?  At present, it just puts

"Tést" 1 1 y

into the *nntpd* buffer when you `g' (and the like).

-- 
(domestic pets only, the antidote for overdose, milk.)
  bloggy blog http://lars.ingebrigtsen.no/



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: IMAP charset confusion
  2011-08-21 21:17 IMAP charset confusion Lars Magne Ingebrigtsen
  2011-08-21 21:31 ` Lars Magne Ingebrigtsen
@ 2011-08-21 23:23 ` Katsumi Yamaoka
  2011-09-10 22:28   ` Lars Magne Ingebrigtsen
  1 sibling, 1 reply; 9+ messages in thread
From: Katsumi Yamaoka @ 2011-08-21 23:23 UTC (permalink / raw)
  To: ding

Lars Magne Ingebrigtsen wrote:
> I just tried copying a message to an nnimap group with non-ASCII
> characters, and it wasn't a pretty sight.

[...]

> Katsumi, you seemed to introduce the encoding stuff in 2007, for good
> reasons, I'm sure.  :-)  But here we have a backend that really knows
> what charset it wants to use (the standard defines utf-7), so I'm not
> sure what the solution here is.

I'm completely ignorant in IMAP, so I didn't touch nnimap for
non-ASCII group names.  Sorry.  Who dit it seems Martin Stjernholm:

http://news.gmane.org/group/gmane.emacs.gnus.general/thread=69448

> Hm...  would it make sense to have `gnus-group-name-charset' just return
> nil for all nnimap groups?

Doesn't this do the trick?

(setq gnus-group-name-charset-method-alist
      '(((nnimap "SERVER") . nil)))



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: IMAP charset confusion
  2011-08-21 23:23 ` Katsumi Yamaoka
@ 2011-09-10 22:28   ` Lars Magne Ingebrigtsen
  2011-09-10 22:46     ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 9+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-09-10 22:28 UTC (permalink / raw)
  To: Katsumi Yamaoka; +Cc: ding

Katsumi Yamaoka <yamaoka@jpl.org> writes:

>> Hm...  would it make sense to have `gnus-group-name-charset' just return
>> nil for all nnimap groups?
>
> Doesn't this do the trick?
>
> (setq gnus-group-name-charset-method-alist
>       '(((nnimap "SERVER") . nil)))

Yes, but this should work by default...

I wonder whether this is the same charset confusion that bug#9351 (about
nnrss) is about.

If I understand the code correctly, the current model is that Gnus knows
what the charset the backends should use is.  So Gnus encodes strings to
that charset before giving the names over to the backend?

But in the case of nnimap, the backend charset should always be utf7.

But I'm not sure that's correct...

-- 
(domestic pets only, the antidote for overdose, milk.)
  bloggy blog http://lars.ingebrigtsen.no/



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: IMAP charset confusion
  2011-09-10 22:28   ` Lars Magne Ingebrigtsen
@ 2011-09-10 22:46     ` Lars Magne Ingebrigtsen
  2011-09-10 22:50       ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 9+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-09-10 22:46 UTC (permalink / raw)
  To: Katsumi Yamaoka; +Cc: ding

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> But I'm not sure that's correct...

I think perhaps I should back out the changes I made, and make Gnus
encode to utf7 first?  Uhm.  Perhaps.  Because now when utf7 IMAP groups
appear, Gnus appears to interpret them as

L¡rs

-- 
(domestic pets only, the antidote for overdose, milk.)
  bloggy blog http://lars.ingebrigtsen.no/



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: IMAP charset confusion
  2011-09-10 22:46     ` Lars Magne Ingebrigtsen
@ 2011-09-10 22:50       ` Lars Magne Ingebrigtsen
  2011-09-11  8:08         ` Andreas Schwab
  0 siblings, 1 reply; 9+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-09-10 22:50 UTC (permalink / raw)
  To: Katsumi Yamaoka; +Cc: ding

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> I think perhaps I should back out the changes I made, and make Gnus
> encode to utf7 first?  Uhm.  Perhaps.  Because now when utf7 IMAP groups
> appear, Gnus appears to interpret them as

But IMAP doesn't really use utf7 encoding.  It uses a variant of it that
Emacs itself doesn't really support natively.

This could get messy.

Anybody have any ideas?

-- 
(domestic pets only, the antidote for overdose, milk.)
  bloggy blog http://lars.ingebrigtsen.no/



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: IMAP charset confusion
  2011-09-10 22:50       ` Lars Magne Ingebrigtsen
@ 2011-09-11  8:08         ` Andreas Schwab
  2011-09-12  4:28           ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 9+ messages in thread
From: Andreas Schwab @ 2011-09-11  8:08 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen; +Cc: Katsumi Yamaoka, ding

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> But IMAP doesn't really use utf7 encoding.  It uses a variant of it that
> Emacs itself doesn't really support natively.
>
> This could get messy.
>
> Anybody have any ideas?

Add it to Emacs?

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: IMAP charset confusion
  2011-09-11  8:08         ` Andreas Schwab
@ 2011-09-12  4:28           ` Lars Magne Ingebrigtsen
  0 siblings, 0 replies; 9+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-09-12  4:28 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Katsumi Yamaoka, ding

Andreas Schwab <schwab@linux-m68k.org> writes:

>> But IMAP doesn't really use utf7 encoding.  It uses a variant of it that
>> Emacs itself doesn't really support natively.
>>
>> This could get messy.
>>
>> Anybody have any ideas?
>
> Add it to Emacs?

I don't think it's generally useful outside of IMAP -- it's basically
utf7 with some character substitutions.

But I found a different solution to do allow IMAP to do its thing while
Gnus does its thing, too.

-- 
(domestic pets only, the antidote for overdose, milk.)
  bloggy blog http://lars.ingebrigtsen.no/



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2011-09-12  4:28 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-21 21:17 IMAP charset confusion Lars Magne Ingebrigtsen
2011-08-21 21:31 ` Lars Magne Ingebrigtsen
2011-08-21 21:37   ` Lars Magne Ingebrigtsen
2011-08-21 23:23 ` Katsumi Yamaoka
2011-09-10 22:28   ` Lars Magne Ingebrigtsen
2011-09-10 22:46     ` Lars Magne Ingebrigtsen
2011-09-10 22:50       ` Lars Magne Ingebrigtsen
2011-09-11  8:08         ` Andreas Schwab
2011-09-12  4:28           ` Lars Magne Ingebrigtsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).