* Decoding ISO8859-15 -- offending headers @ 2002-07-18 6:33 Jonas Steverud 2002-07-26 19:39 ` Simon Josefsson 0 siblings, 1 reply; 27+ messages in thread From: Jonas Steverud @ 2002-07-18 6:33 UTC (permalink / raw) I posted earlier about decoding iso8859-15 news articles and here is an article where åäö is replaced by ? when I view it but they looked nice when I did C-u g. The headers look alright to me. Anyone else who has trouble with this article? Path: news.chalmers.se!newsfeed.sunet.se!news01.sunet.se!logbridge.uoregon.edu!news.tele.dk!small.news.tele.dk!news-stob.telia.net!telia.net!194.22.194.4.MISMATCH!masternews.telia.net.!newsc.telia.net.POSTED!not-for-mail From: Marcus <pictorNOSPAM@mac.com> Subject: .Mac Newsgroups: se.dator.sys.mac Lines: 6 Organization: SUBLIMITETSAKADEMIEN User-Agent: KNode/0.6.1 MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Transfer-Encoding: 8Bit Message-ID: <3ulZ8.47852$n4.11770780@newsc.telia.net> Date: Wed, 17 Jul 2002 21:39:11 GMT NNTP-Posting-Host: 217.210.56.30 X-Complaints-To: abuse@telia.com X-Trace: newsc.telia.net 1026941951 217.210.56.30 (Wed, 17 Jul 2002 23:39:11 CEST) NNTP-Posting-Date: Wed, 17 Jul 2002 23:39:11 CEST Xref: news.chalmers.se se.dator.sys.mac:25740 -- ( www.dtek.chalmers.se/~d4jonas/ ! Wei Wu Wei ) ( Meaning of U2 Lyrics, Roleplaying ! To Do Without Do ) ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Decoding ISO8859-15 -- offending headers 2002-07-18 6:33 Decoding ISO8859-15 -- offending headers Jonas Steverud @ 2002-07-26 19:39 ` Simon Josefsson 2002-07-27 9:09 ` Jonas Steverud 0 siblings, 1 reply; 27+ messages in thread From: Simon Josefsson @ 2002-07-26 19:39 UTC (permalink / raw) Jonas Steverud <d4jonas@dtek.chalmers.se> writes: > I posted earlier about decoding iso8859-15 news articles and here is > an article where åäö is replaced by ? when I view it This looks like a question mark. Is this how it looks for you? Or was it an empty box? The messages looked fine here, but I think you must have the proper ISO-8859-15 fonts to display it. > but they looked nice when I did C-u g. That's weird. ÅÄÖ have the same code point in 15 as in 1, so if somehow your emacs defaults to 1 this might happen, but it doesn't for me. When I C-u g an article containing non-ASCII I always get octal sequences for the characters, which is what you should expect. If you C-u g this message, do you see ÅÄÖ as ÅÄÖ or \345\344\346? ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Decoding ISO8859-15 -- offending headers 2002-07-26 19:39 ` Simon Josefsson @ 2002-07-27 9:09 ` Jonas Steverud 2002-07-27 14:03 ` Simon Josefsson 0 siblings, 1 reply; 27+ messages in thread From: Jonas Steverud @ 2002-07-27 9:09 UTC (permalink / raw) Simon Josefsson <jas@extundo.com> writes: > Jonas Steverud <d4jonas@dtek.chalmers.se> writes: > >> I posted earlier about decoding iso8859-15 news articles and here is >> an article where åäö is replaced by ? when I view it > > This looks like a question mark. Is this how it looks for you? Or > was it an empty box? It's a questionmark. Sorry for not pointing that out. Should've known. > The messages looked fine here, but I think you must have the proper > ISO-8859-15 fonts to display it. It might be it. Dunno how to check though. >> but they looked nice when I did C-u g. > > That's weird. ÅÄÖ have the same code point in 15 as in 1, so if > somehow your emacs defaults to 1 this might happen, but it doesn't for > me. When I C-u g an article containing non-ASCII I always get octal > sequences for the characters, which is what you should expect. If you > C-u g this message, do you see ÅÄÖ as ÅÄÖ or \345\344\346? As ÅÄÖ. -- ( www.dtek.chalmers.se/~d4jonas/ ! Wei Wu Wei ) ( Meaning of U2 Lyrics, Roleplaying ! To Do Without Do ) ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Decoding ISO8859-15 -- offending headers 2002-07-27 9:09 ` Jonas Steverud @ 2002-07-27 14:03 ` Simon Josefsson 2002-07-27 15:17 ` Jonas Steverud 0 siblings, 1 reply; 27+ messages in thread From: Simon Josefsson @ 2002-07-27 14:03 UTC (permalink / raw) Jonas Steverud <d4jonas@dtek.chalmers.se> writes: >> That's weird. ÅÄÖ have the same code point in 15 as in 1, so if >> somehow your emacs defaults to 1 this might happen, but it doesn't for >> me. When I C-u g an article containing non-ASCII I always get octal >> sequences for the characters, which is what you should expect. If you >> C-u g this message, do you see ÅÄÖ as ÅÄÖ or \345\344\346? > > As ÅÄÖ. I'm not sure that's correct. What kind of MULE configuration do you have in .emacs? ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Decoding ISO8859-15 -- offending headers 2002-07-27 14:03 ` Simon Josefsson @ 2002-07-27 15:17 ` Jonas Steverud 2002-07-27 20:45 ` Simon Josefsson 0 siblings, 1 reply; 27+ messages in thread From: Jonas Steverud @ 2002-07-27 15:17 UTC (permalink / raw) Simon Josefsson <jas@extundo.com> writes: > Jonas Steverud <d4jonas@dtek.chalmers.se> writes: [...] >> As ÅÄÖ. > > I'm not sure that's correct. What kind of MULE configuration do you > have in .emacs? Huh? Is it (set-language-environment "Latin-1") you talk of? But set-language-environment is maybe not needed anylonger? I have never entirely understood how Emacs handles diffrent input methods and how to make it understand me so set-language-environment might be outdated. GNU Emacs 21.1.2 (sparc-sun-solaris2.8, X toolkit, Xaw3d scroll bars) of 2001-10-21 on licia.dtek.chalmers.se -- ( www.dtek.chalmers.se/~d4jonas/ ! Wei Wu Wei ) ( Meaning of U2 Lyrics, Roleplaying ! To Do Without Do ) ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Decoding ISO8859-15 -- offending headers 2002-07-27 15:17 ` Jonas Steverud @ 2002-07-27 20:45 ` Simon Josefsson 2002-07-27 21:42 ` Simon Josefsson ` (2 more replies) 0 siblings, 3 replies; 27+ messages in thread From: Simon Josefsson @ 2002-07-27 20:45 UTC (permalink / raw) Jonas Steverud <d4jonas@dtek.chalmers.se> writes: > Simon Josefsson <jas@extundo.com> writes: > >> Jonas Steverud <d4jonas@dtek.chalmers.se> writes: > [...] >>> As ÅÄÖ. >> >> I'm not sure that's correct. What kind of MULE configuration do you >> have in .emacs? > > Huh? > > Is it > (set-language-environment "Latin-1") > you talk of? Yes, and other similar stuff. Evaluating that expression doesn't change the behaviour for me though, I still get octal sequences in C-u g buffers. Pinpointing the code that changes this for you could be a lead in solving the original problem. What do other people see? Perhaps it is my configuration.. > But set-language-environment is maybe not needed anylonger? The C locale is used to guess the language environment, I think, so usually it is not needed. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Decoding ISO8859-15 -- offending headers 2002-07-27 20:45 ` Simon Josefsson @ 2002-07-27 21:42 ` Simon Josefsson 2002-07-28 8:46 ` Jonas Steverud 2002-07-28 14:04 ` Reiner Steib 2 siblings, 0 replies; 27+ messages in thread From: Simon Josefsson @ 2002-07-27 21:42 UTC (permalink / raw) Simon Josefsson <jas@extundo.com> writes: >>>> As ÅÄÖ. >>> >>> I'm not sure that's correct. What kind of MULE configuration do you >>> have in .emacs? >> >> Huh? >> >> Is it >> (set-language-environment "Latin-1") >> you talk of? (standard-display-european 1) has the effect you mention. If you have that in .emacs, consider removing it. I'll see if I can fix it so that s-d-e doesn't break Gnus though. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Decoding ISO8859-15 -- offending headers 2002-07-27 20:45 ` Simon Josefsson 2002-07-27 21:42 ` Simon Josefsson @ 2002-07-28 8:46 ` Jonas Steverud 2002-07-28 14:04 ` Reiner Steib 2 siblings, 0 replies; 27+ messages in thread From: Jonas Steverud @ 2002-07-28 8:46 UTC (permalink / raw) Simon Josefsson <jas@extundo.com> writes: > Jonas Steverud <d4jonas@dtek.chalmers.se> writes: [...] >> Is it >> (set-language-environment "Latin-1") >> you talk of? > > Yes, and other similar stuff. As far as I know and can find, s-l-e is the only function I call except for a (require 'latin-1). > Perhaps it is my configuration.. My problem is that I have very many files loaded in my .emacs so it is hard to debug but I don't think I have anything more then the above. My locale is set to: LANG= LC_CTYPE=iso_8859_1 LC_NUMERIC="C" LC_TIME="C" LC_COLLATE="C" LC_MONETARY="C" LC_MESSAGES="C" LC_ALL= I ssh from Mac OS X running OroborOSX 0.8preview2 as window manager to a Solaris system. But that's hardly the problem. There are very few messages that looks like this so it is not a big problem, but it would be nice to fix it anyway. -- ( www.dtek.chalmers.se/~d4jonas/ ! Wei Wu Wei ) ( Meaning of U2 Lyrics, Roleplaying ! To Do Without Do ) ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Decoding ISO8859-15 -- offending headers 2002-07-27 20:45 ` Simon Josefsson 2002-07-27 21:42 ` Simon Josefsson 2002-07-28 8:46 ` Jonas Steverud @ 2002-07-28 14:04 ` Reiner Steib 2002-07-29 11:44 ` Simon Josefsson 2 siblings, 1 reply; 27+ messages in thread From: Reiner Steib @ 2002-07-28 14:04 UTC (permalink / raw) On Sat, Jul 27 2002, Simon Josefsson wrote: > I still get octal sequences in C-u g buffers. Pinpointing the code > that changes this for you could be a lead in solving the original > problem. What do other people see? I don't see any octal sequences. I see the normal Latin-1 characters (`M-x describe-char-after RET' gives `eight-bit-graphic' in the raw article, not `latin-iso8859-1'). Oort Gnus v0.07, Emacs/21.1, (set-language-environment "Latin-1") Bye, Reiner. -- ,,, (o o) ---ooO-(_)-Ooo--- PGP key available via WWW http://rsteib.home.pages.de/ ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Decoding ISO8859-15 -- offending headers 2002-07-28 14:04 ` Reiner Steib @ 2002-07-29 11:44 ` Simon Josefsson 2002-07-29 14:13 ` Reiner Steib 0 siblings, 1 reply; 27+ messages in thread From: Simon Josefsson @ 2002-07-29 11:44 UTC (permalink / raw) Reiner Steib <4uce.02.r.steib@gmx.net> writes: > On Sat, Jul 27 2002, Simon Josefsson wrote: > >> I still get octal sequences in C-u g buffers. Pinpointing the code >> that changes this for you could be a lead in solving the original >> problem. What do other people see? > > I don't see any octal sequences. I see the normal Latin-1 characters > (`M-x describe-char-after RET' gives `eight-bit-graphic' in the raw > article, not `latin-iso8859-1'). > > Oort Gnus v0.07, Emacs/21.1, (set-language-environment "Latin-1") Interesting, I get the below. However, the Ä is a octal sequence in the *Help* buffer, but a proper glyph when I cut'n'paste it into this buffer. character: Ä (0304, 196, 0xc4) charset: eight-bit-graphic (8-bit graphic char (0xA0..0xFF)) code point: 196 syntax: which means: whitespace category: buffer code: 0xC4 file code: not encodable by coding system nil font: -Adobe-Courier-Medium-R-Normal--17-120-100-100-M-100-ISO8859-1 ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Decoding ISO8859-15 -- offending headers 2002-07-29 11:44 ` Simon Josefsson @ 2002-07-29 14:13 ` Reiner Steib 2002-07-31 15:35 ` Simon Josefsson 0 siblings, 1 reply; 27+ messages in thread From: Reiner Steib @ 2002-07-29 14:13 UTC (permalink / raw) On Mon, Jul 29 2002, Simon Josefsson wrote: > Reiner Steib <4uce.02.r.steib@gmx.net> writes: [...] >> I don't see any octal sequences. I see the normal Latin-1 characters >> (`M-x describe-char-after RET' gives `eight-bit-graphic' in the raw >> article, not `latin-iso8859-1'). >> >> Oort Gnus v0.07, Emacs/21.1, (set-language-environment "Latin-1") I found that my locale is responsible for this (namely LC_CTYPE=en_US.ISO_8859-1, all other are C or POSIX). With LC_CTYPE=C, I see \304 too. Emacs 21.1 (--no-site-file, minimal ~/.{emacs,gnus}), Oort: - LC_CTYPE=C ==> displayed as \304 - LC_CTYPE=en_US.ISO_8859-1 ==> displayed as Ä (eight-bit-graphic in both cases) > Interesting, I get the below. However, the Ä is a octal sequence in > the *Help* buffer, but a proper glyph when I cut'n'paste it into this > buffer. Same here, and in the message buffer, I have »charset: latin-iso8859-1«. Bye, Reiner. -- ,,, (o o) ---ooO-(_)-Ooo--- PGP key available via WWW http://rsteib.home.pages.de/ ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Decoding ISO8859-15 -- offending headers 2002-07-29 14:13 ` Reiner Steib @ 2002-07-31 15:35 ` Simon Josefsson 2002-08-01 6:37 ` Jonas Steverud 2002-08-18 8:42 ` Jonas Steverud 0 siblings, 2 replies; 27+ messages in thread From: Simon Josefsson @ 2002-07-31 15:35 UTC (permalink / raw) Reiner Steib <4uce.02.r.steib@gmx.net> writes: > On Mon, Jul 29 2002, Simon Josefsson wrote: > >> Reiner Steib <4uce.02.r.steib@gmx.net> writes: > [...] >>> I don't see any octal sequences. I see the normal Latin-1 characters >>> (`M-x describe-char-after RET' gives `eight-bit-graphic' in the raw >>> article, not `latin-iso8859-1'). >>> >>> Oort Gnus v0.07, Emacs/21.1, (set-language-environment "Latin-1") > > I found that my locale is responsible for this (namely > LC_CTYPE=en_US.ISO_8859-1, all other are C or POSIX). With LC_CTYPE=C, > I see \304 too. > > Emacs 21.1 (--no-site-file, minimal ~/.{emacs,gnus}), Oort: > - LC_CTYPE=C ==> displayed as \304 > - LC_CTYPE=en_US.ISO_8859-1 ==> displayed as Ä > (eight-bit-graphic in both cases) Ok. Jonas, does using a non-ISO8859-1 locale modify the behaviour for your original problem? What locale settings do you use? ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Decoding ISO8859-15 -- offending headers 2002-07-31 15:35 ` Simon Josefsson @ 2002-08-01 6:37 ` Jonas Steverud 2002-08-18 8:42 ` Jonas Steverud 1 sibling, 0 replies; 27+ messages in thread From: Jonas Steverud @ 2002-08-01 6:37 UTC (permalink / raw) Simon Josefsson <jas@extundo.com> writes: [...] > Ok. Jonas, does using a non-ISO8859-1 locale modify the behaviour for > your original problem? What locale settings do you use? LANG= LC_CTYPE=iso_8859_1 LC_NUMERIC="C" LC_TIME="C" LC_COLLATE="C" LC_MONETARY="C" LC_MESSAGES="C" LC_ALL= I will nopt be able to access my email from saturday and about a week forward and I do not have time to check before, I think. I'll put it on my to-do-list. -- ( www.dtek.chalmers.se/~d4jonas/ ! Wei Wu Wei ) ( Meaning of U2 Lyrics, Roleplaying ! To Do Without Do ) ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Decoding ISO8859-15 -- offending headers 2002-07-31 15:35 ` Simon Josefsson 2002-08-01 6:37 ` Jonas Steverud @ 2002-08-18 8:42 ` Jonas Steverud 2002-08-18 10:27 ` Simon Josefsson 1 sibling, 1 reply; 27+ messages in thread From: Jonas Steverud @ 2002-08-18 8:42 UTC (permalink / raw) Simon Josefsson <jas@extundo.com> writes: [...] > Ok. Jonas, does using a non-ISO8859-1 locale modify the behaviour for > your original problem? What locale settings do you use? I don't remeber what I have answered and the mailing list archive's Glimps index "was not found" when I searched att gnus.org so I can't check which article (the message-id) is was that offended Gnus. My locales are accordingly: On the Solaris where Emacs lives: > locale LANG= LC_CTYPE=iso_8859_1 LC_NUMERIC="C" LC_TIME="C" LC_COLLATE="C" LC_MONETARY="C" LC_MESSAGES="C" LC_ALL= I ssh from a Mac OS X running XFree86 with OrorborOSX 0.8b2 as WM and there I have: > env | egrep '^L[A,C]' LANG=sv_SE LC_CTYPE=iso_8859_1 Since I don't know the offending article I cannot (yet) check how the locale affects Gnus. I do plan to drop the issue since it is not a great matter - *very* few articles are a problem. What is a greater problem (as a side note) is the \201 that is printed in the Groups buffer for 0.06 and 0.07. (See other mails regarding this.) They prevents me from upgrading... :-/ (Since they are so ugly.) -- ( www.dtek.chalmers.se/~d4jonas/ ! Wei Wu Wei ) ( Meaning of U2 Lyrics, Roleplaying ! To Do Without Do ) ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Decoding ISO8859-15 -- offending headers 2002-08-18 8:42 ` Jonas Steverud @ 2002-08-18 10:27 ` Simon Josefsson 2002-08-18 10:46 ` Jonas Steverud 0 siblings, 1 reply; 27+ messages in thread From: Simon Josefsson @ 2002-08-18 10:27 UTC (permalink / raw) Jonas Steverud <d4jonas@dtek.chalmers.se> writes: > Since I don't know the offending article I cannot (yet) check how the > locale affects Gnus. I do plan to drop the issue since it is not a > great matter - *very* few articles are a problem. I haven't been able to reproduce it yet, but I'll try with your locale settings. > What is a greater problem (as a side note) is the \201 that is printed > in the Groups buffer for 0.06 and 0.07. (See other mails regarding > this.) They prevents me from upgrading... :-/ (Since they are so ugly.) I see them too, but I don't know where to start looking. It would be a big help if you could do a binary search between working and non-working CVS until you find the buggy patch. ("cvs -D ...") ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Decoding ISO8859-15 -- offending headers 2002-08-18 10:27 ` Simon Josefsson @ 2002-08-18 10:46 ` Jonas Steverud 2002-08-18 11:08 ` Simon Josefsson 0 siblings, 1 reply; 27+ messages in thread From: Jonas Steverud @ 2002-08-18 10:46 UTC (permalink / raw) Simon Josefsson <jas@extundo.com> writes: > Jonas Steverud <d4jonas@dtek.chalmers.se> writes: [...] >> What is a greater problem (as a side note) is the \201 that is printed >> in the Groups buffer for 0.06 and 0.07. (See other mails regarding >> this.) They prevents me from upgrading... :-/ (Since they are so ugly.) > > I see them too, but I don't know where to start looking. It would be > a big help if you could do a binary search between working and > non-working CVS until you find the buggy patch. ("cvs -D ...") :-D I've been thinking on doing a diff between the 0.05 and 0.06 dirs but I expect them to be very large diffs. I do not have cvs set up and I think it is a little too much learning on my behalf for thay approch to be effective. Hmm... I maybe should do that directory diff after all... What is the function that prints the contents to the Group buffer named? It would help me somewhat to debug this. -- ( www.dtek.chalmers.se/~d4jonas/ ! Wei Wu Wei ) ( Meaning of U2 Lyrics, Roleplaying ! To Do Without Do ) ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Decoding ISO8859-15 -- offending headers 2002-08-18 10:46 ` Jonas Steverud @ 2002-08-18 11:08 ` Simon Josefsson 2002-08-18 11:38 ` Jonas Steverud 2002-08-19 3:14 ` Daiki Ueno 0 siblings, 2 replies; 27+ messages in thread From: Simon Josefsson @ 2002-08-18 11:08 UTC (permalink / raw) Jonas Steverud <d4jonas@dtek.chalmers.se> writes: > What is the function that prints the contents to the Group buffer > named? It would help me somewhat to debug this. I think I found the patch. I'll boot XEmacs to see what is breaking. Masatoshi, do you remember the details? * gnus-group.el (gnus-group-name-decode): Don't test multibyte-string, because it breaks XEmacs. Index: gnus-group.el =================================================================== RCS file: /usr/local/cvsroot/gnus/lisp/gnus-group.el,v retrieving revision 6.68 retrieving revision 6.69 diff -u -p -r6.68 -r6.69 --- gnus-group.el 2002/02/20 00:15:31 6.68 +++ gnus-group.el 2002/02/20 16:15:00 6.69 @@ -1030,8 +1030,7 @@ The following commands are available: result))) (defun gnus-group-name-decode (string charset) - (if (and string charset (featurep 'mule) - (not (mm-multibyte-string-p string))) + (if (and string charset (featurep 'mule)) (mm-decode-coding-string string charset) string)) ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Decoding ISO8859-15 -- offending headers 2002-08-18 11:08 ` Simon Josefsson @ 2002-08-18 11:38 ` Jonas Steverud 2002-08-19 3:14 ` Daiki Ueno 1 sibling, 0 replies; 27+ messages in thread From: Jonas Steverud @ 2002-08-18 11:38 UTC (permalink / raw) Simon Josefsson <jas@extundo.com> writes: > Jonas Steverud <d4jonas@dtek.chalmers.se> writes: > >> What is the function that prints the contents to the Group buffer >> named? It would help me somewhat to debug this. > > I think I found the patch. So do I. [...] > (defun gnus-group-name-decode (string charset) > - (if (and string charset (featurep 'mule) > - (not (mm-multibyte-string-p string))) I reverted that change in 0.07 and now it works! Great! I consider the bug to be found. -- ( www.dtek.chalmers.se/~d4jonas/ ! Wei Wu Wei ) ( Meaning of U2 Lyrics, Roleplaying ! To Do Without Do ) ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Decoding ISO8859-15 -- offending headers 2002-08-18 11:08 ` Simon Josefsson 2002-08-18 11:38 ` Jonas Steverud @ 2002-08-19 3:14 ` Daiki Ueno 2002-08-19 6:53 ` Jonas Steverud 2002-08-19 9:12 ` Simon Josefsson 1 sibling, 2 replies; 27+ messages in thread From: Daiki Ueno @ 2002-08-19 3:14 UTC (permalink / raw) Cc: TSUCHIYA Masatoshi >>>>> In <iluptwg2zlg.fsf@latte.josefsson.org> >>>>> Simon Josefsson <jas@extundo.com> wrote: > I think I found the patch. I'll boot XEmacs to see what is breaking. > Masatoshi, do you remember the details? > * gnus-group.el (gnus-group-name-decode): Don't test > multibyte-string, because it breaks XEmacs. I should comment on this because I suggested him to remove the multibyte check. And, for my real intention, I can't read any Japanese newsgroups whose names are encoded in UTF-8[1] without the patch. I'll show you why the old group name encoder/decoder went wrong for both GNU Emacs and XEmacs-MULE. XEmacs-MULE: Without the patch, all the group names will be decoded twice because the current implementation of `mm-multibyte-string-p' for XEmacs is just an alias to `ignore'. Unfortunately, we can't implement working multibyte-string-p for XEmacs-MULE because XEmacs-MULE uses a Latin-1 character so as to represent a byte in the range from 160 to 255. See the last paragraph of "(internals)Character Sets" for details. GNU Emacs: nntp-server-buffer is marked as multibyte. A string picked from the buffer is always regarded as multibyte, unless it doesn't contain a non-ASCII character. Footnotes: [1] Section 5.5. in USEFOR draft: <http://www.ietf.org/internet-drafts/draft-ietf-usefor-article-07.txt> -- Daiki Ueno ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Decoding ISO8859-15 -- offending headers 2002-08-19 3:14 ` Daiki Ueno @ 2002-08-19 6:53 ` Jonas Steverud 2002-08-19 8:54 ` Kai Großjohann 2002-08-19 9:12 ` Simon Josefsson 1 sibling, 1 reply; 27+ messages in thread From: Jonas Steverud @ 2002-08-19 6:53 UTC (permalink / raw) Daiki Ueno <ueno@unixuser.org> writes: [...] > I'll show you why the old group name encoder/decoder went wrong for > both GNU Emacs and XEmacs-MULE. I.e. it seems like the code has to check for which version of Emacs that is running. A not very beatiful solution IMNSHO. The reason I brought up the issue is that nnfolder groups is attacked by \201 in the name so is it a better solution to do the mutlibyte process in the nnfolder code before the group names are sent to the higher levels? Gnus shall expect to find non-A-Z in the group names of the mailfolders and process it accordingly. From a data arbritation (sp?) point of view, I would say that the best place to process the various characters are in the lower levels and then have all backends return multibyte (or whatever) and send it to the buffer where the buffer display code converts it to whatever the user can handle. Please consider that I'm no expert of the internal functionality of Gnus. -- ( www.dtek.chalmers.se/~d4jonas/ ! Wei Wu Wei ) ( Meaning of U2 Lyrics, Roleplaying ! To Do Without Do ) ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Decoding ISO8859-15 -- offending headers 2002-08-19 6:53 ` Jonas Steverud @ 2002-08-19 8:54 ` Kai Großjohann 2002-08-19 9:13 ` Simon Josefsson 0 siblings, 1 reply; 27+ messages in thread From: Kai Großjohann @ 2002-08-19 8:54 UTC (permalink / raw) Jonas Steverud <d4jonas@dtek.chalmers.se> writes: > Daiki Ueno <ueno@unixuser.org> writes: > > [...] >> I'll show you why the old group name encoder/decoder went wrong for >> both GNU Emacs and XEmacs-MULE. > > I.e. it seems like the code has to check for which version of Emacs > that is running. I wonder if Dave's Gnus changes in the Emacs CVS branch emacs-unicode address this problem. He said that he tried to make Gnus less dependent on the internal encoding used. kai -- A large number of young women don't trust men with beards. (BFBS Radio) ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Decoding ISO8859-15 -- offending headers 2002-08-19 8:54 ` Kai Großjohann @ 2002-08-19 9:13 ` Simon Josefsson 0 siblings, 0 replies; 27+ messages in thread From: Simon Josefsson @ 2002-08-19 9:13 UTC (permalink / raw) Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai Großjohann) writes: > I wonder if Dave's Gnus changes in the Emacs CVS branch emacs-unicode > address this problem. He said that he tried to make Gnus less > dependent on the internal encoding used. The functions involved look the same in the unicode branch, at least when I looked quickly. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Decoding ISO8859-15 -- offending headers 2002-08-19 3:14 ` Daiki Ueno 2002-08-19 6:53 ` Jonas Steverud @ 2002-08-19 9:12 ` Simon Josefsson 2002-08-19 11:05 ` Kai Großjohann 1 sibling, 1 reply; 27+ messages in thread From: Simon Josefsson @ 2002-08-19 9:12 UTC (permalink / raw) Cc: ding, TSUCHIYA Masatoshi Daiki Ueno <ueno@unixuser.org> writes: >>>>>> In <iluptwg2zlg.fsf@latte.josefsson.org> >>>>>> Simon Josefsson <jas@extundo.com> wrote: > >> I think I found the patch. I'll boot XEmacs to see what is breaking. >> Masatoshi, do you remember the details? > >> * gnus-group.el (gnus-group-name-decode): Don't test >> multibyte-string, because it breaks XEmacs. > > I should comment on this because I suggested him to remove the multibyte > check. And, for my real intention, I can't read any Japanese newsgroups > whose names are encoded in UTF-8[1] without the patch. Right, it seem to break my UTF-8 nntp groups too. It also makes the .newsrc.eld contain different codings of the group name depending on if I use Emacs or XEmacs. Not good. > I'll show you why the old group name encoder/decoder went wrong for > both GNU Emacs and XEmacs-MULE. > > XEmacs-MULE: Without the patch, all the group names will be decoded > twice because the current implementation of `mm-multibyte-string-p' > for XEmacs is just an alias to `ignore'. Unfortunately, we can't > implement working multibyte-string-p for XEmacs-MULE because > XEmacs-MULE uses a Latin-1 character so as to represent a byte in the > range from 160 to 255. See the last paragraph of > "(internals)Character Sets" for details. > > GNU Emacs: nntp-server-buffer is marked as multibyte. A string picked > from the buffer is always regarded as multibyte, unless it doesn't > contain a non-ASCII character. Ok. What would the solution be? Make all the backends return multibyte group strings? Right now it seems nnfolder (and nnimap) uses unibyte, and nntp uses multibyte, which won't work. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Decoding ISO8859-15 -- offending headers 2002-08-19 9:12 ` Simon Josefsson @ 2002-08-19 11:05 ` Kai Großjohann 2002-08-19 14:28 ` Simon Josefsson 0 siblings, 1 reply; 27+ messages in thread From: Kai Großjohann @ 2002-08-19 11:05 UTC (permalink / raw) Cc: ding, TSUCHIYA Masatoshi Simon Josefsson <jas@extundo.com> writes: > It also makes the .newsrc.eld contain different codings of the group > name depending on if I use Emacs or XEmacs. Not good. I think it would be good if Gnus always used emacs-mule as the encoding for .newsrc.eld. What do you think? kai -- A large number of young women don't trust men with beards. (BFBS Radio) ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Decoding ISO8859-15 -- offending headers 2002-08-19 11:05 ` Kai Großjohann @ 2002-08-19 14:28 ` Simon Josefsson 2002-08-19 17:02 ` Kai Großjohann 0 siblings, 1 reply; 27+ messages in thread From: Simon Josefsson @ 2002-08-19 14:28 UTC (permalink / raw) Cc: ding, TSUCHIYA Masatoshi Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai Großjohann) writes: > Simon Josefsson <jas@extundo.com> writes: > >> It also makes the .newsrc.eld contain different codings of the group >> name depending on if I use Emacs or XEmacs. Not good. > > I think it would be good if Gnus always used emacs-mule as the > encoding for .newsrc.eld. It has a emacs-lisp cookie, so I think Emacs does this (or doesn't emacs store elisp in emacs-mule encoding?). > What do you think? Won't work when I use XEmacs without MULE. Saving and reading .newsrc.eld in an interoperable encoding, preferably even standardized, is the only acceptable solution, I think, but the set to chose from is empty and will be for many years too. I'd love to be mistaken here though. Btw, how about making Gnus put all of the recipient addresses in To: instead of messing up the headers like it does in this mail? It looks as if I'm addressing Daiki here (or is that Ueno?). ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Decoding ISO8859-15 -- offending headers 2002-08-19 14:28 ` Simon Josefsson @ 2002-08-19 17:02 ` Kai Großjohann 2002-08-19 17:13 ` Simon Josefsson 0 siblings, 1 reply; 27+ messages in thread From: Kai Großjohann @ 2002-08-19 17:02 UTC (permalink / raw) Cc: ding, TSUCHIYA Masatoshi Simon Josefsson <jas@extundo.com> writes: > Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai Großjohann) writes: > >> Simon Josefsson <jas@extundo.com> writes: >> >>> It also makes the .newsrc.eld contain different codings of the group >>> name depending on if I use Emacs or XEmacs. Not good. >> >> I think it would be good if Gnus always used emacs-mule as the >> encoding for .newsrc.eld. > > It has a emacs-lisp cookie, so I think Emacs does this (or doesn't > emacs store elisp in emacs-mule encoding?). Hm? Why should the mode imply any encoding? I used to frob the encoding of ~/.gnus between latin-1, utf-8, and latin-9, for example. kai -- A large number of young women don't trust men with beards. (BFBS Radio) ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Decoding ISO8859-15 -- offending headers 2002-08-19 17:02 ` Kai Großjohann @ 2002-08-19 17:13 ` Simon Josefsson 0 siblings, 0 replies; 27+ messages in thread From: Simon Josefsson @ 2002-08-19 17:13 UTC (permalink / raw) Cc: ding, TSUCHIYA Masatoshi Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai Großjohann) writes: > Simon Josefsson <jas@extundo.com> writes: > >> Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai Großjohann) writes: >> >>> Simon Josefsson <jas@extundo.com> writes: >>> >>>> It also makes the .newsrc.eld contain different codings of the group >>>> name depending on if I use Emacs or XEmacs. Not good. >>> >>> I think it would be good if Gnus always used emacs-mule as the >>> encoding for .newsrc.eld. >> >> It has a emacs-lisp cookie, so I think Emacs does this (or doesn't >> emacs store elisp in emacs-mule encoding?). > > Hm? Why should the mode imply any encoding? I used to frob the > encoding of ~/.gnus between latin-1, utf-8, and latin-9, for example. You are right, I was thinking of *.elc (which is loaded as emacs-mule) and confused things, sorry. ^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2002-08-19 17:13 UTC | newest] Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2002-07-18 6:33 Decoding ISO8859-15 -- offending headers Jonas Steverud 2002-07-26 19:39 ` Simon Josefsson 2002-07-27 9:09 ` Jonas Steverud 2002-07-27 14:03 ` Simon Josefsson 2002-07-27 15:17 ` Jonas Steverud 2002-07-27 20:45 ` Simon Josefsson 2002-07-27 21:42 ` Simon Josefsson 2002-07-28 8:46 ` Jonas Steverud 2002-07-28 14:04 ` Reiner Steib 2002-07-29 11:44 ` Simon Josefsson 2002-07-29 14:13 ` Reiner Steib 2002-07-31 15:35 ` Simon Josefsson 2002-08-01 6:37 ` Jonas Steverud 2002-08-18 8:42 ` Jonas Steverud 2002-08-18 10:27 ` Simon Josefsson 2002-08-18 10:46 ` Jonas Steverud 2002-08-18 11:08 ` Simon Josefsson 2002-08-18 11:38 ` Jonas Steverud 2002-08-19 3:14 ` Daiki Ueno 2002-08-19 6:53 ` Jonas Steverud 2002-08-19 8:54 ` Kai Großjohann 2002-08-19 9:13 ` Simon Josefsson 2002-08-19 9:12 ` Simon Josefsson 2002-08-19 11:05 ` Kai Großjohann 2002-08-19 14:28 ` Simon Josefsson 2002-08-19 17:02 ` Kai Großjohann 2002-08-19 17:13 ` Simon Josefsson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).