Gnus development mailing list
 help / color / mirror / Atom feed
* Huh...? euc-jp...?
@ 2006-02-23 20:57 Steinar Bang
  2006-02-23 21:34 ` Reiner Steib
  0 siblings, 1 reply; 4+ messages in thread
From: Steinar Bang @ 2006-02-23 20:57 UTC (permalink / raw)


Platform: Intel Pentium M, ubuntu breezy
	  GNU Emacs 21.4.1 (i386-pc-linux-gnu, X toolkit, Xaw3d scroll bars) of 2005-05-03 on rothera, modified by Debian
	  No Gnus v0.4 (last updated February 2 2006)

When I pasted the a paragraph from
 http://www.theglobeandmail.com/servlet/story/RTGAM.20060223.wxapple0223/BNStory/Front/home
the message ended up with the charset euc-jp, which I hadn't heard
about until now.

This is the paragraph, that might do the same to this posting...?
  Still, the switch to Intel is a necessary one from an engineering
  standpoint, he said, because Apple needed a way to improve
  performance per watt. Mr. Wozniak would have liked Apple to continue
  using Motorola processors, but “Intel just did a very good logic
  design.”

I guess the quotes are the culprits, but I still can't see what made
it pick euc-jp?  The http headers have the content-type text/html with
no charset.  The transferred HTML document have a <meta> element that
looks like this
 <meta http-equiv="content-type" content="text/html; charset=iso-8859-1"> 

The quotes themselves are character entities, called &ldquo; and
&rdquo;. 




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Huh...? euc-jp...?
  2006-02-23 20:57 Huh...? euc-jp...? Steinar Bang
@ 2006-02-23 21:34 ` Reiner Steib
  2006-02-24  8:30   ` Steinar Bang
  0 siblings, 1 reply; 4+ messages in thread
From: Reiner Steib @ 2006-02-23 21:34 UTC (permalink / raw)


On Thu, Feb 23 2006, Steinar Bang wrote:

> Platform: Intel Pentium M, ubuntu breezy
> 	  GNU Emacs 21.4.1 (i386-pc-linux-gnu, X toolkit, Xaw3d scroll bars) of 2005-05-03 on rothera, modified by Debian
> 	  No Gnus v0.4 (last updated February 2 2006)
[...]
> This is the paragraph, that might do the same to this posting...?

Yes...

| Content-Type: text/plain; charset=euc-jp

>   Still, the switch to Intel is a necessary one from an engineering
>   standpoint, he said, because Apple needed a way to improve
>   performance per watt. Mr. Wozniak would have liked Apple to continue
>   using Motorola processors, but “Intel just did a very good logic
>   design.”
>
> I guess the quotes are the culprits, but I still can't see what made
> it pick euc-jp?  

As Emacs 21 cannot do CJK->UTF-8 unification, Emacs is unable to
convert the chars to UTF-8.  There's nothing Gnus can do about this.

In Emacs 22 it works correctly (see `utf-translate-cjk-mode' in the
NEWS file):

| also known as “The Wizard of Woz,” or even “the other Steve” — made 

(In Emacs 21 I get literal "\u2014" and "\u201c" here (6 chars each)
when pasting from Firefox.)

We already discussed similar problems in the German Gnus[1] and
general newsreader[2] groups when pasting from man pages.  It might
occur only with a certain combination of locales and program versions
(X server?).  All people with this problems were using Debian.  I
couldn't reproduce this problems on SuSE.

This posting will be in utf-8 if `utf-translate-cjk-mode' works
correctly.

Bye, Reiner.

[1] July 2004 in de.comm.software.gnus:
    <news:v93c3z1ro6.fsf@marauder.physik.uni-ulm.de>
    http://www.google.de/groups?as_umsgid=v93c3z1ro6.fsf%40marauder.physik.uni-ulm.de&hl=en

[2] January 2006 in de.comm.software.newsreader
    <v9u0bvvtnw.fsf@marauder.physik.uni-ulm.de>
    http://www.google.de/groups?as_umsgid=v9u0bvvtnw.fsf@marauder.physik.uni-ulm.de&hl=en
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Huh...? euc-jp...?
  2006-02-23 21:34 ` Reiner Steib
@ 2006-02-24  8:30   ` Steinar Bang
  2006-02-24 11:37     ` Katsumi Yamaoka
  0 siblings, 1 reply; 4+ messages in thread
From: Steinar Bang @ 2006-02-24  8:30 UTC (permalink / raw)


>>>>> Reiner Steib <reinersteib+gmane@imap.cc>:

> We already discussed similar problems in the German Gnus[1] and
> general newsreader[2] groups when pasting from man pages.  It might
> occur only with a certain combination of locales and program
> versions (X server?).  All people with this problems were using
> Debian.  I couldn't reproduce this problems on SuSE.

Hm... ubuntu, which I'm using, is a variant of debian.






^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Huh...? euc-jp...?
  2006-02-24  8:30   ` Steinar Bang
@ 2006-02-24 11:37     ` Katsumi Yamaoka
  0 siblings, 0 replies; 4+ messages in thread
From: Katsumi Yamaoka @ 2006-02-24 11:37 UTC (permalink / raw)


>>>>> In <v964n5sqsr.fsf@marauder.physik.uni-ulm.de> Reiner Steib wrote:

> As Emacs 21 cannot do CJK->UTF-8 unification, Emacs is unable to
> convert the chars to UTF-8.  There's nothing Gnus can do about this.

I could reproduce the euc-jp problem using Emacs 21 in the
German language environment, and probably I could solve it using
Mule-UCS[1] and setting the `mm-coding-system-priorities'
variable as follows:

(setq mm-coding-system-priorities '(iso-8859-1 utf-8)) ;; [2]

> In Emacs 22 it works correctly (see `utf-translate-cjk-mode' in the
> NEWS file):

Yes, I think it is better to use Emacs 22 and I don't recommend
Mule-UCS so aggressively (note that Mule-UCS is generally
useless and somewhat harmful to Emacs 22).

[1] The official release is:
  ftp://ftp.m17n.org/pub/mule/Mule-UCS/Mule-UCS-0.84.tar.gz

The following one might not be latest but is pretty new:
  http://www.jpl.org/ftp/pub/tmp/Mule-UCS-0.85-20040906.tar.gz
  or ftp://ftp.jpl.org/pub/tmp/Mule-UCS-0.85-20040906.tar.gz

[2] Perhaps it is necessary to use a better value than that.

>>>>> In <87oe0x5fb5.fsf@dod.no> Steinar Bang wrote:

> Hm... ubuntu, which I'm using, is a variant of debian.

Emacsen I'm using are the ones all I built by myself in the
Fedora Core 4 system.



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2006-02-24 11:37 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-02-23 20:57 Huh...? euc-jp...? Steinar Bang
2006-02-23 21:34 ` Reiner Steib
2006-02-24  8:30   ` Steinar Bang
2006-02-24 11:37     ` Katsumi Yamaoka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).