From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/50848 Path: main.gmane.org!not-for-mail From: Jesper Harder Newsgroups: gmane.emacs.gnus.general Subject: Re: Trouble displaying 8bit postings properly Date: Sun, 16 Mar 2003 22:20:31 +0100 Organization: http://purl.org/harder/ Sender: owner-ding@hpc.uh.edu Message-ID: References: NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: main.gmane.org 1047849836 20963 80.91.224.249 (16 Mar 2003 21:23:56 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Sun, 16 Mar 2003 21:23:56 +0000 (UTC) Original-X-From: owner-ding@hpc.uh.edu Sun Mar 16 22:23:55 2003 Return-path: Original-Received: from malifon.math.uh.edu ([129.7.128.13]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 18ufbm-0005Ry-00 for ; Sun, 16 Mar 2003 22:23:54 +0100 Original-Received: from sina.hpc.uh.edu ([129.7.128.10] ident=lists) by malifon.math.uh.edu with esmtp (Exim 3.20 #1) id 18ufaR-0004EQ-00; Sun, 16 Mar 2003 15:22:31 -0600 Original-Received: by sina.hpc.uh.edu (TLB v0.09a (1.20 tibbs 1996/10/09 22:03:07)); Sun, 16 Mar 2003 15:23:32 -0600 (CST) Original-Received: from sclp3.sclp.com (sclp3.sclp.com [66.230.238.2]) by sina.hpc.uh.edu (8.9.3/8.9.3) with SMTP id PAA04734 for ; Sun, 16 Mar 2003 15:23:19 -0600 (CST) Original-Received: (qmail 26622 invoked by alias); 16 Mar 2003 21:22:13 -0000 Original-Received: (qmail 26617 invoked from network); 16 Mar 2003 21:22:12 -0000 Original-Received: from quimby.gnus.org (80.91.224.244) by 66.230.238.6 with SMTP; 16 Mar 2003 21:22:12 -0000 Original-Received: from news by quimby.gnus.org with local (Exim 3.12 #1 (Debian)) id 18ufyb-0002Iy-00 for ; Sun, 16 Mar 2003 22:47:29 +0100 Original-To: ding@gnus.org Original-Path: localhost.localdomain!nobody Original-Newsgroups: gnus.ding Original-Lines: 42 Original-NNTP-Posting-Host: 0xc3f9534c.esnxr2.ras.tele.dk Original-X-Trace: quimby.gnus.org 1047851249 8863 195.249.83.76 (16 Mar 2003 21:47:29 GMT) Original-X-Complaints-To: usenet@quimby.gnus.org Original-NNTP-Posting-Date: 16 Mar 2003 21:47:29 GMT X-Face: ^RrvqCr7c,P$zTR:QED"@h9+BTm-"fjZJJ-3=OU7.)i/K]<.J88}s>'Z_$r; writes: > Can you still access this posting? > > Message-ID: <3E7260DE.77CC579B@gmx.de> > > Emacs started in an UTF-8 locale (LANG=de_DE.UTF-8 emacs), failed to > display the umplauts properly; the umlauts are displayed correctly when > I type 'C-u g' > > The same is to be observed for this article (where '1 g ISO-8859-1 RET' > fails): You can do `C-u W M c latin-1' instead. > Message-ID: <3e73794b.14305425@news.cis.dfn.de> The problem in both cases is that the articles contain octets that are invalid in Latin-1. 1. The first seems to be slightly corrupted for some reason -- notice the line ending with "Preissystem begr\201át". 2. The second is actually encoded in the evil Windows-1252 charset -- the hyphen in "Hamburg-Rostock" is an n-dash. The real solution to 2. is to have Emacs know about Windows-1252, which it doesn't. I think it's supported in the Unicode branch, though. The first case is more difficult. It's a tradeoff -- it would have displayed correctly with the old code which doesn't try guess the right encoding. On the other hand, all latin-1 articles without a MIME charset would be displayed incorrectly with the old code. I think valid latin-1 articles without a charset declaration are more common than than articles with a strange corruption -- so it's probably a reasonable tradeoff [1]. What do you think? [1] But note that these problems only apply in an UTF-8 locale -- both articles display correctly in a latin-1 locale (except for the invalid octets, of course).