From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/53746 Path: main.gmane.org!not-for-mail From: Jesper Harder Newsgroups: gmane.emacs.gnus.general Subject: Re: Gnus: UTF-8 and compatibility with other MUAs Date: Sun, 17 Aug 2003 02:57:25 +0200 Organization: http://purl.org/harder/ Sender: ding-owner@lists.math.uh.edu Message-ID: References: NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: sea.gmane.org 1061082307 5341 80.91.224.253 (17 Aug 2003 01:05:07 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Sun, 17 Aug 2003 01:05:07 +0000 (UTC) Original-X-From: ding-owner+M2288@lists.math.uh.edu Sun Aug 17 03:05:06 2003 Return-path: Original-Received: from malifon.math.uh.edu ([129.7.128.13]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 19oByk-00076D-00 for ; Sun, 17 Aug 2003 03:05:06 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu) by malifon.math.uh.edu with smtp (Exim 3.20 #1) id 19oBv3-0005ae-00; Sat, 16 Aug 2003 20:01:17 -0500 Original-Received: from sclp3.sclp.com ([64.157.176.121]) by malifon.math.uh.edu with smtp (Exim 3.20 #1) id 19oBuv-0005aW-00 for ding@lists.math.uh.edu; Sat, 16 Aug 2003 20:01:09 -0500 Original-Received: (qmail 78285 invoked by alias); 17 Aug 2003 01:01:09 -0000 Original-Received: (qmail 78280 invoked from network); 17 Aug 2003 01:01:08 -0000 Original-Received: from quimby.gnus.org (80.91.224.244) by sclp3.sclp.com with SMTP; 17 Aug 2003 01:01:08 -0000 Original-Received: from news by quimby.gnus.org with local (Exim 3.12 #1 (Debian)) id 19oBwZ-0001Jf-00 for ; Sun, 17 Aug 2003 03:02:51 +0200 Original-To: ding@gnus.org Original-Path: localhost.localdomain!nobody Original-Newsgroups: gnus.ding Original-Lines: 71 Original-NNTP-Posting-Host: 0xc3f9528d.esnxr1.ras.tele.dk Original-X-Trace: quimby.gnus.org 1061082171 5060 195.249.82.141 (17 Aug 2003 01:02:51 GMT) Original-X-Complaints-To: usenet@quimby.gnus.org Original-NNTP-Posting-Date: 17 Aug 2003 01:02:51 GMT X-Face: ^RrvqCr7c,P$zTR:QED"@h9+BTm-"fjZJJ-3=OU7.)i/K]<.J88}s>'Z_$r; writes: > Jesper Harder writes: > >> But the purpose of email is to _communicate_. Why lower you chance >> of cummunicating if there is no compelling technical reason to do >> so? > > Now and then I realize in German Usenet, that a few people who post > replies to my articles can not deal with UTF-8, because when they > quote the text I wrote, I see funny characters instead of umlauts. > This is not a big impediment to communication. It is a big impediment, believe me. A long time ago I used read Usenet by TELNETTing from a Norsk Data terminal to an overloaded Ultrix box. Needless to say this setup could not display any 8bit characters (the eight bit was stripped). Reading Danish was so annoying that I didn't use dk.* for many years. Also remember that not everyone can say "Okay, I'll just upgrade to something Unicode-capable". If you're using a shared system you probably don't have the power to decide that. > If we don't move to Unicode in the future then coding system > problems will go on forever and ever. It would be foolish not to use Unicode for any _new_ protocols or formats. But for legacy systems like email and Usenet backward compatibility is really, really important. If you look at how e.g. MIME or format=flowed was designed, you'll see that a lot of effort and thought was spent on minimizing negative effects for existing clients. You need an especially good excuse to break existing stuff. The fact that Unicode is a technically more pleasing solution just isn't a good enough reason to break things unnecessarily, IMHO. But if you're doing something that wasn't possible before, say, using German and Thai in the same message, that's a valid reason to use Unicode. > My guess -- by the way -- is that Unicode will become increasingly > important in Europe, especially for the members of the EU. We'd need > at least Latin-1/Latin-9, Latin-2 and Greek (ISO 8859-7). And I am not > sure if that already covers Latvian, Romanian and others. There will > be a growing need for an encoding that covers all of these languages. I think most Western European users don't care about and don't know how to access any glyph that isn't printed on the keyboard. >> 2. Unicode support itself doesn't really buy me a lot if most people >> don't have fairly complete Unicode fonts (which they don't). > > So the worst thing that could happen is that they see a hollow box now > and then. An empty box can be bad enough. If you're writing an equation it can be really important what that empty box happens to be ☺ I experienced that problem recently when I used ℏ in a message. > And yet some characters are more frequent than others. You can > probably rely on the fact that western Europeans have fonts that > contain the Latin-1 repertoire. Box drawing characters or symbols > may not be that frequent, but there is a good chance to get the > additional punctuation characters. In practice the only thing you can reasonably expect are the 650 glyphs in WGL4.¹ ¹ http://partners.adobe.com/asn/tech/type/opentype/appendices/wgl4.jsp