Gnus development mailing list
 help / color / mirror / Atom feed
From: Oliver Scholz <alkibiades@gmx.de>
Subject: Re: Gnus: UTF-8 and compatibility with other MUAs
Date: Sat, 16 Aug 2003 17:36:45 +0200	[thread overview]
Message-ID: <uoeyp62cy.fsf@ID-87814.user.dfncis.de> (raw)
In-Reply-To: <m3ada9vjrl.fsf@defun.localdomain>

Jesper Harder <harder@myrealbox.com> writes:

> Oliver Scholz <alkibiades@gmx.de> writes:
>
>> If you are satisfied with a _fair_ chance to be flawlessly readable
>> at the other end, you may use UTF-8.
>
> But the purpose of email is to _communicate_.  Why lower you chance of
> cummunicating if there is no compelling technical reason to do so?

First of all: I am not talking about UTF-16 or UTF-7, and I am not
talking about Greek, Hebrew or Arabic. I am talking about UTF-8 for
Latin-based scripts. Even if there is no UTF-8 support at all at the
other end, communication won't fail. As things stand I would not yet
recommend UTF-8 to a Greek user, for example. Now and then I realize
in German Usenet, that a few people who post replies to my articles
can not deal with UTF-8, because when they quote the text I wrote, I
see funny characters instead of umlauts. This is not a big impediment
to communication. I doubt that anybody would put me into his or her
killfile, because I use UTF-8.

And, yes, there is a technical reason that Unicode should become the
default text encoding in the future. The fact that we have a myriad of
different encodings to choose from causes a lot of trouble; just
consider how many questions there are in the various Emacs newsgroups
about coding system issues; and this is just the top of the
iceberg. Sure, Unicode makes sometimes trouble, too. But at least one
could say that these are problems of transition. If we don't move to
Unicode in the future then coding system problems will go on forever
and ever.

If we stick to 256-characters encodings forever, then Latin-9 won't be
the last invention that we will have seen. There may be a need for a
new character in three, five, seven years. Who knows? Latin-10 is
already in final state. What should save us from Latin-11, Latin-12
.... Latin-N, if not a single unified encoding that is designed to
match any need now and in the future?

My guess -- by the way -- is that Unicode will become increasingly
important in Europe, especially for the members of the EU. We'd need
at least Latin-1/Latin-9, Latin-2 and Greek (ISO 8859-7). And I am not
sure if that already covers Latvian, Romanian and others. There will
be a growing need for an encoding that covers all of these languages.

Then, if you want to be absolutely sure that everything works as
expected, then you only option is ASCII. Maybe Latin-1 is also
o.k. for a Western European. But every encoding that contains an Euro
sign is a big no-no.

I really hope for a future (however remote it may be), where I can be
sure that every text file I find on a computer is either ASCII, UTF-8
or UTF-16. When we'll look back then, we will regard this whole ISO
8859-soup as something as strange and weird as EBDIC.

>> How long it will take for Unicode to become as widespread in western
>> Europe as Latin-1 is now -- I don't know. But so far it has spread
>> very rapidly.
>
> 1. Application support isn't that great.  Emacs, (La)TeX and Texinfo
>    don't support Unicode fully (those are some of the most important
>    applications as far as I'm concerned).

The Unicode support for Emacs is quite good; there may be issues with
CJK in the current released version of Emacs, but the rest works
fine. But yes, LaTex and Texinfo (especially Texinfo) need
fixing. Even I, Unicode-Jacobite that I am, use Latin-1 for my LaTeX
stuff. But AFAIK there is some work going on, fortunately. The babel
encoding (sic!) for classical Greek (to take an example that is
important for me) is a nuisance. It is about time for LaTeX to support
Unicode.

> 2. Unicode support itself doesn't really buy me a lot if most people
>    don't have fairly complete Unicode fonts (which they don't).
[...]

So the worst thing that could happen is that they see a hollow box now
and then. And yet some characters are more frequent than others. You
can probably rely on the fact that western Europeans have fonts that
contain the Latin-1 repertoire. Box drawing characters or symbols may
not be that frequent, but there is a good chance to get the additional
punctuation characters.

In the future, when UTF-8 will be the default in Mail and News, this
shouldn't be a problem anymore. People who read mailing lists about
classical Greek, will make sure that they have a font containing
“Greek Extended”; the regulars of alt.fan.tolkien (whatever) will make
sure that they can display Tengwar, Star Trek fans will use fonts
including Klingon etc. etc.

    Oliver
-- 
29 Thermidor an 211 de la Révolution
Liberté, Egalité, Fraternité!




  reply	other threads:[~2003-08-16 15:36 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-08-14 15:48 Xavier Maillard
2003-08-14 22:39 ` Frank Schmitt
2003-08-15 18:22   ` Xavier Maillard
2003-08-14 23:01 ` Jesper Harder
2003-08-15 13:50   ` Oliver Scholz
2003-08-15 16:48     ` Jesper Harder
2003-08-15 18:10       ` Oliver Scholz
2003-08-16  0:23         ` Jesper Harder
2003-08-16  9:48           ` Oliver Scholz
2003-08-16 13:01             ` Jesper Harder
2003-08-16 15:36               ` Oliver Scholz [this message]
2003-08-16 17:14                 ` Reiner Steib
2003-08-16 19:29                   ` Oliver Scholz
2003-08-19 14:54                   ` Miles Bader
2003-08-20 15:24                     ` Reiner Steib
2003-08-21  0:20                       ` Miles Bader
2003-08-16 17:23                 ` Simon Josefsson
2003-08-16 19:18                   ` Oliver Scholz
2003-08-16 22:24                     ` Simon Josefsson
2003-08-17 12:30                       ` Benjamin Riefenstahl
2003-08-17 16:40                         ` Oliver Scholz
2003-08-18  2:20                           ` James H. Cloos Jr.
2003-08-18 15:58                           ` Benjamin Riefenstahl
2003-08-18  2:16                       ` James H. Cloos Jr.
2003-08-18  2:09                   ` James H. Cloos Jr.
2003-08-28 13:38                     ` Jens Müller
2003-08-28 13:35                   ` Jens Müller
2003-08-17  0:57                 ` Jesper Harder
2003-08-17 17:24                   ` Oliver Scholz
2003-08-17 18:21                     ` Matthias Andree
2003-08-15 18:24   ` Xavier Maillard
2003-08-16  0:35     ` Jesper Harder
2003-08-14 23:05 ` Simon Josefsson
2003-08-15 17:00   ` Oliver Scholz
2003-08-16  7:43     ` Ivan Boldyrev
2003-08-17 17:27       ` Oliver Scholz
2003-08-18  6:01     ` Steinar Bang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=uoeyp62cy.fsf@ID-87814.user.dfncis.de \
    --to=alkibiades@gmx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).