Gnus development mailing list
 help / color / mirror / Atom feed
From: Oliver Scholz <alkibiades@gmx.de>
Subject: Re: Gnus: UTF-8 and compatibility with other MUAs
Date: Fri, 15 Aug 2003 20:10:56 +0200	[thread overview]
Message-ID: <uhe4in64v.fsf@ID-87814.user.dfncis.de> (raw)
In-Reply-To: <m3isoy50kt.fsf@defun.localdomain>

Jesper Harder <harder@myrealbox.com> writes:

> Oliver Scholz <alkibiades@gmx.de> writes:
>
>> The lowest common denominator for most German text is ISO
>> 646-DE. For most Danish text (I presume) ISO 646-DK. Virtually
>> nobody uses those coding systems anymore, and IMNSHO nobody should
>> use them.
>
> The RFC does say that ISO-8859 is prefered over ISO 646:
>
>    Note that the ISO 646 character sets have deliberately been omitted
>    in favor of their 8859 replacements, which are the designated
>    character sets for Internet mail.
>

Hmm. I guess it's time for me to finally read RFC 2046 ...

>> Taken literally nobody should use ISO 8859-15 then, unless the
>> message really contains an € (or one of the other 7
>> characters). 
>
> I agree with that.  I don't see _any_ reason to use latin-9 if you
> don't need it.  Some MUA's don't support latin-9 (including older
> versions of Gnus) -- why break those clients for no good reason?

Well, I think, if you want to maximize the chance that your message
is flawlessly readable at the other end, this makes sense as a
pragmatic rule.

As a technical rule, however, which is important for the question
whether a message is fully RFC compliant or not, it does not make
sense.

BTW, if the rule were that we should use the smallest, most widely
used coded character set which covers the all necessary characters in
a message, then western European users should use neither Latin-1 nor
Latin-9, but windows-1252.

However, from the section you quotet alone it is not entirely clear
whether it refers to absctract characters, code points in a coded
character set or octets in a character encoding scheme. The term
“character set” may seem to indicate that they are talking about coded
character sets, but RFC 2046 refers to RFC 2045 for the definition of
the term “character set”. There it reads:

   NOTE: The term "character set" was originally to describe such
   straightforward schemes as US-ASCII and ISO-8859-1 which have a
   simple one-to-one mapping from single octets to single characters.
   Multi-octet coded character sets and switching techniques make the
   situation more complex. For example, some communities use the term
   "character encoding" for what MIME calls a "character set", while
   using the phrase "coded character set" to denote an abstract mapping
   from integers (not octets) to characters.

So I'd say “character set” refers to the character encoding
scheme. And in this sense the rule makes sense: if a message contains
only characters from the ASCII repertoire it should be declared as
US-ASCII, not as UTF-8. But that does not extend to ISO
8859-[[:digit:]]+, since UTF-8 and Latin-1 are not compatible.


    Oliver
-- 
28 Thermidor an 211 de la Révolution
Liberté, Egalité, Fraternité!




  reply	other threads:[~2003-08-15 18:10 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-08-14 15:48 Xavier Maillard
2003-08-14 22:39 ` Frank Schmitt
2003-08-15 18:22   ` Xavier Maillard
2003-08-14 23:01 ` Jesper Harder
2003-08-15 13:50   ` Oliver Scholz
2003-08-15 16:48     ` Jesper Harder
2003-08-15 18:10       ` Oliver Scholz [this message]
2003-08-16  0:23         ` Jesper Harder
2003-08-16  9:48           ` Oliver Scholz
2003-08-16 13:01             ` Jesper Harder
2003-08-16 15:36               ` Oliver Scholz
2003-08-16 17:14                 ` Reiner Steib
2003-08-16 19:29                   ` Oliver Scholz
2003-08-19 14:54                   ` Miles Bader
2003-08-20 15:24                     ` Reiner Steib
2003-08-21  0:20                       ` Miles Bader
2003-08-16 17:23                 ` Simon Josefsson
2003-08-16 19:18                   ` Oliver Scholz
2003-08-16 22:24                     ` Simon Josefsson
2003-08-17 12:30                       ` Benjamin Riefenstahl
2003-08-17 16:40                         ` Oliver Scholz
2003-08-18  2:20                           ` James H. Cloos Jr.
2003-08-18 15:58                           ` Benjamin Riefenstahl
2003-08-18  2:16                       ` James H. Cloos Jr.
2003-08-18  2:09                   ` James H. Cloos Jr.
2003-08-28 13:38                     ` Jens Müller
2003-08-28 13:35                   ` Jens Müller
2003-08-17  0:57                 ` Jesper Harder
2003-08-17 17:24                   ` Oliver Scholz
2003-08-17 18:21                     ` Matthias Andree
2003-08-15 18:24   ` Xavier Maillard
2003-08-16  0:35     ` Jesper Harder
2003-08-14 23:05 ` Simon Josefsson
2003-08-15 17:00   ` Oliver Scholz
2003-08-16  7:43     ` Ivan Boldyrev
2003-08-17 17:27       ` Oliver Scholz
2003-08-18  6:01     ` Steinar Bang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=uhe4in64v.fsf@ID-87814.user.dfncis.de \
    --to=alkibiades@gmx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).