Gnus development mailing list
 help / color / mirror / Atom feed
* Gnus coming with Emacs 21pre-release: iso-8859-{1,15}
@ 2001-07-27  5:10 Karl Eichwalder
  2001-07-27  5:20 ` Karl Eichwalder
                   ` (3 more replies)
  0 siblings, 4 replies; 23+ messages in thread
From: Karl Eichwalder @ 2001-07-27  5:10 UTC (permalink / raw)
  Cc: ding

This bug report will be sent to the Free Software Foundation,
not to your local site managers!
Please write in English, because the Emacs maintainers do not have
translators to read other languages for them.

Your bug report will be posted to the emacs-pretest-bug@gnu.org mailing list.

In GNU Emacs 21.0.104.1 (i686-pc-linux-gnu, X toolkit, Xaw3d scroll bars)
 of 2001-06-22 on tux
configured using `configure  --prefix /gnu'
Important settings:
  value of $LC_ALL: nil
  value of $LC_COLLATE: C
  value of $LC_CTYPE: nil
  value of $LC_MESSAGES: nil
  value of $LC_MONETARY: nil
  value of $LC_NUMERIC: nil
  value of $LC_TIME: nil
  value of $LANG: de_DE.ISO-8859-1
  locale-coding-system: iso-latin-1
  default-enable-multibyte-characters: t

Please describe exactly what actions triggered the bug
and the precise symptoms of the bug:

First, yes, I didn't set all the LC_ variable mentioned above.
Nevertheless assuming "nil" is wrong; they are considered to inherit
their values from LANG if not set separately.  Please, try 'locale' on
GNU/Linux.

If I start Emacs 21pre under the locale

    LANG=de_DE.ISO-8859-15

and reply to a iso-8859-1 encoded message (containing umlaut letters),
my reply message is arranged as a multipart message even if there's no
ambiguity involved.

My proposal: by default send out such a message UTF-8 encoded (maybe,
ognus does this already -- Gnus coming with Emacs 21 should do the same,
please).  Please ask, if it isn't clear enough what I intend to say.

Are there variables to control this behavior?  Just say "yes" and I'll
read again the manual ;)

-- 
ke@suse.de (work) / keichwa@gmx.net (home):              |
http://www.suse.de/~ke/                                  |      ,__o
Free Translation Project:                                |    _-\_<,
http://www.iro.umontreal.ca/contrib/po/HTML/             |   (*)/'(*)


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15}
  2001-07-27  5:10 Gnus coming with Emacs 21pre-release: iso-8859-{1,15} Karl Eichwalder
@ 2001-07-27  5:20 ` Karl Eichwalder
  2001-07-27  8:45 ` Eli Zaretskii
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 23+ messages in thread
From: Karl Eichwalder @ 2001-07-27  5:20 UTC (permalink / raw)
  Cc: ding

Just found the old mail by Dave again; here's some more background info:

From: Dave Love <d.love@dl.ac.uk>
Subject: Re: Unicode/Mule (Re: null-device)
To: Karl Eichwalder <keichwa@gmx.net>
Cc: rms@gnu.org, eliz@is.elta.co.il, haible@ilog.fr,
	pinard@iro.umontreal.ca, emacs-devel@gnu.org, gerd@gnu.org
Date: 22 Jul 2001 19:17:38 +0100

[...]

 KE> The consequence is, Gnus often 

[I'd dispute ‘often’.]

 KE> thinks it has to create a multipart message...  

[Is that necessarily wrong?]

I'll eval into my message buffer

(string (make-char 'latin-iso8859-1 ?\xe9)
        (make-char 'latin-iso8859-14 ?\xe9)
        (make-char 'latin-iso8859-15 ?\xe9))
  => "ééé"

You may choose not to believe me that it results in a string with
three different Emacs characters and that Gnus will post this silently
in utf-8, but it's so.  I unify on encoding to utf-8 in what might as
well be a stock Emacs 21⁴.  For just the three ‘e’s, Latin-1 could
have been chosen.

What you normally see is not a consequence of Emacs forcing anything.
It can be customized.

 KE> Yes, it will only do so if you'll enter three 'y' (yes) in a row
 KE> -- this isn't "user-friendly" (Eli).

I made Gnus fixes in this general area (_not_ on the basis of bug
reports), at least some of which aren't installed.

Footnotes: 

[...]

⁴ I know Eli disagrees.

-- 
DOMINUS ILLUMINATIO MEA


-- 
ke@suse.de (work) / keichwa@gmx.net (home):              |
http://www.suse.de/~ke/                                  |      ,__o
Free Translation Project:                                |    _-\_<,
http://www.iro.umontreal.ca/contrib/po/HTML/             |   (*)/'(*)


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15}
  2001-07-27  5:10 Gnus coming with Emacs 21pre-release: iso-8859-{1,15} Karl Eichwalder
  2001-07-27  5:20 ` Karl Eichwalder
@ 2001-07-27  8:45 ` Eli Zaretskii
  2001-07-27 18:22   ` Karl Eichwalder
  2001-09-01 16:30   ` Dave Love
  2001-08-04 15:46 ` Florian Weimer
  2001-09-01 16:26 ` Dave Love
  3 siblings, 2 replies; 23+ messages in thread
From: Eli Zaretskii @ 2001-07-27  8:45 UTC (permalink / raw)
  Cc: emacs-pretest-bug, ding

> From: Karl Eichwalder <keichwa@gmx.net>
> Date: 27 Jul 2001 07:10:57 +0200
>
>   value of $LC_ALL: nil
>   value of $LC_COLLATE: C
>   value of $LC_CTYPE: nil
>   value of $LC_MESSAGES: nil
>   value of $LC_MONETARY: nil
>   value of $LC_NUMERIC: nil
>   value of $LC_TIME: nil
>   value of $LANG: de_DE.ISO-8859-1
>   locale-coding-system: iso-latin-1
>   default-enable-multibyte-characters: t
>
> First, yes, I didn't set all the LC_ variable mentioned above.
> Nevertheless assuming "nil" is wrong; they are considered to inherit
> their values from LANG if not set separately.

This information is for our consumption; it doesn't imply that Emacs
behaves contrary to what you expect.  LANG's value is printed, and
whoever will need this information for tracking down a bug is supposed
to know about the inheritance rules.

> If I start Emacs 21pre under the locale
> 
>     LANG=de_DE.ISO-8859-15
> 
> and reply to a iso-8859-1 encoded message (containing umlaut letters),
> my reply message is arranged as a multipart message even if there's no
> ambiguity involved.
> 
> My proposal: by default send out such a message UTF-8 encoded

This should IMHO be optional at this time, since Unicode support in
the stock Emacs 21 distribution (without add-ons such as Mule-UCS) is
limited and incomplete.  For starters, AFAIK, Emacs cannot encode
8859-15 characters as UTF-8 (see the commentary in utf-8.el) unless
those characters came from a UTF-8 encoded source to begin with, and
thus are stored in the buffer as mule-unicode-NNNN characters.
(Perhaps Gnus can do such conversions with its own code; but I'm
talking about core Emacs functionality here.)  This is not the kind of
support that we could IMHO offer users as the default.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15}
  2001-07-27  8:45 ` Eli Zaretskii
@ 2001-07-27 18:22   ` Karl Eichwalder
  2001-07-27 19:18     ` Eli Zaretskii
  2001-09-01 16:27     ` Dave Love
  2001-09-01 16:30   ` Dave Love
  1 sibling, 2 replies; 23+ messages in thread
From: Karl Eichwalder @ 2001-07-27 18:22 UTC (permalink / raw)
  Cc: emacs-pretest-bug, ding

"Eli Zaretskii" <eliz@is.elta.co.il> writes:

> This should IMHO be optional at this time, since Unicode support in
> the stock Emacs 21 distribution (without add-ons such as Mule-UCS) is
> limited and incomplete.  For starters, AFAIK, Emacs cannot encode
> 8859-15 characters as UTF-8 (see the commentary in utf-8.el) unless
> those characters came from a UTF-8 encoded source to begin with, and
> thus are stored in the buffer as mule-unicode-NNNN characters.

Okay, than we've to make sure to add an user option to store 8859-1 and
8859-15 (and 8859-2 and 8859-16) reply messages in the buffer as
mule-unicode-NNNN characters, please.  I'm sure I did send out UTF-8
messages already -- all this happened behind my back and I was very
happy with it!

[Thanks for the utf-8.el pointer; I'll try to read the code.]

> (Perhaps Gnus can do such conversions with its own code; but I'm
> talking about core Emacs functionality here.)

Yes.  I already received (polite) complains about my strange messages
consisting of attachments only -- Netscape cannot handle multipart text
messages that good.

> This is not the kind of support that we could IMHO offer users as the
> default.

I don't mind to set a variable and to tell users to do so :)

-- 
ke@suse.de (work) / keichwa@gmx.net (home):              |
http://www.suse.de/~ke/                                  |      ,__o
Free Translation Project:                                |    _-\_<,
http://www.iro.umontreal.ca/contrib/po/HTML/             |   (*)/'(*)


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15}
  2001-07-27 18:22   ` Karl Eichwalder
@ 2001-07-27 19:18     ` Eli Zaretskii
  2001-09-01 16:27     ` Dave Love
  1 sibling, 0 replies; 23+ messages in thread
From: Eli Zaretskii @ 2001-07-27 19:18 UTC (permalink / raw)
  Cc: emacs-pretest-bug, ding

> From: Karl Eichwalder <keichwa@gmx.net>
> Date: 27 Jul 2001 20:22:27 +0200
> 
> "Eli Zaretskii" <eliz@is.elta.co.il> writes:
> 
> > This should IMHO be optional at this time, since Unicode support in
> > the stock Emacs 21 distribution (without add-ons such as Mule-UCS) is
> > limited and incomplete.  For starters, AFAIK, Emacs cannot encode
> > 8859-15 characters as UTF-8 (see the commentary in utf-8.el) unless
> > those characters came from a UTF-8 encoded source to begin with, and
> > thus are stored in the buffer as mule-unicode-NNNN characters.
> 
> Okay, than we've to make sure to add an user option to store 8859-1 and
> 8859-15 (and 8859-2 and 8859-16) reply messages in the buffer as
> mule-unicode-NNNN characters, please.

This is exactly the functionality that Emacs lacks: it cannot convert
between 8859-2 and mule-unicode-NNNN because it thinks these are
different characters; and utf-8.el doesn't support anything beyond
8859-1.  (Also 8859-16 is not supported by Emacs at all, IIRC.)


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15}
  2001-07-27  5:10 Gnus coming with Emacs 21pre-release: iso-8859-{1,15} Karl Eichwalder
  2001-07-27  5:20 ` Karl Eichwalder
  2001-07-27  8:45 ` Eli Zaretskii
@ 2001-08-04 15:46 ` Florian Weimer
  2001-08-04 16:54   ` Kai Großjohann
                     ` (2 more replies)
  2001-09-01 16:26 ` Dave Love
  3 siblings, 3 replies; 23+ messages in thread
From: Florian Weimer @ 2001-08-04 15:46 UTC (permalink / raw)


Karl Eichwalder <keichwa@gmx.net> writes:

> My proposal: by default send out such a message UTF-8 encoded (maybe,
> ognus does this already -- Gnus coming with Emacs 21 should do the same,
> please).

It should work even with late pgnus versions if Emacs supports an
UTF-8 coding system. I don't know why it was removed from Emacs 21.
Perhaps Emacs 21 doesn't have a proper UTF-8 coding system?



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15}
  2001-08-04 15:46 ` Florian Weimer
@ 2001-08-04 16:54   ` Kai Großjohann
  2001-08-04 17:15     ` Florian Weimer
  2001-08-04 18:07   ` Eli Zaretskii
  2001-09-01 16:30   ` Dave Love
  2 siblings, 1 reply; 23+ messages in thread
From: Kai Großjohann @ 2001-08-04 16:54 UTC (permalink / raw)
  Cc: emacs-pretest-bug, ding

Florian Weimer <fw@deneb.enyo.de> writes:

> Karl Eichwalder <keichwa@gmx.net> writes:
> 
>> My proposal: by default send out such a message UTF-8 encoded (maybe,
>> ognus does this already -- Gnus coming with Emacs 21 should do the same,
>> please).
> 
> It should work even with late pgnus versions if Emacs supports an
> UTF-8 coding system. I don't know why it was removed from Emacs 21.
> Perhaps Emacs 21 doesn't have a proper UTF-8 coding system?

The Emacs 21 mule-unicode coding system groks iso-8859-1 characters,
but not iso-8859-15.

kai
-- 
~/.signature: No such file or directory


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15}
  2001-08-04 16:54   ` Kai Großjohann
@ 2001-08-04 17:15     ` Florian Weimer
  2001-08-04 17:57       ` Kai Großjohann
  2001-08-04 18:02       ` Eli Zaretskii
  0 siblings, 2 replies; 23+ messages in thread
From: Florian Weimer @ 2001-08-04 17:15 UTC (permalink / raw)
  Cc: emacs-pretest-bug, ding

Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai Großjohann) writes:

>> It should work even with late pgnus versions if Emacs supports an
>> UTF-8 coding system. I don't know why it was removed from Emacs 21.
>> Perhaps Emacs 21 doesn't have a proper UTF-8 coding system?
> 
> The Emacs 21 mule-unicode coding system groks iso-8859-1 characters,
> but not iso-8859-15.

Is anybody needed for fixing this?


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15}
  2001-08-04 17:15     ` Florian Weimer
@ 2001-08-04 17:57       ` Kai Großjohann
  2001-08-04 18:02       ` Eli Zaretskii
  1 sibling, 0 replies; 23+ messages in thread
From: Kai Großjohann @ 2001-08-04 17:57 UTC (permalink / raw)
  Cc: emacs-pretest-bug, ding

Florian Weimer <fw@deneb.enyo.de> writes:

> Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai Großjohann) writes:
> 
>>> It should work even with late pgnus versions if Emacs supports an
>>> UTF-8 coding system. I don't know why it was removed from Emacs 21.
>>> Perhaps Emacs 21 doesn't have a proper UTF-8 coding system?
>> 
>> The Emacs 21 mule-unicode coding system groks iso-8859-1 characters,
>> but not iso-8859-15.
> 
> Is anybody needed for fixing this?

I'm not sure what should be done.  I think that somebody (Dave Love?)
is working on proper transition to Unicode, so whatever is done now is
only a temporary measure, right?  I don't know whether a change like
this can still go in 21.1.

But I'm sure the Emacs maintainers know the full story.

kai
-- 
~/.signature: No such file or directory


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15}
  2001-08-04 17:15     ` Florian Weimer
  2001-08-04 17:57       ` Kai Großjohann
@ 2001-08-04 18:02       ` Eli Zaretskii
  2001-08-04 18:44         ` Florian Weimer
  1 sibling, 1 reply; 23+ messages in thread
From: Eli Zaretskii @ 2001-08-04 18:02 UTC (permalink / raw)
  Cc: Kai.Grossjohann, emacs-pretest-bug, ding

> From: Florian Weimer <fw@deneb.enyo.de>
> Date: Sat, 04 Aug 2001 19:15:43 +0200
> 
> Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai Gro\x7fjohann) writes:
> 
> >> It should work even with late pgnus versions if Emacs supports an
> >> UTF-8 coding system. I don't know why it was removed from Emacs 21.
> >> Perhaps Emacs 21 doesn't have a proper UTF-8 coding system?
> > 
> > The Emacs 21 mule-unicode coding system groks iso-8859-1 characters,
> > but not iso-8859-15.
> 
> Is anybody needed for fixing this?

Yes, you need either (1) install an add-on package such as Mule-UCS;
or (2) add support for using Unicode tables for encoding and decoding
Mule charsets into and from UTF-8; or (3) replace the internal
representation of characters used by Emacs to be based on Unicode.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15}
  2001-08-04 15:46 ` Florian Weimer
  2001-08-04 16:54   ` Kai Großjohann
@ 2001-08-04 18:07   ` Eli Zaretskii
  2001-08-04 19:11     ` Florian Weimer
  2001-09-01 16:30   ` Dave Love
  2 siblings, 1 reply; 23+ messages in thread
From: Eli Zaretskii @ 2001-08-04 18:07 UTC (permalink / raw)
  Cc: emacs-pretest-bug, ding

> From: Florian Weimer <fw@deneb.enyo.de>
> Date: Sat, 04 Aug 2001 17:46:55 +0200
> 
> Karl Eichwalder <keichwa@gmx.net> writes:
> 
> > My proposal: by default send out such a message UTF-8 encoded (maybe,
> > ognus does this already -- Gnus coming with Emacs 21 should do the same,
> > please).
> 
> It should work even with late pgnus versions if Emacs supports an
> UTF-8 coding system. I don't know why it was removed from Emacs 21.

Nothing was removed from Emacs 21.  Emacs never supported UTF-8 before
Emacs 21; in Emacs 21.1 there's a limited support for Latin-1 and for
mule-unicode-* characters sets (which are used if the original text
was encoded in UTF-8).


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15}
  2001-08-04 18:02       ` Eli Zaretskii
@ 2001-08-04 18:44         ` Florian Weimer
  2001-08-05  7:15           ` Eli Zaretskii
  2001-09-01 16:29           ` Dave Love
  0 siblings, 2 replies; 23+ messages in thread
From: Florian Weimer @ 2001-08-04 18:44 UTC (permalink / raw)
  Cc: Kai.Grossjohann, emacs-pretest-bug, ding

"Eli Zaretskii" <eliz@is.elta.co.il> writes:

>> > The Emacs 21 mule-unicode coding system groks iso-8859-1 characters,
>> > but not iso-8859-15.
>> 
>> Is anybody needed for fixing this?
> 
> Yes, you need either (1) install an add-on package such as Mule-UCS;
> or (2) add support for using Unicode tables for encoding and decoding
> Mule charsets into and from UTF-8; or (3) replace the internal
> representation of characters used by Emacs to be based on Unicode.

Is somebody working on this?  Which option has been chosen by the
Emacs maintainers?

I think I've got some unusal ideas on how Emacs might approach some
aspects of Unicode (and which aspects cannot be implemented without a
major paradigm shift), and I'd like to share them (and eventually,
some code).


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15}
  2001-08-04 18:07   ` Eli Zaretskii
@ 2001-08-04 19:11     ` Florian Weimer
  2001-08-05  7:15       ` Eli Zaretskii
  0 siblings, 1 reply; 23+ messages in thread
From: Florian Weimer @ 2001-08-04 19:11 UTC (permalink / raw)
  Cc: emacs-pretest-bug, ding

"Eli Zaretskii" <eliz@is.elta.co.il> writes:

>> > My proposal: by default send out such a message UTF-8 encoded (maybe,
>> > ognus does this already -- Gnus coming with Emacs 21 should do the same,
>> > please).
>> 
>> It should work even with late pgnus versions if Emacs supports an
>> UTF-8 coding system. I don't know why it was removed from Emacs 21.
> 
> Nothing was removed from Emacs 21.

Ah, I see.  I've read some claims before that Emacs 21 will support
Unicode, but this doesn't seem to be quite right.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15}
  2001-08-04 18:44         ` Florian Weimer
@ 2001-08-05  7:15           ` Eli Zaretskii
  2001-09-01 16:29           ` Dave Love
  1 sibling, 0 replies; 23+ messages in thread
From: Eli Zaretskii @ 2001-08-05  7:15 UTC (permalink / raw)
  Cc: Kai.Grossjohann, emacs-pretest-bug, ding


On Sat, 4 Aug 2001, Florian Weimer wrote:

> >> > The Emacs 21 mule-unicode coding system groks iso-8859-1 characters,
> >> > but not iso-8859-15.
> >> 
> >> Is anybody needed for fixing this?
> > 
> > Yes, you need either (1) install an add-on package such as Mule-UCS;
> > or (2) add support for using Unicode tables for encoding and decoding
> > Mule charsets into and from UTF-8; or (3) replace the internal
> > representation of characters used by Emacs to be based on Unicode.
> 
> Is somebody working on this?

I hope so.

> Which option has been chosen by the Emacs maintainers?

The 3rd one, AFAIU.  Since users want unification, it sounds like the
best approach, although it also means lots of work.

> I think I've got some unusal ideas on how Emacs might approach some
> aspects of Unicode (and which aspects cannot be implemented without a
> major paradigm shift), and I'd like to share them (and eventually,
> some code).

Please post those ideas to emacs-devel@gnu.org.  Thanks.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15}
  2001-08-04 19:11     ` Florian Weimer
@ 2001-08-05  7:15       ` Eli Zaretskii
  2001-09-01 16:28         ` Dave Love
  0 siblings, 1 reply; 23+ messages in thread
From: Eli Zaretskii @ 2001-08-05  7:15 UTC (permalink / raw)
  Cc: emacs-pretest-bug, ding


On Sat, 4 Aug 2001, Florian Weimer wrote:

> > Nothing was removed from Emacs 21.
> 
> Ah, I see.  I've read some claims before that Emacs 21 will support
> Unicode, but this doesn't seem to be quite right.

Emacs 21 does support Unicode, but this support is limited unless you
augment it with local changes or add-on packages.  The main limitation
is that the Unicode charsets are disjoint from the other charsets
supported by Emacs, and that, with the exception of UTF-8 and Latin-1,
all the coding systems supported by Emacs cannot produce Unicode
characters.  The practical implication of this is that if you want to
work with Unicode characters, you are limited to reading and writing
UTF-8 and Latin-1 text.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15}
  2001-07-27  5:10 Gnus coming with Emacs 21pre-release: iso-8859-{1,15} Karl Eichwalder
                   ` (2 preceding siblings ...)
  2001-08-04 15:46 ` Florian Weimer
@ 2001-09-01 16:26 ` Dave Love
  3 siblings, 0 replies; 23+ messages in thread
From: Dave Love @ 2001-09-01 16:26 UTC (permalink / raw)
  Cc: emacs-pretest-bug, ding

>>>>> "KE" == Karl Eichwalder <keichwa@gmx.net> writes:

 KE> If I start Emacs 21pre under the locale

 KE>     LANG=de_DE.ISO-8859-15

 KE> and reply to a iso-8859-1 encoded message (containing umlaut
 KE> letters), my reply message is arranged as a multipart message
 KE> even if there's no ambiguity involved.

 KE> My proposal: by default send out such a message UTF-8 encoded
 KE> (maybe, ognus does this already -- Gnus coming with Emacs 21
 KE> should do the same, please).  Please ask, if it isn't clear
 KE> enough what I intend to say.

My point about this obviously didn't sink in.  You have to unify the
relevant characters, and if you're dealing with Latin-N in the first
place, it makes sense to unify to Latin-N, not Unicode.  I explained
already why I personally don't unify 8859-x to 8859-N, so in my case I
do get utf-8.

The chosen coding system (MIME charset) should just be the highest
priority one with which Emacs can encode the message -- that's all.
Assuming `umlaut letters' means German, in this case that should be
iso-8859-15 if you unify 8859 by one of the possible means.

 KE> Are there variables to control this behavior?  

Not exactly, but you need quite trivial additions to Emacs, or
Mule-UCS, plus at least one change to Gnus to get this sort of thing
right in general.  [Actually, I don't know for sure that Mule-UCS does
this particular job as it stands, but it could be taught.]  This
should not be specific to utf-8 or other charsets either.  It should
just work, as it does for me after customization.

 KE> Just say "yes" and I'll read again the manual ;)

Sorry, you probably have to read various Mule code.  I didn't write
the relevant documentation in the end.

Apart from the base coding system support, you have to get Gnus to
choose the right MIME charset/coding system.  Here is a re-written
function for Gnus, which DTRT generally and may or may not still be
relevant to one of the code bases.  [I think after writing this I
found some similar code of handa's that sendmail.el uses.]

(defun mm-find-mime-charset-region (b e)
  "Return the MIME charsets needed to encode the region between B and E.
Nil means ASCII, a single-element list represents an appropriate MIME
charset, and a longer list means no appropriate charset."
  ;; The return possibilities of this function are a mess...
  (or (and
       (mm-multibyte-p)
       ;; How are you supposed to do this in XEmacs?
       (fboundp 'find-coding-systems-region)
       ;; Find the mime-charset of the most preferred coding
       ;; system that has one.
       (let ((systems (find-coding-systems-region b e))
	     result)
	 ;; Fixme: The `mime-charset' (`x-ctext') of `compound-text'
	 ;; is not in the IANA list.
	 (setq systems (delq 'compound-text systems))
	 (unless (equal systems '(undecided))
	   (while systems
	     (let ((cs (coding-system-get (pop systems) 'mime-charset)))
	       (if cs
		   (setq systems nil
			 result (list cs))))))
	 result))
      ;; Otherwise we're not multibyte or a single coding system won't
      ;; cover it.
      (mm-delete-duplicates
       (mapcar 'mm-mime-charset
	       (delq 'iso-2022-jp	; ??
		     (delq 'ascii
			   (mm-find-charset-region b e)))))))


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15}
  2001-07-27 18:22   ` Karl Eichwalder
  2001-07-27 19:18     ` Eli Zaretskii
@ 2001-09-01 16:27     ` Dave Love
  1 sibling, 0 replies; 23+ messages in thread
From: Dave Love @ 2001-09-01 16:27 UTC (permalink / raw)
  Cc: emacs-pretest-bug, ding

>>>>> "KE" == Karl Eichwalder <keichwa@gmx.net> writes:

 KE> Okay, than we've to make sure to add an user option to store
 KE> 8859-1 and 8859-15 (and 8859-2 and 8859-16) reply messages in the
 KE> buffer as mule-unicode-NNNN characters, please.

I've said that the general option exists, and I have made the
necessary tables:

  ;; Unify 8859 on decoding.  (Non-CCL coding systems only.)
  (set-char-table-parent standard-translation-table-for-decode
                         ucs-mule-8859-to-mule-unicode)

Of course, my 8859-16 coding system uses mule-unicode, so the
translation to mule-unicode is irrelevant and it would be rejected for
Emacs.

 KE> [Thanks for the utf-8.el pointer; I'll try to read the code.]

You would probably have to do more than just read that code, but why
bother, since I've implemented this?


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15}
  2001-08-05  7:15       ` Eli Zaretskii
@ 2001-09-01 16:28         ` Dave Love
  0 siblings, 0 replies; 23+ messages in thread
From: Dave Love @ 2001-09-01 16:28 UTC (permalink / raw)
  Cc: Florian Weimer, emacs-pretest-bug, ding

>>>>> "EZ" == Eli Zaretskii <eliz@is.elta.co.il> writes:

 EZ> The main limitation is that the Unicode charsets are disjoint
 EZ> from the other charsets supported by Emacs,

Of course, by definition.  It's misleading to imply that there's
anything special about them per se.  You might as well say that
Japanese support is limited for that reason.  After all, it includes
most of the Latin-N characters.

[The primary limitation of the mule-unicode support is that there
weren't enough free slots for private charsets to cover the BMP after
jisx213 (?) was added.]

 EZ> and that, with the exception of UTF-8 and Latin-1, all the coding
 EZ> systems supported by Emacs cannot produce Unicode characters.

Even assuming that means `no other bundled coding system encodes
mule-unicode-... chars', it's not true.  Anyhow, handa said that the
way mac-roman is implemented is the right thing.  If there's some
problem with that, Mac users are stuffed, but such a problem has
eluded me in extensive use.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15}
  2001-08-04 18:44         ` Florian Weimer
  2001-08-05  7:15           ` Eli Zaretskii
@ 2001-09-01 16:29           ` Dave Love
  2001-09-02 11:01             ` Eli Zaretskii
  1 sibling, 1 reply; 23+ messages in thread
From: Dave Love @ 2001-09-01 16:29 UTC (permalink / raw)
  Cc: Eli Zaretskii, Kai.Grossjohann, emacs-pretest-bug, ding

[-- Attachment #1: Type: text/plain, Size: 1080 bytes --]

>>>>> "FW" == Florian Weimer <fw@deneb.enyo.de> writes:

 FW> Is somebody working on this?  

Do you not believe what I said I've done, which Karl quoted?  (Please
don't take it on trust, and I suggest not related work until you
understand how to do it.)

 FW> I think I've got some unusal ideas on how Emacs might approach
 FW> some aspects of Unicode (and which aspects cannot be implemented
 FW> without a major paradigm shift), and I'd like to share them (and
 FW> eventually, some code).

I posted the following recently about what is already implemented.
What else did you want?  [I'm sure I could do the same with Mule-UCS
if I understood it and hacked it up to avoid data corruption with
untranslatable characters.]  If people want such facilities, I can
only suggest they press the Emacs maintainers to include this sort of
thing, even if they won't take my implementation.

In addition to what I have, it's not clear to me what fundamentally
prevents even Level 2 support now, but I don't need it.  The main
thing you definitely can't do with the current Mule is bidi.


[-- Attachment #2: Type: message/rfc822, Size: 4648 bytes --]

From: Dave Love <d.love@dl.ac.uk>
To: Eli Zaretskii <eliz@is.elta.co.il>
Cc: keichwa@gmx.net,  haible@ilog.fr,  pinard@iro.umontreal.ca, emacs-devel@gnu.org
Subject: Re: Unicode support (was: null-device)
Date: 22 Jul 2001 18:31:25 +0100
Message-ID: <rzq8zhg6fbm.fsf@djlvig.dl.ac.uk>

>>>>> "EZ" == Eli Zaretskii <eliz@is.elta.co.il> writes:

 EZ> and if you try to save a buffer with Latin-3 text using
 EZ> ISO-8859-1 encoding, Emacs will say it's unable to do so, even if
 EZ> all the non-ASCII characters are from the subset of Latin-3 that
 EZ> is in the intersection of Latin-1 and Latin-3.

The unification solution to this involves a few lines of code (which
I've shown elsewhere) plus easily-generated tables.  If you unify on
decoding, as ISO 2022 appears to suggest, the issue basically doesn't
arise anyway and even Emacs 20 has that facility.  [I know a
programmer _can_ break this, because it's Emacs.]  Otherwise, you
could actually expurgate the Latin-3 charset in favour of a trivial
CCL coding system.

 EZ> You cannot support Unicode with this representation, because
 EZ> Unicode unifies characters by its very design principle.

I don't accept this definition of ‘support Unicode’.  Although I've
been assured it doesn't or can't, I maintain my Emacs (without
Mule-UCS) supports Unicode because at least:

 • It groks utf-8 (auto-detected in a utf-8 locale or from cues like
   ‘charset=’ in the file);

 • It can edit normally in the part of the BMP I need – Western
   technical text, including maths – better than, say, Yudit.  It
   works under X and tty with or without a Unicode font;

 • In the rest of the BMP it can edit infelicitously (this could be
   improved) and display the CJK space covered by whichever three
   charsets I chose in a quick go;

 • It has several Unicode-based input methods;

 • As above, it can unify 8859 and others through Unicode during
   coding conversion.  (I don't normally turn all that on, because it
   would mung some of the implementation files I edit.);

 • It has (using Unicode tables) coding systems for all the charsets
   not in base Emacs which haible told me are relevant for GNU
   locales.  Their characters are unified by construction;

 • The MIME code DTRT, as (basically) does W3, for instance;

 • [It might DTRT with Unicode menu items under a suitable version of
   X, if that didn't get broken a while back].

If I can find the enthusiasm, I'll package what I've done if and when
Emacs 21 is released.


 >> To attract hackers working on UTF-8 for Emacs Mule has to go away
 >> first.

This is false by counter-examples, even for values of ‘utf-8’ equal to
‘Unicode’.  The issue in my experience is making progress after
they're attracted.

The propaganda that gives rise to this false claim comes from people
who either don't understand Mule and/or deliberately mislead about it
and the people who work on it.  I admit to being misled initially.

 EZ> What do you mean by ``first''?  We need to replace the current
 EZ> representation by another, based on Unicode.

It's not clear to me that I need this as a Unicode user, even if I was
serious about wider or deeper coverage.  I don't doubt handa has a
good rationale for the re-implementation, though.  Someone might like
to justify it with arguments beyond coping with 8859.

If necessary, I could build a non-standard Emacs now with a different
set of private charsets to cover the whole BMP properly.  That's
undesirable if I ever have to deal with code or data using the
replaced charsets, but presumably it could be declared official.
Anyway, that level of compatibility has to break sometime.

Otherwise, handa proposed extending the code space (apparently doable
quickly) to accomplish the same sort of result with minimal grief.

-- 
Bragging about Unicode support: ‘2d sinθ = nλ’ is plain text.   ☺
<URL:http://www.unicode.org/>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15}
  2001-07-27  8:45 ` Eli Zaretskii
  2001-07-27 18:22   ` Karl Eichwalder
@ 2001-09-01 16:30   ` Dave Love
  1 sibling, 0 replies; 23+ messages in thread
From: Dave Love @ 2001-09-01 16:30 UTC (permalink / raw)
  Cc: keichwa, emacs-pretest-bug, ding

>>>>> "EZ" == Eli Zaretskii <eliz@is.elta.co.il> writes:

 EZ> This should IMHO be optional at this time, 

This should _just work_.  In general.  [As far as I remember, utf-8
support in MUAs is mandated by IETF.]

 EZ> since Unicode support in the stock Emacs 21 distribution (without
 EZ> add-ons such as Mule-UCS) is limited and incomplete.

It can't even be an option until the additions and changes are
available for users to try.  If it's so bad, they can either avoid
using the support or fix it.  It works for my purposes, and I'd like
it to be available for others.

There is no consistent rationale for refusing to base things on the
current Unicode support.  KOI support is incomplete (like at least
most of the codepage.el coding systems); why is that offered?
mac-roman depends on the base Unicode support (in the same way that
Latin-8 and -9 probably should have done if they didn't precede
mule-unicode); what about that?

Anyhow, what editor and mailer _should_ people use with unlimited and
complete Unicode support?

 EZ> For starters, AFAIK, Emacs cannot encode 8859-15 characters as
 EZ> UTF-8 (see the commentary in utf-8.el) unless those characters
 EZ> came from a UTF-8 encoded source to begin with, and thus are
 EZ> stored in the buffer as mule-unicode-NNNN characters.

This is at best confused.  8859-15 is mostly the same as 8859-1, and
the characters at issue will be decoded into the Mule charset
`latin-iso8859-1'.  Anyhow, It's pretty trivial to change the
mule-utf-8 coding system to encode arbitrary Emacs characters with the
aid of a translation table.  It's even more trivial to unify on
decoding, as I've said before.  The 8859-15 coding system could use
mule-unicode.

 EZ> (Perhaps Gnus can do such conversions with its own code; but I'm
 EZ> talking about core Emacs functionality here.)

Gnus could bundle my code to do 8859/unicode unification and handle
the complete set of GNUish charsets, but that wouldn't make sense in
the absence of Emacs 21.1 and the facility should be available
generally.  Also, I don't want to waste effort supporting this in the
face of a maintainer campaign against the basic features it needs, and
it sounds as though it would be chucked out when Gnus was next
reintegrated.

 EZ> This is not the kind of support that we could IMHO offer users as
 EZ> the default.

Unifying 8859 on encoding to utf-8 is exactly the kind of support that
should be default, as the users want.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15}
  2001-08-04 15:46 ` Florian Weimer
  2001-08-04 16:54   ` Kai Großjohann
  2001-08-04 18:07   ` Eli Zaretskii
@ 2001-09-01 16:30   ` Dave Love
  2 siblings, 0 replies; 23+ messages in thread
From: Dave Love @ 2001-09-01 16:30 UTC (permalink / raw)
  Cc: emacs-pretest-bug, ding

>>>>> "FW" == Florian Weimer <fw@deneb.enyo.de> writes:

 FW> It should work even with late pgnus versions if Emacs supports an
 FW> UTF-8 coding system. 

I don't think so.  I had to fix Gnus 5.9 to make mule-utf-8 or current
Mule-UCS's utf-8 work at all.

 FW> Perhaps Emacs 21 doesn't have a proper UTF-8 coding system?

It does, but choosing a charset should not depend on how the relevant
coding systems are defined (as the Gnus code did whenever I last
looked).  See the charset-determining code I posted.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15}
  2001-09-01 16:29           ` Dave Love
@ 2001-09-02 11:01             ` Eli Zaretskii
  2001-09-02 11:39               ` Florian Weimer
  0 siblings, 1 reply; 23+ messages in thread
From: Eli Zaretskii @ 2001-09-02 11:01 UTC (permalink / raw)
  Cc: fw, Kai.Grossjohann, emacs-pretest-bug, ding

> From: Dave Love <d.love@dl.ac.uk>
> Date: 01 Sep 2001 17:29:34 +0100
> 
> The main thing you definitely can't do with the current Mule is
> bidi.

I'm working on that (albeit very slowly, due to insufficient
resources).  Volunteers who are willing to work on Emacs internals are
welcome to join the effort.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15}
  2001-09-02 11:01             ` Eli Zaretskii
@ 2001-09-02 11:39               ` Florian Weimer
  0 siblings, 0 replies; 23+ messages in thread
From: Florian Weimer @ 2001-09-02 11:39 UTC (permalink / raw)
  Cc: d.love, Kai.Grossjohann, emacs-pretest-bug, ding

Eli Zaretskii <eliz@is.elta.co.il> writes:

>> The main thing you definitely can't do with the current Mule is
>> bidi.
>
> I'm working on that (albeit very slowly, due to insufficient
> resources).  Volunteers who are willing to work on Emacs internals are
> welcome to join the effort.

The Unicode bidi algorithm is not compatible with enviroments which
strongly favor hard line breaks over a more paragraph-centered
approach.

(BTW, where is the right forum to discuss such things?  My
subscription requests for the mailing lists I considered relevant were
not honored.)


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2001-09-02 11:39 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-07-27  5:10 Gnus coming with Emacs 21pre-release: iso-8859-{1,15} Karl Eichwalder
2001-07-27  5:20 ` Karl Eichwalder
2001-07-27  8:45 ` Eli Zaretskii
2001-07-27 18:22   ` Karl Eichwalder
2001-07-27 19:18     ` Eli Zaretskii
2001-09-01 16:27     ` Dave Love
2001-09-01 16:30   ` Dave Love
2001-08-04 15:46 ` Florian Weimer
2001-08-04 16:54   ` Kai Großjohann
2001-08-04 17:15     ` Florian Weimer
2001-08-04 17:57       ` Kai Großjohann
2001-08-04 18:02       ` Eli Zaretskii
2001-08-04 18:44         ` Florian Weimer
2001-08-05  7:15           ` Eli Zaretskii
2001-09-01 16:29           ` Dave Love
2001-09-02 11:01             ` Eli Zaretskii
2001-09-02 11:39               ` Florian Weimer
2001-08-04 18:07   ` Eli Zaretskii
2001-08-04 19:11     ` Florian Weimer
2001-08-05  7:15       ` Eli Zaretskii
2001-09-01 16:28         ` Dave Love
2001-09-01 16:30   ` Dave Love
2001-09-01 16:26 ` Dave Love

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).