* Gnus coming with Emacs 21pre-release: iso-8859-{1,15} @ 2001-07-27 5:10 Karl Eichwalder 2001-07-27 5:20 ` Karl Eichwalder ` (3 more replies) 0 siblings, 4 replies; 23+ messages in thread From: Karl Eichwalder @ 2001-07-27 5:10 UTC (permalink / raw) Cc: ding This bug report will be sent to the Free Software Foundation, not to your local site managers! Please write in English, because the Emacs maintainers do not have translators to read other languages for them. Your bug report will be posted to the emacs-pretest-bug@gnu.org mailing list. In GNU Emacs 21.0.104.1 (i686-pc-linux-gnu, X toolkit, Xaw3d scroll bars) of 2001-06-22 on tux configured using `configure --prefix /gnu' Important settings: value of $LC_ALL: nil value of $LC_COLLATE: C value of $LC_CTYPE: nil value of $LC_MESSAGES: nil value of $LC_MONETARY: nil value of $LC_NUMERIC: nil value of $LC_TIME: nil value of $LANG: de_DE.ISO-8859-1 locale-coding-system: iso-latin-1 default-enable-multibyte-characters: t Please describe exactly what actions triggered the bug and the precise symptoms of the bug: First, yes, I didn't set all the LC_ variable mentioned above. Nevertheless assuming "nil" is wrong; they are considered to inherit their values from LANG if not set separately. Please, try 'locale' on GNU/Linux. If I start Emacs 21pre under the locale LANG=de_DE.ISO-8859-15 and reply to a iso-8859-1 encoded message (containing umlaut letters), my reply message is arranged as a multipart message even if there's no ambiguity involved. My proposal: by default send out such a message UTF-8 encoded (maybe, ognus does this already -- Gnus coming with Emacs 21 should do the same, please). Please ask, if it isn't clear enough what I intend to say. Are there variables to control this behavior? Just say "yes" and I'll read again the manual ;) -- ke@suse.de (work) / keichwa@gmx.net (home): | http://www.suse.de/~ke/ | ,__o Free Translation Project: | _-\_<, http://www.iro.umontreal.ca/contrib/po/HTML/ | (*)/'(*) ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15} 2001-07-27 5:10 Gnus coming with Emacs 21pre-release: iso-8859-{1,15} Karl Eichwalder @ 2001-07-27 5:20 ` Karl Eichwalder 2001-07-27 8:45 ` Eli Zaretskii ` (2 subsequent siblings) 3 siblings, 0 replies; 23+ messages in thread From: Karl Eichwalder @ 2001-07-27 5:20 UTC (permalink / raw) Cc: ding Just found the old mail by Dave again; here's some more background info: From: Dave Love <d.love@dl.ac.uk> Subject: Re: Unicode/Mule (Re: null-device) To: Karl Eichwalder <keichwa@gmx.net> Cc: rms@gnu.org, eliz@is.elta.co.il, haible@ilog.fr, pinard@iro.umontreal.ca, emacs-devel@gnu.org, gerd@gnu.org Date: 22 Jul 2001 19:17:38 +0100 [...] KE> The consequence is, Gnus often [I'd dispute ‘often’.] KE> thinks it has to create a multipart message... [Is that necessarily wrong?] I'll eval into my message buffer (string (make-char 'latin-iso8859-1 ?\xe9) (make-char 'latin-iso8859-14 ?\xe9) (make-char 'latin-iso8859-15 ?\xe9)) => "ééé" You may choose not to believe me that it results in a string with three different Emacs characters and that Gnus will post this silently in utf-8, but it's so. I unify on encoding to utf-8 in what might as well be a stock Emacs 21⁴. For just the three ‘e’s, Latin-1 could have been chosen. What you normally see is not a consequence of Emacs forcing anything. It can be customized. KE> Yes, it will only do so if you'll enter three 'y' (yes) in a row KE> -- this isn't "user-friendly" (Eli). I made Gnus fixes in this general area (_not_ on the basis of bug reports), at least some of which aren't installed. Footnotes: [...] ⁴ I know Eli disagrees. -- DOMINUS ILLUMINATIO MEA -- ke@suse.de (work) / keichwa@gmx.net (home): | http://www.suse.de/~ke/ | ,__o Free Translation Project: | _-\_<, http://www.iro.umontreal.ca/contrib/po/HTML/ | (*)/'(*) ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15} 2001-07-27 5:10 Gnus coming with Emacs 21pre-release: iso-8859-{1,15} Karl Eichwalder 2001-07-27 5:20 ` Karl Eichwalder @ 2001-07-27 8:45 ` Eli Zaretskii 2001-07-27 18:22 ` Karl Eichwalder 2001-09-01 16:30 ` Dave Love 2001-08-04 15:46 ` Florian Weimer 2001-09-01 16:26 ` Dave Love 3 siblings, 2 replies; 23+ messages in thread From: Eli Zaretskii @ 2001-07-27 8:45 UTC (permalink / raw) Cc: emacs-pretest-bug, ding > From: Karl Eichwalder <keichwa@gmx.net> > Date: 27 Jul 2001 07:10:57 +0200 > > value of $LC_ALL: nil > value of $LC_COLLATE: C > value of $LC_CTYPE: nil > value of $LC_MESSAGES: nil > value of $LC_MONETARY: nil > value of $LC_NUMERIC: nil > value of $LC_TIME: nil > value of $LANG: de_DE.ISO-8859-1 > locale-coding-system: iso-latin-1 > default-enable-multibyte-characters: t > > First, yes, I didn't set all the LC_ variable mentioned above. > Nevertheless assuming "nil" is wrong; they are considered to inherit > their values from LANG if not set separately. This information is for our consumption; it doesn't imply that Emacs behaves contrary to what you expect. LANG's value is printed, and whoever will need this information for tracking down a bug is supposed to know about the inheritance rules. > If I start Emacs 21pre under the locale > > LANG=de_DE.ISO-8859-15 > > and reply to a iso-8859-1 encoded message (containing umlaut letters), > my reply message is arranged as a multipart message even if there's no > ambiguity involved. > > My proposal: by default send out such a message UTF-8 encoded This should IMHO be optional at this time, since Unicode support in the stock Emacs 21 distribution (without add-ons such as Mule-UCS) is limited and incomplete. For starters, AFAIK, Emacs cannot encode 8859-15 characters as UTF-8 (see the commentary in utf-8.el) unless those characters came from a UTF-8 encoded source to begin with, and thus are stored in the buffer as mule-unicode-NNNN characters. (Perhaps Gnus can do such conversions with its own code; but I'm talking about core Emacs functionality here.) This is not the kind of support that we could IMHO offer users as the default. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15} 2001-07-27 8:45 ` Eli Zaretskii @ 2001-07-27 18:22 ` Karl Eichwalder 2001-07-27 19:18 ` Eli Zaretskii 2001-09-01 16:27 ` Dave Love 2001-09-01 16:30 ` Dave Love 1 sibling, 2 replies; 23+ messages in thread From: Karl Eichwalder @ 2001-07-27 18:22 UTC (permalink / raw) Cc: emacs-pretest-bug, ding "Eli Zaretskii" <eliz@is.elta.co.il> writes: > This should IMHO be optional at this time, since Unicode support in > the stock Emacs 21 distribution (without add-ons such as Mule-UCS) is > limited and incomplete. For starters, AFAIK, Emacs cannot encode > 8859-15 characters as UTF-8 (see the commentary in utf-8.el) unless > those characters came from a UTF-8 encoded source to begin with, and > thus are stored in the buffer as mule-unicode-NNNN characters. Okay, than we've to make sure to add an user option to store 8859-1 and 8859-15 (and 8859-2 and 8859-16) reply messages in the buffer as mule-unicode-NNNN characters, please. I'm sure I did send out UTF-8 messages already -- all this happened behind my back and I was very happy with it! [Thanks for the utf-8.el pointer; I'll try to read the code.] > (Perhaps Gnus can do such conversions with its own code; but I'm > talking about core Emacs functionality here.) Yes. I already received (polite) complains about my strange messages consisting of attachments only -- Netscape cannot handle multipart text messages that good. > This is not the kind of support that we could IMHO offer users as the > default. I don't mind to set a variable and to tell users to do so :) -- ke@suse.de (work) / keichwa@gmx.net (home): | http://www.suse.de/~ke/ | ,__o Free Translation Project: | _-\_<, http://www.iro.umontreal.ca/contrib/po/HTML/ | (*)/'(*) ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15} 2001-07-27 18:22 ` Karl Eichwalder @ 2001-07-27 19:18 ` Eli Zaretskii 2001-09-01 16:27 ` Dave Love 1 sibling, 0 replies; 23+ messages in thread From: Eli Zaretskii @ 2001-07-27 19:18 UTC (permalink / raw) Cc: emacs-pretest-bug, ding > From: Karl Eichwalder <keichwa@gmx.net> > Date: 27 Jul 2001 20:22:27 +0200 > > "Eli Zaretskii" <eliz@is.elta.co.il> writes: > > > This should IMHO be optional at this time, since Unicode support in > > the stock Emacs 21 distribution (without add-ons such as Mule-UCS) is > > limited and incomplete. For starters, AFAIK, Emacs cannot encode > > 8859-15 characters as UTF-8 (see the commentary in utf-8.el) unless > > those characters came from a UTF-8 encoded source to begin with, and > > thus are stored in the buffer as mule-unicode-NNNN characters. > > Okay, than we've to make sure to add an user option to store 8859-1 and > 8859-15 (and 8859-2 and 8859-16) reply messages in the buffer as > mule-unicode-NNNN characters, please. This is exactly the functionality that Emacs lacks: it cannot convert between 8859-2 and mule-unicode-NNNN because it thinks these are different characters; and utf-8.el doesn't support anything beyond 8859-1. (Also 8859-16 is not supported by Emacs at all, IIRC.) ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15} 2001-07-27 18:22 ` Karl Eichwalder 2001-07-27 19:18 ` Eli Zaretskii @ 2001-09-01 16:27 ` Dave Love 1 sibling, 0 replies; 23+ messages in thread From: Dave Love @ 2001-09-01 16:27 UTC (permalink / raw) Cc: emacs-pretest-bug, ding >>>>> "KE" == Karl Eichwalder <keichwa@gmx.net> writes: KE> Okay, than we've to make sure to add an user option to store KE> 8859-1 and 8859-15 (and 8859-2 and 8859-16) reply messages in the KE> buffer as mule-unicode-NNNN characters, please. I've said that the general option exists, and I have made the necessary tables: ;; Unify 8859 on decoding. (Non-CCL coding systems only.) (set-char-table-parent standard-translation-table-for-decode ucs-mule-8859-to-mule-unicode) Of course, my 8859-16 coding system uses mule-unicode, so the translation to mule-unicode is irrelevant and it would be rejected for Emacs. KE> [Thanks for the utf-8.el pointer; I'll try to read the code.] You would probably have to do more than just read that code, but why bother, since I've implemented this? ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15} 2001-07-27 8:45 ` Eli Zaretskii 2001-07-27 18:22 ` Karl Eichwalder @ 2001-09-01 16:30 ` Dave Love 1 sibling, 0 replies; 23+ messages in thread From: Dave Love @ 2001-09-01 16:30 UTC (permalink / raw) Cc: keichwa, emacs-pretest-bug, ding >>>>> "EZ" == Eli Zaretskii <eliz@is.elta.co.il> writes: EZ> This should IMHO be optional at this time, This should _just work_. In general. [As far as I remember, utf-8 support in MUAs is mandated by IETF.] EZ> since Unicode support in the stock Emacs 21 distribution (without EZ> add-ons such as Mule-UCS) is limited and incomplete. It can't even be an option until the additions and changes are available for users to try. If it's so bad, they can either avoid using the support or fix it. It works for my purposes, and I'd like it to be available for others. There is no consistent rationale for refusing to base things on the current Unicode support. KOI support is incomplete (like at least most of the codepage.el coding systems); why is that offered? mac-roman depends on the base Unicode support (in the same way that Latin-8 and -9 probably should have done if they didn't precede mule-unicode); what about that? Anyhow, what editor and mailer _should_ people use with unlimited and complete Unicode support? EZ> For starters, AFAIK, Emacs cannot encode 8859-15 characters as EZ> UTF-8 (see the commentary in utf-8.el) unless those characters EZ> came from a UTF-8 encoded source to begin with, and thus are EZ> stored in the buffer as mule-unicode-NNNN characters. This is at best confused. 8859-15 is mostly the same as 8859-1, and the characters at issue will be decoded into the Mule charset `latin-iso8859-1'. Anyhow, It's pretty trivial to change the mule-utf-8 coding system to encode arbitrary Emacs characters with the aid of a translation table. It's even more trivial to unify on decoding, as I've said before. The 8859-15 coding system could use mule-unicode. EZ> (Perhaps Gnus can do such conversions with its own code; but I'm EZ> talking about core Emacs functionality here.) Gnus could bundle my code to do 8859/unicode unification and handle the complete set of GNUish charsets, but that wouldn't make sense in the absence of Emacs 21.1 and the facility should be available generally. Also, I don't want to waste effort supporting this in the face of a maintainer campaign against the basic features it needs, and it sounds as though it would be chucked out when Gnus was next reintegrated. EZ> This is not the kind of support that we could IMHO offer users as EZ> the default. Unifying 8859 on encoding to utf-8 is exactly the kind of support that should be default, as the users want. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15} 2001-07-27 5:10 Gnus coming with Emacs 21pre-release: iso-8859-{1,15} Karl Eichwalder 2001-07-27 5:20 ` Karl Eichwalder 2001-07-27 8:45 ` Eli Zaretskii @ 2001-08-04 15:46 ` Florian Weimer 2001-08-04 16:54 ` Kai Großjohann ` (2 more replies) 2001-09-01 16:26 ` Dave Love 3 siblings, 3 replies; 23+ messages in thread From: Florian Weimer @ 2001-08-04 15:46 UTC (permalink / raw) Karl Eichwalder <keichwa@gmx.net> writes: > My proposal: by default send out such a message UTF-8 encoded (maybe, > ognus does this already -- Gnus coming with Emacs 21 should do the same, > please). It should work even with late pgnus versions if Emacs supports an UTF-8 coding system. I don't know why it was removed from Emacs 21. Perhaps Emacs 21 doesn't have a proper UTF-8 coding system? ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15} 2001-08-04 15:46 ` Florian Weimer @ 2001-08-04 16:54 ` Kai Großjohann 2001-08-04 17:15 ` Florian Weimer 2001-08-04 18:07 ` Eli Zaretskii 2001-09-01 16:30 ` Dave Love 2 siblings, 1 reply; 23+ messages in thread From: Kai Großjohann @ 2001-08-04 16:54 UTC (permalink / raw) Cc: emacs-pretest-bug, ding Florian Weimer <fw@deneb.enyo.de> writes: > Karl Eichwalder <keichwa@gmx.net> writes: > >> My proposal: by default send out such a message UTF-8 encoded (maybe, >> ognus does this already -- Gnus coming with Emacs 21 should do the same, >> please). > > It should work even with late pgnus versions if Emacs supports an > UTF-8 coding system. I don't know why it was removed from Emacs 21. > Perhaps Emacs 21 doesn't have a proper UTF-8 coding system? The Emacs 21 mule-unicode coding system groks iso-8859-1 characters, but not iso-8859-15. kai -- ~/.signature: No such file or directory ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15} 2001-08-04 16:54 ` Kai Großjohann @ 2001-08-04 17:15 ` Florian Weimer 2001-08-04 17:57 ` Kai Großjohann 2001-08-04 18:02 ` Eli Zaretskii 0 siblings, 2 replies; 23+ messages in thread From: Florian Weimer @ 2001-08-04 17:15 UTC (permalink / raw) Cc: emacs-pretest-bug, ding Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai Großjohann) writes: >> It should work even with late pgnus versions if Emacs supports an >> UTF-8 coding system. I don't know why it was removed from Emacs 21. >> Perhaps Emacs 21 doesn't have a proper UTF-8 coding system? > > The Emacs 21 mule-unicode coding system groks iso-8859-1 characters, > but not iso-8859-15. Is anybody needed for fixing this? ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15} 2001-08-04 17:15 ` Florian Weimer @ 2001-08-04 17:57 ` Kai Großjohann 2001-08-04 18:02 ` Eli Zaretskii 1 sibling, 0 replies; 23+ messages in thread From: Kai Großjohann @ 2001-08-04 17:57 UTC (permalink / raw) Cc: emacs-pretest-bug, ding Florian Weimer <fw@deneb.enyo.de> writes: > Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai Großjohann) writes: > >>> It should work even with late pgnus versions if Emacs supports an >>> UTF-8 coding system. I don't know why it was removed from Emacs 21. >>> Perhaps Emacs 21 doesn't have a proper UTF-8 coding system? >> >> The Emacs 21 mule-unicode coding system groks iso-8859-1 characters, >> but not iso-8859-15. > > Is anybody needed for fixing this? I'm not sure what should be done. I think that somebody (Dave Love?) is working on proper transition to Unicode, so whatever is done now is only a temporary measure, right? I don't know whether a change like this can still go in 21.1. But I'm sure the Emacs maintainers know the full story. kai -- ~/.signature: No such file or directory ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15} 2001-08-04 17:15 ` Florian Weimer 2001-08-04 17:57 ` Kai Großjohann @ 2001-08-04 18:02 ` Eli Zaretskii 2001-08-04 18:44 ` Florian Weimer 1 sibling, 1 reply; 23+ messages in thread From: Eli Zaretskii @ 2001-08-04 18:02 UTC (permalink / raw) Cc: Kai.Grossjohann, emacs-pretest-bug, ding > From: Florian Weimer <fw@deneb.enyo.de> > Date: Sat, 04 Aug 2001 19:15:43 +0200 > > Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai Gro\x7fjohann) writes: > > >> It should work even with late pgnus versions if Emacs supports an > >> UTF-8 coding system. I don't know why it was removed from Emacs 21. > >> Perhaps Emacs 21 doesn't have a proper UTF-8 coding system? > > > > The Emacs 21 mule-unicode coding system groks iso-8859-1 characters, > > but not iso-8859-15. > > Is anybody needed for fixing this? Yes, you need either (1) install an add-on package such as Mule-UCS; or (2) add support for using Unicode tables for encoding and decoding Mule charsets into and from UTF-8; or (3) replace the internal representation of characters used by Emacs to be based on Unicode. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15} 2001-08-04 18:02 ` Eli Zaretskii @ 2001-08-04 18:44 ` Florian Weimer 2001-08-05 7:15 ` Eli Zaretskii 2001-09-01 16:29 ` Dave Love 0 siblings, 2 replies; 23+ messages in thread From: Florian Weimer @ 2001-08-04 18:44 UTC (permalink / raw) Cc: Kai.Grossjohann, emacs-pretest-bug, ding "Eli Zaretskii" <eliz@is.elta.co.il> writes: >> > The Emacs 21 mule-unicode coding system groks iso-8859-1 characters, >> > but not iso-8859-15. >> >> Is anybody needed for fixing this? > > Yes, you need either (1) install an add-on package such as Mule-UCS; > or (2) add support for using Unicode tables for encoding and decoding > Mule charsets into and from UTF-8; or (3) replace the internal > representation of characters used by Emacs to be based on Unicode. Is somebody working on this? Which option has been chosen by the Emacs maintainers? I think I've got some unusal ideas on how Emacs might approach some aspects of Unicode (and which aspects cannot be implemented without a major paradigm shift), and I'd like to share them (and eventually, some code). ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15} 2001-08-04 18:44 ` Florian Weimer @ 2001-08-05 7:15 ` Eli Zaretskii 2001-09-01 16:29 ` Dave Love 1 sibling, 0 replies; 23+ messages in thread From: Eli Zaretskii @ 2001-08-05 7:15 UTC (permalink / raw) Cc: Kai.Grossjohann, emacs-pretest-bug, ding On Sat, 4 Aug 2001, Florian Weimer wrote: > >> > The Emacs 21 mule-unicode coding system groks iso-8859-1 characters, > >> > but not iso-8859-15. > >> > >> Is anybody needed for fixing this? > > > > Yes, you need either (1) install an add-on package such as Mule-UCS; > > or (2) add support for using Unicode tables for encoding and decoding > > Mule charsets into and from UTF-8; or (3) replace the internal > > representation of characters used by Emacs to be based on Unicode. > > Is somebody working on this? I hope so. > Which option has been chosen by the Emacs maintainers? The 3rd one, AFAIU. Since users want unification, it sounds like the best approach, although it also means lots of work. > I think I've got some unusal ideas on how Emacs might approach some > aspects of Unicode (and which aspects cannot be implemented without a > major paradigm shift), and I'd like to share them (and eventually, > some code). Please post those ideas to emacs-devel@gnu.org. Thanks. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15} 2001-08-04 18:44 ` Florian Weimer 2001-08-05 7:15 ` Eli Zaretskii @ 2001-09-01 16:29 ` Dave Love 2001-09-02 11:01 ` Eli Zaretskii 1 sibling, 1 reply; 23+ messages in thread From: Dave Love @ 2001-09-01 16:29 UTC (permalink / raw) Cc: Eli Zaretskii, Kai.Grossjohann, emacs-pretest-bug, ding [-- Attachment #1: Type: text/plain, Size: 1080 bytes --] >>>>> "FW" == Florian Weimer <fw@deneb.enyo.de> writes: FW> Is somebody working on this? Do you not believe what I said I've done, which Karl quoted? (Please don't take it on trust, and I suggest not related work until you understand how to do it.) FW> I think I've got some unusal ideas on how Emacs might approach FW> some aspects of Unicode (and which aspects cannot be implemented FW> without a major paradigm shift), and I'd like to share them (and FW> eventually, some code). I posted the following recently about what is already implemented. What else did you want? [I'm sure I could do the same with Mule-UCS if I understood it and hacked it up to avoid data corruption with untranslatable characters.] If people want such facilities, I can only suggest they press the Emacs maintainers to include this sort of thing, even if they won't take my implementation. In addition to what I have, it's not clear to me what fundamentally prevents even Level 2 support now, but I don't need it. The main thing you definitely can't do with the current Mule is bidi. [-- Attachment #2: Type: message/rfc822, Size: 4648 bytes --] From: Dave Love <d.love@dl.ac.uk> To: Eli Zaretskii <eliz@is.elta.co.il> Cc: keichwa@gmx.net, haible@ilog.fr, pinard@iro.umontreal.ca, emacs-devel@gnu.org Subject: Re: Unicode support (was: null-device) Date: 22 Jul 2001 18:31:25 +0100 Message-ID: <rzq8zhg6fbm.fsf@djlvig.dl.ac.uk> >>>>> "EZ" == Eli Zaretskii <eliz@is.elta.co.il> writes: EZ> and if you try to save a buffer with Latin-3 text using EZ> ISO-8859-1 encoding, Emacs will say it's unable to do so, even if EZ> all the non-ASCII characters are from the subset of Latin-3 that EZ> is in the intersection of Latin-1 and Latin-3. The unification solution to this involves a few lines of code (which I've shown elsewhere) plus easily-generated tables. If you unify on decoding, as ISO 2022 appears to suggest, the issue basically doesn't arise anyway and even Emacs 20 has that facility. [I know a programmer _can_ break this, because it's Emacs.] Otherwise, you could actually expurgate the Latin-3 charset in favour of a trivial CCL coding system. EZ> You cannot support Unicode with this representation, because EZ> Unicode unifies characters by its very design principle. I don't accept this definition of ‘support Unicode’. Although I've been assured it doesn't or can't, I maintain my Emacs (without Mule-UCS) supports Unicode because at least: • It groks utf-8 (auto-detected in a utf-8 locale or from cues like ‘charset=’ in the file); • It can edit normally in the part of the BMP I need – Western technical text, including maths – better than, say, Yudit. It works under X and tty with or without a Unicode font; • In the rest of the BMP it can edit infelicitously (this could be improved) and display the CJK space covered by whichever three charsets I chose in a quick go; • It has several Unicode-based input methods; • As above, it can unify 8859 and others through Unicode during coding conversion. (I don't normally turn all that on, because it would mung some of the implementation files I edit.); • It has (using Unicode tables) coding systems for all the charsets not in base Emacs which haible told me are relevant for GNU locales. Their characters are unified by construction; • The MIME code DTRT, as (basically) does W3, for instance; • [It might DTRT with Unicode menu items under a suitable version of X, if that didn't get broken a while back]. If I can find the enthusiasm, I'll package what I've done if and when Emacs 21 is released. >> To attract hackers working on UTF-8 for Emacs Mule has to go away >> first. This is false by counter-examples, even for values of ‘utf-8’ equal to ‘Unicode’. The issue in my experience is making progress after they're attracted. The propaganda that gives rise to this false claim comes from people who either don't understand Mule and/or deliberately mislead about it and the people who work on it. I admit to being misled initially. EZ> What do you mean by ``first''? We need to replace the current EZ> representation by another, based on Unicode. It's not clear to me that I need this as a Unicode user, even if I was serious about wider or deeper coverage. I don't doubt handa has a good rationale for the re-implementation, though. Someone might like to justify it with arguments beyond coping with 8859. If necessary, I could build a non-standard Emacs now with a different set of private charsets to cover the whole BMP properly. That's undesirable if I ever have to deal with code or data using the replaced charsets, but presumably it could be declared official. Anyway, that level of compatibility has to break sometime. Otherwise, handa proposed extending the code space (apparently doable quickly) to accomplish the same sort of result with minimal grief. -- Bragging about Unicode support: ‘2d sinθ = nλ’ is plain text. ☺ <URL:http://www.unicode.org/> ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15} 2001-09-01 16:29 ` Dave Love @ 2001-09-02 11:01 ` Eli Zaretskii 2001-09-02 11:39 ` Florian Weimer 0 siblings, 1 reply; 23+ messages in thread From: Eli Zaretskii @ 2001-09-02 11:01 UTC (permalink / raw) Cc: fw, Kai.Grossjohann, emacs-pretest-bug, ding > From: Dave Love <d.love@dl.ac.uk> > Date: 01 Sep 2001 17:29:34 +0100 > > The main thing you definitely can't do with the current Mule is > bidi. I'm working on that (albeit very slowly, due to insufficient resources). Volunteers who are willing to work on Emacs internals are welcome to join the effort. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15} 2001-09-02 11:01 ` Eli Zaretskii @ 2001-09-02 11:39 ` Florian Weimer 0 siblings, 0 replies; 23+ messages in thread From: Florian Weimer @ 2001-09-02 11:39 UTC (permalink / raw) Cc: d.love, Kai.Grossjohann, emacs-pretest-bug, ding Eli Zaretskii <eliz@is.elta.co.il> writes: >> The main thing you definitely can't do with the current Mule is >> bidi. > > I'm working on that (albeit very slowly, due to insufficient > resources). Volunteers who are willing to work on Emacs internals are > welcome to join the effort. The Unicode bidi algorithm is not compatible with enviroments which strongly favor hard line breaks over a more paragraph-centered approach. (BTW, where is the right forum to discuss such things? My subscription requests for the mailing lists I considered relevant were not honored.) ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15} 2001-08-04 15:46 ` Florian Weimer 2001-08-04 16:54 ` Kai Großjohann @ 2001-08-04 18:07 ` Eli Zaretskii 2001-08-04 19:11 ` Florian Weimer 2001-09-01 16:30 ` Dave Love 2 siblings, 1 reply; 23+ messages in thread From: Eli Zaretskii @ 2001-08-04 18:07 UTC (permalink / raw) Cc: emacs-pretest-bug, ding > From: Florian Weimer <fw@deneb.enyo.de> > Date: Sat, 04 Aug 2001 17:46:55 +0200 > > Karl Eichwalder <keichwa@gmx.net> writes: > > > My proposal: by default send out such a message UTF-8 encoded (maybe, > > ognus does this already -- Gnus coming with Emacs 21 should do the same, > > please). > > It should work even with late pgnus versions if Emacs supports an > UTF-8 coding system. I don't know why it was removed from Emacs 21. Nothing was removed from Emacs 21. Emacs never supported UTF-8 before Emacs 21; in Emacs 21.1 there's a limited support for Latin-1 and for mule-unicode-* characters sets (which are used if the original text was encoded in UTF-8). ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15} 2001-08-04 18:07 ` Eli Zaretskii @ 2001-08-04 19:11 ` Florian Weimer 2001-08-05 7:15 ` Eli Zaretskii 0 siblings, 1 reply; 23+ messages in thread From: Florian Weimer @ 2001-08-04 19:11 UTC (permalink / raw) Cc: emacs-pretest-bug, ding "Eli Zaretskii" <eliz@is.elta.co.il> writes: >> > My proposal: by default send out such a message UTF-8 encoded (maybe, >> > ognus does this already -- Gnus coming with Emacs 21 should do the same, >> > please). >> >> It should work even with late pgnus versions if Emacs supports an >> UTF-8 coding system. I don't know why it was removed from Emacs 21. > > Nothing was removed from Emacs 21. Ah, I see. I've read some claims before that Emacs 21 will support Unicode, but this doesn't seem to be quite right. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15} 2001-08-04 19:11 ` Florian Weimer @ 2001-08-05 7:15 ` Eli Zaretskii 2001-09-01 16:28 ` Dave Love 0 siblings, 1 reply; 23+ messages in thread From: Eli Zaretskii @ 2001-08-05 7:15 UTC (permalink / raw) Cc: emacs-pretest-bug, ding On Sat, 4 Aug 2001, Florian Weimer wrote: > > Nothing was removed from Emacs 21. > > Ah, I see. I've read some claims before that Emacs 21 will support > Unicode, but this doesn't seem to be quite right. Emacs 21 does support Unicode, but this support is limited unless you augment it with local changes or add-on packages. The main limitation is that the Unicode charsets are disjoint from the other charsets supported by Emacs, and that, with the exception of UTF-8 and Latin-1, all the coding systems supported by Emacs cannot produce Unicode characters. The practical implication of this is that if you want to work with Unicode characters, you are limited to reading and writing UTF-8 and Latin-1 text. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15} 2001-08-05 7:15 ` Eli Zaretskii @ 2001-09-01 16:28 ` Dave Love 0 siblings, 0 replies; 23+ messages in thread From: Dave Love @ 2001-09-01 16:28 UTC (permalink / raw) Cc: Florian Weimer, emacs-pretest-bug, ding >>>>> "EZ" == Eli Zaretskii <eliz@is.elta.co.il> writes: EZ> The main limitation is that the Unicode charsets are disjoint EZ> from the other charsets supported by Emacs, Of course, by definition. It's misleading to imply that there's anything special about them per se. You might as well say that Japanese support is limited for that reason. After all, it includes most of the Latin-N characters. [The primary limitation of the mule-unicode support is that there weren't enough free slots for private charsets to cover the BMP after jisx213 (?) was added.] EZ> and that, with the exception of UTF-8 and Latin-1, all the coding EZ> systems supported by Emacs cannot produce Unicode characters. Even assuming that means `no other bundled coding system encodes mule-unicode-... chars', it's not true. Anyhow, handa said that the way mac-roman is implemented is the right thing. If there's some problem with that, Mac users are stuffed, but such a problem has eluded me in extensive use. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15} 2001-08-04 15:46 ` Florian Weimer 2001-08-04 16:54 ` Kai Großjohann 2001-08-04 18:07 ` Eli Zaretskii @ 2001-09-01 16:30 ` Dave Love 2 siblings, 0 replies; 23+ messages in thread From: Dave Love @ 2001-09-01 16:30 UTC (permalink / raw) Cc: emacs-pretest-bug, ding >>>>> "FW" == Florian Weimer <fw@deneb.enyo.de> writes: FW> It should work even with late pgnus versions if Emacs supports an FW> UTF-8 coding system. I don't think so. I had to fix Gnus 5.9 to make mule-utf-8 or current Mule-UCS's utf-8 work at all. FW> Perhaps Emacs 21 doesn't have a proper UTF-8 coding system? It does, but choosing a charset should not depend on how the relevant coding systems are defined (as the Gnus code did whenever I last looked). See the charset-determining code I posted. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Gnus coming with Emacs 21pre-release: iso-8859-{1,15} 2001-07-27 5:10 Gnus coming with Emacs 21pre-release: iso-8859-{1,15} Karl Eichwalder ` (2 preceding siblings ...) 2001-08-04 15:46 ` Florian Weimer @ 2001-09-01 16:26 ` Dave Love 3 siblings, 0 replies; 23+ messages in thread From: Dave Love @ 2001-09-01 16:26 UTC (permalink / raw) Cc: emacs-pretest-bug, ding >>>>> "KE" == Karl Eichwalder <keichwa@gmx.net> writes: KE> If I start Emacs 21pre under the locale KE> LANG=de_DE.ISO-8859-15 KE> and reply to a iso-8859-1 encoded message (containing umlaut KE> letters), my reply message is arranged as a multipart message KE> even if there's no ambiguity involved. KE> My proposal: by default send out such a message UTF-8 encoded KE> (maybe, ognus does this already -- Gnus coming with Emacs 21 KE> should do the same, please). Please ask, if it isn't clear KE> enough what I intend to say. My point about this obviously didn't sink in. You have to unify the relevant characters, and if you're dealing with Latin-N in the first place, it makes sense to unify to Latin-N, not Unicode. I explained already why I personally don't unify 8859-x to 8859-N, so in my case I do get utf-8. The chosen coding system (MIME charset) should just be the highest priority one with which Emacs can encode the message -- that's all. Assuming `umlaut letters' means German, in this case that should be iso-8859-15 if you unify 8859 by one of the possible means. KE> Are there variables to control this behavior? Not exactly, but you need quite trivial additions to Emacs, or Mule-UCS, plus at least one change to Gnus to get this sort of thing right in general. [Actually, I don't know for sure that Mule-UCS does this particular job as it stands, but it could be taught.] This should not be specific to utf-8 or other charsets either. It should just work, as it does for me after customization. KE> Just say "yes" and I'll read again the manual ;) Sorry, you probably have to read various Mule code. I didn't write the relevant documentation in the end. Apart from the base coding system support, you have to get Gnus to choose the right MIME charset/coding system. Here is a re-written function for Gnus, which DTRT generally and may or may not still be relevant to one of the code bases. [I think after writing this I found some similar code of handa's that sendmail.el uses.] (defun mm-find-mime-charset-region (b e) "Return the MIME charsets needed to encode the region between B and E. Nil means ASCII, a single-element list represents an appropriate MIME charset, and a longer list means no appropriate charset." ;; The return possibilities of this function are a mess... (or (and (mm-multibyte-p) ;; How are you supposed to do this in XEmacs? (fboundp 'find-coding-systems-region) ;; Find the mime-charset of the most preferred coding ;; system that has one. (let ((systems (find-coding-systems-region b e)) result) ;; Fixme: The `mime-charset' (`x-ctext') of `compound-text' ;; is not in the IANA list. (setq systems (delq 'compound-text systems)) (unless (equal systems '(undecided)) (while systems (let ((cs (coding-system-get (pop systems) 'mime-charset))) (if cs (setq systems nil result (list cs)))))) result)) ;; Otherwise we're not multibyte or a single coding system won't ;; cover it. (mm-delete-duplicates (mapcar 'mm-mime-charset (delq 'iso-2022-jp ; ?? (delq 'ascii (mm-find-charset-region b e))))))) ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2001-09-02 11:39 UTC | newest] Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2001-07-27 5:10 Gnus coming with Emacs 21pre-release: iso-8859-{1,15} Karl Eichwalder 2001-07-27 5:20 ` Karl Eichwalder 2001-07-27 8:45 ` Eli Zaretskii 2001-07-27 18:22 ` Karl Eichwalder 2001-07-27 19:18 ` Eli Zaretskii 2001-09-01 16:27 ` Dave Love 2001-09-01 16:30 ` Dave Love 2001-08-04 15:46 ` Florian Weimer 2001-08-04 16:54 ` Kai Großjohann 2001-08-04 17:15 ` Florian Weimer 2001-08-04 17:57 ` Kai Großjohann 2001-08-04 18:02 ` Eli Zaretskii 2001-08-04 18:44 ` Florian Weimer 2001-08-05 7:15 ` Eli Zaretskii 2001-09-01 16:29 ` Dave Love 2001-09-02 11:01 ` Eli Zaretskii 2001-09-02 11:39 ` Florian Weimer 2001-08-04 18:07 ` Eli Zaretskii 2001-08-04 19:11 ` Florian Weimer 2001-08-05 7:15 ` Eli Zaretskii 2001-09-01 16:28 ` Dave Love 2001-09-01 16:30 ` Dave Love 2001-09-01 16:26 ` Dave Love
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).