* More charset things @ 1999-02-03 18:09 Lars Magne Ingebrigtsen 1999-02-04 14:56 ` Hrvoje Niksic 0 siblings, 1 reply; 43+ messages in thread From: Lars Magne Ingebrigtsen @ 1999-02-03 18:09 UTC (permalink / raw) I've gone through the HELLO files under XEmacs and Emacs, and I'm now able to post everything there (except Lao). I had to rewrite some bits to be able to deal with things that use different MULE charsets, but the same MIME charset. The solution was just to do away with most MULE charset thingies, and just do MIME charset thingies instead. Anyway -- body encodings. The reason I'm not able to post Lao is that some of the octets in the Lao stream seems to make Emacs and/or the nntp server choke. I haven't really done any body encoding things -- if it's text, Gnus posts using 8bit or 7bit. But there should be a way to say what MIME charsets should be encoded what way -- 7bit, 8bit, base64 and qp. There is a `rfc2047-charset-encoding-alist', but that says how do encode things in the headers. Should I just add an `mm-charset-encoding-alist' for the bodies? Yes. Fix in Pterodactyl Gnus v0.76. And with that, I hereby declare the charset bits of MIME to be implemented by Gnus. -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-03 18:09 More charset things Lars Magne Ingebrigtsen @ 1999-02-04 14:56 ` Hrvoje Niksic 1999-02-04 17:08 ` Lars Magne Ingebrigtsen 0 siblings, 1 reply; 43+ messages in thread From: Hrvoje Niksic @ 1999-02-04 14:56 UTC (permalink / raw) Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > And with that, I hereby declare the charset bits of MIME to be > implemented by Gnus. Uh-oh. How can we possibly be compliant when there is no support for UTF-8? Also, Gnus still happily sends out 8bit stuff in email headers, losing all charset information, even when it receives it. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-04 14:56 ` Hrvoje Niksic @ 1999-02-04 17:08 ` Lars Magne Ingebrigtsen 1999-02-04 17:21 ` Hrvoje Niksic 1999-02-07 19:35 ` François Pinard 0 siblings, 2 replies; 43+ messages in thread From: Lars Magne Ingebrigtsen @ 1999-02-04 17:08 UTC (permalink / raw) Hrvoje Niksic <hniksic@srce.hr> writes: > Uh-oh. How can we possibly be compliant when there is no support for > UTF-8? That's not my table. :-) When MULE supports utf-8, Gnus will support utf-8. > Also, Gnus still happily sends out 8bit stuff in email headers, losing > all charset information, even when it receives it. Aarh, yes, I had forgotten that I was going to go over the charset things in non-MULE XEmacsen. (By the way -- is it "MULE" or "Mule? I'm waffling all over the place when I write that word. Perhaps I should start writing it "mUlE"?) -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-04 17:08 ` Lars Magne Ingebrigtsen @ 1999-02-04 17:21 ` Hrvoje Niksic 1999-02-04 17:49 ` Lars Magne Ingebrigtsen 1999-02-07 19:37 ` François Pinard 1999-02-07 19:35 ` François Pinard 1 sibling, 2 replies; 43+ messages in thread From: Hrvoje Niksic @ 1999-02-04 17:21 UTC (permalink / raw) Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > Hrvoje Niksic <hniksic@srce.hr> writes: > > > Uh-oh. How can we possibly be compliant when there is no support for > > UTF-8? > > That's not my table. :-) When MULE supports utf-8, Gnus will > support utf-8. That is not a nice way of thinking. MULE is little else than a Japanese version of Emacs, and it appears that the Japanese are not interested in Unicode. So it wasn't implemented. I'm not sure about FSF, but for XEmacs, I know of no plans to implement it in the near future. > (By the way -- is it "MULE" or "Mule? The original thing was called MULE. The XEmacs developers prefer to call the XEmacs variant `Mule', or `XEmacs/Mule'. FSF Emacs maintainers seem to prefer Emacs/MULE. Unlike the XEmacs/Xemacs issue, noone seems to mind the different spellings. So I guess mULE would be just fine. :-) ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-04 17:21 ` Hrvoje Niksic @ 1999-02-04 17:49 ` Lars Magne Ingebrigtsen 1999-02-05 0:47 ` Stephen J. Turnbull 1999-02-07 20:43 ` François Pinard 1999-02-07 19:37 ` François Pinard 1 sibling, 2 replies; 43+ messages in thread From: Lars Magne Ingebrigtsen @ 1999-02-04 17:49 UTC (permalink / raw) Cc: xemacs-mule Hrvoje Niksic <hniksic@srce.hr> writes: > > That's not my table. :-) When MULE supports utf-8, Gnus will > > support utf-8. > > That is not a nice way of thinking. I don't see any other way of thinking. Grokking utf-8 is way outside the scope of Gnus -- it has to be an Emacs thing. > MULE is little else than a Japanese version of Emacs, and it appears > that the Japanese are not interested in Unicode. So it wasn't > implemented. I'm not sure about FSF, but for XEmacs, I know of no > plans to implement it in the near future. A partial implementation of utf-mumble was posted recently somewhere by someone. (Could I possible get any more vague?) So I'm Cc'ing this to the xemacs-mule list. Anyway, I find that I'm strangely fascinated by the idea of an editor that allows intermingling of text that uses a variety of character sets. I have an urge to jump into the matter, but I'm such a charset novice that I don't really feel qualified. (Well, I don't have the time, either, but that's a minor detail.) I asked before for a likely book that would introduce me to the basic concepts, and someone (Stephen Turnbull?) told me, but then I forgot. (At least, I can't find any books on charset issues in my list of books to buy.) Could that someone (or someone else) re-recommend the book(s) that I should buy to get both an introduction and more in-depth knowledge about charset issues? -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-04 17:49 ` Lars Magne Ingebrigtsen @ 1999-02-05 0:47 ` Stephen J. Turnbull 1999-02-05 2:43 ` Hrvoje Niksic ` (5 more replies) 1999-02-07 20:43 ` François Pinard 1 sibling, 6 replies; 43+ messages in thread From: Stephen J. Turnbull @ 1999-02-05 0:47 UTC (permalink / raw) >>>>> "Lars" == Lars Magne Ingebrigtsen <larsi@gnus.org> writes: Lars> Hrvoje Niksic <hniksic@srce.hr> writes: >> MULE is little else than a Japanese version of Emacs, and it >> appears that the Japanese are not interested in Unicode. So it The MULE development group is nearly entirely Japanese; including the people implementing Devanagari (for sure) and Arabic and Ethiopic (IIRC). Not surprisingly, the tuning (and tuning is absolutely necessary; the linguists don't know enough about language for charset guessing and the like to be more than heuristic) is best for Japanese, and bugs for non-Japanese languages don't get found and fixed quickly. But MULE is the only truly multilingual platform there is at the moment, to the best of my knowledge; Unicode doesn't satisfy the needs of lots of people, and is not easily extensible without changing the standard. MULE is. MULE is more than a Japanese version of Emacs. The Japanese are divided on Unicode; some are vehemently opposed, others are interested. There don't seem to be any strong advocates, though. >> wasn't implemented. I'm not sure about FSF, but for XEmacs, I >> know of no plans to implement it in the near future. Lars> A partial implementation of utf-mumble was posted recently Lars> somewhere by someone. (Could I possible get any more Lars> vague?) So I'm Cc'ing this to the xemacs-mule list. Morioka-san ported (IIRC) a Lisp-level implementation of UTF-8. The attachments were broken on the ML (so Steve never was able to look at it), I'll restore from archive the working (I hope) copy I got from Morioka. Martin Buchholz believes that since the tables are in Lisp, the performance impact will be huge. Lars> I asked before for a likely book that would introduce me to Lars> the basic concepts, and someone (Stephen Turnbull?) told me, Lars> but then I forgot. Prices are vague recollections, in decreasing order of importance for basic understanding: Ken Lunde. Chinese, Japanese, Korean and Vietnamese Information Processing. O'Reilly Associates. Probably the most useful single volume, although it doesn't cover single-octet encodings. ISO. ISO-2022: Extension Techniques for Coded Character Sets. US$75. Unicode Consortium. The Unicode Standard, v2.x. About US$70 from Amazon. ISO. ISO-10646: Universal Multi-octet Character Set Encoding Standard. About US$125. Don't bother unless you've got extra money, Unicode Standard is much more complete and readable. All ISO-10646 has extra is 4-octet encoding, which is presently useless, and it is very likely that any UTF-8 . I don't know of any textbooks on character set stuff, there must be some somewhere. Lunde's book will have a very extensive bibliography. -- University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Institute of Policy and Planning Sciences Tel/fax: +81 (298) 53-5091 __________________________________________________________________________ __________________________________________________________________________ What are those two straight lines for? "Free software rules." ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-05 0:47 ` Stephen J. Turnbull @ 1999-02-05 2:43 ` Hrvoje Niksic [not found] ` <m3hft163aa.fsf@peorth.gweep.net> ` (4 subsequent siblings) 5 siblings, 0 replies; 43+ messages in thread From: Hrvoje Niksic @ 1999-02-05 2:43 UTC (permalink / raw) "Stephen J. Turnbull" <turnbull@sk.tsukuba.ac.jp> writes: > The MULE development group is nearly entirely Japanese; including > the people implementing Devanagari (for sure) and Arabic and > Ethiopic (IIRC). Not surprisingly, the tuning (and tuning is > absolutely necessary; the linguists don't know enough about language > for charset guessing and the like to be more than heuristic) is best > for Japanese, and bugs for non-Japanese languages don't get found > and fixed quickly. They don't get fixed at all, Stephen. I don't like to bitch all that much about the subject, since I could come out with the patches as well as anybody else (but my disgust at the code is another matter), only I *have* to correct you when you say that bugs don't get fixed "quickly". I have reported a number of latin2-related bugs in XEmacs/Mule, and I haven't seen a fix for any of them. I, a latin2 user, am supposed to be a target audience for Mule, and yet I cannot bring myself to use it for longer than ten minutes. If Mule is usable for anyone except the latin1 people and the Japanese (== majority), I'm happy for them. But it's not my cup of coffee. Not yet. > But MULE is the only truly multilingual platform there is at the > moment, to the best of my knowledge; Unicode doesn't satisfy the > needs of lots of people, and is not easily extensible without > changing the standard. MULE is. MULE is more than a Japanese > version of Emacs. :-( > Lars> A partial implementation of utf-mumble was posted recently > Lars> somewhere by someone. (Could I possible get any more > Lars> vague?) So I'm Cc'ing this to the xemacs-mule list. > > Morioka-san ported (IIRC) a Lisp-level implementation of UTF-8. Can such a thing even work under XEmacs/Mule? The design differences sound as if they make such a thing impossible. ^ permalink raw reply [flat|nested] 43+ messages in thread
[parent not found: <m3hft163aa.fsf@peorth.gweep.net>]
* Re: More charset things [not found] ` <m3hft163aa.fsf@peorth.gweep.net> @ 1999-02-05 19:06 ` Vladimir Volovich [not found] ` <m3sockqqjx.fsf@peorth.gweep.net> 0 siblings, 1 reply; 43+ messages in thread From: Vladimir Volovich @ 1999-02-05 19:06 UTC (permalink / raw) "Rat" == Stainless Steel Rat writes: Rat> MULE does not work at all well in Europe or other parts of the Rat> world that use ISO-8859-X 8-bit character sets. well, in emacs 20.3, mule works quite satisfactory for cyrillic encodings (including but not limiting to iso-8859-5). Best regards, -- Vladimir. ^ permalink raw reply [flat|nested] 43+ messages in thread
[parent not found: <m3sockqqjx.fsf@peorth.gweep.net>]
* Re: More charset things [not found] ` <m3sockqqjx.fsf@peorth.gweep.net> @ 1999-02-06 15:55 ` Lars Magne Ingebrigtsen [not found] ` <m3lnia5922.fsf@peorth.gweep.net> 1999-02-08 16:04 ` Bill White 0 siblings, 2 replies; 43+ messages in thread From: Lars Magne Ingebrigtsen @ 1999-02-06 15:55 UTC (permalink / raw) [-- Attachment #1: Type: text/plain, Size: 180 bytes --] Stainless Steel Rat <ratinox@peorth.gweep.net> writes: > Try mixing ISO-8859-5 with ISO-8859-[1-4] sometime and you will see just > how badly broken MULE really is. Lét's see... [-- Attachment #2: Type: text/plain, Size: 29 bytes --] Здравствуйте! And some more [-- Attachment #3: Type: text/plain, Size: 132 bytes --] Latïn-1. Looks OK to me... -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 43+ messages in thread
[parent not found: <m3lnia5922.fsf@peorth.gweep.net>]
* Re: More charset things [not found] ` <m3lnia5922.fsf@peorth.gweep.net> @ 1999-02-07 21:02 ` Hrvoje Niksic 1999-02-09 15:56 ` Lars Magne Ingebrigtsen 0 siblings, 1 reply; 43+ messages in thread From: Hrvoje Niksic @ 1999-02-07 21:02 UTC (permalink / raw) [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 886 bytes --] Stainless Steel Rat <ratinox@peorth.gweep.net> writes: > "Lars" == Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > > Lars> Lét's see... ·ÔàÐÒáâÒãÙâÕ! And some more Latïn-1. Looks OK to me... > > Then you lucked out for some reason. Many others (here, notably > Hrvoje), have had numerous problems with it. Mixing charset works for me in a Mule buffer, but there are environmental brain-damages that appear to be incurable for Mule. For instance, it insists that the default 128-255 chars are iso-8859-1, which is a hard-coded arbitrary value with no hope of ever changing it to iso-8859-2. Non-Mule XEmacs can be set up to work with latin2 just fine -- you simply point it to latin2 fonts, and it works out of the box. This strategy works on TTY's too (which is another problem with XEmacs/Mule). Thus for me, Mule is useless. So much for the "internationalization". ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-07 21:02 ` Hrvoje Niksic @ 1999-02-09 15:56 ` Lars Magne Ingebrigtsen 1999-02-09 17:21 ` Hrvoje Niksic 0 siblings, 1 reply; 43+ messages in thread From: Lars Magne Ingebrigtsen @ 1999-02-09 15:56 UTC (permalink / raw) Hrvoje Niksic <hniksic@srce.hr> writes: > Mixing charset works for me in a Mule buffer, but there are > environmental brain-damages that appear to be incurable for Mule. For > instance, it insists that the default 128-255 chars are iso-8859-1, > which is a hard-coded arbitrary value with no hope of ever changing it > to iso-8859-2. Hm. In this message, for instance, isn't "Dzień dobry" rendered correctly for you if you use a Mule XEmacs? -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-09 15:56 ` Lars Magne Ingebrigtsen @ 1999-02-09 17:21 ` Hrvoje Niksic 1999-02-09 17:31 ` Alan Shutko 1999-02-09 17:37 ` Lars Magne Ingebrigtsen 0 siblings, 2 replies; 43+ messages in thread From: Hrvoje Niksic @ 1999-02-09 17:21 UTC (permalink / raw) Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > Hrvoje Niksic <hniksic@srce.hr> writes: > > > Mixing charset works for me in a Mule buffer, but there are > > environmental brain-damages that appear to be incurable for Mule. For > > instance, it insists that the default 128-255 chars are iso-8859-1, > > which is a hard-coded arbitrary value with no hope of ever changing it > > to iso-8859-2. > > Hm. In this message, for instance, isn't "Dzień dobry" rendered > correctly for you if you use a Mule XEmacs? I don't do Mule, but I suspect it renders correctly as long as the charset parameter is right (and Gnus gets things right). But that wasn't the point. The point is that I cannot explain XEmacs/Mule that all the 8bit files I will want to load and save in the near future are latin2, and that if it encounters chars in the appropriate subset of [128,256) range, it should treat them as latin2, not latin1. Currently I have to do things like `C-u C-x C-f FILENAME RET iso-8859-2 RET'. Also, I don't want to see the iso2022 (or whatever) coding on my saved files, *ever*. If files have to be saved in a multicharset format, it should be implemented as Unicode, so that at least other (non-Japanese) software has a chance of getting it right. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-09 17:21 ` Hrvoje Niksic @ 1999-02-09 17:31 ` Alan Shutko 1999-02-09 17:37 ` Lars Magne Ingebrigtsen 1 sibling, 0 replies; 43+ messages in thread From: Alan Shutko @ 1999-02-09 17:31 UTC (permalink / raw) Cc: ding >>>>> "H" == Hrvoje Niksic <hniksic@srce.hr> writes: H> The point is that I cannot explain XEmacs/Mule that all the 8bit H> files I will want to load and save in the near future are latin2, H> and that if it encounters chars in the appropriate subset of H> [128,256) range, it should treat them as latin2, not latin1. In Emacs, there's a variable "default-buffer-file-coding-system", which may do what you want. Is that variable in XEmacs? -- Alan Shutko <ats@acm.org> - By consent of the corrupted A woman's place is in the house... and in the Senate. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-09 17:21 ` Hrvoje Niksic 1999-02-09 17:31 ` Alan Shutko @ 1999-02-09 17:37 ` Lars Magne Ingebrigtsen 1999-02-09 18:06 ` Hrvoje Niksic 1 sibling, 1 reply; 43+ messages in thread From: Lars Magne Ingebrigtsen @ 1999-02-09 17:37 UTC (permalink / raw) Hrvoje Niksic <hniksic@srce.hr> writes: > The point is that I cannot explain XEmacs/Mule that all the 8bit files > I will want to load and save in the near future are latin2, and that > if it encounters chars in the appropriate subset of [128,256) range, > it should treat them as latin2, not latin1. Currently I have to do > things like `C-u C-x C-f FILENAME RET iso-8859-2 RET'. Huh. How, er, useless. I thought that this was what Mule was all about -- letting you do this automatically? Isn't (set-language-environment "Latin-2") (or something) what one is supposed to do? > Also, I don't want to see the iso2022 (or whatever) coding on my saved > files, *ever*. If files have to be saved in a multicharset format, it > should be implemented as Unicode, so that at least other > (non-Japanese) software has a chance of getting it right. Yup. Someone really needs to implement Unicode for the Emacsen. :-) -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-09 17:37 ` Lars Magne Ingebrigtsen @ 1999-02-09 18:06 ` Hrvoje Niksic 0 siblings, 0 replies; 43+ messages in thread From: Hrvoje Niksic @ 1999-02-09 18:06 UTC (permalink / raw) Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > Hrvoje Niksic <hniksic@srce.hr> writes: > > > The point is that I cannot explain XEmacs/Mule that all the 8bit files > > I will want to load and save in the near future are latin2, and that > > if it encounters chars in the appropriate subset of [128,256) range, > > it should treat them as latin2, not latin1. Currently I have to do > > things like `C-u C-x C-f FILENAME RET iso-8859-2 RET'. > > Huh. How, er, useless. I thought that this was what Mule was all > about -- letting you do this automatically? Isn't > (set-language-environment "Latin-2") (or something) what one is > supposed to do? It didn't work for me in XEmacs/Mule when I tried it. I asked about it on the mailing list, and noone was able to instruct me how to do it right. So I concluded that it can't be done. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-06 15:55 ` Lars Magne Ingebrigtsen [not found] ` <m3lnia5922.fsf@peorth.gweep.net> @ 1999-02-08 16:04 ` Bill White 1999-02-09 16:04 ` Lars Magne Ingebrigtsen 1 sibling, 1 reply; 43+ messages in thread From: Bill White @ 1999-02-08 16:04 UTC (permalink / raw) [-- Attachment #1: Type: text/plain, Size: 609 bytes --] Lars - your Russian text shows up only when I switch fonts to "standard: 16-dot medium" via the Mule:Set Font/Fontset menu. Otherwise it's empty boxes. I use (set-default-font "-b&h-lucidatypewriter-medium-*-*-*-12-120-*-*-*-*-*-*") in my .emacs. What font are you using that lets the Russian characters show up? bw In message <m3r9s3edbp.fsf@quimbies.gnus.org>, Lars Magne Ingebrigtsen <larsi@gnus.org> wrote: > Stainless Steel Rat <ratinox@peorth.gweep.net> writes: > > > Try mixing ISO-8859-5 with ISO-8859-[1-4] sometime and you will see just > > how badly broken MULE really is. > > Lét's see... [-- Attachment #2: Type: text/plain, Size: 29 bytes --] Здравствуйте! And some more [-- Attachment #3: Type: text/plain, Size: 208 bytes --] Latïn-1. Looks OK to me... > > -- > (domestic pets only, the antidote for overdose, milk.) > larsi@gnus.org * Lars Magne Ingebrigtsen -- Bill White . billw@wolfram.com . http://www.wolfram.com/~billw ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-08 16:04 ` Bill White @ 1999-02-09 16:04 ` Lars Magne Ingebrigtsen 0 siblings, 0 replies; 43+ messages in thread From: Lars Magne Ingebrigtsen @ 1999-02-09 16:04 UTC (permalink / raw) Bill White <billw@wolfram.com> writes: > in my .emacs. What font are you using that lets the Russian characters > show up? I'm using the intlfonts package, which contains oodles of fonts. -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-05 0:47 ` Stephen J. Turnbull 1999-02-05 2:43 ` Hrvoje Niksic [not found] ` <m3hft163aa.fsf@peorth.gweep.net> @ 1999-02-06 8:17 ` Lars Magne Ingebrigtsen 1999-02-09 10:27 ` Displayed [ 0: Stephen J. Turnbull ] but it had lots of lines Alf-Ivar Holm ` (2 subsequent siblings) 5 siblings, 0 replies; 43+ messages in thread From: Lars Magne Ingebrigtsen @ 1999-02-06 8:17 UTC (permalink / raw) "Stephen J. Turnbull" <turnbull@sk.tsukuba.ac.jp> writes: > Prices are vague recollections, in decreasing order of importance for > basic understanding: Thanks; I've ordered ISO-2022, and I'm ordering the Lunde and the Unicode Standard on Mondey. -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 43+ messages in thread
* Displayed [ 0: Stephen J. Turnbull ] but it had lots of lines 1999-02-05 0:47 ` Stephen J. Turnbull ` (2 preceding siblings ...) 1999-02-06 8:17 ` Lars Magne Ingebrigtsen @ 1999-02-09 10:27 ` Alf-Ivar Holm 1999-02-09 16:14 ` Lars Magne Ingebrigtsen 1999-02-09 22:07 ` More charset things Jan Vroonhof [not found] ` <m3hft163aa.fsf@p <byu2wv6xkb.fsf@bolzano.math.ethz.ch> 5 siblings, 1 reply; 43+ messages in thread From: Alf-Ivar Holm @ 1999-02-09 10:27 UTC (permalink / raw) I got this in my summary buffer: R [ 0: Stephen J. Turnbull ] but it did have lots of text. (Its the last message in the References header, do ^.) Affi ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Displayed [ 0: Stephen J. Turnbull ] but it had lots of lines 1999-02-09 10:27 ` Displayed [ 0: Stephen J. Turnbull ] but it had lots of lines Alf-Ivar Holm @ 1999-02-09 16:14 ` Lars Magne Ingebrigtsen 0 siblings, 0 replies; 43+ messages in thread From: Lars Magne Ingebrigtsen @ 1999-02-09 16:14 UTC (permalink / raw) Alf-Ivar Holm <affi@osc.no> writes: > I got this in my summary buffer: > > R [ 0: Stephen J. Turnbull ] > > but it did have lots of text. (Its the last message in the References > header, do ^.) It shows up as RA [ 63: Stephen J. Turnbull ] Re: More charset things here, using nnml. What do you use? -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-05 0:47 ` Stephen J. Turnbull ` (3 preceding siblings ...) 1999-02-09 10:27 ` Displayed [ 0: Stephen J. Turnbull ] but it had lots of lines Alf-Ivar Holm @ 1999-02-09 22:07 ` Jan Vroonhof [not found] ` <m3hft163aa.fsf@p <byu2wv6xkb.fsf@bolzano.math.ethz.ch> 5 siblings, 0 replies; 43+ messages in thread From: Jan Vroonhof @ 1999-02-09 22:07 UTC (permalink / raw) Cc: xemacs-mule Hrvoje Niksic <hniksic@srce.hr> writes: > It didn't work for me in XEmacs/Mule when I tried it. I asked about > it on the mailing list, and noone was able to instruct me how to do it > right. So I concluded that it can't be done. I think it can be done, but I think most of the language environments are just plain wrong. For instance even in a "Croatian" language environment the 'ctext coding system is preferred over iso-8859-2. Ctext is a good choice for latin-1 based systems as it is "backwards compatible" with latin-1. I did some experimenting and I think you should try (set-language-environment "Croatian") (set-coding-category-system 'iso-8-designate 'iso-8859-2) This makes iso-8859-2 the preferred non Japanese coding system. I think you might even prefer setting the coding system priorities to avoid all the Japanese ones. For some reason all the code to do this is commented out. Note that the FSF versions of the language environments do change the coding priorities, but they are now handled centrally. Somehow I have the feeling somebody tried to sync the XEmacs files with the FSF versions but stopped midway. Jan ^ permalink raw reply [flat|nested] 43+ messages in thread
[parent not found: <m3hft163aa.fsf@p <byu2wv6xkb.fsf@bolzano.math.ethz.ch>]
* Re: More charset things [not found] ` <m3hft163aa.fsf@p <byu2wv6xkb.fsf@bolzano.math.ethz.ch> @ 1999-02-09 22:13 ` Hrvoje Niksic 0 siblings, 0 replies; 43+ messages in thread From: Hrvoje Niksic @ 1999-02-09 22:13 UTC (permalink / raw) Jan Vroonhof <vroonhof@math.ethz.ch> writes: > (set-language-environment "Croatian") > (set-coding-category-system 'iso-8-designate 'iso-8859-2) One more thing that baffles me about Mule is that all these things are totally undocumented. It is near impossible to just *use* Mule if you come from a latin2 background. The above may seem strange coming from a developer, but the fact is, when I was starting with XEmacs, reading the documentation was a pleasure. Mule is a black hole in the Emacs tradition. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-04 17:49 ` Lars Magne Ingebrigtsen 1999-02-05 0:47 ` Stephen J. Turnbull @ 1999-02-07 20:43 ` François Pinard 1999-02-08 2:09 ` Martin Buchholz ` (3 more replies) 1 sibling, 4 replies; 43+ messages in thread From: François Pinard @ 1999-02-07 20:43 UTC (permalink / raw) Cc: xemacs-mule Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > > > That's not my table. :-) When MULE supports utf-8, Gnus will > > > support utf-8. > > That is not a nice way of thinking. > I don't see any other way of thinking. Grokking utf-8 is way outside > the scope of Gnus -- it has to be an Emacs thing. In a way, UTF-8 or Base64 are coding schemes. I see no strong reason for Gnus to be favourable to one without being to the other, except maybe that Base64 is usable in CTE, while UTF-8 is probably not going to be. UTF-8 is really simple, by comparison with other things in the field of charsets, and much more simple that what Gnus already does about the whole thing. Lars, I can send you documentation and C code, if you feel like it. > Could that someone (or someone else) re-recommend the book(s) that I > should buy to get both an introduction and more in-depth knowledge about > charset issues? *The* reference, which I never seen (my librarian says the editor is out of stock), is supposed to be the Ken Lunde book, in the ORA series. From: Brendan_Murray/DUB/Lotus@lotus.com Subject: Re: unicode <-> hex converter (fwd) To: pinard@IRO.UMontreal.CA Date: 1997-04-11 09:38:40 +01:00 For information on Asian character sets, try picking up a copy of Ken Lunde's text for his next book. It should be on ftp://ftp.ora.com/pub/examples/nutshell/ujip/doc/cjk.inf. His first book, "Understanding Japanese Information Processing" is so good that it has been translated to Japanese, and is used over there by many developers (one of the guys in our Tokyo office thought Ken Lunde was Japanese - that's how good it is!) - if you're doing anything with the Japanese encoding systems, I heartily recommend this. By the way, you'll find code snippets sprinkled around that part of the FTP site, with different encoding transformations. Brendan -- François Pinard mailto:pinard@iro.umontreal.ca Join the free Translation Project! http://www.iro.umontreal.ca/~pinard ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-07 20:43 ` François Pinard @ 1999-02-08 2:09 ` Martin Buchholz 1999-02-22 15:52 ` François Pinard [not found] ` <m37lttydo2.fsf@peorth.gweep.net> ` (2 subsequent siblings) 3 siblings, 1 reply; 43+ messages in thread From: Martin Buchholz @ 1999-02-08 2:09 UTC (permalink / raw) Cc: ding, xemacs-mule >>>>> "F" == ISO-8859-1 <ISO-8859-1> writes: F> Lars Magne Ingebrigtsen <larsi@gnus.org> writes: F> *The* reference, which I never seen (my librarian says the editor is out F> of stock), is supposed to be the Ken Lunde book, in the ORA series. F> From: Brendan_Murray/DUB/Lotus@lotus.com F> Subject: Re: unicode <-> hex converter (fwd) F> To: pinard@IRO.UMontreal.CA F> Date: 1997-04-11 09:38:40 +01:00 F> For information on Asian character sets, try picking up a copy of Ken F> Lunde's text for his next book. It should be on F> ftp://ftp.ora.com/pub/examples/nutshell/ujip/doc/cjk.inf. His first book, Where've you been? The second edition is finally out. CJKV! F> "Understanding Japanese Information Processing" is so good that it has been F> translated to Japanese, and is used over there by many developers (one of F> the guys in our Tokyo office thought Ken Lunde was Japanese - that's how F> good it is!) - if you're doing anything with the Japanese encoding systems, F> I heartily recommend this. F> By the way, you'll find code snippets sprinkled around that part of the FTP F> site, with different encoding transformations. Martin ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-08 2:09 ` Martin Buchholz @ 1999-02-22 15:52 ` François Pinard 0 siblings, 0 replies; 43+ messages in thread From: François Pinard @ 1999-02-22 15:52 UTC (permalink / raw) Cc: ding, xemacs-mule [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 1113 bytes --] Martin Buchholz <martin@xemacs.org> writes: > F> *The* reference, which I never seen (my librarian says the editor is out > F> of stock), is supposed to be the Ken Lunde book, in the ORA series. > Where've you been? The second edition is finally out. CJKV! Yes, yeah! I finally got a copy, after having waited for more than a year. I surely have no time to read it right away, yet at first glance, it looks like a wonderful book, and I'll surely find many, many answers in there. :-) Oh, there is a mere mention about Mule, but no documentation. Do not buy this book if you are only looking for Mule specificities. However, Mule is an integrator for many charsets described in the book. So, the book might be useful for anybody interested in Asian charsets details, whether pro-Mule or con-Mule. P.S. - Of course, if this message was going to any FSF list, which I think it does not, I would have refrained from commenting of a non-free book. :-) -- François Pinard mailto:pinard@iro.umontreal.ca Join the free Translation Project! http://www.iro.umontreal.ca/~pinard ^ permalink raw reply [flat|nested] 43+ messages in thread
[parent not found: <m37lttydo2.fsf@peorth.gweep.net>]
* Re: More charset things [not found] ` <m37lttydo2.fsf@peorth.gweep.net> @ 1999-02-08 9:55 ` Kai.Grossjohann 1999-02-08 15:52 ` François Pinard ` (2 subsequent siblings) 3 siblings, 0 replies; 43+ messages in thread From: Kai.Grossjohann @ 1999-02-08 9:55 UTC (permalink / raw) Stainless Steel Rat <ratinox@peorth.gweep.net> writes: > And wow! I just noticed how badly Supercite failed to deal with > your mailbox. Probably because you have 8-bit data in a field > that specifically calls for ASCII and only ASCII. This one isn't > a MULE bug, because there is no MULE in my XEmacs. None of these characters are non-ASCII: ,----- | =?ISO-8859-1?Q?Fran=E7ois_Pinard?= <pinard@iro.umontreal.ca> `----- Maybe SC schould be updated to grok this encoding? kai -- I like _\bb_\bo_\bt_\bh kinds of music. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things [not found] ` <m37lttydo2.fsf@peorth.gweep.net> 1999-02-08 9:55 ` Kai.Grossjohann @ 1999-02-08 15:52 ` François Pinard [not found] ` <m3n22ou09w.fsf@peorth.gweep.net> ` (2 more replies) 1999-02-08 17:29 ` Karl Eichwalder 1999-02-08 22:03 ` James H. Cloos Jr. 3 siblings, 3 replies; 43+ messages in thread From: François Pinard @ 1999-02-08 15:52 UTC (permalink / raw) Stainless Steel Rat <ratinox@peorth.gweep.net> writes: > base64 is an encoding scheme (comparable to uuencode). UTF-8 is a > character set (comparable to ISO-8859-1). They have nothing in common, > at least not the way you are thinking of it. UTF-8 is an encoding scheme, comparable to uuencode. But it is currently used to encode one and only character set, the UCS (described in Unicode manuals and within ISO 10646). But theoretically, it could well be used to encode other things. Because the UTF-8 encoding scheme is used for only one charset, it is common to consider that it is a charset itself, but this is a conceptual abuse. I have nothing against relying on this abuse, which is quite handy, as long as we do not loose sight of the real thing. UTF-8 is not a charset, in the deep nature of things. :-) That is why Lars could well decide, one of these days, to support UTF-8 as an encoding (which it really is) on the same level as Base64, and moreover, rather fun to implement. It might be convenient that Gnus do so as a contribution to the Unicode effort, without really waiting for Emacs to do it. The sad aspect of things is that, for orthogonality reasons, Gnus should then support UTF-7 as well, and this one, being sensibly uglier internally, is not as much fun. -- François Pinard mailto:pinard@iro.umontreal.ca Join the free Translation Project! http://www.iro.umontreal.ca/~pinard ^ permalink raw reply [flat|nested] 43+ messages in thread
[parent not found: <m3n22ou09w.fsf@peorth.gweep.net>]
* Re: More charset things [not found] ` <m3n22ou09w.fsf@peorth.gweep.net> @ 1999-02-08 23:19 ` François Pinard 0 siblings, 0 replies; 43+ messages in thread From: François Pinard @ 1999-02-08 23:19 UTC (permalink / raw) Stainless Steel Rat <ratinox@peorth.gweep.net> writes: > > UTF-8 is an encoding scheme, comparable to uuencode. > It is? Then I'm confused... for some reason I was thinking that UTF-8 > *was* Unicode. Nowadays, the UCS may be represented as UCS-2 or UCS-4 internally, yet UCS-2 is often seen externally. The latest Unicode, if I understand things correctly, highly promotes what was once called UTF-16, which is a way of using one or two UCS-2 super-bytes for representing one million characters. There is also UTF-8 which is popular (and nice) and UTF-7 which is getting popular (and ugly). Nicety and ugliness is well hidden in decoders/encoders, so it does not really matter in practice. UTF-7 is a MIME related invention, it does not come from Unicode nor ISO. There also are other encodings, but they are obsolent enough to not be worth mentioning. -- François Pinard mailto:pinard@iro.umontreal.ca Join the free Translation Project! http://www.iro.umontreal.ca/~pinard ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-08 15:52 ` François Pinard [not found] ` <m3n22ou09w.fsf@peorth.gweep.net> @ 1999-02-09 8:05 ` Steinar Bang 1999-02-14 18:10 ` UTF-8 (Was: More charset things) Steinar Bang 1999-02-09 16:03 ` More charset things Lars Magne Ingebrigtsen 2 siblings, 1 reply; 43+ messages in thread From: Steinar Bang @ 1999-02-09 8:05 UTC (permalink / raw) >>>>> François Pinard <pinard@iro.umontreal.ca>: > That is why Lars could well decide, one of these days, to support > UTF-8 as an encoding (which it really is) on the same level as > Base64, and moreover, rather fun to implement. It might be > convenient that Gnus do so as a contribution to the Unicode effort, > without really waiting for Emacs to do it. But isn't UTF-8 support something that really should be done at the C level (like base64 is done in newer emacsen)? Or am I thinking of UTF-7 here...? (does anyone have some handy online references?) ^ permalink raw reply [flat|nested] 43+ messages in thread
* UTF-8 (Was: More charset things) 1999-02-09 8:05 ` Steinar Bang @ 1999-02-14 18:10 ` Steinar Bang 0 siblings, 0 replies; 43+ messages in thread From: Steinar Bang @ 1999-02-14 18:10 UTC (permalink / raw) >>>>> Steinar Bang <sb@metis.no>: >>>>> François Pinard <pinard@iro.umontreal.ca>: >> That is why Lars could well decide, one of these days, to support >> UTF-8 as an encoding (which it really is) on the same level as >> Base64, and moreover, rather fun to implement. It might be >> convenient that Gnus do so as a contribution to the Unicode effort, >> without really waiting for Emacs to do it. One reason to support UTF-8 decoding and encoding, is that son-of-son-of-1036 (or watchamacallit) http://www.ietf.org/internet-drafts/draft-ietf-usefor-article-01.txt seems to recommend UTF-8 for both the headers and bodies of news messages. Hm... the way this works would probably be to have a UTF-8 decoding that would always attempt to decode a news message and then revert to a locale or newsgroup specific setting if the UTF-8 decoding breaks down (use of the iso-8859-1 charset in the case of the no.* hierarchy). UTF-8 encoding should probably not be made default for a while yet. At least it should be made newsgroup hierarchy dependent. > But isn't UTF-8 support something that really should be done at the C > level (like base64 is done in newer emacsen)? Or am I thinking of > UTF-7 here...? (does anyone have some handy online references?) UTF-8 is defined in RFC2279 ftp://ftp.ntnu.no/pub/rfc/rfc2279.txt UTF-7 is defined in RFC2152 ftp://ftp.ntnu.no/pub/rfc/rfc2152.txt Both would probably be best off with decoding done in C. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-08 15:52 ` François Pinard [not found] ` <m3n22ou09w.fsf@peorth.gweep.net> 1999-02-09 8:05 ` Steinar Bang @ 1999-02-09 16:03 ` Lars Magne Ingebrigtsen 2 siblings, 0 replies; 43+ messages in thread From: Lars Magne Ingebrigtsen @ 1999-02-09 16:03 UTC (permalink / raw) François Pinard <pinard@iro.umontreal.ca> writes: > UTF-8 is an encoding scheme, comparable to uuencode. > > But it is currently used to encode one and only character set, the UCS > (described in Unicode manuals and within ISO 10646). But theoretically, > it could well be used to encode other things. >From off the top of my head -- we have two things, "encoded character set" (which one usually just calls "character set" unless there's a possibility for confusion), and we have "character encoding scheme". ECS and CES. Unicode is an ECS and utf-8 is a CES that is only used for the ECS Unicode. However -- in a MIME context, we don't care about this. What we deal with is "charsets", which is not an ECS or an CES, but a combination of the two. Therefore, "charset=utf-8" is correct. In a MIME context, utf-8 is not an encoding, it is purely, and always, a charset, and nothing else. :-) -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things [not found] ` <m37lttydo2.fsf@peorth.gweep.net> 1999-02-08 9:55 ` Kai.Grossjohann 1999-02-08 15:52 ` François Pinard @ 1999-02-08 17:29 ` Karl Eichwalder 1999-02-08 22:03 ` James H. Cloos Jr. 3 siblings, 0 replies; 43+ messages in thread From: Karl Eichwalder @ 1999-02-08 17:29 UTC (permalink / raw) Stainless Steel Rat <ratinox@peorth.gweep.net> writes: | "oP" == ois Pinard <Fran> writes: | Probably because you have 8-bit data in a field that specifically | calls for ASCII and only ASCII. Try to view the raw From line -- it looks good to me (and message knows to handle it). -- Karl Eichwalder ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things [not found] ` <m37lttydo2.fsf@peorth.gweep.net> ` (2 preceding siblings ...) 1999-02-08 17:29 ` Karl Eichwalder @ 1999-02-08 22:03 ` James H. Cloos Jr. 1999-02-09 5:29 ` Russ Allbery 3 siblings, 1 reply; 43+ messages in thread From: James H. Cloos Jr. @ 1999-02-08 22:03 UTC (permalink / raw) -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 >>>>> "SSR" == Stainless Steel Rat <ratinox@peorth.gweep.net> writes: SSR> And wow! I just noticed how badly Supercite failed to deal with SSR> your mailbox. Probably because you have 8-bit data in a field SSR> that specifically calls for ASCII and only ASCII. This one isn't SSR> a MULE bug, because there is no MULE in my XEmacs. Odd. Works for me in GNU Emacs 20.3.1, with supercite.el revision: 3.54. (Which says it was last modified 1993/09/22 18:58:46, FWIW.) - -JimC - -- James H. Cloos, Jr. <http://www.jhcloos.com/cloos/public_key> 1024D/ED7DAEA6 <cloos@jhcloos.com> E9E9 F828 61A4 6EA9 0F2B 63E7 997A 9F17 ED7D AEA6 -----BEGIN PGP SIGNATURE----- Version: GnuPG v0.9.2 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE2v19CmXqfF+19rqYRAgyUAJ9vaBAXMvUmSQojOY2Mag8dZ+e+FwCfYUB2 3rVAdGIxCIE3FGOkaakoJt4= =vHSW -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-08 22:03 ` James H. Cloos Jr. @ 1999-02-09 5:29 ` Russ Allbery 1999-02-09 7:33 ` James H. Cloos Jr. 0 siblings, 1 reply; 43+ messages in thread From: Russ Allbery @ 1999-02-09 5:29 UTC (permalink / raw) James H Cloos <cloos@jhcloos.com> writes: >>>>>> "SSR" == Stainless Steel Rat <ratinox@peorth.gweep.net> writes: > SSR> And wow! I just noticed how badly Supercite failed to deal with > SSR> your mailbox. Probably because you have 8-bit data in a field > SSR> that specifically calls for ASCII and only ASCII. This one isn't > SSR> a MULE bug, because there is no MULE in my XEmacs. > Odd. Works for me in GNU Emacs 20.3.1, with supercite.el revision: > 3.54. (Which says it was last modified 1993/09/22 18:58:46, FWIW.) supercite.el has a lot of major annoyances in what it's willing to recognize as valid characters for names and for e-mail addresses. I use the following, which fixes it a little at lesat for me: ;; Override sc-get-address with something that's less picky about what it's ;; willing to consider an address (supercite's default truncates the address ;; at the first odd-looking character). (defun sc-get-address (from author) "Get the full email address path from FROM. AUTHOR is the author's name (which is removed from the address)." (let ((eos (length from))) (if (string-match (concat "\\(^\\|^\"\\)" (regexp-quote author) "\\(\\s +\\|\"\\s +\\)") from 0) (let ((address (substring from (match-end 0) eos))) (if (and (= (aref address 0) ?<) (= (aref address (1- (length address))) ?>)) (substring address 1 (1- (length address))) address)) (if (string-match "[ ]*<?\\([^ (>]+@[^ (>]+\\)" from 0) (sc-submatch 1 from) "")))) ;; Override sc-attribs-extract-namestring so that it will correctly cope ;; with From headers that contain no address (which is becoming more common ;; with munging, even if it's technically illegal). (defun sc-attribs-extract-namestring (from) "Extract the name string from FROM. This should be the author's full name minus an optional title." (let ((namestring (or ;; If there is a <...> in the name, ;; treat everything before that as the full name. ;; Even if it contains parens, use the whole thing. ;; On the other hand, we do look for quotes in the usual way. (and (string-match " *<.*>" from 0) (let ((before-angles (sc-name-substring from 0 (match-beginning 0) 0))) (if (string-match "\".*\"" before-angles 0) (sc-name-substring before-angles (match-beginning 0) (match-end 0) 1) before-angles))) (sc-name-substring from (string-match "(.*)" from 0) (match-end 0) 1) (sc-name-substring from (string-match "\".*\"" from 0) (match-end 0) 1) (sc-name-substring from (string-match "\\([-.a-zA-Z0-9_]+\\s *\\)+" from 0) (match-end 0) 0) (sc-attribs-emailname from)))) ;; strip off any leading or trailing whitespace (if namestring (let ((bos 0) (eos (1- (length namestring)))) (while (and (<= bos eos) (memq (aref namestring bos) '(32 ?\t))) (setq bos (1+ bos))) (while (and (> eos bos) (memq (aref namestring eos) '(32 ?\t))) (setq eos (1- eos))) (substring namestring bos (1+ eos)))))) -- Russ Allbery (rra@stanford.edu) <URL:http://www.eyrie.org/~eagle/> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-09 5:29 ` Russ Allbery @ 1999-02-09 7:33 ` James H. Cloos Jr. 1999-02-10 2:13 ` Stephen Zander 0 siblings, 1 reply; 43+ messages in thread From: James H. Cloos Jr. @ 1999-02-09 7:33 UTC (permalink / raw) -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 >>>>>"JHC" == James H Cloos <cloos@jhcloos.com> writes: >>>>> "SSR" == Stainless Steel Rat <ratinox@peorth.gweep.net> writes: JHC> Odd. Works for me in GNU Emacs 20.3.1, with supercite.el JHC> revision: 3.54. (Which says it was last modified 1993/09/22 JHC> 18:58:46, FWIW.) SSR> Hmmm... quite strange. I'm using 3.55, the version bundled SSR> with XEmacs 20.4. You know what it is? Since I'm running GNU Emacs 20.3.1, I'm running MULE. As such, the non-ASCII characters match supercite's regexes. Since you are not running MULE, you'll need different regexes for matching handles and addresses, such as the ones Russ posted. Or at least that seems like the (most) logical explanation.... - -JimC - -- James H. Cloos, Jr. <http://www.jhcloos.com/cloos/public_key> 1024D/ED7DAEA6 <cloos@jhcloos.com> E9E9 F828 61A4 6EA9 0F2B 63E7 997A 9F17 ED7D AEA6 -----BEGIN PGP SIGNATURE----- Version: GnuPG v0.9.2 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE2v+SemXqfF+19rqYRAqN2AJ9kZmgcQdFF2NZ67tQ916F3NtOjTQCfezvH 20iAxvQsBiKUQjjsecaZo+k= =TTK5 -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-09 7:33 ` James H. Cloos Jr. @ 1999-02-10 2:13 ` Stephen Zander 0 siblings, 0 replies; 43+ messages in thread From: Stephen Zander @ 1999-02-10 2:13 UTC (permalink / raw) Cc: (ding) >>>>> "James" == James H Cloos <cloos@jhcloos.com> writes: James> Since I'm running GNU Emacs 20.3.1, I'm running MULE. As James> such, the non-ASCII characters match supercite's regexes. James> Since you are not running MULE, you'll need different James> regexes for matching handles and addresses, such as the James> ones Russ posted. Ixnay, that can't be all the story. I am running Xemacs/MULE & supercite has exactly the same failure mode for me as that experienced by Ratinox. -- Stephen --- It should be illegal to yell "Y2K" in a crowded economy. :-) -- Larry Wall ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-07 20:43 ` François Pinard 1999-02-08 2:09 ` Martin Buchholz [not found] ` <m37lttydo2.fsf@peorth.gweep.net> @ 1999-02-08 14:49 ` Robert Bihlmeyer 1999-02-11 10:09 ` Jan Vroonhof 3 siblings, 0 replies; 43+ messages in thread From: Robert Bihlmeyer @ 1999-02-08 14:49 UTC (permalink / raw) Hi, >>>>> On 07 Feb 1999 15:43:18 -0500 >>>>> François Pinard <pinard@iro.umontreal.ca> said: ^^^^^^^^ works here FP> In a way, UTF-8 or Base64 are coding schemes. I see no strong FP> reason for Gnus to be favourable to one without being to the FP> other, except maybe that Base64 is usable in CTE, while UTF-8 is FP> probably not going to be. UTF-7 is used in CTE today. Robbe -- Robert Bihlmeyer reads: Deutsch, English, MIME, Latin-1, NO SPAM! <robbe@orcus.priv.at> <http://stud2.tuwien.ac.at/~e9426626/sig.html> ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-07 20:43 ` François Pinard ` (2 preceding siblings ...) 1999-02-08 14:49 ` Robert Bihlmeyer @ 1999-02-11 10:09 ` Jan Vroonhof 3 siblings, 0 replies; 43+ messages in thread From: Jan Vroonhof @ 1999-02-11 10:09 UTC (permalink / raw) Stephen Zander <gibreel@pobox.com> writes: > Ixnay, that can't be all the story. I am running Xemacs/MULE & > supercite has exactly the same failure mode for me as that experienced > by Ratinox. Same here. Maybe only FSF Mule has this hack/trick/whatever that a-z matches more than just a-z.[1] Jan Footnotes: [1] Of course there is something to be said for this. Non-ascii is no longer a nicely ordered set anyway so you might as well order all the lower case intl characters somewhere in the a-z range. However you get all kinds of strange questions then: Does a-o match ö for instance? Jan ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-04 17:21 ` Hrvoje Niksic 1999-02-04 17:49 ` Lars Magne Ingebrigtsen @ 1999-02-07 19:37 ` François Pinard 1999-02-08 0:06 ` Kenichi Handa 1 sibling, 1 reply; 43+ messages in thread From: François Pinard @ 1999-02-07 19:37 UTC (permalink / raw) Cc: ding, handa Hrvoje Niksic <hniksic@srce.hr> writes: > That is not a nice way of thinking. MULE is little else than a Japanese > version of Emacs, and it appears that the Japanese are not interested > in Unicode. So it wasn't implemented. I'm not sure about FSF, but for > XEmacs, I know of no plans to implement it in the near future. Handa-san is planning to implement Unicode support in Mule, and I presume UTF-8 will come along with it. -- François Pinard mailto:pinard@iro.umontreal.ca Join the free Translation Project! http://www.iro.umontreal.ca/~pinard ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-07 19:37 ` François Pinard @ 1999-02-08 0:06 ` Kenichi Handa 0 siblings, 0 replies; 43+ messages in thread From: Kenichi Handa @ 1999-02-08 0:06 UTC (permalink / raw) Cc: hniksic, ding =?ISO-8859-1?Q?Fran=E7ois_Pinard?= <pinard@iro.umontreal.ca> writes: > Hrvoje Niksic <hniksic@srce.hr> writes: >> That is not a nice way of thinking. MULE is little else than a Japanese >> version of Emacs, and it appears that the Japanese are not interested >> in Unicode. So it wasn't implemented. I'm not sure about FSF, but for >> XEmacs, I know of no plans to implement it in the near future. > Handa-san is planning to implement Unicode support in Mule, and I presume > UTF-8 will come along with it. I myself have not yet started to work on Unicode support. But, I heard that mleisher@crl.nmsu.edu had started the work. --- Ken'ichi HANDA handa@etl.go.jp ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-04 17:08 ` Lars Magne Ingebrigtsen 1999-02-04 17:21 ` Hrvoje Niksic @ 1999-02-07 19:35 ` François Pinard 1999-02-08 13:37 ` Simon Josefsson 1 sibling, 1 reply; 43+ messages in thread From: François Pinard @ 1999-02-07 19:35 UTC (permalink / raw) Cc: handa Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > (By the way -- is it "MULE" or "Mule? I'm waffling all over the place > when I write that word. Perhaps I should start writing it "mUlE"?) I documented this somewhere. Let me see... OK: The spelling @code{Mule} originally stands for @cite{@emph{mul}tilingual @emph{e}nhancement to GNU Emacs}, it is the result of a collective effort orchestrated by Handa Ken'ishi since 1993. When @code{Mule} got rewritten in the main development stream of GNU Emacs 20, the FSF renamed it @code{MULE}, meaning @cite{@emph{mul}tilingual @emph{e}nvironment in GNU Emacs}. I guess that the FSF wanted to more clearly establish who is the boss, by renaming the thing and changing the capitalization. By reaction, maybe, I try to consistently write "Mule", as a tribute to the original effort. -- François Pinard mailto:pinard@iro.umontreal.ca Join the free Translation Project! http://www.iro.umontreal.ca/~pinard ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-07 19:35 ` François Pinard @ 1999-02-08 13:37 ` Simon Josefsson 1999-02-08 23:43 ` Kenichi Handa 0 siblings, 1 reply; 43+ messages in thread From: Simon Josefsson @ 1999-02-08 13:37 UTC (permalink / raw) Cc: ding, handa François Pinard <pinard@iro.umontreal.ca> writes: > > (By the way -- is it "MULE" or "Mule? I'm waffling all over the place > > when I write that word. Perhaps I should start writing it "mUlE"?) > > I documented this somewhere. Let me see... OK: > > The spelling @code{Mule} originally stands for @cite{@emph{mul}tilingual > @emph{e}nhancement to GNU Emacs}, it is the result of a collective > effort orchestrated by Handa Ken'ishi since 1993. When @code{Mule} got > rewritten in the main development stream of GNU Emacs 20, the FSF renamed > it @code{MULE}, meaning @cite{@emph{mul}tilingual @emph{e}nvironment > in GNU Emacs}. Emacs seem a little bit confused about this itself, the menu bar option is called "Mule" and in it there is "Show all of MULE status". :-) ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: More charset things 1999-02-08 13:37 ` Simon Josefsson @ 1999-02-08 23:43 ` Kenichi Handa 0 siblings, 0 replies; 43+ messages in thread From: Kenichi Handa @ 1999-02-08 23:43 UTC (permalink / raw) Cc: pinard, ding Simon Josefsson <jas@pdc.kth.se> writes: > Emacs seem a little bit confused about this itself, the menu bar > option is called "Mule" and in it there is "Show all of MULE status". The other titles in the menu bar are all capitalized. So, I thought we had better capitalize "MULE" too. --- Ken'ichi HANDA handa@etl.go.jp ^ permalink raw reply [flat|nested] 43+ messages in thread
end of thread, other threads:[~1999-02-22 15:52 UTC | newest] Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 1999-02-03 18:09 More charset things Lars Magne Ingebrigtsen 1999-02-04 14:56 ` Hrvoje Niksic 1999-02-04 17:08 ` Lars Magne Ingebrigtsen 1999-02-04 17:21 ` Hrvoje Niksic 1999-02-04 17:49 ` Lars Magne Ingebrigtsen 1999-02-05 0:47 ` Stephen J. Turnbull 1999-02-05 2:43 ` Hrvoje Niksic [not found] ` <m3hft163aa.fsf@peorth.gweep.net> 1999-02-05 19:06 ` Vladimir Volovich [not found] ` <m3sockqqjx.fsf@peorth.gweep.net> 1999-02-06 15:55 ` Lars Magne Ingebrigtsen [not found] ` <m3lnia5922.fsf@peorth.gweep.net> 1999-02-07 21:02 ` Hrvoje Niksic 1999-02-09 15:56 ` Lars Magne Ingebrigtsen 1999-02-09 17:21 ` Hrvoje Niksic 1999-02-09 17:31 ` Alan Shutko 1999-02-09 17:37 ` Lars Magne Ingebrigtsen 1999-02-09 18:06 ` Hrvoje Niksic 1999-02-08 16:04 ` Bill White 1999-02-09 16:04 ` Lars Magne Ingebrigtsen 1999-02-06 8:17 ` Lars Magne Ingebrigtsen 1999-02-09 10:27 ` Displayed [ 0: Stephen J. Turnbull ] but it had lots of lines Alf-Ivar Holm 1999-02-09 16:14 ` Lars Magne Ingebrigtsen 1999-02-09 22:07 ` More charset things Jan Vroonhof [not found] ` <m3hft163aa.fsf@p <byu2wv6xkb.fsf@bolzano.math.ethz.ch> 1999-02-09 22:13 ` Hrvoje Niksic 1999-02-07 20:43 ` François Pinard 1999-02-08 2:09 ` Martin Buchholz 1999-02-22 15:52 ` François Pinard [not found] ` <m37lttydo2.fsf@peorth.gweep.net> 1999-02-08 9:55 ` Kai.Grossjohann 1999-02-08 15:52 ` François Pinard [not found] ` <m3n22ou09w.fsf@peorth.gweep.net> 1999-02-08 23:19 ` François Pinard 1999-02-09 8:05 ` Steinar Bang 1999-02-14 18:10 ` UTF-8 (Was: More charset things) Steinar Bang 1999-02-09 16:03 ` More charset things Lars Magne Ingebrigtsen 1999-02-08 17:29 ` Karl Eichwalder 1999-02-08 22:03 ` James H. Cloos Jr. 1999-02-09 5:29 ` Russ Allbery 1999-02-09 7:33 ` James H. Cloos Jr. 1999-02-10 2:13 ` Stephen Zander 1999-02-08 14:49 ` Robert Bihlmeyer 1999-02-11 10:09 ` Jan Vroonhof 1999-02-07 19:37 ` François Pinard 1999-02-08 0:06 ` Kenichi Handa 1999-02-07 19:35 ` François Pinard 1999-02-08 13:37 ` Simon Josefsson 1999-02-08 23:43 ` Kenichi Handa
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).