* Lots of sets.
@ 1998-12-02 23:35 Lars Magne Ingebrigtsen
1998-12-02 23:48 ` Hrvoje Niksic
` (3 more replies)
0 siblings, 4 replies; 9+ messages in thread
From: Lars Magne Ingebrigtsen @ 1998-12-02 23:35 UTC (permalink / raw)
[-- Attachment #1: Type: text/plain, Size: 119 bytes --]
I wonder whether I can mail the HELLO file. I guess we'll find out
now. :-)
Amharic (
[-- Attachment #2: Type: text/plain, Size: 151 bytes --]
Czech (česky) Dobrý den
Danish (Dansk) Hej, Goddag
English Hello
Esperanto Saluton
Estonian Tere, Tervist
FORTRAN PROGRAM
Finnish (Suomi) Hei
[-- Attachment #3: Type: text/plain, Size: 96 bytes --]
French (Français) Bonjour, Salut
German (Deutsch Nord) Guten Tag
German (Deutsch Süd) Grüß Gott
[-- Attachment #4: Type: text/plain, Size: 26 bytes --]
Greek (Ελληνικά) Γειά σας
[-- Attachment #5: Type: text/plain, Size: 42 bytes --]
Hebrew שלום
Italiano Ciao, Buon giorno
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #6: Type: text/plain; charset=lao, Size: 24 bytes --]
Lao(¾ÒÊÒÅÒÇ)
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #7: Type: text/plain; charset=lao, Size: 13 bytes --]
ÊкÒ^[0´Õ^[1,
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #8: Type: text/plain; charset=lao, Size: 7 bytes --]
^[0¢í^[1ã
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #9: Type: text/plain; charset=lao, Size: 6 bytes --]
^[0Ëé^[1
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #10: Type: text/plain; charset=lao, Size: 3 bytes --]
⪡
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #11: Type: text/plain; charset=lao, Size: 83 bytes --]
^[0´Õ^[1
Maltese Ciao
Nederlands, Vlaams Hallo, Dag
Norwegian (Norsk) Hei, God dag
[-- Attachment #12: Type: text/plain, Size: 26 bytes --]
Polish Dzień dobry, Hej
[-- Attachment #13: Type: text/plain, Size: 32 bytes --]
Russian (Русский) Здравствуйте!
[-- Attachment #14: Type: text/plain, Size: 19 bytes --]
Slovak Dobrý deň
[-- Attachment #15: Type: text/plain, Size: 55 bytes --]
Spanish (Español) ¡Hola!
Swedish (Svenska) Hej, Goddag
[-- Attachment #16: Type: text/plain, Size: 16 bytes --]
Thai (ภาษาไทย)
[-- Attachment #17: Type: text/plain, Size: 3 bytes --]
สวั
[-- Attachment #18: Type: text/plain, Size: 1 bytes --]
ส
[-- Attachment #19: Type: text/plain, Size: 2 bytes --]
ดี
[-- Attachment #20: Type: text/plain, Size: 1 bytes --]
ค
[-- Attachment #21: Type: text/plain, Size: 2 bytes --]
รั
[-- Attachment #22: Type: text/plain, Size: 3 bytes --]
บ,
[-- Attachment #23: Type: text/plain, Size: 3 bytes --]
สวั
[-- Attachment #24: Type: text/plain, Size: 1 bytes --]
ส
[-- Attachment #25: Type: text/plain, Size: 4 bytes --]
ดีค่
[-- Attachment #26: Type: text/plain, Size: 3 bytes --]
ะ
[-- Attachment #27: Type: text/plain, Size: 43 bytes --]
Tigrigna (
[-- Attachment #28: Type: text/plain, Size: 25 bytes --]
Turkish (Türkçe) Merhaba
[-- Attachment #29: Type: text/plain, Size: 34 bytes --]
Vietnamese (Tiếng Việt) Chào bạn
[-- Attachment #30: Type: text/plain, Size: 43 bytes --]
Japanese (日本語) こんにちは,
[-- Attachment #31: Type: text/plain, Size: 6 bytes --]
コンニチハ
[-- Attachment #32: Type: text/plain, Size: 32 bytes --]
Chinese (中文,普通话,汉语) 你好
[-- Attachment #33: Type: text/plain, Size: 36 bytes --]
Cantonese (粵語,廣東話) 早晨, 你好
[-- Attachment #34: Type: text/plain, Size: 119 bytes --]
Korean (한글) 안녕하세요, 안녕하십니까
Difference among chinese characters in GB, JIS, KSC, BIG5:
[-- Attachment #35: Type: text/plain, Size: 20 bytes --]
GB -- 元气 开发
[-- Attachment #36: Type: text/plain, Size: 32 bytes --]
JIS -- 元気 開発
[-- Attachment #37: Type: text/plain, Size: 32 bytes --]
KSC -- 元氣 開發
[-- Attachment #38: Type: text/plain, Size: 21 bytes --]
BIG5 -- 元氣 開發
[-- Attachment #39: Type: text/plain, Size: 37 bytes --]
Just for a test of JISX0212: 騏
[-- Attachment #40: Type: text/plain, Size: 154 bytes --]
驎 (the second character is of JISX0212)
--
(domestic pets only, the antidote for overdose, milk.)
larsi@ifi.uio.no * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Lots of sets.
1998-12-02 23:35 Lots of sets Lars Magne Ingebrigtsen
@ 1998-12-02 23:48 ` Hrvoje Niksic
1998-12-03 8:03 ` Yair Friedman
` (2 subsequent siblings)
3 siblings, 0 replies; 9+ messages in thread
From: Hrvoje Niksic @ 1998-12-02 23:48 UTC (permalink / raw)
Lars Magne Ingebrigtsen <larsi@ifi.uio.no> writes:
[...]
Something is not right here.
> --==-=-=
> Content-Type: text/plain; charset=iso-2022-jp
>
> I wonder whether I can mail the HELLO file. I guess we'll find out
> now. :-)
>
> Amharic (^[$(3"c!<!N"^^[(B) ^[$(3!A!,!>^[(B
This is not good, because you are marking perfectly fine us-ascii text
as iso-2022-jp (?!) instead of only the Amharic (?) file. Some
mailers may choose not to show the ASCII text because of that.
With iso-8859-* things are slightly better, but still...
> --==-=-=
> Content-Type: text/plain; charset=lao
> Content-Transfer-Encoding: 8bit
>
> Lao(žŇĘŇĹŇÇ)
> --==-=-=
> Content-Type: text/plain; charset=lao
> Content-Transfer-Encoding: 8bit
>
> ĘĐşŇ^[0´Ő^[1,
[... several more lao parts ...]
This is obviously wrong. When autosplitting, Gnus should compress the
adjacent regions of equal charset to a single part. The same with
thai-tis620.
Otherwise, the feature is quite cool, if a little pathological. :-)
--
Hrvoje Niksic <hniksic@srce.hr> | Student at FER Zagreb, Croatia
--------------------------------+--------------------------------
The Lord protects children and fools... But don't push it.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Lots of sets.
1998-12-02 23:35 Lots of sets Lars Magne Ingebrigtsen
1998-12-02 23:48 ` Hrvoje Niksic
@ 1998-12-03 8:03 ` Yair Friedman
1998-12-03 12:11 ` Lars Magne Ingebrigtsen
1998-12-03 14:49 ` Michael Welsh Duggan
1998-12-03 21:01 ` Jack Vinson
3 siblings, 1 reply; 9+ messages in thread
From: Yair Friedman @ 1998-12-03 8:03 UTC (permalink / raw)
Still using Gnus v5.6.43 with tm hencr the crude quoting but,.....
Lars Magne Ingebrigtsen <larsi@ifi.uio.no> writes:
> [5 <text/plain; iso-8859-8 (8bit)>]
> Hebrew ùìåí
> Italiano Ciao, Buon giorno
I'm not sure it's wise to display the Italian greetings in Hebrew
charset, most of the readers probably doesn't have it installed, so they
won't see it.
iso-8859-8 is not a "valid" character set. The message seems to be
using iso-8859-8-i (According to RFC 1556), but the whole point of
sending r2l language messages in (X)Emacs is mute.
--
Yair.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Lots of sets.
1998-12-03 8:03 ` Yair Friedman
@ 1998-12-03 12:11 ` Lars Magne Ingebrigtsen
1998-12-03 13:38 ` Yair Friedman
0 siblings, 1 reply; 9+ messages in thread
From: Lars Magne Ingebrigtsen @ 1998-12-03 12:11 UTC (permalink / raw)
[-- Attachment #1: Type: text/plain, Size: 846 bytes --]
Yair Friedman <yfriedma@JohnBryce.Co.Il> writes:
> > [5 <text/plain; iso-8859-8 (8bit)>]
> > Hebrew ùìåí
> > Italiano Ciao, Buon giorno
>
> I'm not sure it's wise to display the Italian greetings in Hebrew
> charset, most of the readers probably doesn't have it installed, so they
> won't see it.
But the Italian greeting was in all ASCII, so...
When encoding a message (or a message part), then one chooses a
charset that can encode that message (or part). If one were to only
use the charset for the non-ASCII words, then one would have to send
gazillion-part messages, which is not what one would want. One wants
as few parts as possible.
> iso-8859-8 is not a "valid" character set. The message seems to be
> using iso-8859-8-i (According to RFC 1556), but the whole point of
> sending r2l language messages in (X)Emacs is mute.
[-- Attachment #2: Type: text/plain, Size: 438 bytes --]
Well -- isn't it nice to be able to say שלום? :-)
(Although I don't know whether that is readable. What does one do --
read the English text l2r and then the Hebrew world r2l?)
But since there is no proper r2l support in the Emacsen, the Arab
languages and Hebrew (for instance) thingies are more toys than really
usable, I guess.
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Lots of sets.
1998-12-03 12:11 ` Lars Magne Ingebrigtsen
@ 1998-12-03 13:38 ` Yair Friedman
0 siblings, 0 replies; 9+ messages in thread
From: Yair Friedman @ 1998-12-03 13:38 UTC (permalink / raw)
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> [2 <text/plain; iso-8859-8 (8bit)>]
> Well -- isn't it nice to be able to say ùìåí? :-)
>
Well no, it appear reversed so it's not readable, and because a final
letter appears first it's not even gibrish. :-(
> (Although I don't know whether that is readable. What does one do --
> read the English text l2r and then the Hebrew world r2l?)
Actually ye, we do it allways read letters r2l and numbers l2r.
>
> But since there is no proper r2l support in the Emacsen, the Arab
> languages and Hebrew (for instance) thingies are more toys than really
> usable, I guess.
Emacs 20.5 will hopfully have support for r2l... I don't know about
XEmacs.
--
Yair.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Lots of sets.
1998-12-02 23:35 Lots of sets Lars Magne Ingebrigtsen
1998-12-02 23:48 ` Hrvoje Niksic
1998-12-03 8:03 ` Yair Friedman
@ 1998-12-03 14:49 ` Michael Welsh Duggan
1998-12-04 0:57 ` Lars Magne Ingebrigtsen
1998-12-03 21:01 ` Jack Vinson
3 siblings, 1 reply; 9+ messages in thread
From: Michael Welsh Duggan @ 1998-12-03 14:49 UTC (permalink / raw)
Just for reference, (and I'm not going to bother quoting):
All came through intact on my system as found in my HELLO file, with
the exception of the following:
Amharic: broken. I could see the escape codes rather then the characters.
Lao: Partially broken. Some visible escape sequences.
Tigrigna: Same as Amharic.
--
Michael Duggan
(md5i@cs.cmu.edu)
.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Lots of sets.
1998-12-02 23:35 Lots of sets Lars Magne Ingebrigtsen
` (2 preceding siblings ...)
1998-12-03 14:49 ` Michael Welsh Duggan
@ 1998-12-03 21:01 ` Jack Vinson
3 siblings, 0 replies; 9+ messages in thread
From: Jack Vinson @ 1998-12-03 21:01 UTC (permalink / raw)
[-- Attachment #1: Type: text/plain, Size: 965 bytes --]
>>>>> "LMI" == Lars Magne Ingebrigtsen <larsi@ifi.uio.no> writes:
LMI> I wonder whether I can mail the HELLO file. I guess we'll find out
LMI> now. :-)
I don't think I received the entire HELLO file when I looked at the
document in the "default" view in pgnus-0.59. The supercitation of the
message is below. Will this get retranslated, or will it stay in ASCII?
Entries which seem to be missing: Arabic, Hindi, Tibetan. Each of these
follow some unusual language translations. Also, the message was 261 lines
long while the buffer was only 105.
When I do 'K b' to look at the 40 parts (as indicated in the mode line!),
it looks much worse. The "Lao..." line is split into parts 6-11! The Thai
section goes from 16-26.
LMI> Amharic (^[$(3"c!<!N"^^[(B) ^[$(3!A!,!>^[(B
LMI> Czech (hesky) Dobr} den
LMI> Danish (Dansk) Hej, Goddag
LMI> English Hello
LMI> Esperanto Saluton
LMI> Estonian Tere, Tervist
LMI> FORTRAN PROGRAM
LMI> Finnish (Suomi) Hei
[-- Attachment #2: Type: text/plain, Size: 111 bytes --]
LMI> French (Frangais) Bonjour, Salut
LMI> German (Deutsch Nord) Guten Tag
LMI> German (Deutsch S|d) Gr|_ Gott
[-- Attachment #3: Type: text/plain, Size: 31 bytes --]
LMI> Greek (Ekkgmij\) Cei\ sar
[-- Attachment #4: Type: text/plain, Size: 52 bytes --]
LMI> Hebrew ylem
LMI> Italiano Ciao, Buon giorno
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #5: Type: text/plain; charset=lao, Size: 45 bytes --]
LMI> Lao(>RJRERG)
LMI> JP:R-^[1,
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #6: Type: text/plain; charset=lao, Size: 11 bytes --]
LMI> ^[0"m^[1
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #7: Type: text/plain; charset=lao, Size: 118 bytes --]
c
LMI> ^[1
LMI> b*!
LMI> ^[1
LMI> Maltese Ciao
LMI> Nederlands, Vlaams Hallo, Dag
LMI> Norwegian (Norsk) Hei, God dag
[-- Attachment #8: Type: text/plain, Size: 31 bytes --]
LMI> Polish Dzieq dobry, Hej
[-- Attachment #9: Type: text/plain, Size: 37 bytes --]
LMI> Russian (@caaZXY) 7T`PRabRcYbU!
[-- Attachment #10: Type: text/plain, Size: 24 bytes --]
LMI> Slovak Dobr} der
[-- Attachment #11: Type: text/plain, Size: 65 bytes --]
LMI> Spanish (Espaqol) !Hola!
LMI> Swedish (Svenska) Hej, Goddag
[-- Attachment #12: Type: text/plain, Size: 22 bytes --]
LMI> Thai (@RIRd7B)
[-- Attachment #13: Type: text/plain, Size: 9 bytes --]
LMI> JGQ
[-- Attachment #14: Type: text/plain, Size: 7 bytes --]
LMI> J
[-- Attachment #15: Type: text/plain, Size: 8 bytes --]
LMI> 4U
[-- Attachment #16: Type: text/plain, Size: 7 bytes --]
LMI> $
[-- Attachment #17: Type: text/plain, Size: 8 bytes --]
LMI> CQ
[-- Attachment #18: Type: text/plain, Size: 9 bytes --]
LMI> :,
[-- Attachment #19: Type: text/plain, Size: 9 bytes --]
LMI> JGQ
[-- Attachment #20: Type: text/plain, Size: 7 bytes --]
LMI> J
[-- Attachment #21: Type: text/plain, Size: 10 bytes --]
LMI> 4U$h
[-- Attachment #22: Type: text/plain, Size: 56 bytes --]
LMI> P
LMI> Tigrigna (^[$(3"8#r!N"^^[(B) ^[$(3!Q!,!<"8^[(B
[-- Attachment #23: Type: text/plain, Size: 30 bytes --]
LMI> Turkish (T|rkge) Merhaba
[-- Attachment #24: Type: text/plain, Size: 39 bytes --]
LMI> Vietnamese (Ti*ng Vi.t) Ch`o bUn
[-- Attachment #25: Type: text/plain, Size: 49 bytes --]
LMI> Japanese (日本語) こんにちは,
[-- Attachment #26: Type: text/plain, Size: 11 bytes --]
LMI> :]FAJ
[-- Attachment #27: Type: text/plain, Size: 37 bytes --]
LMI> Chinese (VPND,FUM(;0,::So) Dc:C
[-- Attachment #28: Type: text/plain, Size: 41 bytes --]
LMI> Cantonese (8f;y,<s*F8\) &-1a, 'A&n
[-- Attachment #29: Type: text/plain, Size: 129 bytes --]
LMI> Korean (한글) 안녕하세요, 안녕하십니까
LMI> Difference among chinese characters in GB, JIS, KSC, BIG5:
[-- Attachment #30: Type: text/plain, Size: 25 bytes --]
LMI> GB -- T*Fx ?*7"
[-- Attachment #31: Type: text/plain, Size: 37 bytes --]
LMI> JIS -- 元気 開発
[-- Attachment #32: Type: text/plain, Size: 37 bytes --]
LMI> KSC -- 元氣 開發
[-- Attachment #33: Type: text/plain, Size: 26 bytes --]
LMI> BIG5 -- $8.p 6}5o
[-- Attachment #34: Type: text/plain, Size: 43 bytes --]
LMI> Just for a test of JISX0212: 騏
[-- Attachment #35: Type: text/plain, Size: 413 bytes --]
LMI> 驎 (the second character is of JISX0212)
LMI> --
LMI> (domestic pets only, the antidote for overdose, milk.)
LMI> larsi@ifi.uio.no * Lars Magne Ingebrigtsen
--
Jack Vinson <jvinson@chevax.ecs.umass.edu> http://www.cis.upenn.edu/~vinson/
Zippy: Leona, I want to CONFESS things to you..
I want to WRAP you in a SCARLET ROBE trimmed with POLYVINYL CHLORIDE..
I want to EMPTY your ASHTRAYS...
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Lots of sets.
1998-12-03 14:49 ` Michael Welsh Duggan
@ 1998-12-04 0:57 ` Lars Magne Ingebrigtsen
1998-12-04 5:12 ` Michael Welsh Duggan
0 siblings, 1 reply; 9+ messages in thread
From: Lars Magne Ingebrigtsen @ 1998-12-04 0:57 UTC (permalink / raw)
Michael Welsh Duggan <md5i@cs.cmu.edu> writes:
> Amharic: broken. I could see the escape codes rather then the characters.
> Lao: Partially broken. Some visible escape sequences.
> Tigrigna: Same as Amharic.
These are all, er, weird coding systems. Or, at least, they use weird
MULE charsets -- `composition'. What *is* that?
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Lots of sets.
1998-12-04 0:57 ` Lars Magne Ingebrigtsen
@ 1998-12-04 5:12 ` Michael Welsh Duggan
0 siblings, 0 replies; 9+ messages in thread
From: Michael Welsh Duggan @ 1998-12-04 5:12 UTC (permalink / raw)
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> Michael Welsh Duggan <md5i@cs.cmu.edu> writes:
>
> > Amharic: broken. I could see the escape codes rather then the characters.
> > Lao: Partially broken. Some visible escape sequences.
> > Tigrigna: Same as Amharic.
>
> These are all, er, weird coding systems. Or, at least, they use weird
> MULE charsets -- `composition'. What *is* that?
This is where you use base characters and modifiers to build up a
single character or glyph. (I'm using very informal language here.)
An example of composition for roman characters: `A' composed with `¨'
becomes `Ä'. Unicode has a lot of these composing characters, and
some character sets lend themselves to this form of representation.
--
Michael Duggan
(md5i@cs.cmu.edu)
.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~1998-12-04 5:12 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1998-12-02 23:35 Lots of sets Lars Magne Ingebrigtsen
1998-12-02 23:48 ` Hrvoje Niksic
1998-12-03 8:03 ` Yair Friedman
1998-12-03 12:11 ` Lars Magne Ingebrigtsen
1998-12-03 13:38 ` Yair Friedman
1998-12-03 14:49 ` Michael Welsh Duggan
1998-12-04 0:57 ` Lars Magne Ingebrigtsen
1998-12-04 5:12 ` Michael Welsh Duggan
1998-12-03 21:01 ` Jack Vinson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).