Lots of sets.

Gnus development mailing list
 help / color / mirror / Atom feed

* Lots of sets.
@ 1998-12-02 23:35 Lars Magne Ingebrigtsen
  1998-12-02 23:48 ` Hrvoje Niksic
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Lars Magne Ingebrigtsen @ 1998-12-02 23:35 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 119 bytes --]

I wonder whether I can mail the HELLO file.  I guess we'll find out
now.  :-)

Amharic	(

[-- Attachment #2: Type: text/plain, Size: 151 bytes --]

Czech (česky)		Dobrý den
Danish (Dansk)		Hej, Goddag
English			Hello
Esperanto		Saluton
Estonian		Tere, Tervist
FORTRAN			PROGRAM
Finnish (Suomi)		Hei

[-- Attachment #3: Type: text/plain, Size: 96 bytes --]

French (Français)	Bonjour, Salut
German (Deutsch Nord)	Guten Tag
German (Deutsch Süd)	Grüß Gott

[-- Attachment #4: Type: text/plain, Size: 26 bytes --]

Greek (Ελληνικά)	Γειά σας

[-- Attachment #5: Type: text/plain, Size: 42 bytes --]

Hebrew			שלום
Italiano		Ciao, Buon giorno

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #6: Type: text/plain; charset=lao, Size: 24 bytes --]

Lao(¾ÒÊÒÅÒÇ)            

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #7: Type: text/plain; charset=lao, Size: 13 bytes --]

ÊÐºÒ^[0´Õ^[1, 

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #8: Type: text/plain; charset=lao, Size: 7 bytes --]

^[0¢í^[1ã

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #9: Type: text/plain; charset=lao, Size: 6 bytes --]

^[0Ëé^[1

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #10: Type: text/plain; charset=lao, Size: 3 bytes --]

⪡

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #11: Type: text/plain; charset=lao, Size: 83 bytes --]

^[0´Õ^[1
Maltese			Ciao
Nederlands, Vlaams	Hallo, Dag
Norwegian (Norsk)	Hei, God dag

[-- Attachment #12: Type: text/plain, Size: 26 bytes --]

Polish			Dzień dobry, Hej

[-- Attachment #13: Type: text/plain, Size: 32 bytes --]

Russian (Русский)	Здравствуйте!

[-- Attachment #14: Type: text/plain, Size: 19 bytes --]

Slovak			Dobrý deň

[-- Attachment #15: Type: text/plain, Size: 55 bytes --]

Spanish (Español)	¡Hola!
Swedish (Svenska)	Hej, Goddag

[-- Attachment #16: Type: text/plain, Size: 16 bytes --]

Thai (ภาษาไทย)		

[-- Attachment #17: Type: text/plain, Size: 3 bytes --]

สวั

[-- Attachment #18: Type: text/plain, Size: 1 bytes --]

ส

[-- Attachment #19: Type: text/plain, Size: 2 bytes --]

ดี

[-- Attachment #20: Type: text/plain, Size: 1 bytes --]

ค

[-- Attachment #21: Type: text/plain, Size: 2 bytes --]

รั

[-- Attachment #22: Type: text/plain, Size: 3 bytes --]

บ, 

[-- Attachment #23: Type: text/plain, Size: 3 bytes --]

สวั

[-- Attachment #24: Type: text/plain, Size: 1 bytes --]

ส

[-- Attachment #25: Type: text/plain, Size: 4 bytes --]

ดีค่

[-- Attachment #26: Type: text/plain, Size: 3 bytes --]

ะ


[-- Attachment #27: Type: text/plain, Size: 43 bytes --]

Tigrigna (

[-- Attachment #28: Type: text/plain, Size: 25 bytes --]

Turkish (Türkçe)	Merhaba

[-- Attachment #29: Type: text/plain, Size: 34 bytes --]

Vietnamese (Tiếng Việt)	Chào bạn


[-- Attachment #30: Type: text/plain, Size: 43 bytes --]

Japanese (日本語)		こんにちは, 

[-- Attachment #31: Type: text/plain, Size: 6 bytes --]

ｺﾝﾆﾁﾊ

[-- Attachment #32: Type: text/plain, Size: 32 bytes --]

Chinese (中文,普通话,汉语)	你好

[-- Attachment #33: Type: text/plain, Size: 36 bytes --]

Cantonese (粵語,廣東話)		早晨, 你好

[-- Attachment #34: Type: text/plain, Size: 119 bytes --]

Korean (한글)			안녕하세요, 안녕하십니까

Difference among chinese characters in GB, JIS, KSC, BIG5:

[-- Attachment #35: Type: text/plain, Size: 20 bytes --]

	GB   -- 元气  开发

[-- Attachment #36: Type: text/plain, Size: 32 bytes --]

	JIS  -- 元気  開発

[-- Attachment #37: Type: text/plain, Size: 32 bytes --]

	KSC  -- 元氣  開發

[-- Attachment #38: Type: text/plain, Size: 21 bytes --]

	BIG5 -- 元氣  開發


[-- Attachment #39: Type: text/plain, Size: 37 bytes --]

Just for a test of JISX0212: 騏

[-- Attachment #40: Type: text/plain, Size: 154 bytes --]

驎 (the second character is of JISX0212)


-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@ifi.uio.no * Lars Magne Ingebrigtsen

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Lots of sets.
  1998-12-02 23:35 Lots of sets Lars Magne Ingebrigtsen
@ 1998-12-02 23:48 ` Hrvoje Niksic
  1998-12-03  8:03 ` Yair Friedman
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Hrvoje Niksic @ 1998-12-02 23:48 UTC (permalink / raw)

Lars Magne Ingebrigtsen <larsi@ifi.uio.no> writes:
[...]

Something is not right here.

> --==-=-=
> Content-Type: text/plain; charset=iso-2022-jp
> 
> I wonder whether I can mail the HELLO file.  I guess we'll find out
> now.  :-)
> 
> Amharic	(^[$(3"c!<!N"^^[(B)	^[$(3!A!,!>^[(B

This is not good, because you are marking perfectly fine us-ascii text 
as iso-2022-jp (?!) instead of only the Amharic (?) file.  Some
mailers may choose not to show the ASCII text because of that.

With iso-8859-* things are slightly better, but still...

> --==-=-=
> Content-Type: text/plain; charset=lao
> Content-Transfer-Encoding: 8bit
> 
> Lao(žŇĘŇĹŇÇ)            
> --==-=-=
> Content-Type: text/plain; charset=lao
> Content-Transfer-Encoding: 8bit
> 
> ĘĐşŇ^[0´Ő^[1,
[... several more lao parts ...]

This is obviously wrong.  When autosplitting, Gnus should compress the
adjacent regions of equal charset to a single part.  The same with
thai-tis620.

Otherwise, the feature is quite cool, if a little pathological.  :-)

-- 
Hrvoje Niksic <hniksic@srce.hr> | Student at FER Zagreb, Croatia
--------------------------------+--------------------------------
The Lord protects children and fools...  But don't push it.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Lots of sets.
  1998-12-02 23:35 Lots of sets Lars Magne Ingebrigtsen
  1998-12-02 23:48 ` Hrvoje Niksic
@ 1998-12-03  8:03 ` Yair Friedman
  1998-12-03 12:11   ` Lars Magne Ingebrigtsen
  1998-12-03 14:49 ` Michael Welsh Duggan
  1998-12-03 21:01 ` Jack Vinson
  3 siblings, 1 reply; 9+ messages in thread
From: Yair Friedman @ 1998-12-03  8:03 UTC (permalink / raw)



Still using Gnus v5.6.43 with tm hencr the crude quoting but,.....

Lars Magne Ingebrigtsen <larsi@ifi.uio.no> writes:

> [5  <text/plain; iso-8859-8 (8bit)>]
> Hebrew			ùìåí
> Italiano		Ciao, Buon giorno

I'm not sure it's wise to display the Italian greetings in Hebrew
charset, most of the readers probably doesn't have it installed, so they
won't see it.

iso-8859-8 is not a "valid" character set.  The message seems to be
using iso-8859-8-i (According to RFC 1556), but the whole point of
sending r2l language messages in (X)Emacs is mute.
--
Yair.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Lots of sets.
  1998-12-03  8:03 ` Yair Friedman
@ 1998-12-03 12:11   ` Lars Magne Ingebrigtsen
  1998-12-03 13:38     ` Yair Friedman
  0 siblings, 1 reply; 9+ messages in thread
From: Lars Magne Ingebrigtsen @ 1998-12-03 12:11 UTC (permalink / raw)

[-- Attachment #1: Type: text/plain, Size: 846 bytes --]

Yair Friedman <yfriedma@JohnBryce.Co.Il> writes:

> > [5  <text/plain; iso-8859-8 (8bit)>]
> > Hebrew			ùìåí
> > Italiano		Ciao, Buon giorno
> 
> I'm not sure it's wise to display the Italian greetings in Hebrew
> charset, most of the readers probably doesn't have it installed, so they
> won't see it.

But the Italian greeting was in all ASCII, so...

When encoding a message (or a message part), then one chooses a
charset that can encode that message (or part).  If one were to only
use the charset for the non-ASCII words, then one would have to send
gazillion-part messages, which is not what one would want.  One wants
as few parts as possible.

> iso-8859-8 is not a "valid" character set.  The message seems to be
> using iso-8859-8-i (According to RFC 1556), but the whole point of
> sending r2l language messages in (X)Emacs is mute.

[-- Attachment #2: Type: text/plain, Size: 438 bytes --]

Well -- isn't it nice to be able to say שלום?  :-)

(Although I don't know whether that is readable.  What does one do --
read the English text l2r and then the Hebrew world r2l?)

But since there is no proper r2l support in the Emacsen, the Arab
languages and Hebrew (for instance) thingies are more toys than really
usable, I guess.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Lots of sets.
  1998-12-03 12:11   ` Lars Magne Ingebrigtsen
@ 1998-12-03 13:38     ` Yair Friedman
  0 siblings, 0 replies; 9+ messages in thread
From: Yair Friedman @ 1998-12-03 13:38 UTC (permalink / raw)


Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> [2  <text/plain; iso-8859-8 (8bit)>]
> Well -- isn't it nice to be able to say ùìåí?  :-)
> 

Well no, it appear reversed so it's not readable, and because a final
letter appears first it's not even gibrish. :-(

> (Although I don't know whether that is readable.  What does one do --
> read the English text l2r and then the Hebrew world r2l?)

Actually ye, we do it allways read letters r2l and numbers l2r.

> 
> But since there is no proper r2l support in the Emacsen, the Arab
> languages and Hebrew (for instance) thingies are more toys than really
> usable, I guess.

Emacs 20.5 will hopfully have support for r2l... I don't know about
XEmacs.
--
Yair.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Lots of sets.
  1998-12-02 23:35 Lots of sets Lars Magne Ingebrigtsen
  1998-12-02 23:48 ` Hrvoje Niksic
  1998-12-03  8:03 ` Yair Friedman
@ 1998-12-03 14:49 ` Michael Welsh Duggan
  1998-12-04  0:57   ` Lars Magne Ingebrigtsen
  1998-12-03 21:01 ` Jack Vinson
  3 siblings, 1 reply; 9+ messages in thread
From: Michael Welsh Duggan @ 1998-12-03 14:49 UTC (permalink / raw)



Just for reference, (and I'm not going to bother quoting):

All came through intact on my system as found in my HELLO file, with
the exception of the following:

Amharic: broken.  I could see the escape codes rather then the characters.
Lao: Partially broken.  Some visible escape sequences.
Tigrigna: Same as Amharic.

-- 
Michael Duggan
(md5i@cs.cmu.edu)
.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Lots of sets.
  1998-12-02 23:35 Lots of sets Lars Magne Ingebrigtsen
                   ` (2 preceding siblings ...)
  1998-12-03 14:49 ` Michael Welsh Duggan
@ 1998-12-03 21:01 ` Jack Vinson
  3 siblings, 0 replies; 9+ messages in thread
From: Jack Vinson @ 1998-12-03 21:01 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 965 bytes --]

>>>>> "LMI" == Lars Magne Ingebrigtsen <larsi@ifi.uio.no> writes:

LMI> I wonder whether I can mail the HELLO file.  I guess we'll find out
LMI> now.  :-)

I don't think I received the entire HELLO file when I looked at the
document in the "default" view in pgnus-0.59.  The supercitation of the
message is below.  Will this get retranslated, or will it stay in ASCII?

Entries which seem to be missing: Arabic, Hindi, Tibetan.  Each of these
follow some unusual language translations.  Also, the message was 261 lines
long while the buffer was only 105.  

When I do 'K b' to look at the 40 parts (as indicated in the mode line!),
it looks much worse.  The "Lao..." line is split into parts 6-11!  The Thai
section goes from 16-26.

LMI> Amharic	(^[$(3"c!<!N"^^[(B)	^[$(3!A!,!>^[(B
LMI> Czech (hesky)		Dobr} den
LMI> Danish (Dansk)		Hej, Goddag
LMI> English			Hello
LMI> Esperanto		Saluton
LMI> Estonian		Tere, Tervist
LMI> FORTRAN			PROGRAM
LMI> Finnish (Suomi)		Hei

[-- Attachment #2: Type: text/plain, Size: 111 bytes --]

LMI> French (Frangais)	Bonjour, Salut
LMI> German (Deutsch Nord)	Guten Tag
LMI> German (Deutsch S|d)	Gr|_ Gott

[-- Attachment #3: Type: text/plain, Size: 31 bytes --]

LMI> Greek (Ekkgmij\)	Cei\ sar

[-- Attachment #4: Type: text/plain, Size: 52 bytes --]

LMI> Hebrew			ylem
LMI> Italiano		Ciao, Buon giorno

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #5: Type: text/plain; charset=lao, Size: 45 bytes --]

LMI> Lao(>RJRERG)            
LMI> JP:R-^[1, 

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #6: Type: text/plain; charset=lao, Size: 11 bytes --]

LMI> ^[0"m^[1

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #7: Type: text/plain; charset=lao, Size: 118 bytes --]

c
LMI> ^[1
LMI> b*!
LMI> ^[1
LMI> Maltese			Ciao
LMI> Nederlands, Vlaams	Hallo, Dag
LMI> Norwegian (Norsk)	Hei, God dag

[-- Attachment #8: Type: text/plain, Size: 31 bytes --]

LMI> Polish			Dzieq dobry, Hej

[-- Attachment #9: Type: text/plain, Size: 37 bytes --]

LMI> Russian (@caaZXY)	7T`PRabRcYbU!

[-- Attachment #10: Type: text/plain, Size: 24 bytes --]

LMI> Slovak			Dobr} der

[-- Attachment #11: Type: text/plain, Size: 65 bytes --]

LMI> Spanish (Espaqol)	!Hola!
LMI> Swedish (Svenska)	Hej, Goddag

[-- Attachment #12: Type: text/plain, Size: 22 bytes --]

LMI> Thai (@RIRd7B)		

[-- Attachment #13: Type: text/plain, Size: 9 bytes --]

LMI> JGQ

[-- Attachment #14: Type: text/plain, Size: 7 bytes --]

LMI> J

[-- Attachment #15: Type: text/plain, Size: 8 bytes --]

LMI> 4U

[-- Attachment #16: Type: text/plain, Size: 7 bytes --]

LMI> $

[-- Attachment #17: Type: text/plain, Size: 8 bytes --]

LMI> CQ

[-- Attachment #18: Type: text/plain, Size: 9 bytes --]

LMI> :, 

[-- Attachment #19: Type: text/plain, Size: 9 bytes --]

LMI> JGQ

[-- Attachment #20: Type: text/plain, Size: 7 bytes --]

LMI> J

[-- Attachment #21: Type: text/plain, Size: 10 bytes --]

LMI> 4U$h

[-- Attachment #22: Type: text/plain, Size: 56 bytes --]

LMI> P

LMI> Tigrigna (^[$(3"8#r!N"^^[(B)	^[$(3!Q!,!<"8^[(B

[-- Attachment #23: Type: text/plain, Size: 30 bytes --]

LMI> Turkish (T|rkge)	Merhaba

[-- Attachment #24: Type: text/plain, Size: 39 bytes --]

LMI> Vietnamese (Ti*ng Vi.t)	Ch`o bUn


[-- Attachment #25: Type: text/plain, Size: 49 bytes --]

LMI> Japanese (日本語)		こんにちは, 

[-- Attachment #26: Type: text/plain, Size: 11 bytes --]

LMI> :]FAJ

[-- Attachment #27: Type: text/plain, Size: 37 bytes --]

LMI> Chinese (VPND,FUM(;0,::So)	Dc:C

[-- Attachment #28: Type: text/plain, Size: 41 bytes --]

LMI> Cantonese (8f;y,<s*F8\)		&-1a, 'A&n

[-- Attachment #29: Type: text/plain, Size: 129 bytes --]

LMI> Korean (한글)			안녕하세요, 안녕하십니까

LMI> Difference among chinese characters in GB, JIS, KSC, BIG5:

[-- Attachment #30: Type: text/plain, Size: 25 bytes --]

LMI> 	GB   -- T*Fx  ?*7"

[-- Attachment #31: Type: text/plain, Size: 37 bytes --]

LMI> 	JIS  -- 元気  開発

[-- Attachment #32: Type: text/plain, Size: 37 bytes --]

LMI> 	KSC  -- 元氣  開發

[-- Attachment #33: Type: text/plain, Size: 26 bytes --]

LMI> 	BIG5 -- $8.p  6}5o


[-- Attachment #34: Type: text/plain, Size: 43 bytes --]

LMI> Just for a test of JISX0212: 騏

[-- Attachment #35: Type: text/plain, Size: 413 bytes --]

LMI> 驎 (the second character is of JISX0212)


LMI> -- 
LMI> (domestic pets only, the antidote for overdose, milk.)
LMI>   larsi@ifi.uio.no * Lars Magne Ingebrigtsen

-- 
Jack Vinson <jvinson@chevax.ecs.umass.edu>    http://www.cis.upenn.edu/~vinson/
Zippy: Leona, I want to CONFESS things to you..
 I want to WRAP you in a SCARLET ROBE trimmed with POLYVINYL CHLORIDE..
 I want to EMPTY your ASHTRAYS...

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Lots of sets.
  1998-12-03 14:49 ` Michael Welsh Duggan
@ 1998-12-04  0:57   ` Lars Magne Ingebrigtsen
  1998-12-04  5:12     ` Michael Welsh Duggan
  0 siblings, 1 reply; 9+ messages in thread
From: Lars Magne Ingebrigtsen @ 1998-12-04  0:57 UTC (permalink / raw)


Michael Welsh Duggan <md5i@cs.cmu.edu> writes:

> Amharic: broken.  I could see the escape codes rather then the characters.
> Lao: Partially broken.  Some visible escape sequences.
> Tigrigna: Same as Amharic.

These are all, er, weird coding systems.  Or, at least, they use weird 
MULE charsets -- `composition'.  What *is* that?

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Lots of sets.
  1998-12-04  0:57   ` Lars Magne Ingebrigtsen
@ 1998-12-04  5:12     ` Michael Welsh Duggan
  0 siblings, 0 replies; 9+ messages in thread
From: Michael Welsh Duggan @ 1998-12-04  5:12 UTC (permalink / raw)

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> Michael Welsh Duggan <md5i@cs.cmu.edu> writes:
> 
> > Amharic: broken.  I could see the escape codes rather then the characters.
> > Lao: Partially broken.  Some visible escape sequences.
> > Tigrigna: Same as Amharic.
> 
> These are all, er, weird coding systems.  Or, at least, they use weird 
> MULE charsets -- `composition'.  What *is* that?

This is where you use base characters and modifiers to build up a
single character or glyph.  (I'm using very informal language here.)
An example of composition for roman characters: `A' composed with `¨'
becomes `Ä'.  Unicode has a lot of these composing characters, and
some character sets lend themselves to this form of representation.

-- 
Michael Duggan
(md5i@cs.cmu.edu)
.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~1998-12-04  5:12 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1998-12-02 23:35 Lots of sets Lars Magne Ingebrigtsen
1998-12-02 23:48 ` Hrvoje Niksic
1998-12-03  8:03 ` Yair Friedman
1998-12-03 12:11   ` Lars Magne Ingebrigtsen
1998-12-03 13:38     ` Yair Friedman
1998-12-03 14:49 ` Michael Welsh Duggan
1998-12-04  0:57   ` Lars Magne Ingebrigtsen
1998-12-04  5:12     ` Michael Welsh Duggan
1998-12-03 21:01 ` Jack Vinson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).