From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/45750 Path: main.gmane.org!not-for-mail From: Ken Raeburn Newsgroups: gmane.emacs.gnus.general Subject: naive charset question Date: Sat, 20 Jul 2002 23:02:37 -0400 Sender: owner-ding@hpc.uh.edu Message-ID: NNTP-Posting-Host: localhost.gmane.org X-Trace: main.gmane.org 1027220632 13218 127.0.0.1 (21 Jul 2002 03:03:52 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Sun, 21 Jul 2002 03:03:52 +0000 (UTC) Return-path: Original-Received: from malifon.math.uh.edu ([129.7.128.13]) by main.gmane.org with esmtp (Exim 3.33 #1 (Debian)) id 17W70h-0003R5-00 for ; Sun, 21 Jul 2002 05:03:51 +0200 Original-Received: from sina.hpc.uh.edu ([129.7.128.10] ident=lists) by malifon.math.uh.edu with esmtp (Exim 3.20 #1) id 17W6zp-00019k-00; Sat, 20 Jul 2002 22:02:57 -0500 Original-Received: by sina.hpc.uh.edu (TLB v0.09a (1.20 tibbs 1996/10/09 22:03:07)); Sat, 20 Jul 2002 22:03:23 -0500 (CDT) Original-Received: from sclp3.sclp.com (qmailr@sclp3.sclp.com [209.196.61.66]) by sina.hpc.uh.edu (8.9.3/8.9.3) with SMTP id WAA16530 for ; Sat, 20 Jul 2002 22:03:13 -0500 (CDT) Original-Received: (qmail 19085 invoked by alias); 21 Jul 2002 03:02:41 -0000 Original-Received: (qmail 19080 invoked from network); 21 Jul 2002 03:02:40 -0000 Original-Received: from 208-59-178-90.c3-0.smr-ubr1.sbo-smr.ma.cable.rcn.com (HELO raeburn.org) (208.59.178.90) by gnus.org with SMTP; 21 Jul 2002 03:02:40 -0000 Original-Received: from kal-el.raeburn.org (mail@kal-el.raeburn.org [18.101.0.230]) by raeburn.org (8.11.3/8.11.3) with ESMTP id g6L32bf15909; Sat, 20 Jul 2002 23:02:37 -0400 (EDT) Original-Received: from raeburn by kal-el.raeburn.org with local (Exim 3.35 #1 (Debian)) id 17W6zV-0006JO-00; Sat, 20 Jul 2002 23:02:37 -0400 Original-To: ding@gnus.org Precedence: list X-Majordomo: 1.94.jlt7 Xref: main.gmane.org gmane.emacs.gnus.general:45750 X-Report-Spam: http://spam.gmane.org/gmane.emacs.gnus.general:45750 I'm trying to experiment a little with sending non-ASCII characters in email. As far as I'm aware, I've done nothing special to set up the charset handling for my Emacs or Gnus configurations. I rather naively assume that if Gnus and Emacs are doing their job well, I should be able to take a buffer with non-ASCII characters displayed, stick it in an email message, and at most be prompted for which of the possible charsets should be used for certain non-ASCII characters, based on the characters actually used in the message. No, let me change that statement: I would submit that a user-friendly multilingual MUA should behave that way by default. If I want to mention Kai by his full name and discuss money, I should probably be thinking "ess-tset" and "euro", not "latin-, uh, wait, let me look it up again, are they both in the same charset?" If I want to reply to and quote a message written in Japanese, the charset selection should Just Work. To make it a bit challenging (perhaps too much so?), I tried inserting the HELLO buffer (C-h h) into a mail message (in Message mode); the buffer showed ASCII, Cyrillic, Hebrew and other characters, which no single 8-bit character set would encompass. I tried to send the message; it asked me for a charset, and when I hit "?" I got a bunch of options, including many that I'm sure would not support some of the characters in the buffer. Assuming (again naively) that the default should do something reasonable, I hit return, and the message was sent. The message that arrived in my mailbox says charset=us-ascii. I tried the same test a second time, and picked a charset from the list, "arabic-1-column"; the message was immediately sent with charset=arabic-1-column. Both messages look wrong in the *Article* buffer, of course. I didn't find much in the documentation to indicate why my naive assumption might actually be wrong. In ognus-0.06, message.texi says very little about charsets and encoding except "here's how you specify your choice", and gnus.texi appears to only talk about them in the context of viewing messages; it seems to be assumed that the reader already knows how charsets are used. If message.texi isn't going to go into it, a pointer to some introductory material elsewhere might be of use; after all, even some of us ignorant, self-centered Americans need to know how to communicate with the outside world. So, how should I, naive in such issues, transmit such a HELLO message (not an attachment with a byte for byte copy of the HELLO file) so that a recipient (and preferably one not necessarily using Emacs) might view it as intended? Would this have actually worked if I were sending something that could be expressed using a single charset? Is there something about the HELLO file encoding that makes it a bad test case? Ken P.S. I'm using Emacs 21.3.50 built from the CVS repository within the last day or so, and, as I said, ognus-0.06.