From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/16724 Path: main.gmane.org!not-for-mail From: Lars Magne Ingebrigtsen Newsgroups: gmane.emacs.gnus.general Subject: Re: "Coding system"? Eh? Date: 05 Sep 1998 22:07:43 +0200 Sender: owner-ding@hpc.uh.edu Message-ID: References: NNTP-Posting-Host: coloc-standby.netfonds.no Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-Trace: main.gmane.org 1035155550 29522 80.91.224.250 (20 Oct 2002 23:12:30 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Sun, 20 Oct 2002 23:12:30 +0000 (UTC) Return-Path: Original-Received: from gizmo.hpc.uh.edu (gizmo.hpc.uh.edu [129.7.102.31]) by sclp3.sclp.com (8.8.5/8.8.5) with ESMTP id QAA08062 for ; Sat, 5 Sep 1998 16:41:25 -0400 (EDT) Original-Received: from sina.hpc.uh.edu (sina.hpc.uh.edu [129.7.3.5]) by gizmo.hpc.uh.edu (8.7.6/8.7.3) with ESMTP id PAF16035; Sat, 5 Sep 1998 15:12:30 -0500 Original-Received: by sina.hpc.uh.edu (TLB v0.09a (1.20 tibbs 1996/10/09 22:03:07)); Sat, 05 Sep 1998 15:41:17 -0500 (CDT) Original-Received: from sclp3.sclp.com (root@sclp3.sclp.com [209.195.19.139]) by sina.hpc.uh.edu (8.7.3/8.7.3) with ESMTP id PAA13763 for ; Sat, 5 Sep 1998 15:41:08 -0500 (CDT) Original-Received: from sparky.gnus.org (ppp102.uio.no [129.240.240.107]) by sclp3.sclp.com (8.8.5/8.8.5) with ESMTP id QAA08056 for ; Sat, 5 Sep 1998 16:41:01 -0400 (EDT) Original-Received: (from larsi@localhost) by sparky.gnus.org (8.8.5/8.8.5) id WAA23923; Sat, 5 Sep 1998 22:44:07 +0200 Mail-Copies-To: never X-Now-Reading: Jack Dann (ed.)'s _Nova Awards 32_ Original-To: ding@gnus.org In-Reply-To: Michael Welsh Duggan's message of "Sat, 05 Sep 1998 16:31:58 GMT" X-Mailer: Pterodactyl Gnus v0.17/Emacs 20.3 X-Face: &w!^oO~dS|}-P0~ge{$c!h\ writes: > No, not really. A character set is merely a set of characters. > latin-1, etc, are often called character sets because they use the > same number of characters as extended ASCII, etc. A coding-system is > just that: a coding-system. The characters could be encoded any which > way (including encrypted!). For example, old-jis uses escapes around > sequences of 7-bit characters. This is an encoding, which you can > display using a character set, but not a character set in and of > itself. All texts consists of characters (from some character set) encoded (using some coding system). iso-8859-1, for instance, represents the character LATIN-LETTER-A-WITH-UMLAUT ("ä") with one byte that contains the number 0xe4. The same letter encoded in a different charset (say, Unicode) would occupy two bytes. Other character sets use multiple bytes to represent characters, like iso-2022-jp. When one talks about character sets (in, say, MIME) one talks about encoded character sets. Abstract character sets aren't all that interesting when fiddling with data. iso-8859-1, which MULE calls a coding system, is something everyone else calls a character set. The same with old-jis and iso-2022-jp. Or something. -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen