From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/16779 Path: main.gmane.org!not-for-mail From: Lars Magne Ingebrigtsen Newsgroups: gmane.emacs.gnus.general Subject: Re: "Coding system"? Eh? Date: Sun, 20 Oct 2002 23:13:16 +0000 (UTC) Sender: owner-ding@hpc.uh.edu Message-ID: References: NNTP-Posting-Host: coloc-standby.netfonds.no Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-Trace: main.gmane.org 1035155596 29836 80.91.224.250 (20 Oct 2002 23:13:16 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Sun, 20 Oct 2002 23:13:16 +0000 (UTC) Return-Path: Original-Received: from gizmo.hpc.uh.edu (gizmo.hpc.uh.edu [129.7.102.31]) by sclp3.sclp.com (8.8.5/8.8.5) with ESMTP id TAA18969 for ; Mon, 7 Sep 1998 19:01:22 -0400 (EDT) Original-Received: from sina.hpc.uh.edu (sina.hpc.uh.edu [129.7.3.5]) by gizmo.hpc.uh.edu (8.7.6/8.7.3) with ESMTP id RAF24617; Mon, 7 Sep 1998 17:32:28 -0500 Original-Received: by sina.hpc.uh.edu (TLB v0.09a (1.20 tibbs 1996/10/09 22:03:07)); Mon, 07 Sep 1998 18:01:13 -0500 (CDT) Original-Received: from sclp3.sclp.com (root@sclp3.sclp.com [209.195.19.139]) by sina.hpc.uh.edu (8.7.3/8.7.3) with ESMTP id SAA15082 for ; Mon, 7 Sep 1998 18:01:01 -0500 (CDT) Original-Received: from sparky.gnus.org (ppp047.uio.no [129.240.240.48]) by sclp3.sclp.com (8.8.5/8.8.5) with ESMTP id TAA18953 for ; Mon, 7 Sep 1998 19:00:44 -0400 (EDT) Original-Received: (from larsi@localhost) by sparky.gnus.org (8.8.5/8.8.5) id BAA03411; Tue, 8 Sep 1998 01:03:50 +0200 Mail-Copies-To: never X-Now-Reading: Ursula K. Le Guin's _Unlocking the Air and Other Stories_ Original-To: ding@gnus.org In-Reply-To: davidk@lysator.liu.se's message of "07 Sep 1998 17:12:40 +0200" X-Mailer: Pterodactyl Gnus v0.18/Emacs 20.3 X-Face: &w!^oO~dS|}-P0~ge{$c!h\ ISO 8859-1 is both a character set, and an encoding (one-to-one from > charater to byte), I believe. But I'm not sure how it is defined. A character set is an encoding in normal usage. Quoth RFC2045: 2.2. Character Set The term "character set" is used in MIME to refer to a method of converting a sequence of octets into a sequence of characters. Note that unconditional and unambiguous conversion in the other direction is not required, in that not all characters may be representable by a given character set and a character set may provide more than one sequence of octets to represent a particular sequence of characters. This definition is intended to allow various kinds of character encodings, from simple single-table mappings such as US-ASCII to complex table switching methods such as those that use ISO 2022's techniques, to be used as character sets. However, the definition associated with a MIME character set name must fully specify the mapping to be performed. In particular, use of external profiling information to determine the exact mapping is not permitted. NOTE: The term "character set" was originally to describe such straightforward schemes as US-ASCII and ISO-8859-1 which have a simple one-to-one mapping from single octets to single characters. Multi-octet coded character sets and switching techniques make the situation more complex. For example, some communities use the term "character encoding" for what MIME calls a "character set", while using the phrase "coded character set" to denote an abstract mapping from integers (not octets) to characters. -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen