From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/32570 Path: main.gmane.org!not-for-mail From: Dave Love Newsgroups: gmane.emacs.gnus.general Subject: Re: \201 irritation! :-) Date: Mon, 25 Sep 2000 12:55:16 +0100 Sender: owner-ding@hpc.uh.edu Message-ID: <200009251155.MAA15586@djlvig.dl.ac.uk> References: <00Aug28.151432edt.115218@gateway.intersys.com> <00Aug28.173634edt.115213@gateway.intersys.com> <200009051429.PAA09826@djlvig.dl.ac.uk> <200009082240.XAA16800@djlvig.dl.ac.uk> <87n1h86w6a.fsf@deneb.enyo.de> <200009181407.PAA02748@djlvig.dl.ac.uk> <87wvg7roi1.fsf@deneb.enyo.de> <200009211933.UAA08307@djlvig.dl.ac.uk> <87og1f8imp.fsf@deneb.enyo.de> NNTP-Posting-Host: coloc-standby.netfonds.no Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: main.gmane.org 1035168834 20579 80.91.224.250 (21 Oct 2002 02:53:54 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Mon, 21 Oct 2002 02:53:54 +0000 (UTC) Return-Path: Original-Received: from fisher.math.uh.edu (fisher.math.uh.edu [129.7.128.35]) by mailhost.sclp.com (Postfix) with ESMTP id BF93DD051E for ; Mon, 25 Sep 2000 07:59:13 -0400 (EDT) Original-Received: from sina.hpc.uh.edu (lists@Sina.HPC.UH.EDU [129.7.3.5]) by fisher.math.uh.edu (8.9.1/8.9.1) with ESMTP id GAC02074; Mon, 25 Sep 2000 06:55:44 -0500 (CDT) Original-Received: by sina.hpc.uh.edu (TLB v0.09a (1.20 tibbs 1996/10/09 22:03:07)); Mon, 25 Sep 2000 06:55:08 -0500 (CDT) Original-Received: from mailhost.sclp.com (postfix@66-209.196.61.interliant.com [209.196.61.66] (may be forged)) by sina.hpc.uh.edu (8.9.3/8.9.3) with ESMTP id GAA16065 for ; Mon, 25 Sep 2000 06:54:57 -0500 (CDT) Original-Received: from djlvig.dl.ac.uk (djlvig.dl.ac.uk [148.79.112.146]) by mailhost.sclp.com (Postfix) with ESMTP id A585BD051E for ; Mon, 25 Sep 2000 07:55:17 -0400 (EDT) Original-Received: (from fx@localhost) by djlvig.dl.ac.uk (8.8.7/8.8.5) id MAA15586; Mon, 25 Sep 2000 12:55:16 +0100 X-Authentication-Warning: djlvig.dl.ac.uk: fx set sender to d.love@dl.ac.uk using -f X-Face: "_!nmR@11ZNuumt0oqG"Y3Hfy|;FGz)`"ul[G?ah6k-oNyDW?3/Nq3Qab$kUnUQ_d4};kPl R=}-Vqfo|S5mThi-kaBR=>%g5a3-OvnEhdHu{^APIaP:b}0m!$bDC>SX zz'r)e?`at?tpD*+~b+pf Original-To: ding@gnus.org User-Agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/21.0.90 Original-Lines: 50 Precedence: list X-Majordomo: 1.94.jlt7 Xref: main.gmane.org gmane.emacs.gnus.general:32570 X-Report-Spam: http://spam.gmane.org/gmane.emacs.gnus.general:32570 >>>>> "FW" == Florian Weimer writes: FW> If there is a byte-combination, "(char-after)" returns values FW> outside the usual 0 .. 255 range, and FW> "quoted-printable-encode-region" doesn't handle this. Then it's either intrinsically broken or not being used appropriately since it doesn't make sense for multibyte characters. I suppose I'll have to check it if no-one else can. FW> The best thing probably is to switch the buffer to FW> uni-byte mode during quoted-printable encoding. Whatever you do, you need a buffer with encoded contents. Making it unibyte isn't right otherwise or you're dealing with the emacs-mule charset. >> I don't know what that means. Just spurious combination of leading >> bytes stuffed raw into a multibyte buffer or something else? FW> Take the "Chinese" line from the HELLO file, copy it to a FW> multi-byte buffer, do "encode-coding-region" on it and specify FW> "utf-8" as encoding. Copy the result to a unibyte buffer, and FW> paste it back into the first (multi-byte) buffer. This seems to be just confusing several issues. I don't see how it's related to spurious byte combination, unless it refers to some problem in Mule-UCS itself. Can you give a code snippet demonstrating the problems you're concerned about without using utf-8 as an example? It shouldn't be relevant. Be careful to include the relevant context (C-h C, at least) since you're concerned with code conversion. FW> Now the \201s are there, you can see them if you switch the first FW> buffer to uni-byte mode. Of course you see the internal encoding if you switch it to unibyte mode anyway. You may also have pasted in raw bytes. FW> The trouble with UTF-8 is that it tends to generate more FW> byte-combinations than other encodings, as it seems. >> Do you mean _de_code-coding-region? `encode-coding-region' would >> produce raw bytes. FW> And these raw bytes are not properly dealt with in multi-byte buffers. So don't do that. That's the point. I'm not getting any indication that there is an issue here that needs to be addressed in Emacs in future.