From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/32570
Path: main.gmane.org!not-for-mail
From: Dave Love <d.love@dl.ac.uk>
Newsgroups: gmane.emacs.gnus.general
Subject: Re: \201 irritation! :-)
Date: Mon, 25 Sep 2000 12:55:16 +0100
Sender: owner-ding@hpc.uh.edu
Message-ID: <200009251155.MAA15586@djlvig.dl.ac.uk>
References: <oqaedytybi.fsf@titan.progiciels-bpi.ca>
	<00Aug28.151432edt.115218@gateway.intersys.com>
	<oqog2dqhza.fsf@titan.progiciels-bpi.ca>
	<00Aug28.173634edt.115213@gateway.intersys.com>
	<oqitslatr0.fsf@titan.progiciels-bpi.ca>
	<200009051429.PAA09826@djlvig.dl.ac.uk>
	<oqk8cpr2kk.fsf@titan.progiciels-bpi.ca>
	<200009082240.XAA16800@djlvig.dl.ac.uk> <87n1h86w6a.fsf@deneb.enyo.de>
	<200009181407.PAA02748@djlvig.dl.ac.uk> <87wvg7roi1.fsf@deneb.enyo.de>
	<200009211933.UAA08307@djlvig.dl.ac.uk> <87og1f8imp.fsf@deneb.enyo.de>
NNTP-Posting-Host: coloc-standby.netfonds.no
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Trace: main.gmane.org 1035168834 20579 80.91.224.250 (21 Oct 2002 02:53:54 GMT)
X-Complaints-To: usenet@main.gmane.org
NNTP-Posting-Date: Mon, 21 Oct 2002 02:53:54 +0000 (UTC)
Return-Path: <owner-ding@hpc.uh.edu>
Original-Received: from fisher.math.uh.edu (fisher.math.uh.edu [129.7.128.35])
	by mailhost.sclp.com (Postfix) with ESMTP id BF93DD051E
	for <jason@mailhost.sclp.com>; Mon, 25 Sep 2000 07:59:13 -0400 (EDT)
Original-Received: from sina.hpc.uh.edu (lists@Sina.HPC.UH.EDU [129.7.3.5])
	by fisher.math.uh.edu (8.9.1/8.9.1) with ESMTP id GAC02074;
	Mon, 25 Sep 2000 06:55:44 -0500 (CDT)
Original-Received: by sina.hpc.uh.edu (TLB v0.09a (1.20 tibbs 1996/10/09 22:03:07)); Mon, 25 Sep 2000 06:55:08 -0500 (CDT)
Original-Received: from mailhost.sclp.com (postfix@66-209.196.61.interliant.com [209.196.61.66] (may be forged))
	by sina.hpc.uh.edu (8.9.3/8.9.3) with ESMTP id GAA16065
	for <ding@hpc.uh.edu>; Mon, 25 Sep 2000 06:54:57 -0500 (CDT)
Original-Received: from djlvig.dl.ac.uk (djlvig.dl.ac.uk [148.79.112.146])
	by mailhost.sclp.com (Postfix) with ESMTP id A585BD051E
	for <ding@gnus.org>; Mon, 25 Sep 2000 07:55:17 -0400 (EDT)
Original-Received: (from fx@localhost)
	by djlvig.dl.ac.uk (8.8.7/8.8.5) id MAA15586;
	Mon, 25 Sep 2000 12:55:16 +0100
X-Authentication-Warning: djlvig.dl.ac.uk: fx set sender to d.love@dl.ac.uk using -f
X-Face: "_!nmR@11ZNuumt0oqG"Y3Hfy|;FGz)`"ul[G?ah6k-oNyDW?3/Nq3Qab$kUnUQ_d4};kPl
 R=}-Vqfo|S5mThi-k<sb8dDL76.\(='H$ee]<E5E=}pX[X&;]RITcPw-6GDeIlbJlshHd~p1Ip&xZE
 [dYv$`;'AXF-OV$@6dcl/*rvO{8a~`_Qq2FJZx>aBR=>%g5a3-OvnEhdHu{^APIaP:b}0m!$bDC>SX
 zz'r)e?`at?tpD*+~b+pf
Original-To: ding@gnus.org
User-Agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/21.0.90
Original-Lines: 50
Precedence: list
X-Majordomo: 1.94.jlt7
Xref: main.gmane.org gmane.emacs.gnus.general:32570
X-Report-Spam: http://spam.gmane.org/gmane.emacs.gnus.general:32570

>>>>> "FW" == Florian Weimer <fw@deneb.enyo.de> writes:

 FW> If there is a byte-combination, "(char-after)" returns values
 FW> outside the usual 0 .. 255 range, and
 FW> "quoted-printable-encode-region" doesn't handle this.

Then it's either intrinsically broken or not being used appropriately
since it doesn't make sense for multibyte characters.  I suppose I'll
have to check it if no-one else can.

 FW> The best thing probably is to switch the buffer to
 FW> uni-byte mode during quoted-printable encoding.

Whatever you do, you need a buffer with encoded contents.  Making it
unibyte isn't right otherwise or you're dealing with the emacs-mule
charset.

 >> I don't know what that means.  Just spurious combination of leading
 >> bytes stuffed raw into a multibyte buffer or something else?

 FW> Take the "Chinese" line from the HELLO file, copy it to a
 FW> multi-byte buffer, do "encode-coding-region" on it and specify
 FW> "utf-8" as encoding.  Copy the result to a unibyte buffer, and
 FW> paste it back into the first (multi-byte) buffer.  

This seems to be just confusing several issues.  I don't see how it's
related to spurious byte combination, unless it refers to some problem
in Mule-UCS itself.  Can you give a code snippet demonstrating the
problems you're concerned about without using utf-8 as an example?  It
shouldn't be relevant.  Be careful to include the relevant context
(C-h C, at least) since you're concerned with code conversion.

 FW> Now the \201s are there, you can see them if you switch the first
 FW> buffer to uni-byte mode.

Of course you see the internal encoding if you switch it to unibyte
mode anyway.  You may also have pasted in raw bytes.

 FW> The trouble with UTF-8 is that it tends to generate more
 FW> byte-combinations than other encodings, as it seems.

 >> Do you mean _de_code-coding-region?  `encode-coding-region' would
 >> produce raw bytes.

 FW> And these raw bytes are not properly dealt with in multi-byte buffers.

So don't do that.  That's the point.

I'm not getting any indication that there is an issue here that needs
to be addressed in Emacs in future.