mm-with-unibyte-current-buffer is bad for Emacs 23

Gnus development mailing list
 help / color / mirror / Atom feed

* mm-with-unibyte-current-buffer is bad for Emacs 23
@ 2006-02-28  9:48 Katsumi Yamaoka
  2006-02-28 13:40 ` Reiner Steib
  0 siblings, 1 reply; 4+ messages in thread
From: Katsumi Yamaoka @ 2006-02-28  9:48 UTC (permalink / raw)


Hi,

The macro `mm-with-unibyte-current-buffer' is used here and
there in Gnus.  In Emacs 23, I realized there is a possibility
that it breaks non-ASCII text.  For instance:

(with-temp-buffer
  (set-buffer-multibyte t)
  (insert "Àµ")
  (encode-coding-region (point-min) (point-max) 'iso-8859-1)
  (mm-with-unibyte-current-buffer (ignore))
  (decode-coding-string (buffer-string) 'iso-8859-1))
 => "µ"

It also happens for text other than Latin-1.  I haven't
encountered this problem yet actually when using Gnus, though.

Regards,



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mm-with-unibyte-current-buffer is bad for Emacs 23
  2006-02-28  9:48 mm-with-unibyte-current-buffer is bad for Emacs 23 Katsumi Yamaoka
@ 2006-02-28 13:40 ` Reiner Steib
  2006-02-28 23:47   ` Katsumi Yamaoka
  0 siblings, 1 reply; 4+ messages in thread
From: Reiner Steib @ 2006-02-28 13:40 UTC (permalink / raw)


On Tue, Feb 28 2006, Katsumi Yamaoka wrote:

> The macro `mm-with-unibyte-current-buffer' is used here and
> there in Gnus.  In Emacs 23, I realized there is a possibility
> that it breaks non-ASCII text.

Unless you can fix it, please put a note into the code like...

;; FIXME: ... Emacs 23 (unicode)

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mm-with-unibyte-current-buffer is bad for Emacs 23
  2006-02-28 13:40 ` Reiner Steib
@ 2006-02-28 23:47   ` Katsumi Yamaoka
  2006-03-03  4:29     ` Katsumi Yamaoka
  0 siblings, 1 reply; 4+ messages in thread
From: Katsumi Yamaoka @ 2006-02-28 23:47 UTC (permalink / raw)

>>>>> In <v9r75nhaa5.fsf@marauder.physik.uni-ulm.de> Reiner Steib wrote:

> On Tue, Feb 28 2006, Katsumi Yamaoka wrote:

>> The macro `mm-with-unibyte-current-buffer' is used here and
>> there in Gnus.  In Emacs 23, I realized there is a possibility
>> that it breaks non-ASCII text.

> Unless you can fix it, please put a note into the code like...

> ;; FIXME: ... Emacs 23 (unicode)

If anything, we should not use '(set-buffer-multibyte t)' in
unibyte buffers which aren't empty if there's no special purpose.
For instance, the following code might work in a certain limited
condition (it is for decoding encoded text and displaying it).

  (set-buffer-multibyte t)
  (decode-coding-region (point-min) (point-max) 'CODING-SYSTEM)

But we need to change the order as follows for Emacs 23.

  (decode-coding-region (point-min) (point-max) 'CODING-SYSTEM)
  (set-buffer-multibyte t)

However, it might become not working when the Emacs 23 spec will
change in the future.  I got a suggestion from Kenichi Handa
yesterday.  The best way to do such a thing is to manipulate
data outside the buffer.  For instance:

  (insert
   (prog1
       (decode-coding-string (buffer-string) 'CODING-SYSTEM)
     (erase-buffer)
     (set-buffer-multibyte t)))

I don't know very much how the macro is used in Gnus modules,
and I don't have a capacity to examine all of them either.  So,
I've added the following note to the docstring.

#v+
NOTE: Use this macro with caution in multibyte buffers (it is not
worth using this macro in unibyte buffers of course).  Use of
`(set-buffer-multibyte t)', which is run finally, is generally
harmful since it is likely to modify existing data in the buffer.
For instance, it converts "\300\255" into "\255" in Emacs 23.
#v-

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mm-with-unibyte-current-buffer is bad for Emacs 23
  2006-02-28 23:47   ` Katsumi Yamaoka
@ 2006-03-03  4:29     ` Katsumi Yamaoka
  0 siblings, 0 replies; 4+ messages in thread
From: Katsumi Yamaoka @ 2006-03-03  4:29 UTC (permalink / raw)

>> On Tue, Feb 28 2006, Katsumi Yamaoka wrote:

>>> The macro `mm-with-unibyte-current-buffer' is used here and
>>> there in Gnus.  In Emacs 23, I realized there is a possibility
>>> that it breaks non-ASCII text.

I encountered actual problems.  In Emacs 23, text parts contained
in Japanese multipart messages that are encoded by the shift_jis
charset and the 8bit encoding are displayed with some broken
characters.  In that case, `mm-with-unibyte-current-buffer' is
called at least twice on the raw message; one is for extracting
the whole body, the others are for extracting parts.  The raw
message is broken at the first time the macro is called.  So,
I've modified the `mm-get-part' function so as not to use the
`mm-with-unibyte-current-buffer' macro.

In addition, I've also fixed the display table used in the
summary buffer.  It nixed out data 127 through 255, however some
Latin characters have values 160 through 255 in Emacs 23.  For
example:

(make-char 'latin-iso8859-1 160)
;; Emacs 23
 => 160
;; Emacs 22
 => 2208

Because of this, "Sébastien" was displayed as "S?bastien" in the
summary buffer.

* trunk and v5-10 branch *

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2006-03-03  4:29 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-02-28  9:48 mm-with-unibyte-current-buffer is bad for Emacs 23 Katsumi Yamaoka
2006-02-28 13:40 ` Reiner Steib
2006-02-28 23:47   ` Katsumi Yamaoka
2006-03-03  4:29     ` Katsumi Yamaoka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).