* rfc2047.el bug?
@ 2003-09-10 4:29 Katsumi Yamaoka
2003-09-11 0:40 ` Jesper Harder
0 siblings, 1 reply; 3+ messages in thread
From: Katsumi Yamaoka @ 2003-09-10 4:29 UTC (permalink / raw)
Hi,
It actually may not cause a problem, but there's something wrong
in rfc2047.el. This is an extract from rfc2047.txt:
An 'encoded-word' may not be more than 75 characters long, including
'charset', 'encoding', 'encoded-text', and delimiters. If it is
desirable to encode more text than will fit in an 'encoded-word' of
75 characters, multiple 'encoded-word's (separated by CRLF SPACE) may
be used.
While there is no limit to the length of a multiple-line header
field, each line of a header field that contains one or more
'encoded-word's is limited to 76 characters.
The length restrictions are included both to ease interoperability
through internetwork mail gateways, and to impose a limit on the
amount of lookahead a header parser must employ (while looking for a
final ?= delimiter) before it can decide whether a token is an
"encoded-word" or something else.
Gnus sometimes encodes a Japanese-English mixture text to the
line length exceeding 76 characters. For example:
Subject: 寿ju 限ge 無mu 寿ju 限ge 無mu 五go 劫kou
Subject: =?iso-2022-jp?b?GyRCPHcbKEJqdSAbJEI4QhsoQmdlIBskQkw1GyhCbXUgGyRCPHcbKEJqdQ==?=
=?iso-2022-jp?b?IBskQjhCGyhCZ2UgGyRCTDUbKEJtdSAbJEI4XhsoQmdvIBskQjllGyhCaw==?=
=?iso-2022-jp?b?b3U=?=
The following is a case of using FLIM's eword-encode.el:
Subject: =?ISO-2022-JP?b?GyRCPHcbKEJqdSAbJEI4QhsoQmdlIBskQkw1GyhCbXUg?=
=?ISO-2022-JP?b?GyRCPHcbKEJqdSAbJEI4QhsoQmdlIBskQkw1GyhCbXUgGyRCOF4bKEJn?=
=?ISO-2022-JP?b?byAbJEI5ZRsoQmtvdQ==?=
--
Katsumi Yamaoka <yamaoka@jpl.org>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: rfc2047.el bug?
2003-09-10 4:29 rfc2047.el bug? Katsumi Yamaoka
@ 2003-09-11 0:40 ` Jesper Harder
2003-09-11 1:34 ` Katsumi Yamaoka
0 siblings, 1 reply; 3+ messages in thread
From: Jesper Harder @ 2003-09-11 0:40 UTC (permalink / raw)
[-- Attachment #1: Type: text/plain, Size: 505 bytes --]
Katsumi Yamaoka <yamaoka@jpl.org> writes:
> Gnus sometimes encodes a Japanese-English mixture text to the
> line length exceeding 76 characters. For example:
>
> Subject: 寿ju 限ge 無mu 寿ju 限ge 無mu 五go 劫kou
>
> Subject: =?iso-2022-jp?b?GyRCPHcbKEJqdSAbJEI4QhsoQmdlIBskQkw1GyhCbXUgGyRCPHcbKEJqdQ==?=
> =?iso-2022-jp?b?IBskQjhCGyhCZ2UgGyRCTDUbKEJtdSAbJEI4XhsoQmdvIBskQjllGyhCaw==?=
> =?iso-2022-jp?b?b3U=?=
Hey, I can make an even worse example :-)
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/plain; charset=iso-2022-cn-ext, Size: 853 bytes --]
Subject: ^[$+I^[O!<^[$+M^[O!~^[$+I^[O!<^[$+M^[O!~^[$+I^[O!<^[$+M^[O!~^[$+I^[O!<^[$+M^[O!~^[$+I^[O!<^[$+M^[O!~^[$+I^[O!<^[$+M^[O!~^[$+I^[O!<^[$+M^[O!~^[$+I^[O!<
Subject: =?iso-2022-cn-ext?b?GyQrSRtPITwbJCtNG08hfhskK0kbTyE8GyQrTRtPIX4bJCtJG08hPBskK00bTyF+GyQrSRtPITwbJCtNG08hfhskK0kbTyE8GyQrTRtPIX4bJCtJG08hPBskK00bTyF+GyQrSRtPITwbJCtNG08hfhskK0kbTyE8?=
No less than 191 characters! Oh boy, this means that we'll have to
split the header for every 4 characters in some cases.
I've checked in a fix. It
* Assumes a factor 8 worst case expansion for everything with B in
`rfc2047-charset-encoding-alist' where I didn't know better. It's
true for iso-2022-cn-ext -- if someone knows better values for other
charsets, please fill them in.
* Only fixes the 75 char limit on encoded-words. It's still possible
to exceed the 76 char line length limit, although less likely.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: rfc2047.el bug?
2003-09-11 0:40 ` Jesper Harder
@ 2003-09-11 1:34 ` Katsumi Yamaoka
0 siblings, 0 replies; 3+ messages in thread
From: Katsumi Yamaoka @ 2003-09-11 1:34 UTC (permalink / raw)
>>>>> In <m3znhcxili.fsf@defun.localdomain>
>>>>> Jesper Harder <harder@myrealbox.com> wrote:
> Katsumi Yamaoka <yamaoka@jpl.org> writes:
>> Gnus sometimes encodes a Japanese-English mixture text to the
>> line length exceeding 76 characters. For example:
[...]
> I've checked in a fix.
Thanks! It fixes a problem when encoding at least Japanese text.
However, the FLIM's encoder is still high quality overwhelmingly.
We can use it for Gnus as follows:
(eval-after-load "mail-parse"
'(progn
(defalias 'mail-encode-encoded-word-buffer
(lambda nil
(require 'eword-encode)
(mime-encode-header-in-buffer t)))
;; (defadvice eword-encode-text (after downcase-charset activate)
;; "Convert charset and encoding strings to lower case."
;; (require 'eword-decode)
;; (if (and ad-return-value
;; (string-match eword-encoded-word-regexp
;; ad-return-value))
;; (setq ad-return-value
;; (concat
;; (downcase (substring ad-return-value
;; 0 (match-beginning 4)))
;; (substring ad-return-value
;; (match-beginning 4))))))
))
P.S.
Please don't use smtpmail.el included in the FLIM package. It
is slightly incompatible with the Emacs version of smtpmail.el.
I've made a proposal to remove it from the package, but there's
no response from the authors.
--
Katsumi Yamaoka <yamaoka@jpl.org>
;; I'm now renewing the "multiple message frames" suit. :)
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2003-09-11 1:34 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-09-10 4:29 rfc2047.el bug? Katsumi Yamaoka
2003-09-11 0:40 ` Jesper Harder
2003-09-11 1:34 ` Katsumi Yamaoka
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).