Gnus development mailing list
 help / color / mirror / Atom feed
* rfc2047.el bug?
@ 2003-09-10  4:29 Katsumi Yamaoka
  2003-09-11  0:40 ` Jesper Harder
  0 siblings, 1 reply; 3+ messages in thread
From: Katsumi Yamaoka @ 2003-09-10  4:29 UTC (permalink / raw)


Hi,

It actually may not cause a problem, but there's something wrong
in rfc2047.el.  This is an extract from rfc2047.txt:

   An 'encoded-word' may not be more than 75 characters long, including
   'charset', 'encoding', 'encoded-text', and delimiters.  If it is
   desirable to encode more text than will fit in an 'encoded-word' of
   75 characters, multiple 'encoded-word's (separated by CRLF SPACE) may
   be used.

   While there is no limit to the length of a multiple-line header
   field, each line of a header field that contains one or more
   'encoded-word's is limited to 76 characters.

   The length restrictions are included both to ease interoperability
   through internetwork mail gateways, and to impose a limit on the
   amount of lookahead a header parser must employ (while looking for a
   final ?= delimiter) before it can decide whether a token is an
   "encoded-word" or something else.

Gnus sometimes encodes a Japanese-English mixture text to the
line length exceeding 76 characters.  For example:

Subject: 寿ju 限ge 無mu 寿ju 限ge 無mu 五go 劫kou

Subject: =?iso-2022-jp?b?GyRCPHcbKEJqdSAbJEI4QhsoQmdlIBskQkw1GyhCbXUgGyRCPHcbKEJqdQ==?=
 =?iso-2022-jp?b?IBskQjhCGyhCZ2UgGyRCTDUbKEJtdSAbJEI4XhsoQmdvIBskQjllGyhCaw==?=
 =?iso-2022-jp?b?b3U=?=

The following is a case of using FLIM's eword-encode.el:

Subject: =?ISO-2022-JP?b?GyRCPHcbKEJqdSAbJEI4QhsoQmdlIBskQkw1GyhCbXUg?=
 =?ISO-2022-JP?b?GyRCPHcbKEJqdSAbJEI4QhsoQmdlIBskQkw1GyhCbXUgGyRCOF4bKEJn?=
 =?ISO-2022-JP?b?byAbJEI5ZRsoQmtvdQ==?=
-- 
Katsumi Yamaoka <yamaoka@jpl.org>



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: rfc2047.el bug?
  2003-09-10  4:29 rfc2047.el bug? Katsumi Yamaoka
@ 2003-09-11  0:40 ` Jesper Harder
  2003-09-11  1:34   ` Katsumi Yamaoka
  0 siblings, 1 reply; 3+ messages in thread
From: Jesper Harder @ 2003-09-11  0:40 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 505 bytes --]

Katsumi Yamaoka <yamaoka@jpl.org> writes:

> Gnus sometimes encodes a Japanese-English mixture text to the
> line length exceeding 76 characters.  For example:
>
> Subject: 寿ju 限ge 無mu 寿ju 限ge 無mu 五go 劫kou
>
> Subject: =?iso-2022-jp?b?GyRCPHcbKEJqdSAbJEI4QhsoQmdlIBskQkw1GyhCbXUgGyRCPHcbKEJqdQ==?=
>  =?iso-2022-jp?b?IBskQjhCGyhCZ2UgGyRCTDUbKEJtdSAbJEI4XhsoQmdvIBskQjllGyhCaw==?=
>  =?iso-2022-jp?b?b3U=?=

Hey, I can make an even worse example :-)

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/plain; charset=iso-2022-cn-ext, Size: 853 bytes --]


Subject: ^[$+I^[O!<^[$+M^[O!~^[$+I^[O!<^[$+M^[O!~^[$+I^[O!<^[$+M^[O!~^[$+I^[O!<^[$+M^[O!~^[$+I^[O!<^[$+M^[O!~^[$+I^[O!<^[$+M^[O!~^[$+I^[O!<^[$+M^[O!~^[$+I^[O!<

Subject: =?iso-2022-cn-ext?b?GyQrSRtPITwbJCtNG08hfhskK0kbTyE8GyQrTRtPIX4bJCtJG08hPBskK00bTyF+GyQrSRtPITwbJCtNG08hfhskK0kbTyE8GyQrTRtPIX4bJCtJG08hPBskK00bTyF+GyQrSRtPITwbJCtNG08hfhskK0kbTyE8?=

No less than 191 characters!  Oh boy, this means that we'll have to
split the header for every 4 characters in some cases.

I've checked in a fix.  It

* Assumes a factor 8 worst case expansion for everything with B in
  `rfc2047-charset-encoding-alist' where I didn't know better.  It's
  true for iso-2022-cn-ext -- if someone knows better values for other
  charsets, please fill them in.

* Only fixes the 75 char limit on encoded-words.  It's still possible
  to exceed the 76 char line length limit, although less likely.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: rfc2047.el bug?
  2003-09-11  0:40 ` Jesper Harder
@ 2003-09-11  1:34   ` Katsumi Yamaoka
  0 siblings, 0 replies; 3+ messages in thread
From: Katsumi Yamaoka @ 2003-09-11  1:34 UTC (permalink / raw)


>>>>> In <m3znhcxili.fsf@defun.localdomain>
>>>>>	Jesper Harder <harder@myrealbox.com> wrote:

> Katsumi Yamaoka <yamaoka@jpl.org> writes:

>> Gnus sometimes encodes a Japanese-English mixture text to the
>> line length exceeding 76 characters.  For example:

[...]

> I've checked in a fix.

Thanks!  It fixes a problem when encoding at least Japanese text.
However, the FLIM's encoder is still high quality overwhelmingly.
We can use it for Gnus as follows:

(eval-after-load "mail-parse"
  '(progn
     (defalias 'mail-encode-encoded-word-buffer
       (lambda nil
	 (require 'eword-encode)
	 (mime-encode-header-in-buffer t)))
;;     (defadvice eword-encode-text (after downcase-charset activate)
;;       "Convert charset and encoding strings to lower case."
;;       (require 'eword-decode)
;;       (if (and ad-return-value
;;		(string-match eword-encoded-word-regexp
;;			      ad-return-value))
;;	   (setq ad-return-value
;;		 (concat
;;		  (downcase (substring ad-return-value
;;				       0 (match-beginning 4)))
;;		  (substring ad-return-value
;;			     (match-beginning 4))))))
     ))

P.S.
Please don't use smtpmail.el included in the FLIM package.  It
is slightly incompatible with the Emacs version of smtpmail.el.
I've made a proposal to remove it from the package, but there's
no response from the authors.
-- 
Katsumi Yamaoka <yamaoka@jpl.org>
;; I'm now renewing the "multiple message frames" suit. :)



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2003-09-11  1:34 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-09-10  4:29 rfc2047.el bug? Katsumi Yamaoka
2003-09-11  0:40 ` Jesper Harder
2003-09-11  1:34   ` Katsumi Yamaoka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).