From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/69483 Path: news.gmane.org!not-for-mail From: Martin Stjernholm Newsgroups: gmane.emacs.gnus.general Subject: Override charset in tag for K H Date: Mon, 29 Mar 2010 02:42:50 +0200 Message-ID: <7mfx3k556d.fsf@kolon.stjernholm.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Trace: dough.gmane.org 1269823469 17556 80.91.229.12 (29 Mar 2010 00:44:29 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Mon, 29 Mar 2010 00:44:29 +0000 (UTC) To: ding@gnus.org Original-X-From: ding-owner+M17878@lists.math.uh.edu Mon Mar 29 02:44:21 2010 Return-path: Envelope-to: ding-account@gmane.org Original-Received: from util0.math.uh.edu ([129.7.128.18]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Nw35Y-0000VE-29 for ding-account@gmane.org; Mon, 29 Mar 2010 02:44:20 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu) by util0.math.uh.edu with smtp (Exim 4.63) (envelope-from ) id 1Nw34N-0003tE-UU; Sun, 28 Mar 2010 19:43:07 -0500 Original-Received: from mx1.math.uh.edu ([129.7.128.32]) by util0.math.uh.edu with esmtps (TLSv1:AES256-SHA:256) (Exim 4.63) (envelope-from ) id 1Nw34L-0003t2-Ap for ding@lists.math.uh.edu; Sun, 28 Mar 2010 19:43:05 -0500 Original-Received: from quimby.gnus.org ([80.91.231.51]) by mx1.math.uh.edu with esmtp (Exim 4.69) (envelope-from ) id 1Nw348-000278-Su for ding@lists.math.uh.edu; Sun, 28 Mar 2010 19:43:04 -0500 Original-Received: from mail.roxen.com ([212.247.29.220]) by quimby with esmtp (Exim 3.36 #1 (Debian)) id 1Nw40D-00054M-00 for ; Mon, 29 Mar 2010 03:42:53 +0200 Original-Received: from localhost (c-26f5e255.710-9-64736c14.cust.bredbandsbolaget.se [85.226.245.38]) by mail.roxen.com (Postfix) with ESMTP id AD9A0628213 for ; Mon, 29 Mar 2010 02:42:51 +0200 (CEST) User-Agent: Gnus/5.110011 (No Gnus v0.11) Emacs/23.1 (gnu/linux) X-Spam-Score: -2.3 (--) List-ID: Precedence: bulk Xref: news.gmane.org gmane.emacs.gnus.general:69483 Archived-At: --=-=-= When an html mail is viewed externally with K H (gnus-article-browse-html-article), gnus-article-browse-html-parts might sometimes forcefully encode it to utf-8 in the temporary file. However, if the html already contains a with a different charset, it won't be changed (as is the documented behavior for mm-add-meta-html-tag). The result is that the browser views the utf-8 encoded article with the original charset. I haven't dug deep enough in gnus-article-browse-html-parts to understand why it sometimes changes the charset, but I assume it is with good reason. Anyway, if the html is recoded to a different charset then clearly the charset in the tag should be updated too. The attached patch fixes this in the code paths where it's forced to utf-8. Note that there are at least two more paths where the article goes through mm-encode-coding-string. Since I haven't grasped those, this patch doesn't touch them. The same problem might exist there too. --=-=-= Content-Type: text/x-diff Content-Disposition: inline; filename=meta-charset-override.patch mm-decode.el (mm-add-meta-html-tag): Added option to override the charset. gnus-art.el (gnus-article-browse-html-parts): Force the correct charset into the tag when the article is encoded to utf-8. diff --git a/lisp/gnus-art.el b/lisp/gnus-art.el index 1a66404..3dcc30e 100644 --- a/lisp/gnus-art.el +++ b/lisp/gnus-art.el @@ -2862,7 +2862,7 @@ message header will be added to the bodies of the \"text/html\" parts." ;; Add a meta html tag to specify charset and a header. (cond (header - (let (title eheader body hcharset coding) + (let (title eheader body hcharset coding force-charset) (with-temp-buffer (mm-enable-multibyte) (setq case-fold-search t) @@ -2886,7 +2886,8 @@ message header will be added to the bodies of the \"text/html\" parts." title (when title (mm-encode-coding-string title charset)) body (mm-encode-coding-string (mm-get-part handle) - charset)) + charset) + force-charset t) (setq hcharset (mm-find-mime-charset-region (point-min) (point-max))) (cond ((= (length hcharset) 1) @@ -2917,7 +2918,8 @@ message header will be added to the bodies of the \"text/html\" parts." body (mm-encode-coding-string (mm-decode-coding-string (mm-get-part handle) body) - charset)))) + charset) + force-charset t))) (setq charset hcharset eheader (mm-encode-coding-string (buffer-string) coding) @@ -2931,7 +2933,7 @@ message header will be added to the bodies of the \"text/html\" parts." (mm-disable-multibyte) (insert body) (when charset - (mm-add-meta-html-tag handle charset)) + (mm-add-meta-html-tag handle charset force-charset)) (when title (goto-char (point-min)) (unless (search-forward "" nil t) diff --git a/lisp/mm-decode.el b/lisp/mm-decode.el index a511253..0edc631 100644 --- a/lisp/mm-decode.el +++ b/lisp/mm-decode.el @@ -1250,11 +1250,11 @@ PROMPT overrides the default one used to ask user for a file name." (mm-save-part-to-file handle file) file)))) -(defun mm-add-meta-html-tag (handle &optional charset) +(defun mm-add-meta-html-tag (handle &optional charset force-charset) "Add meta html tag to specify CHARSET of HANDLE in the current buffer. CHARSET defaults to the one HANDLE specifies. Existing meta tag that -specifies charset will not be modified. Return t if meta tag is added -or replaced." +specifies charset will not be modified unless FORCE-CHARSET is non-nil. +Return t if meta tag is added or replaced." (when (equal (mm-handle-media-type handle) "text/html") (when (or charset (setq charset (mail-content-type-get (mm-handle-type handle) @@ -1266,7 +1266,8 @@ or replaced." (if (re-search-forward "\ <meta\\s-+http-equiv=[\"']?content-type[\"']?\\s-+content=[\"']\ text/\\(\\sw+\\)\\(?:\;\\s-*charset=\\(.+?\\)\\)?[\"'][^>]*>" nil t) - (if (and (match-beginning 2) + (if (and (not force-charset) + (match-beginning 2) (string-match "\\`html\\'" (match-string 1))) ;; Don't modify existing meta tag. nil --=-=-=--