Gnus development mailing list
 help / color / mirror / Atom feed
* Re: gnus-article-browse-html-article' and character encoding
       [not found] <873auj0xo8.fsf@ID-24456.user.uni-berlin.de>
@ 2007-12-05  2:57 ` Katsumi Yamaoka
  2007-12-05  5:43   ` Christoph Conrad
  2007-12-05  6:06   ` Christoph Conrad
  0 siblings, 2 replies; 8+ messages in thread
From: Katsumi Yamaoka @ 2007-12-05  2:57 UTC (permalink / raw)
  To: Christoph Conrad; +Cc: ding, bugs

[-- Attachment #1: Type: text/plain, Size: 627 bytes --]

>>>>> Christoph Conrad wrote:
> No Gnus v0.7
> GNU Emacs 23.0.50.1 (i686-pc-linux-gnu, GTK+ Version 2.12.1)
>  of 2007-11-30 on brabbelbox
> 200 news.online.de InterNetNews NNRP server INN 2.3.5 ready (posting ok).

> `gnus-article-browse-html-article' should write also the character
> encoding of the html part to the file, e.g. with

> Content-Type: text/html;
> 	charset="UTF-8"

> german umlauts are not displayed correctly if the default character
> encoding is iso-8859-1.

> Best regards,
> Christoph

Is the attached patch what you need?  This will add the meta html
tag that specifies the charset to an html source.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-patch, Size: 1428 bytes --]

--- gnus-art.el~	2007-12-05 01:51:07 +0000
+++ gnus-art.el	2007-12-05 02:55:23 +0000
@@ -2803,7 +2803,34 @@
 		    (string-match "text/html" (car (mm-handle-type handle))))
 	       (let ((tmp-file (mm-make-temp-file
 				;; Do we need to care for 8.3 filenames?
-				"mm-" nil ".html")))
+				"mm-" nil ".html"))
+		     (charset (mail-content-type-get (mm-handle-type handle)
+						     'charset)))
+		 (when charset
+		   (with-current-buffer (mm-handle-buffer handle)
+		     (when (eq charset 'gnus-decoded)
+		       (insert (prog2
+				   (setq charset 'utf-8)
+				   (encode-coding-string (buffer-string)
+							 charset)
+				 (erase-buffer)
+				 (mm-disable-multibyte))))
+		     (setq charset (format "\
+<meta http-equiv=\"Content-Type\" content=\"text/html; charset=%s\">"
+					   charset))
+		     (goto-char (point-min))
+		     (let ((case-fold-search t))
+		       (cond ((re-search-forward "\
+<meta[\t\n\r ]+http-equiv=\"content-type\"[^>]+>"
+						 nil t)
+			      (replace-match charset))
+			     ((re-search-forward "<head>[\t\n\r ]*" nil t)
+			      (insert charset "\n"))
+			     (t
+			      (re-search-forward "\
+<html\\(?:[\t\n\r ]+[^>]+\\|[\t\n\r ]*\\)>[\t\n\r ]*"
+						 nil t)
+			      (insert "<head>\n" charset "\n</head>\n"))))))
 		 (mm-save-part-to-file handle tmp-file)
 		 (add-to-list 'gnus-article-browse-html-temp-list tmp-file)
 		 (add-hook 'gnus-summary-prepare-exit-hook

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: gnus-article-browse-html-article' and character encoding
  2007-12-05  2:57 ` gnus-article-browse-html-article' and character encoding Katsumi Yamaoka
@ 2007-12-05  5:43   ` Christoph Conrad
  2007-12-05  6:06   ` Christoph Conrad
  1 sibling, 0 replies; 8+ messages in thread
From: Christoph Conrad @ 2007-12-05  5:43 UTC (permalink / raw)
  To: Katsumi Yamaoka; +Cc: ding, bugs

Hi Katsumi,

> Is the attached patch what you need?

Yes, that seems to work. The meta-tag is added, Firefox switches the
encoding. Please check it in. Thank you!

Best regards,
Christoph



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: gnus-article-browse-html-article' and character encoding
  2007-12-05  2:57 ` gnus-article-browse-html-article' and character encoding Katsumi Yamaoka
  2007-12-05  5:43   ` Christoph Conrad
@ 2007-12-05  6:06   ` Christoph Conrad
  2007-12-05  6:25     ` Katsumi Yamaoka
  1 sibling, 1 reply; 8+ messages in thread
From: Christoph Conrad @ 2007-12-05  6:06 UTC (permalink / raw)
  To: Katsumi Yamaoka; +Cc: ding, bugs

Hi Katsumi,

i am wondering what happens when the body of a part already contains a
meta tag - then the "charset=" attribute is changed to that of the part.
That is not correct, i assume.

e.g.

,----
| Content-Type: text/html; charset=utf-8
| 
| <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
| "http://www.w3.org/TR/html4/loose.dtd">
| <html>
| <head>
| <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
|
| [...]
`----

Best regards,
Christoph



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: gnus-article-browse-html-article' and character encoding
  2007-12-05  6:06   ` Christoph Conrad
@ 2007-12-05  6:25     ` Katsumi Yamaoka
  2007-12-05  6:39       ` Katsumi Yamaoka
  0 siblings, 1 reply; 8+ messages in thread
From: Katsumi Yamaoka @ 2007-12-05  6:25 UTC (permalink / raw)
  To: Christoph Conrad; +Cc: ding, bugs

>>>>> Christoph Conrad wrote:

> i am wondering what happens when the body of a part already contains a
> meta tag - then the "charset=" attribute is changed to that of the part.
> That is not correct, i assume.

I see.  I'll install the version that keeps existing meta tag,
later.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: gnus-article-browse-html-article' and character encoding
  2007-12-05  6:25     ` Katsumi Yamaoka
@ 2007-12-05  6:39       ` Katsumi Yamaoka
  2007-12-10  2:17         ` Katsumi Yamaoka
  0 siblings, 1 reply; 8+ messages in thread
From: Katsumi Yamaoka @ 2007-12-05  6:39 UTC (permalink / raw)
  To: Christoph Conrad; +Cc: ding, bugs

>>>>> Katsumi Yamaoka wrote:
>>>>>> Christoph Conrad wrote:

>> i am wondering what happens when the body of a part already contains a
>> meta tag - then the "charset=" attribute is changed to that of the part.
>> That is not correct, i assume.

> I see.  I'll install the version that keeps existing meta tag,
> later.

Done.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: gnus-article-browse-html-article' and character encoding
  2007-12-05  6:39       ` Katsumi Yamaoka
@ 2007-12-10  2:17         ` Katsumi Yamaoka
  2007-12-10  6:26           ` Christoph Conrad
  2007-12-10  6:27           ` Christoph Conrad
  0 siblings, 2 replies; 8+ messages in thread
From: Katsumi Yamaoka @ 2007-12-10  2:17 UTC (permalink / raw)
  To: Christoph Conrad; +Cc: ding, bugs

>>>>> Christoph Conrad wrote:

> would be nice if gnus-article-browse-html-article could handle an
> article like attached below. I do not know if it is a big mess to
> implement this.

> content-type: text/html; charset=utf-8
> content-transfer-encoding: base64

Oops.  I forgot decoding of CTE.  Fixed in the Gnus trunk.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: gnus-article-browse-html-article' and character encoding
  2007-12-10  2:17         ` Katsumi Yamaoka
@ 2007-12-10  6:26           ` Christoph Conrad
  2007-12-10  6:27           ` Christoph Conrad
  1 sibling, 0 replies; 8+ messages in thread
From: Christoph Conrad @ 2007-12-10  6:26 UTC (permalink / raw)
  To: ding

Hi Katsumi,

* Katsumi Yamaoka <yamaoka@jpl.org> schrieb:

> Oops. I forgot decoding of CTE. Fixed in the Gnus trunk.

Thank you for implementing!

Best regards,
Christoph




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: gnus-article-browse-html-article' and character encoding
  2007-12-10  2:17         ` Katsumi Yamaoka
  2007-12-10  6:26           ` Christoph Conrad
@ 2007-12-10  6:27           ` Christoph Conrad
  1 sibling, 0 replies; 8+ messages in thread
From: Christoph Conrad @ 2007-12-10  6:27 UTC (permalink / raw)
  To: Katsumi Yamaoka; +Cc: ding, bugs

Hi Katsumi,

* Katsumi Yamaoka <yamaoka@jpl.org> wrote:

> Oops. I forgot decoding of CTE. Fixed in the Gnus trunk.

Thank you for implementing!

Best regards,
Christoph



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2007-12-10  6:27 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <873auj0xo8.fsf@ID-24456.user.uni-berlin.de>
2007-12-05  2:57 ` gnus-article-browse-html-article' and character encoding Katsumi Yamaoka
2007-12-05  5:43   ` Christoph Conrad
2007-12-05  6:06   ` Christoph Conrad
2007-12-05  6:25     ` Katsumi Yamaoka
2007-12-05  6:39       ` Katsumi Yamaoka
2007-12-10  2:17         ` Katsumi Yamaoka
2007-12-10  6:26           ` Christoph Conrad
2007-12-10  6:27           ` Christoph Conrad

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).