Gnus development mailing list
 help / color / mirror / Atom feed
* nnrss parses < twice.
@ 2007-10-31 15:28 Michaël Cadilhac
  2007-11-01  7:43 ` Katsumi Yamaoka
  0 siblings, 1 reply; 2+ messages in thread
From: Michaël Cadilhac @ 2007-10-31 15:28 UTC (permalink / raw)
  To: ding

[-- Attachment #1: Type: text/plain, Size: 2026 bytes --]

Hi guys!

Small problem here with the nnrss backend.  If you have a RSS feed that
uses some < or > in the *text* (as opposed to the HTML code), those will
be parsed, and w3m (if you use w3m) will screw up badly.

Such RSS feed will have an entry like :

   When &lt;strong&gt;Mickey&lt;/strong&gt;'s colleague was tasked with
   changing &amp;lt;br&amp;gt;s

which should result in HTML in


   When <strong>Mickey</strong>'s colleague was tasked with changing
   &lt;br&gt;s

and in the final reading in

   When *Mickey*'s colleague was tasked with changing <br>s

Problem is, this field is « HTML-unescaped » twice :

- On line ~456,

	  (setq xmlform (xml-parse-region (point-min) (point-max)))

because xml-parse-region uses xml-parse-string which uses
xml-substitute-special.

- On line ~772,

          (and extra (nnrss-decode-entities-string extra))

which uses mm-url-decode-entities-nbsp.


This leads to the <br> being interpreted as a newline.


I don't know if `w3-parse-buffer' does the same, but as xml-parse-region
works for all my feeds, I use the following change :

Index: nnrss.el
===================================================================
RCS file: /usr/local/cvsroot/gnus/lisp/nnrss.el,v
retrieving revision 7.52
diff -b -u -w -r7.52 nnrss.el
--- nnrss.el	25 Oct 2007 08:17:54 -0000	7.52
+++ nnrss.el	31 Oct 2007 15:24:27 -0000
@@ -769,7 +769,7 @@
 	  (and subject (nnrss-mime-encode-string subject))
 	  (and author (nnrss-mime-encode-string author))
 	  date
-	  (and extra (nnrss-decode-entities-string extra))
+	  extra
 	  enclosure
 	  comments
 	  hash-index)


Let me know if I should install it.

-- 
 |   Michaël `Micha' Cadilhac       |  Personne n'est la au mauvais moment   |
 |   http://michael.cadilhac.name   |           et au mauvais endroit        |
 |   JID/MSN:                       |     par hasard.                        |
 `----  michael.cadilhac@gmail.com  |          -- ElBarto               -  --'

[-- Attachment #2: Type: application/pgp-signature, Size: 188 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: nnrss parses &lt; twice.
  2007-10-31 15:28 nnrss parses &lt; twice Michaël Cadilhac
@ 2007-11-01  7:43 ` Katsumi Yamaoka
  0 siblings, 0 replies; 2+ messages in thread
From: Katsumi Yamaoka @ 2007-11-01  7:43 UTC (permalink / raw)
  To: ding

>>>>> Michaël Cadilhac wrote:

> Small problem here with the nnrss backend.  If you have a RSS feed that
> uses some < or > in the *text* (as opposed to the HTML code), those will
> be parsed, and w3m (if you use w3m) will screw up badly.

[...]

> Let me know if I should install it.

I believe it's a bug and should be fixed.  Here's a rss feed for
testing this. :)

http://www.jpl.org/test-nnrss.xml



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2007-11-01  7:43 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-10-31 15:28 nnrss parses &lt; twice Michaël Cadilhac
2007-11-01  7:43 ` Katsumi Yamaoka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).