Gnus development mailing list
 help / color / mirror / Atom feed
From: michael@cadilhac.name (Michaël Cadilhac)
To: ding@gnus.org
Subject: nnrss parses < twice.
Date: Wed, 31 Oct 2007 16:28:41 +0100	[thread overview]
Message-ID: <87bqaf9uc6.fsf@cadilhac.name> (raw)

[-- Attachment #1: Type: text/plain, Size: 2026 bytes --]

Hi guys!

Small problem here with the nnrss backend.  If you have a RSS feed that
uses some < or > in the *text* (as opposed to the HTML code), those will
be parsed, and w3m (if you use w3m) will screw up badly.

Such RSS feed will have an entry like :

   When &lt;strong&gt;Mickey&lt;/strong&gt;'s colleague was tasked with
   changing &amp;lt;br&amp;gt;s

which should result in HTML in


   When <strong>Mickey</strong>'s colleague was tasked with changing
   &lt;br&gt;s

and in the final reading in

   When *Mickey*'s colleague was tasked with changing <br>s

Problem is, this field is « HTML-unescaped » twice :

- On line ~456,

	  (setq xmlform (xml-parse-region (point-min) (point-max)))

because xml-parse-region uses xml-parse-string which uses
xml-substitute-special.

- On line ~772,

          (and extra (nnrss-decode-entities-string extra))

which uses mm-url-decode-entities-nbsp.


This leads to the <br> being interpreted as a newline.


I don't know if `w3-parse-buffer' does the same, but as xml-parse-region
works for all my feeds, I use the following change :

Index: nnrss.el
===================================================================
RCS file: /usr/local/cvsroot/gnus/lisp/nnrss.el,v
retrieving revision 7.52
diff -b -u -w -r7.52 nnrss.el
--- nnrss.el	25 Oct 2007 08:17:54 -0000	7.52
+++ nnrss.el	31 Oct 2007 15:24:27 -0000
@@ -769,7 +769,7 @@
 	  (and subject (nnrss-mime-encode-string subject))
 	  (and author (nnrss-mime-encode-string author))
 	  date
-	  (and extra (nnrss-decode-entities-string extra))
+	  extra
 	  enclosure
 	  comments
 	  hash-index)


Let me know if I should install it.

-- 
 |   Michaël `Micha' Cadilhac       |  Personne n'est la au mauvais moment   |
 |   http://michael.cadilhac.name   |           et au mauvais endroit        |
 |   JID/MSN:                       |     par hasard.                        |
 `----  michael.cadilhac@gmail.com  |          -- ElBarto               -  --'

[-- Attachment #2: Type: application/pgp-signature, Size: 188 bytes --]

             reply	other threads:[~2007-10-31 15:28 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-31 15:28 Michaël Cadilhac [this message]
2007-11-01  7:43 ` Katsumi Yamaoka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bqaf9uc6.fsf@cadilhac.name \
    --to=michael@cadilhac.name \
    --cc=ding@gnus.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).