Gnus development mailing list
 help / color / mirror / Atom feed
From: michael@cadilhac.name (Michaël Cadilhac)
To: ding@gnus.org
Subject: RSS and annoying always-changing article fields.
Date: Mon, 16 Apr 2007 19:58:06 +0200	[thread overview]
Message-ID: <87lkgs5hht.fsf@lrde.org> (raw)


[-- Attachment #1.1: Type: text/plain, Size: 331 bytes --]

Hi!

I've subscribed to some RSS feeds that update the fields of articles
continuously with useless informations, such as the number of comments
for the article.

I've read some bug report in the list that look like this one, so it
probably is the very same bug.

Here's a proposal for that: adding a variable for ignored fields.


[-- Attachment #1.2: nnrss.patch --]
[-- Type: text/x-patch, Size: 3118 bytes --]

Index: lisp/nnrss.el
===================================================================
RCS file: /usr/local/cvsroot/gnus/lisp/nnrss.el,v
retrieving revision 7.47
diff -c -r7.47 nnrss.el
*** lisp/nnrss.el	24 Jan 2007 07:15:37 -0000	7.47
--- lisp/nnrss.el	16 Apr 2007 17:50:48 -0000
***************
*** 50,55 ****
--- 50,66 ----
  (defvoo nnrss-directory (nnheader-concat gnus-directory "rss/")
    "Where nnrss will save its files.")
  
+ (defvoo nnrss-ignore-article-fields nil
+   "*List of fields that should be ignored when comparing RSS articles.
+ Some RSS feeds update article fields during their lives, such as the
+ number of comments or the times the articles have been seen.  However, if
+ there is a difference between the local article and the distant one,
+ it is considered as a new article.  To avoid this and discard some fields,
+ set this variable to the list of fields to be ignored.
+ 
+ For example, http://worsethanfailure.com requires this variable to be
+ set to '(slash:comments).")
+ 
  ;; (group max rss-url)
  (defvoo nnrss-server-data nil)
  
***************
*** 658,663 ****
--- 669,682 ----
  
  ;;; Snarf functions
  
+ (defun nnrss-make-hash-index (item)
+   (setq item (remove-if
+ 	      (lambda (field)
+ 		(when (listp field)
+ 		  (memq (car field) nnrss-ignore-article-fields)))
+ 	      item))
+   (md5 (gnus-prin1-to-string item)))
+ 
  (defun nnrss-check-group (group server)
    (let (file xml subject url extra changed author date feed-subject
  	     enclosure comments rss-ns rdf-ns content-ns dc-ns
***************
*** 693,699 ****
      (dolist (item (nreverse (nnrss-find-el (intern (concat rss-ns "item")) xml)))
        (when (and (listp item)
  		 (string= (concat rss-ns "item") (car item))
! 		 (progn (setq hash-index (md5 (gnus-prin1-to-string item)))
  			(not (gethash hash-index nnrss-group-hashtb))))
  	(setq subject (nnrss-node-text rss-ns 'title item))
  	(setq url (nnrss-decode-entities-string
--- 712,718 ----
      (dolist (item (nreverse (nnrss-find-el (intern (concat rss-ns "item")) xml)))
        (when (and (listp item)
  		 (string= (concat rss-ns "item") (car item))
! 		 (progn (setq hash-index (nnrss-make-hash-index item))
  			(not (gethash hash-index nnrss-group-hashtb))))
  	(setq subject (nnrss-node-text rss-ns 'title item))
  	(setq url (nnrss-decode-entities-string
Index: lisp/ChangeLog
===================================================================
RCS file: /usr/local/cvsroot/gnus/lisp/ChangeLog,v
retrieving revision 7.1527
diff -C0 -r7.1527 ChangeLog
*** lisp/ChangeLog	16 Apr 2007 12:25:04 -0000	7.1527
--- lisp/ChangeLog	16 Apr 2007 17:50:53 -0000
***************
*** 0 ****
--- 1,9 ----
+ 2007-04-16  Michaël Cadilhac  <michael@cadilhac.name>
+ 
+ 	* nnrss.el (nnrss-ignore-article-fields): New variable.  List of fields
+ 	that should be ignored when comparing distant RSS articles with local
+ 	ones.
+ 	(nnrss-make-hash-index): New function.  Create a hash index according
+ 	to the ignored fields.
+ 	(nnrss-check-group): Use it.
+ 

[-- Attachment #1.3: Type: text/plain, Size: 335 bytes --]


TIA!

-- 
 |   Michaël `Micha' Cadilhac       |  Isn't vi that text editor with        |
 |   http://michael.cadilhac.name   |   two modes... One that beeps and      |
 |   JID/MSN:                       |     one that corrupts your file?       |
 `----  michael.cadilhac@gmail.com  |           -- Dan Jacobson         -  --'

[-- Attachment #2: Type: application/pgp-signature, Size: 188 bytes --]

             reply	other threads:[~2007-04-16 17:58 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-04-16 17:58 Michaël Cadilhac [this message]
2007-04-16 21:04 ` Michaël Cadilhac
2007-04-26  7:31   ` Michaël Cadilhac

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87lkgs5hht.fsf@lrde.org \
    --to=michael@cadilhac.name \
    --cc=ding@gnus.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).