Gnus development mailing list
 help / color / mirror / Atom feed
* RSS and annoying always-changing article fields.
@ 2007-04-16 17:58 Michaël Cadilhac
  2007-04-16 21:04 ` Michaël Cadilhac
  0 siblings, 1 reply; 3+ messages in thread
From: Michaël Cadilhac @ 2007-04-16 17:58 UTC (permalink / raw)
  To: ding


[-- Attachment #1.1: Type: text/plain, Size: 331 bytes --]

Hi!

I've subscribed to some RSS feeds that update the fields of articles
continuously with useless informations, such as the number of comments
for the article.

I've read some bug report in the list that look like this one, so it
probably is the very same bug.

Here's a proposal for that: adding a variable for ignored fields.


[-- Attachment #1.2: nnrss.patch --]
[-- Type: text/x-patch, Size: 3118 bytes --]

Index: lisp/nnrss.el
===================================================================
RCS file: /usr/local/cvsroot/gnus/lisp/nnrss.el,v
retrieving revision 7.47
diff -c -r7.47 nnrss.el
*** lisp/nnrss.el	24 Jan 2007 07:15:37 -0000	7.47
--- lisp/nnrss.el	16 Apr 2007 17:50:48 -0000
***************
*** 50,55 ****
--- 50,66 ----
  (defvoo nnrss-directory (nnheader-concat gnus-directory "rss/")
    "Where nnrss will save its files.")
  
+ (defvoo nnrss-ignore-article-fields nil
+   "*List of fields that should be ignored when comparing RSS articles.
+ Some RSS feeds update article fields during their lives, such as the
+ number of comments or the times the articles have been seen.  However, if
+ there is a difference between the local article and the distant one,
+ it is considered as a new article.  To avoid this and discard some fields,
+ set this variable to the list of fields to be ignored.
+ 
+ For example, http://worsethanfailure.com requires this variable to be
+ set to '(slash:comments).")
+ 
  ;; (group max rss-url)
  (defvoo nnrss-server-data nil)
  
***************
*** 658,663 ****
--- 669,682 ----
  
  ;;; Snarf functions
  
+ (defun nnrss-make-hash-index (item)
+   (setq item (remove-if
+ 	      (lambda (field)
+ 		(when (listp field)
+ 		  (memq (car field) nnrss-ignore-article-fields)))
+ 	      item))
+   (md5 (gnus-prin1-to-string item)))
+ 
  (defun nnrss-check-group (group server)
    (let (file xml subject url extra changed author date feed-subject
  	     enclosure comments rss-ns rdf-ns content-ns dc-ns
***************
*** 693,699 ****
      (dolist (item (nreverse (nnrss-find-el (intern (concat rss-ns "item")) xml)))
        (when (and (listp item)
  		 (string= (concat rss-ns "item") (car item))
! 		 (progn (setq hash-index (md5 (gnus-prin1-to-string item)))
  			(not (gethash hash-index nnrss-group-hashtb))))
  	(setq subject (nnrss-node-text rss-ns 'title item))
  	(setq url (nnrss-decode-entities-string
--- 712,718 ----
      (dolist (item (nreverse (nnrss-find-el (intern (concat rss-ns "item")) xml)))
        (when (and (listp item)
  		 (string= (concat rss-ns "item") (car item))
! 		 (progn (setq hash-index (nnrss-make-hash-index item))
  			(not (gethash hash-index nnrss-group-hashtb))))
  	(setq subject (nnrss-node-text rss-ns 'title item))
  	(setq url (nnrss-decode-entities-string
Index: lisp/ChangeLog
===================================================================
RCS file: /usr/local/cvsroot/gnus/lisp/ChangeLog,v
retrieving revision 7.1527
diff -C0 -r7.1527 ChangeLog
*** lisp/ChangeLog	16 Apr 2007 12:25:04 -0000	7.1527
--- lisp/ChangeLog	16 Apr 2007 17:50:53 -0000
***************
*** 0 ****
--- 1,9 ----
+ 2007-04-16  Michaël Cadilhac  <michael@cadilhac.name>
+ 
+ 	* nnrss.el (nnrss-ignore-article-fields): New variable.  List of fields
+ 	that should be ignored when comparing distant RSS articles with local
+ 	ones.
+ 	(nnrss-make-hash-index): New function.  Create a hash index according
+ 	to the ignored fields.
+ 	(nnrss-check-group): Use it.
+ 

[-- Attachment #1.3: Type: text/plain, Size: 335 bytes --]


TIA!

-- 
 |   Michaël `Micha' Cadilhac       |  Isn't vi that text editor with        |
 |   http://michael.cadilhac.name   |   two modes... One that beeps and      |
 |   JID/MSN:                       |     one that corrupts your file?       |
 `----  michael.cadilhac@gmail.com  |           -- Dan Jacobson         -  --'

[-- Attachment #2: Type: application/pgp-signature, Size: 188 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: RSS and annoying always-changing article fields.
  2007-04-16 17:58 RSS and annoying always-changing article fields Michaël Cadilhac
@ 2007-04-16 21:04 ` Michaël Cadilhac
  2007-04-26  7:31   ` Michaël Cadilhac
  0 siblings, 1 reply; 3+ messages in thread
From: Michaël Cadilhac @ 2007-04-16 21:04 UTC (permalink / raw)
  To: ding

[-- Attachment #1: Type: text/plain, Size: 1120 bytes --]

michael@cadilhac.name (Michaël Cadilhac) writes:

> Hi!
>
> I've subscribed to some RSS feeds that update the fields of articles
> continuously with useless informations, such as the number of comments
> for the article.
>
> I've read some bug report in the list that look like this one, so it
> probably is the very same bug.
>

> + (defun nnrss-make-hash-index (item)
> +   (setq item (remove-if
> + 	      (lambda (field)
> + 		(when (listp field)
> + 		  (memq (car field) nnrss-ignore-article-fields)))
> + 	      item))
> +   (md5 (gnus-prin1-to-string item)))
> + 

Oh, and just another rationale for adding a whole function for that :
it allows (me) to advice it so that I can perform stronger ignoring
procedures, such as removing all \n and squeezing spaces.

-- 
 |   Michaël `Micha' Cadilhac       |  Ajoutez du whisky                     |
 |   http://michael.cadilhac.name   |           à n'importe quel texte,      |
 |   JID/MSN:                       |    ça vous fera un beau pangramme.     |
 `----  michael.cadilhac@gmail.com  |          -- Michel Clavel         -  --'

[-- Attachment #2: Type: application/pgp-signature, Size: 188 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: RSS and annoying always-changing article fields.
  2007-04-16 21:04 ` Michaël Cadilhac
@ 2007-04-26  7:31   ` Michaël Cadilhac
  0 siblings, 0 replies; 3+ messages in thread
From: Michaël Cadilhac @ 2007-04-26  7:31 UTC (permalink / raw)
  To: ding

[-- Attachment #1: Type: text/plain, Size: 1213 bytes --]

michael@cadilhac.name (Michaël Cadilhac) writes:

> michael@cadilhac.name (Michaël Cadilhac) writes:
>
>> Hi!
>>
>> I've subscribed to some RSS feeds that update the fields of articles
>> continuously with useless informations, such as the number of comments
>> for the article.
>>
>> I've read some bug report in the list that look like this one, so it
>> probably is the very same bug.
>>
>
>> + (defun nnrss-make-hash-index (item)
>> +   (setq item (remove-if
>> + 	      (lambda (field)
>> + 		(when (listp field)
>> + 		  (memq (car field) nnrss-ignore-article-fields)))
>> + 	      item))
>> +   (md5 (gnus-prin1-to-string item)))
>> + 
>
> Oh, and just another rationale for adding a whole function for that :
> it allows (me) to advice it so that I can perform stronger ignoring
> procedures, such as removing all \n and squeezing spaces.

SYN/ACK ?

-- 
 |   Michaël `Micha' Cadilhac       |    The second-degree,                  |
 |   http://michael.cadilhac.name   |       is kind of                       |
 |   JID/MSN:                       |   the semantic back slang.             |
 `----  michael.cadilhac@gmail.com  |                                   -  --'

[-- Attachment #2: Type: application/pgp-signature, Size: 188 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2007-04-26  7:31 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-04-16 17:58 RSS and annoying always-changing article fields Michaël Cadilhac
2007-04-16 21:04 ` Michaël Cadilhac
2007-04-26  7:31   ` Michaël Cadilhac

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).