Announcements and discussions for Gnus, the GNU Emacs Usenet newsreader
 help / color / mirror / Atom feed
* [nnrss.el patch] Only Use Subject, Author, URL and Date to build the Hash Index
@ 2007-02-04  4:10 David Hansen
  2007-02-24 17:21 ` Mark Plaksin
  0 siblings, 1 reply; 3+ messages in thread
From: David Hansen @ 2007-02-04  4:10 UTC (permalink / raw)
  To: info-gnus-english

[-- Attachment #1: Type: text/plain, Size: 147 bytes --]

Hello,

so far this works well with the feeds i read, no nasty duplicates
anymore.  Should be easy to extend to other fields if necessary.

David


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: nnrss.diff --]
[-- Type: text/x-patch, Size: 6036 bytes --]

*** nnrss.el	30 Jan 2007 21:50:55 +0100	7.47
--- nnrss.el	03 Feb 2007 11:42:52 +0100	
***************
*** 691,750 ****
  	  rss-ns (nnrss-get-namespace-prefix xml "http://purl.org/rss/1.0/")
  	  content-ns (nnrss-get-namespace-prefix xml "http://purl.org/rss/1.0/modules/content/"))
      (dolist (item (nreverse (nnrss-find-el (intern (concat rss-ns "item")) xml)))
!       (when (and (listp item)
! 		 (string= (concat rss-ns "item") (car item))
! 		 (progn (setq hash-index (md5 (gnus-prin1-to-string item)))
! 			(not (gethash hash-index nnrss-group-hashtb))))
  	(setq subject (nnrss-node-text rss-ns 'title item))
! 	(setq url (nnrss-decode-entities-string
! 		   (nnrss-node-text rss-ns 'link (cddr item))))
! 	(setq extra (or (nnrss-node-text content-ns 'encoded item)
! 			(nnrss-node-text rss-ns 'description item)))
! 	(if (setq feed-subject (nnrss-node-text dc-ns 'subject item))
! 	    (setq extra (concat feed-subject "<br /><br />" extra)))
! 	(setq author (or (nnrss-node-text rss-ns 'author item)
  			 (nnrss-node-text dc-ns 'creator item)
  			 (nnrss-node-text dc-ns 'contributor item)))
! 	(setq date (nnrss-normalize-date
! 		    (or (nnrss-node-text dc-ns 'date item)
! 			(nnrss-node-text rss-ns 'pubDate item))))
! 	(setq comments (nnrss-node-text rss-ns 'comments item))
! 	(when (setq enclosure (cadr (assq (intern (concat rss-ns "enclosure")) item)))
! 	  (let ((url (cdr (assq 'url enclosure)))
! 		(len (cdr (assq 'length enclosure)))
! 		(type (cdr (assq 'type enclosure)))
! 		(name))
! 	    (setq len
! 		  (if (and len (integerp (setq len (string-to-number len))))
! 		      ;; actually already in `ls-lisp-format-file-size' but
! 		      ;; probably not worth to require it for one function
! 		      (do ((size (/ len 1.0) (/ size 1024.0))
! 			   (post-fixes (list "" "k" "M" "G" "T" "P" "E")
! 				       (cdr post-fixes)))
! 			  ((< size 1024)
! 			   (format "%.1f%s" size (car post-fixes))))
! 		    "0"))
! 	    (setq url (or url ""))
! 	    (setq name (if (string-match "/\\([^/]*\\)$" url)
! 			   (match-string 1 url)
! 			 "file"))
! 	    (setq type (or type ""))
! 	    (setq enclosure (list url name len type))))
! 	(push
! 	 (list
! 	  (incf nnrss-group-max)
! 	  (current-time)
! 	  url
! 	  (and subject (nnrss-mime-encode-string subject))
! 	  (and author (nnrss-mime-encode-string author))
! 	  date
! 	  (and extra (nnrss-decode-entities-string extra))
! 	  enclosure
! 	  comments
! 	  hash-index)
! 	 nnrss-group-data)
! 	(puthash hash-index t nnrss-group-hashtb)
! 	(setq changed t))
        (setq extra nil))
      (when changed
        (nnrss-save-group-data group server)
--- 691,754 ----
  	  rss-ns (nnrss-get-namespace-prefix xml "http://purl.org/rss/1.0/")
  	  content-ns (nnrss-get-namespace-prefix xml "http://purl.org/rss/1.0/modules/content/"))
      (dolist (item (nreverse (nnrss-find-el (intern (concat rss-ns "item")) xml)))
!       (when (and (listp item) (string= (concat rss-ns "item") (car item)))
!         ;; for hashing use subject, author, url and date
  	(setq subject (nnrss-node-text rss-ns 'title item))
!         (setq author (or (nnrss-node-text rss-ns 'author item)
  			 (nnrss-node-text dc-ns 'creator item)
  			 (nnrss-node-text dc-ns 'contributor item)))
!         (setq url (nnrss-decode-entities-string
! 		   (nnrss-node-text rss-ns 'link (cddr item))))
!         (setq date (or (nnrss-node-text dc-ns 'date item)
!                        (nnrss-node-text rss-ns 'pubDate item)))
!         (when (progn
!                 (setq hash-index (md5 (concat (or subject "")
!                                               (or author "")
!                                               (or url "")
!                                               (or date ""))))
!                 (not (gethash hash-index nnrss-group-hashtb)))
!           (setq date (nnrss-normalize-date date))
!           (setq extra (or (nnrss-node-text content-ns 'encoded item)
!                           (nnrss-node-text rss-ns 'description item)))
!           (if (setq feed-subject (nnrss-node-text dc-ns 'subject item))
!               (setq extra (concat feed-subject "<br /><br />" extra)))
!           (setq comments (nnrss-node-text rss-ns 'comments item))
!           (when (setq enclosure (cadr (assq (intern (concat rss-ns "enclosure")) item)))
!             (let ((url (cdr (assq 'url enclosure)))
!                   (len (cdr (assq 'length enclosure)))
!                   (type (cdr (assq 'type enclosure)))
!                   (name))
!               (setq len
!                     (if (and len (integerp (setq len (string-to-number len))))
!                         ;; actually already in `ls-lisp-format-file-size' but
!                         ;; probably not worth to require it for one function
!                         (do ((size (/ len 1.0) (/ size 1024.0))
!                              (post-fixes (list "" "k" "M" "G" "T" "P" "E")
!                                          (cdr post-fixes)))
!                             ((< size 1024)
!                              (format "%.1f%s" size (car post-fixes))))
!                       "0"))
!               (setq url (or url ""))
!               (setq name (if (string-match "/\\([^/]*\\)$" url)
!                              (match-string 1 url)
!                            "file"))
!               (setq type (or type ""))
!               (setq enclosure (list url name len type))))
!           (push
!            (list
!             (incf nnrss-group-max)
!             (current-time)
!             url
!             (and subject (nnrss-mime-encode-string subject))
!             (and author (nnrss-mime-encode-string author))
!             date
!             (and extra (nnrss-decode-entities-string extra))
!             enclosure
!             comments
!             hash-index)
!            nnrss-group-data)
!           (puthash hash-index t nnrss-group-hashtb)
!           (setq changed t)))
        (setq extra nil))
      (when changed
        (nnrss-save-group-data group server)

[-- Attachment #3: Type: text/plain, Size: 161 bytes --]

_______________________________________________
info-gnus-english mailing list
info-gnus-english@gnu.org
http://lists.gnu.org/mailman/listinfo/info-gnus-english

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [nnrss.el patch] Only Use Subject, Author, URL and Date to build the Hash Index
  2007-02-04  4:10 [nnrss.el patch] Only Use Subject, Author, URL and Date to build the Hash Index David Hansen
@ 2007-02-24 17:21 ` Mark Plaksin
  2007-02-25 18:24   ` David Hansen
  0 siblings, 1 reply; 3+ messages in thread
From: Mark Plaksin @ 2007-02-24 17:21 UTC (permalink / raw)
  To: info-gnus-english

David Hansen <david.hansen@gmx.net> writes:

> Hello,
>
> so far this works well with the feeds i read, no nasty duplicates
> anymore.  Should be easy to extend to other fields if necessary.

This is perfect.  Thank you!  It ought to be applied to Gnus.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [nnrss.el patch] Only Use Subject, Author, URL and Date to build the Hash Index
  2007-02-24 17:21 ` Mark Plaksin
@ 2007-02-25 18:24   ` David Hansen
  0 siblings, 0 replies; 3+ messages in thread
From: David Hansen @ 2007-02-25 18:24 UTC (permalink / raw)
  To: info-gnus-english

On Sat, 24 Feb 2007 12:21:16 -0500 Mark Plaksin wrote:

> David Hansen <david.hansen@gmx.net> writes:
>
>> Hello,
>>
>> so far this works well with the feeds i read, no nasty duplicates
>> anymore.  Should be easy to extend to other fields if necessary.
>
> This is perfect.  Thank you!  It ought to be applied to Gnus.

Ooops, seems I've send it to the wrong list...  Shame on me.  Thanks
for the reply, otherwise i wouldn't have noticed.

I'll send it to the devel list (will have a look at it first though,
I think calculating a hash shouldn't be necessary anymore).

David

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2007-02-25 18:24 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-04  4:10 [nnrss.el patch] Only Use Subject, Author, URL and Date to build the Hash Index David Hansen
2007-02-24 17:21 ` Mark Plaksin
2007-02-25 18:24   ` David Hansen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).