* Why am I getting duplicate messages on RSS groups? @ 2024-04-17 8:52 Nasser Alkmim 2024-04-17 9:28 ` Emanuel Berg 2024-04-17 15:08 ` Tim Landscheidt 0 siblings, 2 replies; 12+ messages in thread From: Nasser Alkmim @ 2024-04-17 8:52 UTC (permalink / raw) To: ding Hi, Not sure how to debug this situation, but some RSS feeds that I have in groups end up with duplicate messages. I use this "five filters full-text RSS" to extract the full text from some RSS feeds, and it has a limit of 3 items per feed and 12-hours refresh rate. Maybe after this 12-hours, the messages are obtained again. The duplicate messages have different "Message-ID", but same subject/date and everything else. Any ideas? -- Nasser Alkmim ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Why am I getting duplicate messages on RSS groups? 2024-04-17 8:52 Why am I getting duplicate messages on RSS groups? Nasser Alkmim @ 2024-04-17 9:28 ` Emanuel Berg 2024-04-17 10:47 ` Nasser Alkmim 2024-04-17 15:08 ` Tim Landscheidt 1 sibling, 1 reply; 12+ messages in thread From: Emanuel Berg @ 2024-04-17 9:28 UTC (permalink / raw) To: ding Nasser Alkmim wrote: > The duplicate messages have different "Message-ID", but same > subject/date and everything else. > > Any ideas? See if this works - (setq gnus-suppress-duplicates t) -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Why am I getting duplicate messages on RSS groups? 2024-04-17 9:28 ` Emanuel Berg @ 2024-04-17 10:47 ` Nasser Alkmim 2024-04-26 5:42 ` Nasser Alkmim 0 siblings, 1 reply; 12+ messages in thread From: Nasser Alkmim @ 2024-04-17 10:47 UTC (permalink / raw) To: ding Emanuel Berg <incal@dataswamp.org> writes: > See if this works - > > (setq gnus-suppress-duplicates t) I will give it try. Thanks! -- Nasser Alkmim ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Why am I getting duplicate messages on RSS groups? 2024-04-17 10:47 ` Nasser Alkmim @ 2024-04-26 5:42 ` Nasser Alkmim 0 siblings, 0 replies; 12+ messages in thread From: Nasser Alkmim @ 2024-04-26 5:42 UTC (permalink / raw) To: ding Nasser Alkmim <nasser.alkmim@gmail.com> writes: > Emanuel Berg <incal@dataswamp.org> writes: > >> See if this works - >> >> (setq gnus-suppress-duplicates t) > > I will give it try. Thanks! Unfortunately, this setting did not solve the problem. I'm still getting duplicates on the rss groups. -- Nasser Alkmim +43 677 6408 9171 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Why am I getting duplicate messages on RSS groups? 2024-04-17 8:52 Why am I getting duplicate messages on RSS groups? Nasser Alkmim 2024-04-17 9:28 ` Emanuel Berg @ 2024-04-17 15:08 ` Tim Landscheidt 2024-04-26 15:16 ` James Thomas 1 sibling, 1 reply; 12+ messages in thread From: Tim Landscheidt @ 2024-04-17 15:08 UTC (permalink / raw) To: Nasser Alkmim; +Cc: ding Nasser Alkmim <nasser.alkmim@gmail.com> wrote: > Not sure how to debug this situation, but some RSS feeds that I have in groups end up with duplicate messages. > I use this "five filters full-text RSS" to extract the full text from some RSS feeds, and it has a limit of 3 items per feed and 12-hours refresh rate. > Maybe after this 12-hours, the messages are obtained again. > The duplicate messages have different "Message-ID", but same subject/date and everything else. > Any ideas? I'm not sure the /internal/ dates are actually the same: If I write the data for such duplicate entries to disk (*1): | (dolist (i '(58302 58461 58609 58757 58905 59053)) | (with-temp-file (format "/tmp/%d.el" i) | (pp (cddr (assoc i nnrss-group-data)) (current-buffer)))) and diff them, some entries change from file to file (pubDate, author, URL, etc.). For example, pubDate is: | $ grep -i date /tmp/*.el | /tmp/58302.el: (pubDate nil "Thu, 10 Dec 2020 22:01:00 GMT") | /tmp/58461.el: (pubDate nil "Thu, 10 Dec 2020 22:01:00 -0400") | /tmp/58609.el: (pubDate nil "Thu, 10 Dec 2020 22:01:00 -0400") | /tmp/58757.el: (pubDate nil "Thu, 10 Dec 2020 22:01:00 -0400") | /tmp/58905.el: (pubDate nil "Thu, 10 Dec 2020 22:01:00 +0000") | /tmp/59053.el: (pubDate nil "Thu, 10 Dec 2020 22:01:00 +0100") | $ But for all six messages, Gnus says: | Date: Thu, 10 Dec 2020 22:01:00 +0000 (3 years, 18 weeks ago) Now if I understand nnrss.el correctly, it considers two en- tries the same if they only differ in fields listed in nnrss-ignore-article-fields (which is 'slash:comments by de- fault), so any changes to an RSS feed entry will create a new Gnus nnrss message. What appears to be missing is treating guid as an indicator that an entry has not changed. Tim (*1) There is probably also a way to do this in Emacs (Lisp) itself. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Why am I getting duplicate messages on RSS groups? 2024-04-17 15:08 ` Tim Landscheidt @ 2024-04-26 15:16 ` James Thomas 2024-04-26 15:32 ` Eric S Fraga 2024-04-27 6:54 ` Nasser Alkmim 0 siblings, 2 replies; 12+ messages in thread From: James Thomas @ 2024-04-26 15:16 UTC (permalink / raw) To: ding Tim Landscheidt wrote: > Nasser Alkmim <nasser.alkmim@gmail.com> wrote: > >> Not sure how to debug this situation, but some RSS feeds that I have >> in groups end up with duplicate messages. > >> I use this "five filters full-text RSS" to extract the full text >> from some RSS feeds, and it has a limit of 3 items per feed and >> 12-hours refresh rate. >> Maybe after this 12-hours, the messages are obtained again. > >> The duplicate messages have different "Message-ID", but same subject/date and everything else. > >> Any ideas? > > I'm not sure the /internal/ dates are actually the same: If > I write the data for such duplicate entries to disk (*1): > > | (dolist (i '(58302 58461 58609 58757 58905 59053)) > | (with-temp-file (format "/tmp/%d.el" i) > | (pp (cddr (assoc i nnrss-group-data)) (current-buffer)))) > > and diff them, some entries change from file to file > (pubDate, author, URL, etc.). For example, pubDate is: > > | $ grep -i date /tmp/*.el > | /tmp/58302.el: (pubDate nil "Thu, 10 Dec 2020 22:01:00 GMT") > | /tmp/58461.el: (pubDate nil "Thu, 10 Dec 2020 22:01:00 -0400") > | /tmp/58609.el: (pubDate nil "Thu, 10 Dec 2020 22:01:00 -0400") > | /tmp/58757.el: (pubDate nil "Thu, 10 Dec 2020 22:01:00 -0400") > | /tmp/58905.el: (pubDate nil "Thu, 10 Dec 2020 22:01:00 +0000") > | /tmp/59053.el: (pubDate nil "Thu, 10 Dec 2020 22:01:00 +0100") > | $ > > But for all six messages, Gnus says: > > | Date: Thu, 10 Dec 2020 22:01:00 +0000 (3 years, 18 weeks ago) > > Now if I understand nnrss.el correctly, it considers two en- > tries the same if they only differ in fields listed in > nnrss-ignore-article-fields (which is 'slash:comments by de- > fault), so any changes to an RSS feed entry will create a > new Gnus nnrss message. What appears to be missing is > treating guid as an indicator that an entry has not changed. Nasser, Maybe you haven't tried adding 'pubDate (or better: everything other than 'guid) to nnrss-ignore-article-fields. -- ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Why am I getting duplicate messages on RSS groups? 2024-04-26 15:16 ` James Thomas @ 2024-04-26 15:32 ` Eric S Fraga 2024-04-26 21:36 ` James Thomas 2024-04-27 6:54 ` Nasser Alkmim 1 sibling, 1 reply; 12+ messages in thread From: Eric S Fraga @ 2024-04-26 15:32 UTC (permalink / raw) To: ding James, On Friday, 26 Apr 2024 at 20:46, James Thomas wrote: > Maybe you haven't tried adding 'pubDate (or better: everything other > than 'guid) to nnrss-ignore-article-fields. Is there a list of fields to be added to nnrss-ignore-article-fields that one can find somewhere or should I guess them from the header of an rss entry? Thank you, eric -- Eric S Fraga via gnus (Emacs 30.0.50 2024-04-17) on Debian bookworm/sid ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Why am I getting duplicate messages on RSS groups? 2024-04-26 15:32 ` Eric S Fraga @ 2024-04-26 21:36 ` James Thomas 2024-04-28 10:02 ` Eric S Fraga 0 siblings, 1 reply; 12+ messages in thread From: James Thomas @ 2024-04-26 21:36 UTC (permalink / raw) To: ding Eric S Fraga wrote: > Is there a list of fields to be added to nnrss-ignore-article-fields > that one can find somewhere or should I guess them from the header of an > rss entry? The latter, I suppose, since it could vary by site. I had the following: '(slash:comments slash:hit_parade num_comments ups) ...for slashdot.org (before I switched to the new nnatom backend). -- ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Why am I getting duplicate messages on RSS groups? 2024-04-26 21:36 ` James Thomas @ 2024-04-28 10:02 ` Eric S Fraga 0 siblings, 0 replies; 12+ messages in thread From: Eric S Fraga @ 2024-04-28 10:02 UTC (permalink / raw) To: ding On Saturday, 27 Apr 2024 at 03:06, James Thomas wrote: > The latter, I suppose, since it could vary by site. I had the following: > > '(slash:comments slash:hit_parade num_comments ups) > > ...for slashdot.org (before I switched to the new nnatom backend). Thank you. None of these appear in the headers for my rss feeds, most of which are mastodon tags or individuals. Mastodon rss feeds are quite minimal in the information they propagate unfortunately. -- Eric S Fraga via gnus (Emacs 30.0.50 2024-04-18) on Debian 12.5 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Why am I getting duplicate messages on RSS groups? 2024-04-26 15:16 ` James Thomas 2024-04-26 15:32 ` Eric S Fraga @ 2024-04-27 6:54 ` Nasser Alkmim 2024-04-27 9:33 ` James Thomas 1 sibling, 1 reply; 12+ messages in thread From: Nasser Alkmim @ 2024-04-27 6:54 UTC (permalink / raw) To: James Thomas; +Cc: ding James Thomas <jimjoe@gmx.net> writes: > > Nasser, > > Maybe you haven't tried adding 'pubDate (or better: everything other > than 'guid) to nnrss-ignore-article-fields. Hi James, I tried (add-to-list 'nnrss-ignore-article-fields 'pubDate), and after scanning an rss group with gnus-group-get-new-news-this-group, it still fetches a repeated article. What I don't understand is that, after scanning the rss group, I check the variable nnrss-ignore-article-fields again and it is reset to its default value (slash:comments). I'm able to reproduce the behavior by 1. deleting the duplicated messages (two in this case) 2. closing and reopening gnus 3. scan the group again Then the duplicated messages reappear. -- Nasser Alkmim +43 677 6408 9171 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Why am I getting duplicate messages on RSS groups? 2024-04-27 6:54 ` Nasser Alkmim @ 2024-04-27 9:33 ` James Thomas 2024-04-27 13:07 ` Nasser Alkmim 0 siblings, 1 reply; 12+ messages in thread From: James Thomas @ 2024-04-27 9:33 UTC (permalink / raw) To: ding Nasser Alkmim wrote: > I tried (add-to-list 'nnrss-ignore-article-fields 'pubDate), and after Are you sure it was done after Gnus and nnrss was loaded? If not, try putting this in ~/.gnus.el: (setq nnrss-ignore-article-fields '(slash:comments pubDate)) -- ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Why am I getting duplicate messages on RSS groups? 2024-04-27 9:33 ` James Thomas @ 2024-04-27 13:07 ` Nasser Alkmim 0 siblings, 0 replies; 12+ messages in thread From: Nasser Alkmim @ 2024-04-27 13:07 UTC (permalink / raw) To: James Thomas; +Cc: ding James Thomas <jimjoe@gmx.net> writes: > Nasser Alkmim wrote: > >> I tried (add-to-list 'nnrss-ignore-article-fields 'pubDate), and after > > Are you sure it was done after Gnus and nnrss was loaded? If not, try > putting this in ~/.gnus.el: > > (setq nnrss-ignore-article-fields '(slash:comments pubDate)) I have it in a use-package declaration that expands to: (eval-after-load 'nrss '(progn (add-to-list 'nnrss-ignore-article-fields 'pubDate) t)) I also tried in a ~/.gnus.el, but the same behavior persists. -- Nasser Alkmim +43 677 6408 9171 ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2024-04-28 10:03 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-04-17 8:52 Why am I getting duplicate messages on RSS groups? Nasser Alkmim 2024-04-17 9:28 ` Emanuel Berg 2024-04-17 10:47 ` Nasser Alkmim 2024-04-26 5:42 ` Nasser Alkmim 2024-04-17 15:08 ` Tim Landscheidt 2024-04-26 15:16 ` James Thomas 2024-04-26 15:32 ` Eric S Fraga 2024-04-26 21:36 ` James Thomas 2024-04-28 10:02 ` Eric S Fraga 2024-04-27 6:54 ` Nasser Alkmim 2024-04-27 9:33 ` James Thomas 2024-04-27 13:07 ` Nasser Alkmim
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).