From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: from mx1.math.uh.edu (mx1.math.uh.edu [129.7.128.32]) by inbox.vuxu.org (Postfix) with ESMTP id 6954025473 for ; Fri, 26 Apr 2024 17:17:26 +0200 (CEST) Received: from lists1.math.uh.edu ([129.7.128.208]) by mx1.math.uh.edu with esmtps (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1s0NK9-00000007u6p-168Z for ml@inbox.vuxu.org; Fri, 26 Apr 2024 10:17:24 -0500 Received: from lists1.math.uh.edu ([127.0.0.1] helo=lists.math.uh.edu) by lists1.math.uh.edu with smtp (Exim 4.97.1) (envelope-from ) id 1s0NK9-00000003s4b-0BpS for ml@inbox.vuxu.org; Fri, 26 Apr 2024 10:17:17 -0500 Received: from mx2.math.uh.edu ([129.7.128.33]) by lists1.math.uh.edu with esmtp (Exim 4.97.1) (envelope-from ) id 1s0NK1-00000003s4S-029O for ding@lists.math.uh.edu; Fri, 26 Apr 2024 10:17:14 -0500 Received: from quimby.gnus.org ([95.216.78.240]) by mx2.math.uh.edu with esmtps (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1s0NJl-0000000705T-10Ly for ding@lists.math.uh.edu; Fri, 26 Apr 2024 10:16:58 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnus.org; s=20200322; h=Content-Type:MIME-Version:Message-ID:Date:References: In-Reply-To:Subject:To:From:Sender:Reply-To:Cc:Content-Transfer-Encoding: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=g4P/LhNOWcWmjiOUmRiduLxdA887Y48vzkTukl5+b4U=; b=ql9rdPVsBORICfhiwxQ3ijF0fS 7k5GaGDb9uc3XWL7NAArNN5GzP/lNg+/bBLs0UXcJebYwU5kwy7wPfGFfghEj6MeGMw4rWqV6PI4h 2gRNjempyr6vaEwQafGusjqRTRE0UfC3Gzo/ihNHCLmGXbkpWgTBOOju35i89+2LBSCU=; Received: from mout.gmx.net ([212.227.15.18]) by quimby.gnus.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1s0NJY-0008L3-0S for ding@gnus.org; Fri, 26 Apr 2024 17:16:43 +0200 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmx.net; s=s31663417; t=1714144599; x=1714749399; i=jimjoe@gmx.net; bh=g4P/LhNOWcWmjiOUmRiduLxdA887Y48vzkTukl5+b4U=; h=X-UI-Sender-Class:From:To:Subject:In-Reply-To:References:Date: Message-ID:MIME-Version:Content-Type:cc:content-transfer-encoding: content-type:date:from:message-id:mime-version:reply-to:subject: to; b=Mdc0dKL5qq1aYYHAGi3DCAkqo9FO2Jr9tHXhmqT31v0BLdniL+SAIsZqBao6ij+f FWqzkOvjSL8/CBjYlqp11sad00JeFED1rTqOa4zwvdbrGnUzGoqlofjJb+dE+pD8+ vElWY7PJIuUf8/Gf2uiOGzaZ4LkwqslTu4nbya8Rl0//4oqy0V/KMZ08VdQd4MgbH ARTMawJ39Q/dqe8UVGXM9LVz+FkeNetqgVhoZBa8Ol44VenOOBVbp8ht7ZOdRMqPi FeWk9QqHCXk5+vNC0i+VD6b5Fok4YH3X6di7j2syY3djIH2Np7NnhJW5V3PbepdV7 S45WCEoJi+7dL4dbFg== X-UI-Sender-Class: 724b4f7f-cbec-4199-ad4e-598c01a50d3a Received: from user-Inspiron-15-5518 ([59.182.124.78]) by mail.gmx.net (mrgmx005 [212.227.17.184]) with ESMTPSA (Nemesis) id 1MWRRZ-1sAXe91Foh-00Xvno for ; Fri, 26 Apr 2024 17:16:38 +0200 From: James Thomas To: ding@gnus.org Subject: Re: Why am I getting duplicate messages on RSS groups? In-Reply-To: <87ttk0do1x.fsf@vagabond.tim-landscheidt.de> (Tim Landscheidt's message of "Wed, 17 Apr 2024 15:08:42 +0000") References: <86le5cv0b0.fsf@gmail.com> <87ttk0do1x.fsf@vagabond.tim-landscheidt.de> Date: Fri, 26 Apr 2024 20:46:35 +0530 Message-ID: <87jzkk6tnw.fsf@gmx.net> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain X-Provags-ID: V03:K1:rQe4ROhNdlWc7e5hDWlmdeKKJQgyWRm5o0njN91FIfseSysxop3 8jNQwN+IVmTi1XNF4PKuk+zZ+OJCYbKpQLD1UeqH/ok/JQa5DCR24RBLKmXpi5J+7gBnHGF ZwczNYBusSN0b9TCxtjY9BmZ40LYmRJyNm+riAn78uAOjprBuBzIp/2JFxZ0vbuvkBHjZYD 6VvoNVT6ACVHnrZUGc2nA== UI-OutboundReport: notjunk:1;M01:P0:YGcqLzEeJVY=;q//JXTVOO4mC9+XGVsqCxWKEZ6P FOZ30LWhQoRogaK5XsJIU8eFWNGjP5rM2exU/CeIif+D8ZgY53DEdu+jEzW8+6u2bHC3ZdllK RkUC9qfnubbfNiyIaS8Q9tbwbuW7xBcU9TZYNPzfM6knAalIYNYEVmXR0UTxKeZtfJdoUqKF1 +Hhnck9eRhNjmX2IHdA8LX+lE6TDrHHLJrVu87F0hWWiS5qBEbNfMY7H3rprp/c6D0mf1H0iI +vFuHDiBD2ixNXcxcZjSqCY6QCLAG6zpklWF4hrvNsKyZyhiN9qiurH71PVFKxM/nTGACaELK zK9/UOue0A2oVlzPp5WVT8Opap6ewME0NXWdJ1Aj1CnFsNhlV/XdkNBWmC4sCbf9SBGhOpRmh X9+v/6gphk5TzfFMqSaVLrLgQ8/8EeYZl2+Y4xwCwi6I3bYe74bSUGpoWsFZkMNy27ratkpoW 2sebfuMnsEyxf8bUrm1iVRbg51Ann3aYbYe1vZTcYVcHNnSA2fjThyo0ciYRm7XwvIDPSUjwK AfQ1xWatow1ALTS3ozFXtBq6DW7DhHYQhw5ufOK4sP4XIEakgNr+MGE0NKbvVCgWHa+7BBwvQ sfJsAOYR9Cf3kLRi+S6wwnzmcYNkUDva5YYzhHyqU6FphGXJN+vyILUq1j7rmZm/ey6E5m5xx X0d6ts4SkmV0ajjcuh7fzuua4MoMKhepSbUz7sBKv0k1ZOK1kKAUA8r7ukmXbR5vXuvFOGJKS XSLDUxvTHQ3JA743WLi/Ahx4n4RYO0+VPxyT2za1Pjc7YkVZGHCdngSPsutY+7Zsf4wzyN4Mf Bqh0LcS8skXmFlnXFKJ9bU6NIOJo+NiQaM3SFJhKR/fhg= List-ID: Precedence: bulk Tim Landscheidt wrote: > Nasser Alkmim wrote: > >> Not sure how to debug this situation, but some RSS feeds that I have >> in groups end up with duplicate messages. > >> I use this "five filters full-text RSS" to extract the full text >> from some RSS feeds, and it has a limit of 3 items per feed and >> 12-hours refresh rate. >> Maybe after this 12-hours, the messages are obtained again. > >> The duplicate messages have different "Message-ID", but same subject/date and everything else. > >> Any ideas? > > I'm not sure the /internal/ dates are actually the same: If > I write the data for such duplicate entries to disk (*1): > > | (dolist (i '(58302 58461 58609 58757 58905 59053)) > | (with-temp-file (format "/tmp/%d.el" i) > | (pp (cddr (assoc i nnrss-group-data)) (current-buffer)))) > > and diff them, some entries change from file to file > (pubDate, author, URL, etc.). For example, pubDate is: > > | $ grep -i date /tmp/*.el > | /tmp/58302.el: (pubDate nil "Thu, 10 Dec 2020 22:01:00 GMT") > | /tmp/58461.el: (pubDate nil "Thu, 10 Dec 2020 22:01:00 -0400") > | /tmp/58609.el: (pubDate nil "Thu, 10 Dec 2020 22:01:00 -0400") > | /tmp/58757.el: (pubDate nil "Thu, 10 Dec 2020 22:01:00 -0400") > | /tmp/58905.el: (pubDate nil "Thu, 10 Dec 2020 22:01:00 +0000") > | /tmp/59053.el: (pubDate nil "Thu, 10 Dec 2020 22:01:00 +0100") > | $ > > But for all six messages, Gnus says: > > | Date: Thu, 10 Dec 2020 22:01:00 +0000 (3 years, 18 weeks ago) > > Now if I understand nnrss.el correctly, it considers two en- > tries the same if they only differ in fields listed in > nnrss-ignore-article-fields (which is 'slash:comments by de- > fault), so any changes to an RSS feed entry will create a > new Gnus nnrss message. What appears to be missing is > treating guid as an indicator that an entry has not changed. Nasser, Maybe you haven't tried adding 'pubDate (or better: everything other than 'guid) to nnrss-ignore-article-fields. --