From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.user/16531 Path: news.gmane.org!not-for-mail From: Alex Schroeder Newsgroups: gmane.emacs.gnus.user Subject: Deleting duplicates from nnml:mail.misc Date: Tue, 08 Oct 2013 01:00:10 +0200 Organization: Gnus News User Services Message-ID: NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1381228067 7994 80.91.229.3 (8 Oct 2013 10:27:47 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 8 Oct 2013 10:27:47 +0000 (UTC) To: info-gnus-english@gnu.org Original-X-From: info-gnus-english-bounces+gegu-info-gnus-english=m.gmane.org@gnu.org Tue Oct 08 12:27:52 2013 Return-path: Envelope-to: gegu-info-gnus-english@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1VTUW6-00067n-VP for gegu-info-gnus-english@m.gmane.org; Tue, 08 Oct 2013 12:27:51 +0200 Original-Received: from localhost ([::1]:35671 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VTUW6-0000Hb-I0 for gegu-info-gnus-english@m.gmane.org; Tue, 08 Oct 2013 06:27:50 -0400 Original-Path: usenet.stanford.edu!goblin1!goblin.stu.neva.ru!uio.no!quimby.gnus.org!.POSTED!not-for-mail Original-Newsgroups: gnu.emacs.gnus Original-Lines: 34 Original-NNTP-Posting-Host: 178-83-163-103.dynamic.hispeed.ch Original-X-Trace: quimby.gnus.org 1381186811 23577 178.83.163.103 (7 Oct 2013 23:00:11 GMT) Original-X-Complaints-To: usenet@quimby.gnus.org Original-NNTP-Posting-Date: Mon, 7 Oct 2013 23:00:11 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (darwin) Cancel-Lock: sha1:CvaIgHIEbdNTHhamcwb237MdPRY= Original-Xref: usenet.stanford.edu gnu.emacs.gnus:87658 X-Mailman-Approved-At: Tue, 08 Oct 2013 06:27:29 -0400 X-BeenThere: info-gnus-english@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Announcements and discussions for GNUS, the GNU Emacs Usenet newsreader \(in English\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: info-gnus-english-bounces+gegu-info-gnus-english=m.gmane.org@gnu.org Original-Sender: info-gnus-english-bounces+gegu-info-gnus-english=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.gnus.user:16531 Archived-At: I've started using Gnus again after many years. I had a ton of mbox files I had created using fetchmail from my Gmail account, but didn't really trust it, ran it a lot, thought that maybe it didn't get my sent messages and therefore moved a lot of messages from the Gmail IMAP to my nnml:mail.misc using B m ... and now I'm suddenly having second thoughts. Is my nnml:mail.misc of about 60000 messages full of duplicates? I've been wondering how to find them, if any. I've tried looking at the .overview file... #!/bin/env perl my %count = (); my %file = (); my $overview = "/Volumes/Extern/Archives/Mail/mail/misc/.overview"; open(F, $overview) or die "Cannot open $overview: $!"; while(my $line = ) { my @field = split(/\t/, $line); $count{$field[4]}++; push(@{$file{$field[4]}}, $field[0]); } close(F); my @keys = sort { $count{$b} cmp $count{$a} } keys %count; print join("\n", map { $_ . "\t" . $count{$_} . "\t" . join(", ", @{$file{$_}}) } @keys[0 .. 3]) . "\n"; How would I best use this script to delete the duplicate messages? Can I "regenerate" the overview file without loosing anything, perhaps by regenrating something in the Server buffer? Or is there an elisp version of the above that does the right thing? Cheers Alex