Gnus development mailing list
 help / color / mirror / Atom feed
From: Eric Abrahamsen <eric@ericabrahamsen.net>
To: ding@gnus.org
Subject: Re: [PATCH] Two issues with the gnus-registry
Date: Thu, 13 Nov 2014 20:05:38 +0800	[thread overview]
Message-ID: <8761ej8fvx.fsf@ericabrahamsen.net> (raw)
In-Reply-To: <87bnoff9fq.fsf@lifelogs.com>

[-- Attachment #1: Type: text/plain, Size: 2399 bytes --]

Ted Zlatanov <tzz@lifelogs.com> writes:

> On Sat, 08 Nov 2014 16:39:48 +0800 Eric Abrahamsen <eric@ericabrahamsen.net> wrote: 
>
> EA> So unless I'm really missing something, my proposal is:
>
> EA> 1. Only provide one limit: max-size.
> EA> 2. Allow customization of prune-factor.
>
> EA> That seems like all the customization you'd need. Cap the size, and
> EA> provide a reasonable control of how often the registry gets pruned.
>
> OK. I am afraid I made the whole thing too complicated and it took
> another set of eyes to recognize it.  Thanks for that, and the fixes.
>
> Please go ahead and apply any changes you have in mind. I am sure no one
> is tuning soft/hard pruning. Simpler is definitely better.
>
> EA> That would require a change in the object signature, which would mean
> EA> some extra handling code for "upgrading". But once we're doing that, we
> EA> could also take the opportunity to add :prune-function and
> EA> :sort-function slots on the base registry object, which would be
> EA> nice. We could even change the default store filename from its "eioio"
> EA> extension to "eieio". :)
>
> Yeah, I remember that bug report about the extension of the database
> file, if we're upgrading anyway... It's a good chance to batch these
> changes.

Okay, here is Stab the First.

Most of what's happening should be evident from commit messages and code
comments. There are three patches:

1. The big one changes the database structure, combining :max-hard and
:max-soft into :max-size, and reworking the pruning routine to only make
a single pass. This also includes a new initialize-instance method for
upgrading from older versions, and the extraction of the database
version into a new defvar: see the comments on top of
`registry-db-version'.

2. The second patch changes the default filename to a eieio extension,
which necessitated more code than I thought -- `gnus-registry-read' is
split into two functions. I hope this isn't too crufty.

3. Number three introduces sorting into the pruning process. I
considered adding a sorting function as a slot on the registry-db
objects, but couldn't immediately think of why that was really
necessary. Anyway, there's a new option
`gnus-registry-default-sorting-function', which defaults to a function
that sorts by 'creation-time.


Tests and documentation all around.

Presumably there are bugs! I hope people will test.

Eric


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0003-Sort-registry-entries-when-pruning.patch --]
[-- Type: text/x-diff, Size: 8104 bytes --]

From 9e4e8993aaef6a861b2237e0509eeee02089916f Mon Sep 17 00:00:00 2001
From: Eric Abrahamsen <eric@ericabrahamsen.net>
Date: Wed, 12 Nov 2014 13:49:36 +0800
Subject: [PATCH 3/4] Sort registry entries when pruning

* lisp/registry.el (registry-prune): Accept and use a sort-function
  argument.

* lisp/gnus-registry.el (gnus-registry-default-sort-function): New
  customization option pointing at a sort function to use when
  pruning.
  (gnus-registry-sort-by-creation-time): Provide a default function
  for the above option.
  (gnus-registry-save, gnus-registry-insert): Use sort function.

* lisp/tests/gnustest-registry.el (gnustest-registry-sort-function):
  Default sort function for testing.
  (gnustest-registry-pruning-sort-test): New test for sorting.

* text/gnus.texi: Document sorting options.
---
 lisp/gnus-registry.el           | 25 +++++++++++++++++++++++--
 lisp/registry.el                | 21 +++++++++++++++------
 lisp/tests/gnustest-registry.el | 26 ++++++++++++++++++++++++++
 texi/gnus.texi                  |  9 +++++++++
 4 files changed, 73 insertions(+), 8 deletions(-)

diff --git a/lisp/gnus-registry.el b/lisp/gnus-registry.el
index 0f3ea23..92f8f04 100644
--- a/lisp/gnus-registry.el
+++ b/lisp/gnus-registry.el
@@ -257,6 +257,25 @@ entries.  The pruning process is constrained by the presence of
   :group 'gnus-registry
   :type 'float)
 
+(defcustom gnus-registry-default-sort-function
+  #'gnus-registry-sort-by-creation-time
+  "Sort function to use when pruning the registry.
+
+Entries which sort to the front of the list will be pruned
+first.
+
+This can slow pruning down.  Set to nil to perform no sorting."
+  :version "24.4"
+  :group 'gnus-registry
+  :type 'symbol)
+
+(defun gnus-registry-sort-by-creation-time (l r)
+  "Sort older entries to front of list."
+  ;; Pruning starts from the front of the list.
+  (time-less-p
+   (cadr (assq 'creation-time r))
+   (cadr (assq 'creation-time l))))
+
 (defun gnus-registry-fixup-registry (db)
   (when db
     (let ((old (oref db :tracked)))
@@ -349,7 +368,8 @@ This is not required after changing `gnus-registry-cache-file'."
         (db (or db gnus-registry-db)))
     (gnus-message 5 "Saving Gnus registry (%d entries) to %s..."
                   (registry-size db) file)
-    (registry-prune db)
+    (registry-prune
+     db gnus-registry-default-sort-function)
     ;; TODO: call (gnus-string-remove-all-properties v) on all elements?
     (eieio-persistent-save db file)
     (gnus-message 5 "Saving Gnus registry (size %d) to %s...done"
@@ -1056,7 +1076,8 @@ only the last one's marks are returned."
   "Just like `registry-insert' but tries to prune on error."
   (when (registry-full db)
     (message "Trying to prune the registry because it's full")
-    (registry-prune db))
+    (registry-prune
+     db gnus-registry-default-sort-function))
   (registry-insert db id entry)
   entry)
 
diff --git a/lisp/registry.el b/lisp/registry.el
index 86e1ee3..d949e7a 100644
--- a/lisp/registry.el
+++ b/lisp/registry.el
@@ -334,7 +334,7 @@ Errors out if the key exists already."
 	       (registry-lookup-secondary-value db tr val value-keys))))
 	 (oref db :data))))))
 
-(defmethod registry-prune ((db registry-db))
+(defmethod registry-prune ((db registry-db) &optional sortfunc)
   "Prunes the registry-db object DB.
 
 Attempts to prune the number of entries down to \(*
@@ -343,6 +343,9 @@ pruning doesn't need to happen on every save. Removes only
 entries without the :precious keys, so it may not be possible to
 reach the target limit.
 
+Entries to be pruned are first sorted using SORTFUNC.  Entries
+from the front of the list are deleted first.
+
 Returns the number of deleted entries."
   (let ((size (registry-size db))
 	(target-size (- (oref db :max-size)
@@ -352,15 +355,17 @@ Returns the number of deleted entries."
     (if (> size target-size)
 	(progn
 	  (setq candidates
-		(registry-collect-prune-candidates db (- size target-size)))
+		(registry-collect-prune-candidates
+		 db (- size target-size) sortfunc))
 	  (length (registry-delete db candidates nil)))
       0)))
 
-(defmethod registry-collect-prune-candidates ((db registry-db) limit)
+(defmethod registry-collect-prune-candidates ((db registry-db) limit sortfunc)
   "Collects pruning candidates from the registry-db object DB.
 
 Proposes only entries without the :precious keys, and attempts to
-return LIMIT such candidates."
+return LIMIT such candidates.  If SORTFUNC is provided, sort
+entries first and return candidates from beginning of list."
   (let* ((precious (oref db :precious))
 	 (precious-p (lambda (entry-key)
 		       (cdr (memq (car entry-key) precious))))
@@ -368,8 +373,12 @@ return LIMIT such candidates."
 	 (candidates (cl-loop for k being the hash-keys of data
 			      using (hash-values v)
 			      when (notany precious-p v)
-			      collect k)))
-    (delq nil (cl-subseq candidates 0 limit))))
+			      collect (cons k v))))
+    ;; We want the full entries for sorting, but should only return a
+    ;; list of entry keys.
+    (when sortfunc
+      (setq candidates (sort candidates sortfunc)))
+    (delq nil (cl-subseq (mapcar #'car candidates) 0 limit))))
 
 (provide 'registry)
 ;;; registry.el ends here
diff --git a/lisp/tests/gnustest-registry.el b/lisp/tests/gnustest-registry.el
index 5ca0cc0..e6ac8db 100644
--- a/lisp/tests/gnustest-registry.el
+++ b/lisp/tests/gnustest-registry.el
@@ -52,6 +52,11 @@
     (should-not (registry--match :member entry '((hello)))))
   (message "Done with matching testing."))
 
+(defun gnustest-registry-sort-function (l r)
+  "Sort lower values of sort-field earlier."
+  (< (cadr (assq 'sort-field l))
+     (cadr (assq 'sort-field r))))
+
 (defun gnustest-registry-make-testable-db (n &optional prune-factor name file)
   (let* ((db (registry-db
               (or name "Testing")
@@ -66,6 +71,7 @@
                               (more-extra) ; Empty data key should be pruned.
                               ;; First 5 entries will NOT have this extra data.
                               ,@(when (< 4 i) (list (list 'extra "more data")))
+			      (sort-field ,(- n i))
                               (groups ,(number-to-string i)))))
     db))
 
@@ -126,6 +132,26 @@
 	     :prefix "Error: ")
 	  (should (= expected-prune-count actual-prune-count)))))))
 
+(ert-deftest gnustest-registry-pruning-sort-test ()
+  "Check that entries are sorted properly before pruning."
+  (let ((db (gnustest-registry-make-testable-db 10 0.4))
+	;; These entries have the highest 'sort-field values.  Pruning
+	;; sorts by lowest values first, then prunes from the front of
+	;; the list, so these entries survive
+	(expected-survivors '(5 6 7 8 9 0))
+	actual-survivors disjunct)
+    (registry-prune
+     db #'gnustest-registry-sort-function)
+    (maphash (lambda (k v) (push k actual-survivors))
+	     (oref db :data))
+    (setq disjunct (cl-set-exclusive-or
+		    expected-survivors
+		    actual-survivors))
+    (ert-info
+	((format "Incorrect pruning: %s" disjunct)
+	 :prefix "Error: ")
+      (should (null disjunct)))))
+
 (ert-deftest gnustest-registry-persistence-test ()
   (let* ((n 100)
          (tempfile (make-temp-file "registry-persistence-"))
diff --git a/texi/gnus.texi b/texi/gnus.texi
index ca7ddfe..9255ed9 100644
--- a/texi/gnus.texi
+++ b/texi/gnus.texi
@@ -25958,6 +25958,15 @@ cut back to 45000 entries.  Entries with keys marked as precious will
 not be pruned.
 @end defvar
 
+@defvar gnus-registry-default-sort-function
+This option specifies how registry entries are sorted during pruning.
+If a function is given, it should sort least valuable entries first,
+as pruning starts from the beginning of the list.  The default value
+is @code{gnus-registry-sort-by-creation-time}, which proposes the
+oldest entries for pruning.  Set to nil to perform no sorting, which
+will speed up the pruning process.
+@end defvar
+
 @defvar gnus-registry-cache-file
 The file where the registry will be stored between Gnus sessions.  By
 default the file name is @code{.gnus.registry.eieio} in the same
-- 
2.1.3


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0002-Change-default-registry-filename-extension.patch --]
[-- Type: text/x-diff, Size: 5022 bytes --]

From 2ef0a99726eaeaab7831fa2cd7af0e3dd12c5e72 Mon Sep 17 00:00:00 2001
From: Eric Abrahamsen <eric@ericabrahamsen.net>
Date: Wed, 12 Nov 2014 13:04:24 +0800
Subject: [PATCH 2/4] Change default registry filename extension

* lisp/gnus-registry.el (gnus-registry-cache-file): Change default
  filename extension to "eieio".
  (gnus-registry-read): New function, split out from
  `gnus-registry-load', that does the actual object reading.
  (gnus-registry-load): Refactor to call new function
  `gnus-registry-read', also add condition case handler to check for
  old filename extension
---
 lisp/gnus-registry.el | 52 ++++++++++++++++++++++++++++++++++-----------------
 texi/gnus.texi        |  2 +-
 2 files changed, 36 insertions(+), 18 deletions(-)

diff --git a/lisp/gnus-registry.el b/lisp/gnus-registry.el
index d3f6546..0f3ea23 100644
--- a/lisp/gnus-registry.el
+++ b/lisp/gnus-registry.el
@@ -232,7 +232,7 @@ the Bit Bucket."
 (defcustom gnus-registry-cache-file
   (nnheader-concat
    (or gnus-dribble-directory gnus-home-directory "~/")
-   ".gnus.registry.eioio")
+   ".gnus.registry.eieio")
   "File where the Gnus registry will be stored."
   :group 'gnus-registry
   :type 'file)
@@ -301,22 +301,27 @@ This is not required after changing `gnus-registry-cache-file'."
     (gnus-message 4 "Remaking the Gnus registry")
     (setq gnus-registry-db (gnus-registry-make-db))))
 
-(defun gnus-registry-read ()
-  "Read the registry cache file."
+(defun gnus-registry-load ()
+  "Load the registry from the cache file."
   (interactive)
   (let ((file gnus-registry-cache-file))
     (condition-case nil
-        (progn
-          (gnus-message 5 "Reading Gnus registry from %s..." file)
-          (setq gnus-registry-db
-		(gnus-registry-fixup-registry
-		 (condition-case nil
-		     (with-no-warnings
-		       (eieio-persistent-read file 'registry-db))
-		   ;; Older EIEIO versions do not check the class name.
-		   ('wrong-number-of-arguments
-		    (eieio-persistent-read file)))))
-          (gnus-message 5 "Reading Gnus registry from %s...done" file))
+        (gnus-registry-read file)
+      (file-error
+       ;; Fix previous mis-naming of the registry file.
+       (let ((old-file-name
+	      (concat (file-name-sans-extension
+		      gnus-registry-cache-file)
+		     ".eioio")))
+	 (if (and (file-exists-p old-file-name)
+		  (yes-or-no-p
+		   (format "Rename registry file from %s to %s? "
+			   old-file-name file)))
+	     (progn
+	       (gnus-registry-read old-file-name)
+	       (oset gnus-registry-db :file file)
+	       (gnus-message 1 "Registry filename changed to %s" file))
+	   (gnus-registry-remake-db t))))
       (error
        (gnus-message
         1
@@ -324,6 +329,19 @@ This is not required after changing `gnus-registry-cache-file'."
         file)
        (gnus-registry-remake-db t)))))
 
+(defun gnus-registry-read (file)
+  "Do the actual reading of the registry persistence file."
+  (gnus-message 5 "Reading Gnus registry from %s..." file)
+  (setq gnus-registry-db
+	(gnus-registry-fixup-registry
+	 (condition-case nil
+	     (with-no-warnings
+	       (eieio-persistent-read file 'registry-db))
+	   ;; Older EIEIO versions do not check the class name.
+	   ('wrong-number-of-arguments
+	    (eieio-persistent-read file)))))
+  (gnus-message 5 "Reading Gnus registry from %s...done" file))
+
 (defun gnus-registry-save (&optional file db)
   "Save the registry cache file."
   (interactive)
@@ -1096,7 +1114,7 @@ only the last one's marks are returned."
   (gnus-message 5 "Initializing the registry")
   (gnus-registry-install-hooks)
   (gnus-registry-install-shortcuts)
-  (gnus-registry-read))
+  (gnus-registry-load))
 
 ;; FIXME: Why autoload this function?
 ;;;###autoload
@@ -1110,7 +1128,7 @@ only the last one's marks are returned."
   (add-hook 'nnmail-spool-hook 'gnus-registry-spool-action)
 
   (add-hook 'gnus-save-newsrc-hook 'gnus-registry-save)
-  (add-hook 'gnus-read-newsrc-el-hook 'gnus-registry-read)
+  (add-hook 'gnus-read-newsrc-el-hook 'gnus-registry-load)
 
   (add-hook 'gnus-summary-prepare-hook 'gnus-registry-register-message-ids))
 
@@ -1123,7 +1141,7 @@ only the last one's marks are returned."
   (remove-hook 'nnmail-spool-hook 'gnus-registry-spool-action)
 
   (remove-hook 'gnus-save-newsrc-hook 'gnus-registry-save)
-  (remove-hook 'gnus-read-newsrc-el-hook 'gnus-registry-read)
+  (remove-hook 'gnus-read-newsrc-el-hook 'gnus-registry-load)
 
   (remove-hook 'gnus-summary-prepare-hook 'gnus-registry-register-message-ids)
   (setq gnus-registry-enabled nil))
diff --git a/texi/gnus.texi b/texi/gnus.texi
index 082adad..ca7ddfe 100644
--- a/texi/gnus.texi
+++ b/texi/gnus.texi
@@ -25960,7 +25960,7 @@ not be pruned.
 
 @defvar gnus-registry-cache-file
 The file where the registry will be stored between Gnus sessions.  By
-default the file name is @code{.gnus.registry.eioio} in the same
+default the file name is @code{.gnus.registry.eieio} in the same
 directory as your @code{.newsrc.eld}.
 @end defvar
 
-- 
2.1.3


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #4: 0001-Alter-registry-database-to-fix-pruning-issues.patch --]
[-- Type: text/x-diff, Size: 17827 bytes --]

From 7da7dc2222137b03ff8f54e61ac0c519b7316f0c Mon Sep 17 00:00:00 2001
From: Eric Abrahamsen <eric@ericabrahamsen.net>
Date: Wed, 12 Nov 2014 12:59:45 +0800
Subject: [PATCH 1/4] Alter registry database to fix pruning issues

* lisp/registry.el (registry-db): Consolidate the :max-hard and
  :max-soft slots into a :max-size slot.
  (registry-db-version): New defvar, holding database version number.
  (registry-prune): Use :max-size slot
  (registry-collect-prune-candidates): New function for finding
  non-precious pruning candidates
  (registry-prune-hard-candidates, registry-prune-soft-candidates):
  Remove functions
  (initialize-instance): Perform registry version upgrade

* lisp/gnus-registry.el (gnus-registry-prune-factor): New custom
  option
  (gnus-registry-max-pruned-entries): Remove option

* lisp/tests/gnustest-registry.el (gnustest-registry-make-testable-db):
  Alter for new object signature
  (gnustest-registry-pruning-test): New pruning test

* texi/gnus.texi (Gnus Registry Setup): Document changes
---
 lisp/gnus-registry.el           |  30 ++++----
 lisp/registry.el                | 150 +++++++++++++++++++++-------------------
 lisp/tests/gnustest-registry.el |  50 +++++++++-----
 texi/gnus.texi                  |  16 +++--
 4 files changed, 140 insertions(+), 106 deletions(-)

diff --git a/lisp/gnus-registry.el b/lisp/gnus-registry.el
index f3b81f7..d3f6546 100644
--- a/lisp/gnus-registry.el
+++ b/lisp/gnus-registry.el
@@ -176,6 +176,7 @@ nnmairix groups are specifically excluded because they are ephemeral."
 (make-obsolete-variable 'gnus-registry-max-track-groups nil "23.4")
 (make-obsolete-variable 'gnus-registry-entry-caching nil "23.4")
 (make-obsolete-variable 'gnus-registry-trim-articles-without-groups nil "23.4")
+(make-obsolete-variable 'gnus-registry-max-pruned-entries nil "24.4")
 
 (defcustom gnus-registry-track-extra '(subject sender recipient)
   "Whether the registry should track extra data about a message.
@@ -242,12 +243,19 @@ the Bit Bucket."
   :type '(radio (const :format "Unlimited " nil)
                 (integer :format "Maximum number: %v")))
 
-(defcustom gnus-registry-max-pruned-entries nil
-  "Maximum number of pruned entries in the registry, nil for unlimited."
-  :version "24.1"
+(defcustom gnus-registry-prune-factor 0.1
+  "When pruning, try to prune back to this factor less than the maximum size.
+
+In order to prevent constant pruning, we prune back to a number
+somewhat less than the maximum size.  This option controls
+exactly how much less.  For example, given a maximum size of
+50000 and a prune factor of 0.1, the pruning process will try to
+cut the registry back to \(- 50000 \(* 50000 0.1\)\) -> 45000
+entries.  The pruning process is constrained by the presence of
+\"precious\" entries."
+  :version "24.4"
   :group 'gnus-registry
-  :type '(radio (const :format "Unlimited " nil)
-                (integer :format "Maximum number: %v")))
+  :type 'float)
 
 (defun gnus-registry-fixup-registry (db)
   (when db
@@ -255,14 +263,12 @@ the Bit Bucket."
       (oset db :precious
             (append gnus-registry-extra-entries-precious
                     '()))
-      (oset db :max-hard
+      (oset db :max-size
             (or gnus-registry-max-entries
                 most-positive-fixnum))
       (oset db :prune-factor
-            0.1)
-      (oset db :max-soft
-            (or gnus-registry-max-pruned-entries
-                most-positive-fixnum))
+            (or gnus-registry-prune-factor
+		0.1))
       (oset db :tracked
             (append gnus-registry-track-extra
                     '(mark group keyword)))
@@ -278,8 +284,8 @@ the Bit Bucket."
     "Gnus Registry"
     :file (or file gnus-registry-cache-file)
     ;; these parameters are set in `gnus-registry-fixup-registry'
-    :max-hard most-positive-fixnum
-    :max-soft most-positive-fixnum
+    :max-size most-positive-fixnum
+    :version registry-db-version
     :precious nil
     :tracked nil)))
 
diff --git a/lisp/registry.el b/lisp/registry.el
index dbc7b51..86e1ee3 100644
--- a/lisp/registry.el
+++ b/lisp/registry.el
@@ -25,11 +25,11 @@
 ;; This library provides a general-purpose EIEIO-based registry
 ;; database with persistence, initialized with these fields:
 
-;; version: a float, 0.1 currently (don't change it)
+;; version: a float
 
-;; max-hard: an integer, default 5000000
+;; max-size: an integer, default 50000
 
-;; max-soft: an integer, default 50000
+;; prune-factor: a float between 0 and 1, default 0.1
 
 ;; precious: a list of symbols
 
@@ -57,14 +57,15 @@
 ;; Note that whether a field has one or many pieces of data, the data
 ;; is always a list of values.
 
-;; The user decides which fields are "precious", F2 for example.  At
-;; PRUNE TIME (when the :prune-function is called), the registry will
-;; trim any entries without the F2 field until the size is :max-soft
-;; or less.  No entries with the F2 field will be removed at PRUNE
-;; TIME.
+;; The user decides which fields are "precious", F2 for example.  When
+;; the registry is pruned, any entries without the F2 field will be
+;; removed until the size is :max-size * :prune-factor _less_ than the
+;; maximum database size. No entries with the F2 field will be removed
+;; at PRUNE TIME, which means it may not be possible to prune back all
+;; the way to the target size.
 
-;; When an entry is inserted, the registry will reject new entries
-;; if they bring it over the max-hard limit, even if they have the F2
+;; When an entry is inserted, the registry will reject new entries if
+;; they bring it over the :max-size limit, even if they have the F2
 ;; field.
 
 ;; The user decides which fields are "tracked", F1 for example.  Any
@@ -94,28 +95,32 @@
       (error
        "eieio not found in `load-path' or gnus-fallback-lib/ directory.")))
 
+;; The version number needs to be kept outside of the class definition
+;; itself.  The persistent-save process does *not* write to file any
+;; slot values that are equal to the default :initform value.  If a
+;; database object is at the most recent version, therefore, its
+;; version number will not be written to file.  That makes it
+;; difficult to know when a database needs to be upgraded.
+(defvar registry-db-version 0.2
+  "The current version of the registry format.")
+
 (defclass registry-db (eieio-persistent)
   ((version :initarg :version
-            :initform 0.1
-            :type float
-            :custom float
+            :initform nil
+            :type (or null float)
             :documentation "The registry version.")
-   (max-hard :initarg :max-hard
-             :initform 5000000
-             :type integer
-             :custom integer
-             :documentation "Never accept more than this many elements.")
-   (max-soft :initarg :max-soft
-             :initform 50000
+   (max-size :initarg :max-size
+             :initform most-positive-fixnum
              :type integer
              :custom integer
-             :documentation "Prune as much as possible to get to this size.")
+             :documentation "The maximum number of registry entries.")
    (prune-factor
     :initarg :prune-factor
     :initform 0.1
     :type float
     :custom float
-    :documentation "At the max-hard limit, prune size * this entries.")
+    :documentation "Prune to \(:max-size * :prune-factor\) less
+    than the :max-size limit.  Should be a float between 0 and 1.")
    (tracked :initarg :tracked
             :initform nil
             :type t
@@ -131,6 +136,23 @@
          :type hash-table
          :documentation "The data hashtable.")))
 
+(defmethod initialize-instance :BEFORE ((this registry-db) slots)
+  "Check whether a registry object needs to be upgraded."
+  ;; Hardcoded upgrade routines.  Version 0.1 to 0.2 requires the
+  ;; :max-soft slot to disappear, and the :max-hard slot to be renamed
+  ;; :max-size.
+  (let ((current-version
+	 (and (plist-member slots :version)
+	      (plist-get slots :version))))
+    (when (or (null current-version)
+	      (eql current-version 0.1))
+      (setq slots
+	    (plist-put slots :max-size (plist-get slots :max-hard)))
+      (setq slots
+	    (plist-put slots :version registry-db-version))
+      (cl-remf slots :max-hard)
+      (cl-remf slots :max-soft))))
+
 (defmethod initialize-instance :AFTER ((this registry-db) slots)
   "Set value of data slot of THIS after initialization."
   (with-slots (data tracker) this
@@ -267,7 +289,7 @@ This is the key count of the :data slot."
 (defmethod registry-full ((db registry-db))
   "Checks if registry-db THIS is full."
   (>= (registry-size db)
-      (oref db :max-hard)))
+      (oref db :max-size)))
 
 (defmethod registry-insert ((db registry-db) key entry)
   "Insert ENTRY under KEY into the registry-db THIS.
@@ -279,7 +301,7 @@ Errors out if the key exists already."
 
   (assert (not (registry-full db))
 	  nil
-	  "registry max-hard size limit reached")
+	  "registry max-size limit reached")
 
   ;; store the entry
   (puthash key entry (oref db :data))
@@ -312,58 +334,42 @@ Errors out if the key exists already."
 	       (registry-lookup-secondary-value db tr val value-keys))))
 	 (oref db :data))))))
 
-(defmethod registry-prune ((db registry-db) &optional sortfun)
-  "Prunes the registry-db object THIS.
-Removes only entries without the :precious keys if it can,
-then removes oldest entries first.
-Returns the number of deleted entries.
-If SORTFUN is given, tries to keep entries that sort *higher*.
-SORTFUN is passed only the two keys so it must look them up directly."
-  (dolist (collector '(registry-prune-soft-candidates
-		       registry-prune-hard-candidates))
-    (let* ((size (registry-size db))
-	   (collected (funcall collector db))
-	   (limit (nth 0 collected))
-	   (candidates (nth 1 collected))
-	   ;; sort the candidates if SORTFUN was given
-	   (candidates (if sortfun (sort candidates sortfun) candidates))
-	   (candidates-count (length candidates))
-	   ;; are we over max-soft?
-	   (prune-needed (> size limit)))
-
-      ;; while we have more candidates than we need to remove...
-      (while (and (> candidates-count (- size limit)) candidates)
-	(decf candidates-count)
-	(setq candidates (cdr candidates)))
-
-      (registry-delete db candidates nil)
-      (length candidates))))
-
-(defmethod registry-prune-soft-candidates ((db registry-db))
-  "Collects pruning candidates from the registry-db object THIS.
-Proposes only entries without the :precious keys."
+(defmethod registry-prune ((db registry-db))
+  "Prunes the registry-db object DB.
+
+Attempts to prune the number of entries down to \(*
+:max-size :prune-factor\) less than the max-size limit, so
+pruning doesn't need to happen on every save. Removes only
+entries without the :precious keys, so it may not be possible to
+reach the target limit.
+
+Returns the number of deleted entries."
+  (let ((size (registry-size db))
+	(target-size (- (oref db :max-size)
+			(* (oref db :max-size)
+			   (oref db :prune-factor))))
+	candidates)
+    (if (> size target-size)
+	(progn
+	  (setq candidates
+		(registry-collect-prune-candidates db (- size target-size)))
+	  (length (registry-delete db candidates nil)))
+      0)))
+
+(defmethod registry-collect-prune-candidates ((db registry-db) limit)
+  "Collects pruning candidates from the registry-db object DB.
+
+Proposes only entries without the :precious keys, and attempts to
+return LIMIT such candidates."
   (let* ((precious (oref db :precious))
 	 (precious-p (lambda (entry-key)
 		       (cdr (memq (car entry-key) precious))))
 	 (data (oref db :data))
-	 (limit (oref db :max-soft))
-	 (candidates (loop for k being the hash-keys of data
-			   using (hash-values v)
-			   when (notany precious-p v)
-			   collect k)))
-    (list limit candidates)))
-
-(defmethod registry-prune-hard-candidates ((db registry-db))
-  "Collects pruning candidates from the registry-db object THIS.
-Proposes any entries over the max-hard limit minus size * prune-factor."
-  (let* ((data (oref db :data))
-	 ;; prune to (size * prune-factor) below the max-hard limit so
-	 ;; we're not pruning all the time
-	 (limit (max 0 (- (oref db :max-hard)
-			  (* (registry-size db) (oref db :prune-factor)))))
-	 (candidates (loop for k being the hash-keys of data
-			   collect k)))
-    (list limit candidates)))
+	 (candidates (cl-loop for k being the hash-keys of data
+			      using (hash-values v)
+			      when (notany precious-p v)
+			      collect k)))
+    (delq nil (cl-subseq candidates 0 limit))))
 
 (provide 'registry)
 ;;; registry.el ends here
diff --git a/lisp/tests/gnustest-registry.el b/lisp/tests/gnustest-registry.el
index 174a0cb..5ca0cc0 100644
--- a/lisp/tests/gnustest-registry.el
+++ b/lisp/tests/gnustest-registry.el
@@ -52,20 +52,20 @@
     (should-not (registry--match :member entry '((hello)))))
   (message "Done with matching testing."))
 
-(defun gnustest-registry-make-testable-db (n &optional name file)
+(defun gnustest-registry-make-testable-db (n &optional prune-factor name file)
   (let* ((db (registry-db
               (or name "Testing")
               :file (or file "unused")
-              :max-hard n
-              :max-soft 0               ; keep nothing not precious
+              :max-size n
+	      :prune-factor (or prune-factor 0.1)
               :precious '(extra more-extra)
               :tracked '(sender subject groups))))
     (dotimes (i n)
       (registry-insert db i `((sender "me")
                               (subject "about you")
-                              (more-extra) ; empty data key should be pruned
-                              ;; first 5 entries will NOT have this extra data
-                              ,@(when (< 5 i) (list (list 'extra "more data")))
+                              (more-extra) ; Empty data key should be pruned.
+                              ;; First 5 entries will NOT have this extra data.
+                              ,@(when (< 4 i) (list (list 'extra "more data")))
                               (groups ,(number-to-string i)))))
     db))
 
@@ -101,22 +101,36 @@
     (should (= n (length (registry-search db :all t))))
     (message "Secondary search after delete")
     (should (= n (length (registry-lookup-secondary-value db 'sender "me"))))
-    ;; (message "Pruning")
-    ;; (let* ((tokeep (registry-search db :member '((extra "more data"))))
-    ;;        (count (- n (length tokeep)))
-    ;;        (pruned (registry-prune db))
-    ;;        (prune-count (length pruned)))
-    ;;   (message "Expecting to prune %d entries and pruned %d"
-    ;;            count prune-count)
-    ;;   (should (and (= count 5)
-    ;;                (= count prune-count))))
     (message "Done with usage testing.")))
 
+(ert-deftest gnustest-registry-pruning-test ()
+  "Check that precious entries are never pruned."
+  (let ((dbs (list
+	      ;; Can prune fully without touching precious entries.
+	      (gnustest-registry-make-testable-db 10 0.1)
+	      ;; Pruning limited by precious entries.
+	      (gnustest-registry-make-testable-db 10 0.6))))
+    (dolist (db dbs)
+      (message "Pruning")
+      (let* ((size (registry-size db))
+	     (limit (- (oref db :max-size)
+		       (* (oref db :max-size)
+			  (oref db :prune-factor))))
+	     (keepers (registry-search db :member '((extra "more data"))))
+	     (expected-prune-count (min (- size (length keepers))
+					(- size limit)))
+	     (actual-prune-count (registry-prune db)))
+	(ert-info
+	    ((format "Expected to prune %d entries but pruned %d"
+		     expected-prune-count actual-prune-count)
+	     :prefix "Error: ")
+	  (should (= expected-prune-count actual-prune-count)))))))
+
 (ert-deftest gnustest-registry-persistence-test ()
   (let* ((n 100)
          (tempfile (make-temp-file "registry-persistence-"))
          (name "persistence tester")
-         (db (gnustest-registry-make-testable-db n name tempfile))
+         (db (gnustest-registry-make-testable-db n nil name tempfile))
          size back)
     (message "Saving to %s" tempfile)
     (eieio-persistent-save db)
@@ -205,8 +219,8 @@
     (should (= (registry-size back) n))
     (should (= (registry-size back) (registry-size db)))
     (delete-file tempfile)
-    (message "Pruning Gnus registry to 0 by setting :max-soft")
-    (oset db :max-soft 0)
+    (message "Pruning Gnus registry to 0 by setting :max-size")
+    (oset db :max-size 0)
     (registry-prune db)
     (should (= (registry-size db) 0)))
   (message "Done with Gnus registry usage testing."))
diff --git a/texi/gnus.texi b/texi/gnus.texi
index 6f47786..082adad 100644
--- a/texi/gnus.texi
+++ b/texi/gnus.texi
@@ -25942,12 +25942,20 @@ the word ``archive'' is not followed.
 
 @defvar gnus-registry-max-entries
 The number (an integer or @code{nil} for unlimited) of entries the
-registry will keep.
+registry will keep.  If the registry has reached or exceeded this
+size, it will reject insertion of new entries.
 @end defvar
 
-@defvar gnus-registry-max-pruned-entries
-The maximum number (an integer or @code{nil} for unlimited) of entries
-the registry will keep after pruning.
+@defvar gnus-registry-prune-factor
+This option (a float between 0 and 1) controls how much the registry
+is cut back during pruning.  In order to prevent constant pruning, the
+registry will be pruned back to less than
+@code{gnus-registry-max-entries}.  This option controls exactly how
+much less: the target is calculated as the maximum number of entries
+minus the maximum number times this factor.  The default is 0.1:
+i.e. if your registry is limited to 50000 entries, pruning will try to
+cut back to 45000 entries.  Entries with keys marked as precious will
+not be pruned.
 @end defvar
 
 @defvar gnus-registry-cache-file
-- 
2.1.3


  parent reply	other threads:[~2014-11-13 12:05 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-24 19:04 Eric Abrahamsen
2014-10-24 20:56 ` Eric Abrahamsen
2014-10-25 19:59   ` Eric Abrahamsen
2014-10-27 15:03 ` Ted Zlatanov
2014-10-27 19:15   ` Eric Abrahamsen
2014-10-28 18:04     ` Eric Abrahamsen
2014-11-07 23:56       ` Eric Abrahamsen
2014-11-08  0:01         ` Eric Abrahamsen
2014-11-08  8:39           ` Eric Abrahamsen
2014-11-10 13:54             ` Ted Zlatanov
2014-11-11  2:55               ` Eric Abrahamsen
2014-11-13 12:05               ` Eric Abrahamsen [this message]
2014-11-16  1:04                 ` Dan Christensen
2014-11-16  3:24                   ` Eric Abrahamsen
2014-12-18 10:07                 ` Ted Zlatanov
2014-12-18 15:00                   ` Eric Abrahamsen
2014-12-18 15:09                     ` Eric Abrahamsen
2014-12-19  0:44                       ` Katsumi Yamaoka
2014-12-19  2:08                         ` Eric Abrahamsen
2014-12-20  3:09                         ` Ted Zlatanov
2014-12-20 11:22                           ` Katsumi Yamaoka
2014-12-20 13:53                             ` Older Emacsen (was: [PATCH] Two issues with the gnus-registry) Ted Zlatanov
2014-12-19  1:30                       ` [PATCH] Two issues with the gnus-registry Ted Zlatanov
2014-10-28 20:10     ` Ted Zlatanov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8761ej8fvx.fsf@ericabrahamsen.net \
    --to=eric@ericabrahamsen.net \
    --cc=ding@gnus.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).