Gnus development mailing list
 help / color / mirror / Atom feed
From: Eric Abrahamsen <eric@ericabrahamsen.net>
To: ding@gnus.org
Subject: Re: [PATCH] Two issues with the gnus-registry
Date: Sat, 08 Nov 2014 07:56:58 +0800	[thread overview]
Message-ID: <87a942lg39.fsf@ericabrahamsen.net> (raw)
In-Reply-To: <87h9yogjer.fsf@ericabrahamsen.net>

[-- Attachment #1: Type: text/plain, Size: 1572 bytes --]

Eric Abrahamsen <eric@ericabrahamsen.net> writes:

> Eric Abrahamsen <eric@ericabrahamsen.net> writes:
>
>> Ted Zlatanov <tzz@lifelogs.com> writes:
>
> [...]
>
>>> The registry has ERT tests, which I thought covered this case.  Can you
>>> look at `tests/gnustest-registry.el'?  As a first step, can you try
>>> making tests to demonstrate the problems?
>>
>> Will do.
>
> Okay, turns out there were tests, but the bit that tests pruning was
> commented out :)
>
> This is the first in a (gradual) series of patches. It does very little
> except:
>
> 1. Adjust `registry-prune' to do what the docstring says: return the
> total number of entries pruned
>
> 2. Uncomment the pruning test and adjust it so it correctly catches the
> number of pruned entries.
>
> Test should fail. Over the next couple of days I'll add another test to
> check that entries are sorted correctly before pruning, and then take a
> stab at fixing the pruning itself.

Okay I realized it didn't make too much sense to patch gnus just to make
it fail, so here's a complete patch, with passing tests, that preserves
precious entries when pruning. I haven't done the sorting yet, that will
be a separate patch.

My main realization when working on this is that the distinction between
max-soft and max-hard is confusing, and possibly unnecessary. Do we
really need both? If, as I suspect, most users set one but not the
other, then there's no point in having both. Is anyone really setting
two different values for `gnus-registry-max-entries' and
`gnus-registry-max-pruned-entries'? And why?

Eric


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Preserve-precious-entries-when-pruning-registry.patch --]
[-- Type: text/x-diff, Size: 11201 bytes --]

From d19b46f0e3afdf6f2c8a30ab083b93d7d072718e Mon Sep 17 00:00:00 2001
From: Eric Abrahamsen <eric@ericabrahamsen.net>
Date: Fri, 7 Nov 2014 19:36:57 +0800
Subject: [PATCH] Preserve precious entries when pruning registry

* lisp/registry.el (registry-prune): Rewrite function to only make a
  single pruning pass over the registry.
  (registry-collect-prune-candidates): New function formerly
  known as `registry-prune-soft-candidates'.
  (registry-prune-hard-candidates): Delete function.
* lisp/tests/gnustest-registry.el (gnustest-registry-pruning-test):
  Create independent ert test for pruning.
  (gnustest-registry-make-testable-db): Allow specifying the :max-soft
  slot.
* texi/gnus.texi (Gnus Registry Setup): Documentation clarification
* lisp/gnus-registry.el (gnus-registry-max-pruned-entries): Reword
  docstring.

The pruning process was formerly not preserving precious entries.
Sorting of entries to be pruned is still not implemented.
---
 lisp/gnus-registry.el           |  3 +-
 lisp/registry.el                | 98 +++++++++++++++++------------------------
 lisp/tests/gnustest-registry.el | 41 +++++++++++------
 texi/gnus.texi                  | 13 ++++--
 4 files changed, 80 insertions(+), 75 deletions(-)

diff --git a/lisp/gnus-registry.el b/lisp/gnus-registry.el
index f3b81f7..847ceb3 100644
--- a/lisp/gnus-registry.el
+++ b/lisp/gnus-registry.el
@@ -243,7 +243,8 @@ the Bit Bucket."
                 (integer :format "Maximum number: %v")))
 
 (defcustom gnus-registry-max-pruned-entries nil
-  "Maximum number of pruned entries in the registry, nil for unlimited."
+  "Number of entries that will be retained in the registry after
+pruning, nil for unlimited."
   :version "24.1"
   :group 'gnus-registry
   :type '(radio (const :format "Unlimited " nil)
diff --git a/lisp/registry.el b/lisp/registry.el
index dbc7b51..d9fb0f5 100644
--- a/lisp/registry.el
+++ b/lisp/registry.el
@@ -57,14 +57,13 @@
 ;; Note that whether a field has one or many pieces of data, the data
 ;; is always a list of values.
 
-;; The user decides which fields are "precious", F2 for example.  At
-;; PRUNE TIME (when the :prune-function is called), the registry will
-;; trim any entries without the F2 field until the size is :max-soft
-;; or less.  No entries with the F2 field will be removed at PRUNE
-;; TIME.
-
-;; When an entry is inserted, the registry will reject new entries
-;; if they bring it over the max-hard limit, even if they have the F2
+;; The user decides which fields are "precious", F2 for example. At
+;; PRUNE TIME, the registry will attempt to trim entries until the
+;; size is (- :max-soft (* registry-size :prune-factor). No entries
+;; with the F2 field will be removed at PRUNE TIME.
+
+;; When an entry is inserted, the registry will reject new entries if
+;; they bring it over the :max-hard limit, even if they have the F2
 ;; field.
 
 ;; The user decides which fields are "tracked", F1 for example.  Any
@@ -115,7 +114,8 @@
     :initform 0.1
     :type float
     :custom float
-    :documentation "At the max-hard limit, prune size * this entries.")
+    :documentation "Prune to \(size * :prune-factor\) less than
+    the :max-soft limit.")
    (tracked :initarg :tracked
             :initform nil
             :type t
@@ -312,58 +312,42 @@ Errors out if the key exists already."
 	       (registry-lookup-secondary-value db tr val value-keys))))
 	 (oref db :data))))))
 
-(defmethod registry-prune ((db registry-db) &optional sortfun)
-  "Prunes the registry-db object THIS.
-Removes only entries without the :precious keys if it can,
-then removes oldest entries first.
-Returns the number of deleted entries.
-If SORTFUN is given, tries to keep entries that sort *higher*.
-SORTFUN is passed only the two keys so it must look them up directly."
-  (dolist (collector '(registry-prune-soft-candidates
-		       registry-prune-hard-candidates))
-    (let* ((size (registry-size db))
-	   (collected (funcall collector db))
-	   (limit (nth 0 collected))
-	   (candidates (nth 1 collected))
-	   ;; sort the candidates if SORTFUN was given
-	   (candidates (if sortfun (sort candidates sortfun) candidates))
-	   (candidates-count (length candidates))
-	   ;; are we over max-soft?
-	   (prune-needed (> size limit)))
-
-      ;; while we have more candidates than we need to remove...
-      (while (and (> candidates-count (- size limit)) candidates)
-	(decf candidates-count)
-	(setq candidates (cdr candidates)))
-
-      (registry-delete db candidates nil)
-      (length candidates))))
-
-(defmethod registry-prune-soft-candidates ((db registry-db))
-  "Collects pruning candidates from the registry-db object THIS.
-Proposes only entries without the :precious keys."
+(defmethod registry-prune ((db registry-db))
+  "Prunes the registry-db object DB.
+
+Attempts to prune the number of entries down to \(*
+registry-size :prune-factor\) less than the max-soft limit, so
+pruning doesn't need to happen on every save. Removes only
+entries without the :precious keys, so it may not be possible to
+reach the target limit.
+
+Returns the number of deleted entries."
+  (let* ((size (registry-size db))
+	 (limit (max 0 (- (min (oref db :max-hard)
+			       (oref db :max-soft))
+			  (* size (oref db :prune-factor)))))
+	 candidates)
+    (if (> limit 0)
+	(progn
+	  (setq candidates
+		(registry-collect-prune-candidates db (- size limit)))
+	  (length (registry-delete db candidates nil)))
+      0)))
+
+(defmethod registry-collect-prune-candidates ((db registry-db) limit)
+  "Collects pruning candidates from the registry-db object DB.
+
+Proposes only entries without the :precious keys, and attempts to
+return LIMIT such candidates."
   (let* ((precious (oref db :precious))
 	 (precious-p (lambda (entry-key)
 		       (cdr (memq (car entry-key) precious))))
 	 (data (oref db :data))
-	 (limit (oref db :max-soft))
-	 (candidates (loop for k being the hash-keys of data
-			   using (hash-values v)
-			   when (notany precious-p v)
-			   collect k)))
-    (list limit candidates)))
-
-(defmethod registry-prune-hard-candidates ((db registry-db))
-  "Collects pruning candidates from the registry-db object THIS.
-Proposes any entries over the max-hard limit minus size * prune-factor."
-  (let* ((data (oref db :data))
-	 ;; prune to (size * prune-factor) below the max-hard limit so
-	 ;; we're not pruning all the time
-	 (limit (max 0 (- (oref db :max-hard)
-			  (* (registry-size db) (oref db :prune-factor)))))
-	 (candidates (loop for k being the hash-keys of data
-			   collect k)))
-    (list limit candidates)))
+	 (candidates (cl-loop for k being the hash-keys of data
+			      using (hash-values v)
+			      when (notany precious-p v)
+			      collect k)))
+    (delq nil (cl-subseq candidates 0 limit))))
 
 (provide 'registry)
 ;;; registry.el ends here
diff --git a/lisp/tests/gnustest-registry.el b/lisp/tests/gnustest-registry.el
index 174a0cb..d7fac59 100644
--- a/lisp/tests/gnustest-registry.el
+++ b/lisp/tests/gnustest-registry.el
@@ -52,20 +52,20 @@
     (should-not (registry--match :member entry '((hello)))))
   (message "Done with matching testing."))
 
-(defun gnustest-registry-make-testable-db (n &optional name file)
+(defun gnustest-registry-make-testable-db (n &optional max-soft name file)
   (let* ((db (registry-db
               (or name "Testing")
               :file (or file "unused")
               :max-hard n
-              :max-soft 0               ; keep nothing not precious
+              :max-soft (or max-soft 0)
               :precious '(extra more-extra)
               :tracked '(sender subject groups))))
     (dotimes (i n)
       (registry-insert db i `((sender "me")
                               (subject "about you")
-                              (more-extra) ; empty data key should be pruned
-                              ;; first 5 entries will NOT have this extra data
-                              ,@(when (< 5 i) (list (list 'extra "more data")))
+                              (more-extra) ; Empty data key should be pruned.
+                              ;; First 5 entries will NOT have this extra data.
+                              ,@(when (< 4 i) (list (list 'extra "more data")))
                               (groups ,(number-to-string i)))))
     db))
 
@@ -101,17 +101,30 @@
     (should (= n (length (registry-search db :all t))))
     (message "Secondary search after delete")
     (should (= n (length (registry-lookup-secondary-value db 'sender "me"))))
-    ;; (message "Pruning")
-    ;; (let* ((tokeep (registry-search db :member '((extra "more data"))))
-    ;;        (count (- n (length tokeep)))
-    ;;        (pruned (registry-prune db))
-    ;;        (prune-count (length pruned)))
-    ;;   (message "Expecting to prune %d entries and pruned %d"
-    ;;            count prune-count)
-    ;;   (should (and (= count 5)
-    ;;                (= count prune-count))))
     (message "Done with usage testing.")))
 
+(ert-deftest gnustest-registry-pruning-test ()
+  "Check that precious entries are never pruned."
+  (let ((dbs (list
+	      ;; Can prune fully without touching precious entries.
+	      (gnustest-registry-make-testable-db 10 8)
+	      ;; Pruning limited by precious entries.
+	      (gnustest-registry-make-testable-db 10 4))))
+    (dolist (db dbs)
+      (message "Pruning")
+      (let* ((size (registry-size db))
+	     (max-soft (oref db :max-soft))
+	     (keepers (registry-search db :member '((extra "more data"))))
+	     (desired-prune-count (- size (- max-soft (* size 0.1))))
+	     (expected-prune-count (min (- size (length keepers))
+					desired-prune-count))
+	     (actual-prune-count (registry-prune db)))
+	(ert-info
+	    ((format "Expected to prune %d entries but pruned %d"
+		     expected-prune-count actual-prune-count)
+	     :prefix "Error: ")
+	  (should (= expected-prune-count actual-prune-count)))))))
+
 (ert-deftest gnustest-registry-persistence-test ()
   (let* ((n 100)
          (tempfile (make-temp-file "registry-persistence-"))
diff --git a/texi/gnus.texi b/texi/gnus.texi
index 6f47786..6b687f7 100644
--- a/texi/gnus.texi
+++ b/texi/gnus.texi
@@ -25942,12 +25942,19 @@ the word ``archive'' is not followed.
 
 @defvar gnus-registry-max-entries
 The number (an integer or @code{nil} for unlimited) of entries the
-registry will keep.
+registry will keep.  If the registry has reached or exceeded this
+size, it will reject insertion of new entries.
 @end defvar
 
 @defvar gnus-registry-max-pruned-entries
-The maximum number (an integer or @code{nil} for unlimited) of entries
-the registry will keep after pruning.
+The number (an integer or @code{nil} for unlimited) of entries the
+registry will be pruned to when saving.  In order to prevent constant
+pruning, the registry will be pruned back to somewhat less than this
+number.  Entries with keys marked as precious will not be pruned. It
+may make more sense to control the size of the registry with this
+variable rather than @code{gnus-registry-max-entries}, as it won't
+intefere with the creation or preservation of entries with precious
+keys.
 @end defvar
 
 @defvar gnus-registry-cache-file
-- 
2.1.3


  reply	other threads:[~2014-11-07 23:56 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-24 19:04 Eric Abrahamsen
2014-10-24 20:56 ` Eric Abrahamsen
2014-10-25 19:59   ` Eric Abrahamsen
2014-10-27 15:03 ` Ted Zlatanov
2014-10-27 19:15   ` Eric Abrahamsen
2014-10-28 18:04     ` Eric Abrahamsen
2014-11-07 23:56       ` Eric Abrahamsen [this message]
2014-11-08  0:01         ` Eric Abrahamsen
2014-11-08  8:39           ` Eric Abrahamsen
2014-11-10 13:54             ` Ted Zlatanov
2014-11-11  2:55               ` Eric Abrahamsen
2014-11-13 12:05               ` Eric Abrahamsen
2014-11-16  1:04                 ` Dan Christensen
2014-11-16  3:24                   ` Eric Abrahamsen
2014-12-18 10:07                 ` Ted Zlatanov
2014-12-18 15:00                   ` Eric Abrahamsen
2014-12-18 15:09                     ` Eric Abrahamsen
2014-12-19  0:44                       ` Katsumi Yamaoka
2014-12-19  2:08                         ` Eric Abrahamsen
2014-12-20  3:09                         ` Ted Zlatanov
2014-12-20 11:22                           ` Katsumi Yamaoka
2014-12-20 13:53                             ` Older Emacsen (was: [PATCH] Two issues with the gnus-registry) Ted Zlatanov
2014-12-19  1:30                       ` [PATCH] Two issues with the gnus-registry Ted Zlatanov
2014-10-28 20:10     ` Ted Zlatanov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a942lg39.fsf@ericabrahamsen.net \
    --to=eric@ericabrahamsen.net \
    --cc=ding@gnus.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).