Gnus development mailing list
 help / color / mirror / Atom feed
From: Eric Abrahamsen <eric@ericabrahamsen.net>
To: ding@gnus.org
Subject: Re: [PATCH] Two issues with the gnus-registry
Date: Mon, 27 Oct 2014 12:15:24 -0700	[thread overview]
Message-ID: <87zjchxr1v.fsf@ericabrahamsen.net> (raw)
In-Reply-To: <87wq7lsggo.fsf@lifelogs.com>

Ted Zlatanov <tzz@lifelogs.com> writes:

> On Fri, 24 Oct 2014 12:04:25 -0700 Eric Abrahamsen <eric@ericabrahamsen.net> wrote: 
>
> EA> I'm using the gnus-registry in Gnorb to keep track of correspondences
> EA> between Gnus messages and Org headings, using a key on the registry
> EA> entries called 'gnorb-ids. This is meant to be a "precious" key, ie
> EA> entries with this key are not pruned. So far so good:
>
> EA> (oref gnus-registry-db :precious) => (gnorb-ids mark)
>
> EA> But the entries still do get pruned! I've just had tracked entries start
> EA> disappearing on me, and realized that they're getting pruned from the
> EA> registry when I save it. There appear to be two problems here:
>
> EA> 1. The pruning doesn't start with the oldest entries first. The
> EA> docstring of `registry-prune' says it will, but looking over the code in
> EA> both registry.el and gnus-registry.el, I don't see how that would
> EA> happen. The entries aren't sorted as they're entered, nor does
> EA> `gnus-registry-save' pass a sortfun in to `registry-save'. In fact, the
> EA> entries I'm losing are the most recently-created ones.
>
> Huh.  I could've sworn this worked.  It must be my mistake.
>
> EA> 2. The bigger problem is that "precious" entries still seem to get
> EA> pruned. This is really difficult to edebug, because of the enormous
> EA> loops involved, and because any time the cursor passes the "db" variable
> EA> and tries to pretty-print it, Emacs runs out of memory (my registry has
> EA> about 18,000 entries in it). But something is still wrong here!
>
> Yeah, keeping the whole database in a big data structure is pretty bad.
> Plus I wrote that code in my "loop-happy" period.  I'd like to use
> something better to store these records.

Better than a hash table? Dunno what that would be. But I suppose
precious entries could go in one table, and non-precious in another... I
don't think the "loop" macro itself is such a problem, but stepping
through the main prune or save functions does get cumbersome with large
data sets.

Eieio is supposed to make edebug use the object-print method when
displaying objects, but the line that would do that (eieio.el:895) is
currently commented out, I don't know why. If that were working,
registries could have their own object-print method, would make edebug
much more usable.

> EA> I added a sortfun to the `gnus-registry-save' call, but obviously it
> EA> slows the save process *way* the heck down. I'm not sure what else to
> EA> do, though, since (as far as I know) hash tables aren't guaranteed to
> EA> keep their sort order. Is that correct?
>
> Right.  But you don't need to sort the whole database, only the
> non-precious entries.  So it should be: (prune (sort (select-non-precious)))
> which in theory should be a fairly constant number (as we keep pruning it).

Right, makes sense. I started on a patch for the pruning process, but
realized I was a bit confused by the interaction between max-entries/:max-hard
and max-pruned-entries/:max-soft. Just to make sure I get it:

The :max-hard limit should *only* come into play when adding new
entries: they'll be rejected if the registry is already full.

The :max-soft limit should *only* come into play when pruning: we do our
best to prune down to :max-soft by deleting non-precious entries. If all
entries are precious, we accept a too-large registry.

Right now, the :prune-factor is used in conjunction with :max-hard. If
the above is correct, that's backwards: it should be used when pruning
with :max-soft.

Is all that correct?

> EA> And I really don't know why the precious entries are getting pruned.
> EA> In a day or so, when I have more time, I'll test with a dummy registry,
> EA> with max-entries set to 10 and 'creation-time added to the precious
> EA> entries -- should make debugging easier.
>
> The registry has ERT tests, which I thought covered this case.  Can you
> look at `tests/gnustest-registry.el'?  As a first step, can you try
> making tests to demonstrate the problems?

Will do.

> EA> Given the commentary in registry.el, I wonder if the whole pruning
> EA> arrangement actually never quite got finished:
>
> EA> ;; The user decides which fields are "precious", F2 for example.  At
> EA> ;; PRUNE TIME (when the :prune-function is called), the registry will
> EA> ;; trim any entries without the F2 field until the size is :max-soft
> EA> ;; or less.  No entries with the F2 field will be removed at PRUNE
> EA> ;; TIME.
>
> EA> It looks like there's supposed to be a :prune-function slot on registry
> EA> objects, and presumably the Gnus registry would have a prune function
> EA> that worked more like what the comments above outline.
>
> EA> I'd be happy to help work on this, with a little direction...
>
> Assume it's broken.  And feel free to propose or make bigger changes as
> you see fit; there's little in Emacs to support this kind of database so
> I wrote a lot of it from scratch.  Sorry for the trouble.

Not at all! It's worked great so far, and is impressive for something
done from scratch. My guess is not many people have had a need for the
precious functionality, so that hasn't been an issue. It's ideal
for use in Gnorb (tracking correspondences between messages and Org
headings), I just *really* need precious entries to hang around.

I also looked into subclassing the basic registry, but so much necessary
functionality (tracking message movement and summary-buffer scanning) is
only in gnus-registry, so that wasn't very realistic. At some point
(later) it might be worth considering making the gnus-registry an actual
subclass of the basic registry, and turning the various action
functions, etc, into methods.

That's getting ahead of things -- tests first.

Thanks,
Eric




  reply	other threads:[~2014-10-27 19:15 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-24 19:04 Eric Abrahamsen
2014-10-24 20:56 ` Eric Abrahamsen
2014-10-25 19:59   ` Eric Abrahamsen
2014-10-27 15:03 ` Ted Zlatanov
2014-10-27 19:15   ` Eric Abrahamsen [this message]
2014-10-28 18:04     ` Eric Abrahamsen
2014-11-07 23:56       ` Eric Abrahamsen
2014-11-08  0:01         ` Eric Abrahamsen
2014-11-08  8:39           ` Eric Abrahamsen
2014-11-10 13:54             ` Ted Zlatanov
2014-11-11  2:55               ` Eric Abrahamsen
2014-11-13 12:05               ` Eric Abrahamsen
2014-11-16  1:04                 ` Dan Christensen
2014-11-16  3:24                   ` Eric Abrahamsen
2014-12-18 10:07                 ` Ted Zlatanov
2014-12-18 15:00                   ` Eric Abrahamsen
2014-12-18 15:09                     ` Eric Abrahamsen
2014-12-19  0:44                       ` Katsumi Yamaoka
2014-12-19  2:08                         ` Eric Abrahamsen
2014-12-20  3:09                         ` Ted Zlatanov
2014-12-20 11:22                           ` Katsumi Yamaoka
2014-12-20 13:53                             ` Older Emacsen (was: [PATCH] Two issues with the gnus-registry) Ted Zlatanov
2014-12-19  1:30                       ` [PATCH] Two issues with the gnus-registry Ted Zlatanov
2014-10-28 20:10     ` Ted Zlatanov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zjchxr1v.fsf@ericabrahamsen.net \
    --to=eric@ericabrahamsen.net \
    --cc=ding@gnus.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).