Gnus development mailing list
 help / color / mirror / Atom feed
From: Eric Abrahamsen <eric@ericabrahamsen.net>
To: "Jose A. Ortega Ruiz" <jao@gnu.org>
Cc: Eric S Fraga <e.fraga@ucl.ac.uk>,  ding@gnus.org
Subject: Re: New "gnus-search" syntax and interface
Date: Wed, 18 Nov 2020 12:40:05 -0800	[thread overview]
Message-ID: <87a6vez9ei.fsf@ericabrahamsen.net> (raw)
In-Reply-To: <87lfezwkxt.fsf@gnus.jao.io> (Jose A. Ortega Ruiz's message of "Wed, 18 Nov 2020 00:46:54 +0000")

"Jose A. Ortega Ruiz" <jao@gnu.org> writes:

> On Tue, Nov 17 2020, Eric Abrahamsen wrote:
>
>> Eric S Fraga <e.fraga@ucl.ac.uk> writes:
>>
>>> On Monday, 16 Nov 2020 at 10:47, Eric Abrahamsen wrote:
>>>> Well that's pretty weird. I didn't think I changed any of the actual
>>>> parsing behavior, only reorganized code. 
>>>
>>> With these complex systems, the whole concept of emergent behaviour
>>> raises its head! ;-)
>>>
>>> Happy to test out anything you change etc., of course.
>>
>> Thanks very much. This is still baffling me. If you (and/or Jose, or
>> anyone else) have a moment, would you please send me the command-line
>> output of a successful notmuch search (ie absolute filenames) which
>> fails in Emacs master right now? Feel free to anonymize however
>> necessary, though obviously leaving the group-name part of the filepath
>> alone. Plus whatever notmuch-related config you've got...
>
> If i perform a gnus-search for "baffled" in this group, which is called
> by gnus "nntp:gmane.emacs.gnus.general", the buffer where
> gnus-search-indexed-parse-output runs has the following output from
> notmuch (which correctly inserts real path names):
>
>    /home/jao/var/mail/gmane/emacs/gnus/general/65
>    /home/jao/var/mail/feeds/news/cur/1605280382.M861830P724182.osgiliath,S=3292,W=3331:2,Sa
>    /home/jao/var/mail/gwene/org/arxiv/computer/science/1102
>
> where /home/jao/var/mail is my remove-prefix.  
>
> That method (gnus-search.el:1364) is receiving as its 'groups' argument
> the value '("gmane.emacs.gnus.general").  the method constructs a regexp
> that it calls group-regexp using that value, and that regexp is wrong.
>
> That is because it uses as its value:
>
>     (when groups
>       (regexp-opt (mapcar (lambda (x) (gnus-group-real-name x)) groups)))
>
> which, for grous being '("gmane.emacs.gnus.general"), evaluates to:
>
>     "\\(?:gmane\\.emacs\\.gnus\\.general\\)"
>
> That is wrong because the method then tries to match in the results
> buffer, using that regexp, things like:
>
>    /home/jao/var/mail/gmane.emacs.gnus.general/65
>
> instead of correct pathnames such as:
>
>    /home/jao/var/mail/gmane/emacs/gnus/general/65
>
> The end result is that, despite notmuch finding correct paths and
> inserting them correctly in the temporary results buffer,
> gnus-search-indexed-parse-output fails to recognise them when it is
> filtering those belonging to the searched for groups.  And this is
> because it's constructing a wrong regular expression in
> gnus-search.el:1367, and using it in gnus-search.el:1381.
>
> The workaround my advice implements is changing the value of the input
> parameter groups from '("foo.bar") to '("foo/bar").
>
> I'm most probably belabouring here many things obvious to you, sorry
> about that :)

No, that's perfect, I'm trying to make completely sure I understand
what's going on here. The original nnir-run-* functions all did pretty
much the same thing but with slightly different code, and in combining
all those functions I didn't get everything exactly in place.

The original notmuch code used the GROUPS parameter to build a "--path"
option for the notmuch call, but even if I re-instate that, we'll still
want this filter here for the other engines.

Would you try this version of the function? It should just permit any
path separator when checking for group name matches.

Thanks,
Eric


(cl-defmethod gnus-search-indexed-parse-output ((engine gnus-search-indexed)
server query &optional groups)
(let ((prefix (slot-value engine 'remove-prefix))
 (group-regexp (when groups
			(mapconcat
			 (lambda (x)
			   (replace-regexp-in-string
			    ;; Accept any of [.\/] as path separators.
			    "[.\\/]" "[.\\\\/]"
			    (gnus-group-real-name x)))
			 groups "\\|")))
	artlist vectors article group)
    (goto-char (point-min))
    (while (not (eobp))
      (pcase-let ((`(,f-name ,score) (gnus-search-indexed-extract engine)))
	(when (and (file-readable-p f-name)
		   (null (file-directory-p f-name))
		   (or (null groups)
		       (and (gnus-search-single-p query)
			    (alist-get 'thread query))
		       (string-match-p group-regexp f-name)))
	  (push (list f-name score) artlist))))
    ;; Are we running an additional grep query?
    (when-let ((grep-reg (alist-get 'grep query)))
      (setq artlist (gnus-search-grep-search engine artlist grep-reg)))
    ;; Turn (file-name score) into [group article score].
    (pcase-dolist (`(,f-name ,score) artlist)
      (setq article (file-name-nondirectory f-name))
      ;; Remove prefix.
      (when (and prefix
		 (file-name-absolute-p prefix)
		 (string-match (concat "^"
				       (file-name-as-directory prefix))
			       f-name))
	(setq group (replace-match "" t t (file-name-directory f-name))))
      ;; Break the directory name down until it's something that
      ;; (probably) can be used as a group name.
      (setq group
	    (replace-regexp-in-string
	     "[/\\]" "."
	     (replace-regexp-in-string
	      "/?\\(cur\\|new\\|tmp\\)?/\\'" ""
	      (replace-regexp-in-string
	       "^[./\\]" ""
	       group nil t)
	      nil t)
	     nil t))

      (push (vector (gnus-group-full-name group server)
		    (if (string-match-p "\\`[[:digit:]]+\\'" article)
			(string-to-number article)
		      (nnmaildir-base-name-to-article-number
		       (substring article 0 (string-match ":" article))
		       group nil))
		    (if (numberp score)
			score
		      (string-to-number score)))
	    vectors))
    vectors))


  reply	other threads:[~2020-11-18 20:42 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-04 17:15 Eric Abrahamsen
2020-11-04 18:14 ` Pankaj Jangid
2020-11-04 18:45   ` Eric Abrahamsen
2020-11-04 19:32     ` Pankaj Jangid
2020-11-04 19:39     ` Pankaj Jangid
2020-11-04 19:49       ` Eric Abrahamsen
2020-11-05  2:19 ` Eric Abrahamsen
2020-11-05 11:58 ` Adam Sjøgren
2020-11-05 16:04   ` Eric Abrahamsen
2020-11-05 16:14     ` Eric Abrahamsen
2020-11-05 16:41     ` Adam Sjøgren
2020-11-05 17:18       ` Eric Abrahamsen
2020-11-05 17:34         ` Adam Sjøgren
2020-11-06  3:50           ` Eric Abrahamsen
2020-11-06  6:23 ` Jose A. Ortega Ruiz
     [not found]   ` <12MvvhYMQyipeZzkJx1ODwHAD4xQZo6qw1FSX6nvgZAyLZCPkEVFXXGxOQTuxL1zvwZC6BER4jnUFXNgIEjIZA==@protonmail.internalid>
2020-11-07  4:59   ` Eric Abrahamsen
2020-11-08  1:23     ` Jose A. Ortega Ruiz
     [not found]       ` <VEKLrJRKnbVIVztgsX0O5q0i9OwitXf-t5q2hcVN-ZDq0SRE1KS4DIpk7iNeQxIhD1_9AC4DWOdDJsJW2XCMlg==@protonmail.internalid>
2020-11-08  2:38       ` Eric Abrahamsen
2020-11-08  2:51         ` Jose A. Ortega Ruiz
2020-11-08  2:55         ` Andrew Cohen
2020-11-08  2:43     ` Jose A. Ortega Ruiz
2020-11-08  5:03       ` Eric Abrahamsen
2020-11-08  7:16         ` Jose A. Ortega Ruiz
2020-11-12 20:51           ` Eric Abrahamsen
2020-11-13  3:17             ` Jose A. Ortega Ruiz
     [not found]               ` <-NlYHnQ3eprZs5vpzJzwiWpUHjyOUwbkarR4R4m8DK_5ik1XoE8SVsxNfQWJwgUWVIfjwxU5eCbwaWIzwZUJNQ==@protonmail.internalid>
2020-11-13  6:38               ` Eric Abrahamsen
2020-11-13 19:15                 ` Jose A. Ortega Ruiz
2020-11-14  1:02                   ` Eric Abrahamsen
2020-11-13 11:07 ` Eric S Fraga
2020-11-13 12:39   ` Gijs Hillenius
2020-11-13 13:01     ` Eric S Fraga
2020-11-13 16:15       ` Eric Abrahamsen
2020-11-13 16:56         ` Eric S Fraga
2020-11-13 17:21           ` Eric Abrahamsen
2020-11-13 20:06             ` Jose A. Ortega Ruiz
2020-11-16 10:44               ` Eric S Fraga
2020-11-16 15:00               ` Eric S Fraga
2020-11-16 18:47                 ` Eric Abrahamsen
2020-11-17 11:04                   ` Eric S Fraga
2020-11-17 23:58                     ` Eric Abrahamsen
2020-11-18  0:46                       ` Jose A. Ortega Ruiz
2020-11-18 20:40                         ` Eric Abrahamsen [this message]
     [not found]                         ` <1x7NOCTHudFuCvB0kEBCGDds7KKAdbu-tZRD41ue36qG8dPBUSj7W9lq7CK5WJwL50cGQKZcom2KMkW_2VZi8Q==@protonmail.internalid>
     [not found]                           ` <871rgqz98k.fsf@ericabrahamsen.net>
2020-11-18 21:05                             ` Jose A. Ortega Ruiz
2020-11-18 21:31                               ` Eric Abrahamsen
2020-11-18  9:21 ` yoctocell
2020-11-18 20:53   ` Eric Abrahamsen
2020-11-19 11:02     ` yoctocell
2020-11-22 12:56     ` yoctocell
2020-11-22 16:31       ` Eric Abrahamsen
     [not found]         ` <86lfel9z1b.fsf@yoctocell.xyz>
2020-11-30  4:40           ` Eric Abrahamsen
2020-11-30  8:17             ` yoctocell
2020-11-30 17:30               ` Eric Abrahamsen
2020-12-01  7:47                 ` yoctocell
2020-12-02  2:16                   ` Eric Abrahamsen
2020-12-02  7:17                     ` yoctocell
2020-12-11  1:39                       ` Eric Abrahamsen
2020-12-11  7:55                         ` yoctocell
2020-12-13 10:18                           ` yoctocell
2020-12-13 11:23                             ` yoctocell
2020-12-13 16:49                               ` Eric Abrahamsen
2020-12-13 21:13                                 ` yoctocell
2020-12-18  4:30                                   ` Eric Abrahamsen
2020-12-18  8:21                                     ` yoctocell
2020-12-22 17:05                                       ` Eric Abrahamsen
2020-12-22 18:10                                         ` yoctocell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a6vez9ei.fsf@ericabrahamsen.net \
    --to=eric@ericabrahamsen.net \
    --cc=ding@gnus.org \
    --cc=e.fraga@ucl.ac.uk \
    --cc=jao@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).