Gnus development mailing list
 help / color / mirror / Atom feed
From: Harry Putnam <reader@newsguy.com>
Subject: Re: searching multiple groups from the Group buffer
Date: 19 Feb 2001 17:11:24 -0800	[thread overview]
Message-ID: <m3bss027dt.fsf@reader.newsguy.com> (raw)
In-Reply-To: <vafk86pnfi7.fsf@lucy.cs.uni-dortmund.de> (Kai.Grossjohann@CS.Uni-Dortmund.DE's message of "17 Feb 2001 22:53:04 +0100")


Colin W. asks:
>>	Is there a way I can search for a regexp in multiple groups?  For
>>	xample, I archive my mail in nnfolder archives by month, like
>>	xmail-2001-02, mail-2001-01, etc.


Kai Answers:
> For searching multiple groups, you can create an nnvirtual group that
> contains them all, then search normally.  Or you create an nnkiboze
> group which is similar to nnvirtual, but has the searching built-in.
 
> And then, there's always nnir.el which requires a search engine.
> Several people have volunteered to make it work with find/grep, but so
> far no go.  However, Glimpse is sufficiently simple to set up so it
> might do the trick for you.

Not sure about nnkiboze but the nnvirtual technique is the only one
mentioned here that really uses regexp.  From what I know, nnkiboze
isn't really working well either.

None of the currently supported search engines for nnir.el support
real regexp, far as I know. ( Is that correct Kai?)  Glimpse comes
closest, but doesn't allow full regexp either.

FreeWAIS is really a pain to setup and get working on linux.  I
finally got it working on my Freebsd box but it was such a pain (as
Kai will remember.. since I pestered him to death with it) I wouldn't
wish it on anybody.  It is very fast though. It is based on regexp in
the *.fmt file but the command lines are not.  Does not seem to be
near as robust as the glimpse software.

Colin, 

I've written a small shell/awk script that I use a lot, but kind of
stoped improving it once I got it working.  It is *NOT* integrated
into gnus or nnir as yet since I'm working on a more elaborate version
(sporadically).

To view

http://www.ptw.com/~reader/exp/search2.html

To download

http://www.ptw.com/~reader/exp/download/search2.tar.gz

It requires 3 regexp but has a default mode where you supply only one to
find in the body of messages.  Two regexp are searched for in headers,
one in body.  In the default mode  `From: ' and `Subject: ' are supplied by
the program and user supplies the body regexp on the command line with
`-b' flag.  Along with the directories to search with a `-d' flag.

NOTE: WARNING The script includes what documentation there is in the
form of `here documents' and comments.  Thes script may be somewhat
unconventional since I'm not really very experienced as a shell/awk
programmer.

The script knows when its in headers and when not and returns the hits
accordingly.  There must be 3 hits for anything to be returned.

A command line would look like:

 $ search2.sh -b "marking.*M-&" -d ~/Mail/prinb

  ( This will find your message in my  incoming pre-view group and
    return the regexp line plus `From' and `Subject' headers)

 (output would look like)

	/home/reader/Mail/prinb/30271
	30271|Subject: searching multiple groups from the Group buffer
	30271|From: Colin Walters <walters@cis.ohio-state.edu>
	30271|50|I tried marking those groups, and doing 'M-& RET', which did visit
	-- 

First section above shows file name where the hit occured and line
numbers of the actual hits, for quick viewing with command line tools
like less or vi.  Had there been more instances of the regexp in one
file they would be listed too.

If more files contain the hits then they are listed in the same way
with a separating `-- '.
	
	Regular expressions used:
	Header1 = ^Subject: 
	Header2 = ^From: 
	Body    = marking.*M-&
	Searched:    60 files 4803 lines
	under directory /home/reader/Mail/prinb
	  Finish = Feb 17 16:16:47 2001
	  Start  = Feb 17 16:16:47 2001


A final `stats' output (above) is displayed that may be of interest or
helpfull in adjusting the regexps.

I sometimes use a filter which writes copies of the files with hits to
a spool style file that gnus can access with nndoc (or mutt with `-f'
flag)  If the search seems to merit it.

   http://www.ptw.com/~reader/exp/download/nnml-nntp2mbox.sh

A very crude way to allow gnus tools to come into the foray.

I use the filter (nnml-nntp2mbox.sh) on the command line like:

   nnml-nntp2mbox.sh `search2.sh -b  "marking.*M-&" -d ~/Mail/prinb\
   |egrep '^/[^/]+/'` >FILE

Adjust the egrep regexp as needed.  This should generate a spool
(mbox) file that gnus can open with nndoc or mutt with `mutt -f FILE'.

NOTE -- about shabby filter 
No guarantee that the filter will handle all nntp messages.  Some that
are produced by list to news gateways have abnormal first lines.

I would like to get some feed back from an experienced person if you
are willing to try this home made script out.  It should work OK on a
linux box with `gawk' installed but not sure about other OSs with
different  `awks' (nawk mawk tawk etc etc)


NOTE:  A slick awk (manipulating ARGV )technique supplied by Arnold
       Robbins (of awk fame) allows just about any number of
       files to be searched without resorting to xargs.

That does not apply to the `filter' which passes the files on the
command line via "$@"  which can run into the `too many args' problem
but only if there are really a lot of hits.

Please try this out and give feed back if you have time.  At some
point after I get a newer better search.sh written I want to integrate
this into nnir as one of the engines.






  reply	other threads:[~2001-02-20  1:11 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-02-17 21:00 Colin Walters
2001-02-17 21:53 ` Kai Großjohann
2001-02-20  1:11   ` Harry Putnam [this message]
2001-02-20  9:40     ` Kai Großjohann
2001-02-20 14:53       ` Harry Putnam
2001-02-20 17:52     ` Harry Putnam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m3bss027dt.fsf@reader.newsguy.com \
    --to=reader@newsguy.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).