Announcements and discussions for Gnus, the GNU Emacs Usenet newsreader
 help / color / mirror / Atom feed
* nnml article filenames
@ 2005-10-28 12:05 Pranav K. Tiwari
  2005-10-28 13:30 ` Steve Youngs
  0 siblings, 1 reply; 6+ messages in thread
From: Pranav K. Tiwari @ 2005-10-28 12:05 UTC (permalink / raw)




To allow desktop search programs go through nnml articles, I would like
to give an extension like .xyz, and tell these programs to treat these
files like email.

For doing that, I am trying to see if nnml can store articles as 1.xyz,
2.xyz etc. rather than  1, 2, ..

I have been able get it to read these articles appropriately, by
modifying 'nnml-article-to-file (article)', such that it looks for a
file 1.xyz for the article 1. But, while saving messages into the group,
this function is not called.

How can I change the names of files in which the articles are stored?

thx,
-- 
Pranav Tiwari.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nnml article filenames
  2005-10-28 12:05 nnml article filenames Pranav K. Tiwari
@ 2005-10-28 13:30 ` Steve Youngs
  2005-10-31  5:29   ` Pranav K. Tiwari
  0 siblings, 1 reply; 6+ messages in thread
From: Steve Youngs @ 2005-10-28 13:30 UTC (permalink / raw)


* Pranav K Tiwari <jpranav@cisco.com> writes:

  > To allow desktop search programs go through nnml articles, I would
  > like to give an extension like .xyz, and tell these programs to
  > treat these files like email.

I think this is the wrong approach.  Instead of modifying the
filenames to suit the search program, find a way to make the search
program work properly.

It's really not that difficult, see...

   $ find <nnmldir> -type f -regex '^.*[0-9]+$'

-- 
|---<Steve Youngs>---------------<GnuPG KeyID: A94B3003>---|
|                   Te audire no possum.                   |
|             Musa sapientum fixa est in aure.             |
|----------------------------------<steve@youngs.au.com>---|


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nnml article filenames
  2005-10-28 13:30 ` Steve Youngs
@ 2005-10-31  5:29   ` Pranav K. Tiwari
  2005-10-31  7:30     ` Steve Youngs
  0 siblings, 1 reply; 6+ messages in thread
From: Pranav K. Tiwari @ 2005-10-31  5:29 UTC (permalink / raw)


Steve Youngs <steve@youngs.au.com> writes:

> * Pranav K Tiwari <jpranav@cisco.com> writes:
>
>   > To allow desktop search programs go through nnml articles, I would
>   > like to give an extension like .xyz, and tell these programs to
>   > treat these files like email.
>
> I think this is the wrong approach.  Instead of modifying the
> filenames to suit the search program, find a way to make the search
> program work properly.
>
> It's really not that difficult, see...
>
>    $ find <nnmldir> -type f -regex '^.*[0-9]+$'
>

The question is not about 'finding' these files, but about associating a
'type' with the file. Most indexing programs (google/yahoo/microsoft
desktop search engines, X1) rely on file extensions to determine the
filetype, and then index the contens of the file accordingly. It'll be
good if they could deal with files with no extensions, but they don't
(afaik).

So - with that in mind, the easiest way would be to change the way gnus
nnml stores files, or write another backend that allows changing
filenames.

-p


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nnml article filenames
  2005-10-31  5:29   ` Pranav K. Tiwari
@ 2005-10-31  7:30     ` Steve Youngs
  2005-10-31  9:13       ` Pranav K. Tiwari
  0 siblings, 1 reply; 6+ messages in thread
From: Steve Youngs @ 2005-10-31  7:30 UTC (permalink / raw)


* Pranav K Tiwari <jpranav@cisco.com> writes:

  > Steve Youngs <steve@youngs.au.com> writes:
  >> * Pranav K Tiwari <jpranav@cisco.com> writes:
  >> 
  >> > To allow desktop search programs go through nnml articles, I would
  >> > like to give an extension like .xyz, and tell these programs to
  >> > treat these files like email.
  >> 
  >> I think this is the wrong approach.  Instead of modifying the
  >> filenames to suit the search program, find a way to make the search
  >> program work properly.
  >> 
  >> It's really not that difficult, see...
  >> 
  >> $ find <nnmldir> -type f -regex '^.*[0-9]+$'
  >> 

  > The question is not about 'finding' these files, but about
  > associating a 'type' with the file.

But if you can find them, there's really no point in associating a
"type" to them.

  $ find <nnmldir> -type f -regex '^.*[0-9]+$' | \
     xargs some_app_needing_mail_files_as_input

  > Most indexing programs (google/yahoo/microsoft desktop search
  > engines, X1) rely on file extensions to determine the filetype,
  > and then index the contens of the file accordingly. It'll be good
  > if they could deal with files with no extensions, but they don't
  > (afaik).

Yes they do.  For example:

  <http://homepage.mac.com/pauljlucas/software/swish/>

  > So - with that in mind, the easiest way would be to change the way gnus
  > nnml stores files, or write another backend that allows changing
  > filenames.

Maybe you should say what it is exactly that you want to do with your
nnml files.

-- 
|---<Steve Youngs>---------------<GnuPG KeyID: A94B3003>---|
|                   Te audire no possum.                   |
|             Musa sapientum fixa est in aure.             |
|----------------------------------<steve@youngs.au.com>---|


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nnml article filenames
  2005-10-31  7:30     ` Steve Youngs
@ 2005-10-31  9:13       ` Pranav K. Tiwari
  2006-04-20  8:59         ` Pranav K. Tiwari
  0 siblings, 1 reply; 6+ messages in thread
From: Pranav K. Tiwari @ 2005-10-31  9:13 UTC (permalink / raw)


Steve Youngs <steve@youngs.au.com> writes:

> * Pranav K Tiwari <jpranav@cisco.com> writes:
>
>   > Steve Youngs <steve@youngs.au.com> writes:
>   >> * Pranav K Tiwari <jpranav@cisco.com> writes:
>   >> 
>   >> > To allow desktop search programs go through nnml articles, I would
>   >> > like to give an extension like .xyz, and tell these programs to
>   >> > treat these files like email.
>   >> 
>   >> I think this is the wrong approach.  Instead of modifying the
>   >> filenames to suit the search program, find a way to make the search
>   >> program work properly.
>   >> 
>   >> It's really not that difficult, see...
>   >> 
>   >> $ find <nnmldir> -type f -regex '^.*[0-9]+$'
>   >> 
>
>   > The question is not about 'finding' these files, but about
>   > associating a 'type' with the file.
>
> But if you can find them, there's really no point in associating a
> "type" to them.
>
>   $ find <nnmldir> -type f -regex '^.*[0-9]+$' | \
>      xargs some_app_needing_mail_files_as_input
>
>   > Most indexing programs (google/yahoo/microsoft desktop search
>   > engines, X1) rely on file extensions to determine the filetype,
>   > and then index the contens of the file accordingly. It'll be good
>   > if they could deal with files with no extensions, but they don't
>   > (afaik).
>
> Yes they do.  For example:
>
>   <http://homepage.mac.com/pauljlucas/software/swish/>
>
>   > So - with that in mind, the easiest way would be to change the way gnus
>   > nnml stores files, or write another backend that allows changing
>   > filenames.
>
> Maybe you should say what it is exactly that you want to do with your
> nnml files.
>

swish is fine - that's what I've used till now. I've been unable to use
it to index all of my email periodically. I would like to say, here's
the top directory under which all my nnml mail is, and this should be
indexed periodically. But swish runs out of memory (even with -e option,
on my 512Meg Win2k machine) in trying to index my mails (some, 35-40
nnml folders, each with 2000-5000 emails). So, the way I use swish is to
have one index file per nnml folder, and I have modified the swish
search function to search a list of index files.

It works, but as you can see, it's not optimal. Maybe, my usage of swish
is not correct - and if so, I'll be glad to be corrected.

desktop search programs that I mentioned, all support a 'crawl' type of
indexing where they can keep track of what has changed, and update their
indices appropriately. And I have never had any trouble with memory with
them. That's why I'll like to use any of those to index my mail, instead
of swish that I'm using at present.

-p


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nnml article filenames
  2005-10-31  9:13       ` Pranav K. Tiwari
@ 2006-04-20  8:59         ` Pranav K. Tiwari
  0 siblings, 0 replies; 6+ messages in thread
From: Pranav K. Tiwari @ 2006-04-20  8:59 UTC (permalink / raw)


jpranav@cisco.com (Pranav K. Tiwari) writes:

> Steve Youngs <steve@youngs.au.com> writes:
>
>> * Pranav K Tiwari <jpranav@cisco.com> writes:
>>
>>   > Steve Youngs <steve@youngs.au.com> writes:
>>   >> * Pranav K Tiwari <jpranav@cisco.com> writes:
>>   >> 
>>   >> > To allow desktop search programs go through nnml articles, I would
>>   >> > like to give an extension like .xyz, and tell these programs to
>>   >> > treat these files like email.
>>   >> 
>>   >> I think this is the wrong approach.  Instead of modifying the
>>   >> filenames to suit the search program, find a way to make the search
>>   >> program work properly.
>>   >> 
>>   >> It's really not that difficult, see...
>>   >> 
>>   >> $ find <nnmldir> -type f -regex '^.*[0-9]+$'
>>   >> 
>>
>>   > The question is not about 'finding' these files, but about
>>   > associating a 'type' with the file.
>>
>> But if you can find them, there's really no point in associating a
>> "type" to them.
>>
>>   $ find <nnmldir> -type f -regex '^.*[0-9]+$' | \
>>      xargs some_app_needing_mail_files_as_input
>>
>>   > Most indexing programs (google/yahoo/microsoft desktop search
>>   > engines, X1) rely on file extensions to determine the filetype,
>>   > and then index the contens of the file accordingly. It'll be good
>>   > if they could deal with files with no extensions, but they don't
>>   > (afaik).
>>
>> Yes they do.  For example:
>>
>>   <http://homepage.mac.com/pauljlucas/software/swish/>
>>
>>   > So - with that in mind, the easiest way would be to change the way gnus
>>   > nnml stores files, or write another backend that allows changing
>>   > filenames.
>>
>> Maybe you should say what it is exactly that you want to do with your
>> nnml files.
>>
>
> swish is fine - that's what I've used till now. I've been unable to use
> it to index all of my email periodically. I would like to say, here's
> the top directory under which all my nnml mail is, and this should be
> indexed periodically. But swish runs out of memory (even with -e option,
> on my 512Meg Win2k machine) in trying to index my mails (some, 35-40
> nnml folders, each with 2000-5000 emails). So, the way I use swish is to
> have one index file per nnml folder, and I have modified the swish
> search function to search a list of index files.
>
> It works, but as you can see, it's not optimal. Maybe, my usage of swish
> is not correct - and if so, I'll be glad to be corrected.
>
> desktop search programs that I mentioned, all support a 'crawl' type of
> indexing where they can keep track of what has changed, and update their
> indices appropriately. And I have never had any trouble with memory with
> them. That's why I'll like to use any of those to index my mail, instead
> of swish that I'm using at present.
>
> -p

I've had some success with it by modifying nnml.el to store articles
with an extension. So, instead of storing articles as group/N, I store
it as group/N.nnml, and then configure the search engine to treat .nnml
file as a text file. Works well - much better than swish_e for the 50k
emails that I have. Diffs attached, in case anyone else cares.

regards,
-p

---------------------------------------------------------------------------

Index: lisp/nnml.el
===================================================================
RCS file: /usr/local/cvsroot/gnus/lisp/nnml.el,v
retrieving revision 7.8
diff -r7.8 nnml.el
512a513,517
> (defvar pkt:nnml-txt-ext ".nnml"
>   "*extension for nnml files")
> (defvar pkt:nnml-use-txt-extension t
>   "should text extension be used?")
>
513a519,526
>   (let (file)
>     (setq file (nnml-article-to-file-original article))
>     (if (file-exists-p file)
>         file
>       (if pkt:nnml-use-txt-extension
>           (concat file pkt:nnml-txt-ext)))))
>
> (defun nnml-article-to-file-original (article)
621a635,637
>     (setq text-ext
>           (if pkt:nnml-use-txt-extension
>               pkt:nnml-txt-ext))
640a657
>                             text-ext

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2006-04-20  8:59 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-10-28 12:05 nnml article filenames Pranav K. Tiwari
2005-10-28 13:30 ` Steve Youngs
2005-10-31  5:29   ` Pranav K. Tiwari
2005-10-31  7:30     ` Steve Youngs
2005-10-31  9:13       ` Pranav K. Tiwari
2006-04-20  8:59         ` Pranav K. Tiwari

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).