* nnml article filenames @ 2005-10-28 12:05 Pranav K. Tiwari 2005-10-28 13:30 ` Steve Youngs 0 siblings, 1 reply; 6+ messages in thread From: Pranav K. Tiwari @ 2005-10-28 12:05 UTC (permalink / raw) To allow desktop search programs go through nnml articles, I would like to give an extension like .xyz, and tell these programs to treat these files like email. For doing that, I am trying to see if nnml can store articles as 1.xyz, 2.xyz etc. rather than 1, 2, .. I have been able get it to read these articles appropriately, by modifying 'nnml-article-to-file (article)', such that it looks for a file 1.xyz for the article 1. But, while saving messages into the group, this function is not called. How can I change the names of files in which the articles are stored? thx, -- Pranav Tiwari. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: nnml article filenames 2005-10-28 12:05 nnml article filenames Pranav K. Tiwari @ 2005-10-28 13:30 ` Steve Youngs 2005-10-31 5:29 ` Pranav K. Tiwari 0 siblings, 1 reply; 6+ messages in thread From: Steve Youngs @ 2005-10-28 13:30 UTC (permalink / raw) * Pranav K Tiwari <jpranav@cisco.com> writes: > To allow desktop search programs go through nnml articles, I would > like to give an extension like .xyz, and tell these programs to > treat these files like email. I think this is the wrong approach. Instead of modifying the filenames to suit the search program, find a way to make the search program work properly. It's really not that difficult, see... $ find <nnmldir> -type f -regex '^.*[0-9]+$' -- |---<Steve Youngs>---------------<GnuPG KeyID: A94B3003>---| | Te audire no possum. | | Musa sapientum fixa est in aure. | |----------------------------------<steve@youngs.au.com>---| ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: nnml article filenames 2005-10-28 13:30 ` Steve Youngs @ 2005-10-31 5:29 ` Pranav K. Tiwari 2005-10-31 7:30 ` Steve Youngs 0 siblings, 1 reply; 6+ messages in thread From: Pranav K. Tiwari @ 2005-10-31 5:29 UTC (permalink / raw) Steve Youngs <steve@youngs.au.com> writes: > * Pranav K Tiwari <jpranav@cisco.com> writes: > > > To allow desktop search programs go through nnml articles, I would > > like to give an extension like .xyz, and tell these programs to > > treat these files like email. > > I think this is the wrong approach. Instead of modifying the > filenames to suit the search program, find a way to make the search > program work properly. > > It's really not that difficult, see... > > $ find <nnmldir> -type f -regex '^.*[0-9]+$' > The question is not about 'finding' these files, but about associating a 'type' with the file. Most indexing programs (google/yahoo/microsoft desktop search engines, X1) rely on file extensions to determine the filetype, and then index the contens of the file accordingly. It'll be good if they could deal with files with no extensions, but they don't (afaik). So - with that in mind, the easiest way would be to change the way gnus nnml stores files, or write another backend that allows changing filenames. -p ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: nnml article filenames 2005-10-31 5:29 ` Pranav K. Tiwari @ 2005-10-31 7:30 ` Steve Youngs 2005-10-31 9:13 ` Pranav K. Tiwari 0 siblings, 1 reply; 6+ messages in thread From: Steve Youngs @ 2005-10-31 7:30 UTC (permalink / raw) * Pranav K Tiwari <jpranav@cisco.com> writes: > Steve Youngs <steve@youngs.au.com> writes: >> * Pranav K Tiwari <jpranav@cisco.com> writes: >> >> > To allow desktop search programs go through nnml articles, I would >> > like to give an extension like .xyz, and tell these programs to >> > treat these files like email. >> >> I think this is the wrong approach. Instead of modifying the >> filenames to suit the search program, find a way to make the search >> program work properly. >> >> It's really not that difficult, see... >> >> $ find <nnmldir> -type f -regex '^.*[0-9]+$' >> > The question is not about 'finding' these files, but about > associating a 'type' with the file. But if you can find them, there's really no point in associating a "type" to them. $ find <nnmldir> -type f -regex '^.*[0-9]+$' | \ xargs some_app_needing_mail_files_as_input > Most indexing programs (google/yahoo/microsoft desktop search > engines, X1) rely on file extensions to determine the filetype, > and then index the contens of the file accordingly. It'll be good > if they could deal with files with no extensions, but they don't > (afaik). Yes they do. For example: <http://homepage.mac.com/pauljlucas/software/swish/> > So - with that in mind, the easiest way would be to change the way gnus > nnml stores files, or write another backend that allows changing > filenames. Maybe you should say what it is exactly that you want to do with your nnml files. -- |---<Steve Youngs>---------------<GnuPG KeyID: A94B3003>---| | Te audire no possum. | | Musa sapientum fixa est in aure. | |----------------------------------<steve@youngs.au.com>---| ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: nnml article filenames 2005-10-31 7:30 ` Steve Youngs @ 2005-10-31 9:13 ` Pranav K. Tiwari 2006-04-20 8:59 ` Pranav K. Tiwari 0 siblings, 1 reply; 6+ messages in thread From: Pranav K. Tiwari @ 2005-10-31 9:13 UTC (permalink / raw) Steve Youngs <steve@youngs.au.com> writes: > * Pranav K Tiwari <jpranav@cisco.com> writes: > > > Steve Youngs <steve@youngs.au.com> writes: > >> * Pranav K Tiwari <jpranav@cisco.com> writes: > >> > >> > To allow desktop search programs go through nnml articles, I would > >> > like to give an extension like .xyz, and tell these programs to > >> > treat these files like email. > >> > >> I think this is the wrong approach. Instead of modifying the > >> filenames to suit the search program, find a way to make the search > >> program work properly. > >> > >> It's really not that difficult, see... > >> > >> $ find <nnmldir> -type f -regex '^.*[0-9]+$' > >> > > > The question is not about 'finding' these files, but about > > associating a 'type' with the file. > > But if you can find them, there's really no point in associating a > "type" to them. > > $ find <nnmldir> -type f -regex '^.*[0-9]+$' | \ > xargs some_app_needing_mail_files_as_input > > > Most indexing programs (google/yahoo/microsoft desktop search > > engines, X1) rely on file extensions to determine the filetype, > > and then index the contens of the file accordingly. It'll be good > > if they could deal with files with no extensions, but they don't > > (afaik). > > Yes they do. For example: > > <http://homepage.mac.com/pauljlucas/software/swish/> > > > So - with that in mind, the easiest way would be to change the way gnus > > nnml stores files, or write another backend that allows changing > > filenames. > > Maybe you should say what it is exactly that you want to do with your > nnml files. > swish is fine - that's what I've used till now. I've been unable to use it to index all of my email periodically. I would like to say, here's the top directory under which all my nnml mail is, and this should be indexed periodically. But swish runs out of memory (even with -e option, on my 512Meg Win2k machine) in trying to index my mails (some, 35-40 nnml folders, each with 2000-5000 emails). So, the way I use swish is to have one index file per nnml folder, and I have modified the swish search function to search a list of index files. It works, but as you can see, it's not optimal. Maybe, my usage of swish is not correct - and if so, I'll be glad to be corrected. desktop search programs that I mentioned, all support a 'crawl' type of indexing where they can keep track of what has changed, and update their indices appropriately. And I have never had any trouble with memory with them. That's why I'll like to use any of those to index my mail, instead of swish that I'm using at present. -p ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: nnml article filenames 2005-10-31 9:13 ` Pranav K. Tiwari @ 2006-04-20 8:59 ` Pranav K. Tiwari 0 siblings, 0 replies; 6+ messages in thread From: Pranav K. Tiwari @ 2006-04-20 8:59 UTC (permalink / raw) jpranav@cisco.com (Pranav K. Tiwari) writes: > Steve Youngs <steve@youngs.au.com> writes: > >> * Pranav K Tiwari <jpranav@cisco.com> writes: >> >> > Steve Youngs <steve@youngs.au.com> writes: >> >> * Pranav K Tiwari <jpranav@cisco.com> writes: >> >> >> >> > To allow desktop search programs go through nnml articles, I would >> >> > like to give an extension like .xyz, and tell these programs to >> >> > treat these files like email. >> >> >> >> I think this is the wrong approach. Instead of modifying the >> >> filenames to suit the search program, find a way to make the search >> >> program work properly. >> >> >> >> It's really not that difficult, see... >> >> >> >> $ find <nnmldir> -type f -regex '^.*[0-9]+$' >> >> >> >> > The question is not about 'finding' these files, but about >> > associating a 'type' with the file. >> >> But if you can find them, there's really no point in associating a >> "type" to them. >> >> $ find <nnmldir> -type f -regex '^.*[0-9]+$' | \ >> xargs some_app_needing_mail_files_as_input >> >> > Most indexing programs (google/yahoo/microsoft desktop search >> > engines, X1) rely on file extensions to determine the filetype, >> > and then index the contens of the file accordingly. It'll be good >> > if they could deal with files with no extensions, but they don't >> > (afaik). >> >> Yes they do. For example: >> >> <http://homepage.mac.com/pauljlucas/software/swish/> >> >> > So - with that in mind, the easiest way would be to change the way gnus >> > nnml stores files, or write another backend that allows changing >> > filenames. >> >> Maybe you should say what it is exactly that you want to do with your >> nnml files. >> > > swish is fine - that's what I've used till now. I've been unable to use > it to index all of my email periodically. I would like to say, here's > the top directory under which all my nnml mail is, and this should be > indexed periodically. But swish runs out of memory (even with -e option, > on my 512Meg Win2k machine) in trying to index my mails (some, 35-40 > nnml folders, each with 2000-5000 emails). So, the way I use swish is to > have one index file per nnml folder, and I have modified the swish > search function to search a list of index files. > > It works, but as you can see, it's not optimal. Maybe, my usage of swish > is not correct - and if so, I'll be glad to be corrected. > > desktop search programs that I mentioned, all support a 'crawl' type of > indexing where they can keep track of what has changed, and update their > indices appropriately. And I have never had any trouble with memory with > them. That's why I'll like to use any of those to index my mail, instead > of swish that I'm using at present. > > -p I've had some success with it by modifying nnml.el to store articles with an extension. So, instead of storing articles as group/N, I store it as group/N.nnml, and then configure the search engine to treat .nnml file as a text file. Works well - much better than swish_e for the 50k emails that I have. Diffs attached, in case anyone else cares. regards, -p --------------------------------------------------------------------------- Index: lisp/nnml.el =================================================================== RCS file: /usr/local/cvsroot/gnus/lisp/nnml.el,v retrieving revision 7.8 diff -r7.8 nnml.el 512a513,517 > (defvar pkt:nnml-txt-ext ".nnml" > "*extension for nnml files") > (defvar pkt:nnml-use-txt-extension t > "should text extension be used?") > 513a519,526 > (let (file) > (setq file (nnml-article-to-file-original article)) > (if (file-exists-p file) > file > (if pkt:nnml-use-txt-extension > (concat file pkt:nnml-txt-ext))))) > > (defun nnml-article-to-file-original (article) 621a635,637 > (setq text-ext > (if pkt:nnml-use-txt-extension > pkt:nnml-txt-ext)) 640a657 > text-ext ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2006-04-20 8:59 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2005-10-28 12:05 nnml article filenames Pranav K. Tiwari 2005-10-28 13:30 ` Steve Youngs 2005-10-31 5:29 ` Pranav K. Tiwari 2005-10-31 7:30 ` Steve Youngs 2005-10-31 9:13 ` Pranav K. Tiwari 2006-04-20 8:59 ` Pranav K. Tiwari
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).