From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/31876 Path: main.gmane.org!not-for-mail From: Harry Putnam Newsgroups: gmane.emacs.gnus.general Subject: Re: nnir/freeWAIS-sf Date: 20 Jul 2000 11:13:05 -0700 Sender: owner-ding@hpc.uh.edu Message-ID: References: <86g0pb4646.fsf@beta.fciencias.unam.mx> <861z0sx3uh.fsf@beta.fciencias.unam.mx> NNTP-Posting-Host: coloc-standby.netfonds.no Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: main.gmane.org 1035168237 16547 80.91.224.250 (21 Oct 2002 02:43:57 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Mon, 21 Oct 2002 02:43:57 +0000 (UTC) Return-Path: Original-Received: from fisher.math.uh.edu (fisher.math.uh.edu [129.7.128.35]) by mailhost.sclp.com (Postfix) with ESMTP id 230E3D051F for ; Thu, 20 Jul 2000 14:52:42 -0400 (EDT) Original-Received: from sina.hpc.uh.edu (lists@Sina.HPC.UH.EDU [129.7.3.5]) by fisher.math.uh.edu (8.9.1/8.9.1) with ESMTP id NAC28366; Thu, 20 Jul 2000 13:52:26 -0500 (CDT) Original-Received: by sina.hpc.uh.edu (TLB v0.09a (1.20 tibbs 1996/10/09 22:03:07)); Thu, 20 Jul 2000 13:51:27 -0500 (CDT) Original-Received: from mailhost.sclp.com (postfix@sclp3.sclp.com [204.252.123.139]) by sina.hpc.uh.edu (8.9.3/8.9.3) with ESMTP id NAA03490 for ; Thu, 20 Jul 2000 13:50:58 -0500 (CDT) Original-Received: from mail.networkone.net (mail.networkone.net [209.144.112.75]) by mailhost.sclp.com (Postfix) with SMTP id 6393ED051F for ; Thu, 20 Jul 2000 14:51:29 -0400 (EDT) Original-Received: (qmail 12672 invoked from network); 20 Jul 2000 18:51:28 -0000 Original-Received: from adsl-116-86.ln.networkone.net (HELO reader.ptw.com) (209.144.116.86) by mail.networkone.net with SMTP; 20 Jul 2000 18:51:28 -0000 Original-Received: (from reader@localhost) by reader.ptw.com (8.9.3/8.9.3) id LAA25563; Thu, 20 Jul 2000 11:51:26 -0700 Original-To: ding@gnus.org In-Reply-To: Kai.Grossjohann@CS.Uni-Dortmund.DE's message of "Thu, 20 Jul 2000 16:34:39 +0200" User-Agent: Gnus/5.0807 (Gnus v5.8.7) Emacs/20.5 Original-Lines: 39 Precedence: list X-Majordomo: 1.94.jlt7 Xref: main.gmane.org gmane.emacs.gnus.general:31876 X-Report-Spam: http://spam.gmane.org/gmane.emacs.gnus.general:31876 Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai Gro=DFjohann) writes: > On 18 Jul 2000, Harry Putnam wrote: >=20 > > waissearch seems to only display a hit on file if it is in the subject > > line even though the spec does not specify that. > >=20 > > I suspect it has something to do with the *.fmt files liberal use of > > the `BOTH' specifier... So experimenting with that. >=20 > Hm. Right. My database also fails to contain `fmt' in the dictionary > for the global field. Hm. >=20 > I wish there was a debugging switch to waisindex where it told me what > it thought about the document. This way, debugging would be much > easier. >=20 > I still think it's a problem with the indexing process. Ie, once we > get the format file right, Bob will be our uncle. That sounds right too but current results are not very incouraging here. Have you noticed that there doesn't seem to be a way to do a strictly body search? You explained it as simply not putting data from any field sources into GLOBAL, but if you set all field specs like: SOUNDEX LOCAL TEXT LOCAL and leave only the /^$/ spec as GLOBAL wais caves in and drops core. >=20 > > As a side note, I don't really understand why the `body' regexp, the > > one beginning with: > >=20 > > region: /^$/ Needs a non-matching regexp. >=20 > Well, I wanted to make sure that all the remaining lines in each file > will be indexed in the body field, and choosing a non-matching regexp > is a sure way to have waisindex go through till the end of the file. I understood the reasoning but thought there might be some regexp that fit the end of a file. Guess not.