From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/31884 Path: main.gmane.org!not-for-mail From: Harry Putnam Newsgroups: gmane.emacs.gnus.general Subject: Re: nnir/freeWAIS-sf Date: 21 Jul 2000 15:35:57 -0700 Sender: owner-ding@hpc.uh.edu Message-ID: References: <86g0pb4646.fsf@beta.fciencias.unam.mx> <861z0sx3uh.fsf@beta.fciencias.unam.mx> NNTP-Posting-Host: coloc-standby.netfonds.no Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: main.gmane.org 1035168246 16634 80.91.224.250 (21 Oct 2002 02:44:06 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Mon, 21 Oct 2002 02:44:06 +0000 (UTC) Return-Path: Original-Received: from fisher.math.uh.edu (fisher.math.uh.edu [129.7.128.35]) by mailhost.sclp.com (Postfix) with ESMTP id 3A067D051E for ; Fri, 21 Jul 2000 18:43:28 -0400 (EDT) Original-Received: from sina.hpc.uh.edu (lists@Sina.HPC.UH.EDU [129.7.3.5]) by fisher.math.uh.edu (8.9.1/8.9.1) with ESMTP id RAC02167; Fri, 21 Jul 2000 17:43:11 -0500 (CDT) Original-Received: by sina.hpc.uh.edu (TLB v0.09a (1.20 tibbs 1996/10/09 22:03:07)); Fri, 21 Jul 2000 17:42:21 -0500 (CDT) Original-Received: from mailhost.sclp.com (postfix@sclp3.sclp.com [204.252.123.139]) by sina.hpc.uh.edu (8.9.3/8.9.3) with ESMTP id RAA17513 for ; Fri, 21 Jul 2000 17:42:01 -0500 (CDT) Original-Received: from mail.networkone.net (mail.networkone.net [209.144.112.75]) by mailhost.sclp.com (Postfix) with SMTP id B2B93D051E for ; Fri, 21 Jul 2000 18:42:34 -0400 (EDT) Original-Received: (qmail 7266 invoked from network); 21 Jul 2000 22:42:33 -0000 Original-Received: from adsl-116-86.ln.networkone.net (HELO reader.ptw.com) (209.144.116.86) by mail.networkone.net with SMTP; 21 Jul 2000 22:42:33 -0000 Original-Received: (from reader@localhost) by reader.ptw.com (8.9.3/8.9.3) id PAA00508; Fri, 21 Jul 2000 15:42:30 -0700 X-Authentication-Warning: reader.ptw.com: reader set sender to reader@newsguy.com using -f Original-To: ding@gnus.org In-Reply-To: Kai.Grossjohann@CS.Uni-Dortmund.DE's message of "Fri, 21 Jul 2000 19:31:26 +0200" User-Agent: Gnus/5.0807 (Gnus v5.8.7) Emacs/20.5 Original-Lines: 84 Precedence: list X-Majordomo: 1.94.jlt7 Xref: main.gmane.org gmane.emacs.gnus.general:31884 X-Report-Spam: http://spam.gmane.org/gmane.emacs.gnus.general:31884 Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai Gro=DFjohann) writes: > On 20 Jul 2000, Harry Putnam wrote: >=20 > > That sounds right too but current results are not very incouraging > > here. Have you noticed that there doesn't seem to be a way to do a > > strictly body search? You explained it as simply not putting data > > from any field sources into GLOBAL, but if you set all field specs > > like: SOUNDEX LOCAL TEXT LOCAL and leave only the /^$/ spec as > > GLOBAL wais caves in and drops core. >=20 > Arf. >=20 > Workaround: define a `body' field which gets /^$/ as the start regexp > and the non-matching regexp as the end regexp, and otherwise looks > like the Subject field. >=20 > Like this, maybe: >=20 > region: /^$/=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20 > body "Message body" stemming TEXT BOTH > end: /^@this regex should never match@$/ >=20 > Then, you can say `body=3Dfoo' to search in the body. It's not helping here.. Still can't get results in a body search Wasn't sure if you meant to use the above instead of the old GLOBAL entry or to use them both, so I tried both ways. Either way I get: Total word count for dictionary of field body is: 0 in indexing output. bsd > waissearch -d mail body=3D\(resounding and silence\) Search Response: NumberOfRecordsReturned: 2 Code: S1, field unexists: body 1: Score: 0, lines:13517 'Search produced no result. Here's the Catalog for database: mail' =20 >>From To and subject are all that work for me. Index command: waisindex -r -d mail -stem -t fields ~/Mail mail.fmt: record-sep: /^@this regex should never match@$/ =20=20=20 # Searchable fields specification. =20=20=20 region: /^[sS]ubject:/ /^[sS]ubject: */ subject "Subject header" stemming TEXT BOTH end: /^[^ \t]/ =20=20=20 region: /^([tT][oO]|[cC][cC]):/ /^([tT][oO]|[cC][cC]): */ to "To and Cc headers" SOUNDEX LOCAL TEXT BOTH end: /^[^ \t]/ =20=20=20 region: /^[fF][rR][oO][mM]:/ /^[fF][rR][oO][mM]: */ from "From header" SOUNDEX LOCAL TEXT BOTH end: /^[^ \t]/ =20=20=20 =20=20=20 region: /^$/ body "Message body" stemming TEXT BOTH end: /^@this regex should never match@$/ =20=20=20 region: /^$/ stemming TEXT GLOBAL end: /^@this regex should never match@$/ I guess you are already sure that these infinite regexp are palatable to wais.