From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from scc-mailout.scc.kit.edu (scc-mailout.scc.kit.edu [129.13.185.202]) by krisdoz.my.domain (8.14.3/8.14.3) with ESMTP id pAG1oqAJ007161 for ; Tue, 15 Nov 2011 20:50:54 -0500 (EST) Received: from hekate.usta.de (asta-nat.asta.uni-karlsruhe.de [172.22.63.82]) by scc-mailout-02.scc.kit.edu with esmtp (Exim 4.72 #1) id 1RQUeJ-0001lw-Ds; Wed, 16 Nov 2011 02:50:51 +0100 Received: from donnerwolke.usta.de ([172.24.96.3]) by hekate.usta.de with esmtp (Exim 4.72) (envelope-from ) id 1RQUeJ-000153-CL for tech@mdocml.bsd.lv; Wed, 16 Nov 2011 02:50:51 +0100 Received: from iris.usta.de ([172.24.96.5] helo=usta.de) by donnerwolke.usta.de with esmtp (Exim 4.72) (envelope-from ) id 1RQUeJ-000317-BH for tech@mdocml.bsd.lv; Wed, 16 Nov 2011 02:50:51 +0100 Received: from schwarze by usta.de with local (Exim 4.72) (envelope-from ) id 1RQUeJ-0000CQ-AU for tech@mdocml.bsd.lv; Wed, 16 Nov 2011 02:50:51 +0100 Date: Wed, 16 Nov 2011 02:50:51 +0100 From: Ingo Schwarze To: tech@mdocml.bsd.lv Subject: Re: mandocdb: full set of search types Message-ID: <20111116015051.GE30189@iris.usta.de> References: <20111116003919.GD30189@iris.usta.de> <4EC3093E.3030504@bsd.lv> X-Mailinglist: mdocml-tech Reply-To: tech@mdocml.bsd.lv MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4EC3093E.3030504@bsd.lv> User-Agent: Mutt/1.5.21 (2010-09-15) Hi Kristaps, Kristaps Dzonsons wrote on Wed, Nov 16, 2011 at 01:52:14AM +0100: > Just a word or two before I sleep. This approach is sound and I've > no issues with a quick look over the patch, but will wait til > tomorrow to do so in earnest. About the same for your logical operations: The approach looks nice, an i will merge it tomorrow and read it in detail. > But first, before mandocdb gets production, the database should be > checked for endian-neutrality. Hm, i never though about that. > Second, we'd long ago mentioned splitting SYNOPSIS-invoked macros > (Fl, Fn, etc.) for querying on their SYNOPSIS or non-SYNOPSIS usage. > I think an elegant method is to encode the section within the > keyword database, which allows for > > apropos Fn~mdoc -a -s SYNOPSIS Not -s, since -s is the other section (grrr). The database field would have to be a bitmask, or we would multiply the size auf the database. The syntax is not completely logical, as the -s SYNOPSIS is a qualifier for the mdoc query string, not a stand-alone query phrase. apropos SYNOPSIS:Fn=mdoc would be more logical. If you really want -o, you have to say: apropos SYNOPSIS:any=mdoc -o Fn=mdoc Your proposal causes ambiguities: apropos Fn=mdoc -a -s SYNOPSIS -a Nm=man Is that: apropos SYNOPSIS:Fn=mdoc -a Nm=man apropos Fn=mdoc -a SYNOPSIS:Nm=man apropos SYNOPSIS:Fn=mdoc -a SYNOPSIS:Nm=man And even worse, what the heck is: apropos Fn=mdoc -o -s SYNOPSIS My proposal also has a quirk. Consider: .Sh SYNOPSIS .Nm foo .Ar mdoc .Sh DESCRIPTION .Nm mdoc That would match SYNOPSIS:Nm=mdoc. But that's unfixable, unless we drop the whole bitfield approach, which will make the database size explode. Or we could use a bitfield of the size 20 (sections) times 40 (macros) = 800 bits = 100 bytes, which is also very big. > or whatever `-s' replacement operator. How does that sound? This > sounds a lot more reasonable than encoding separate Fn, Nm, etc. > macros for SYNOPSIS and non-SYNOPSIS invocation. Yes, in particular since you will be looking for other macros in other sections: SEE ALSO:Xr FILES:Pa AUTHORS:An STANDARDS:St HISTORY:Bx DIAGNOSTICS:Er. And atypical queries may occasionally make sense, like STANDARDS:Fl. Hand-picking combinations seems like unreasonable implementation effort and not at all user-friendly. Maybe we don't need section restrictions at all. The only use for section restrictions would be controlling noise in searches - or do you see other uses? But seriously, how much noise do you expect from Nm outside SYNOPSIS, Xr outside SEE ALSO, Pa outside FILES, and so on? I expect little, because macros are rare outside their typical sections. On top of that, some of these atypical occurrences will contribute to the signal, so i'd recommend that people not use section restrictions by default, but only switch them on when drowning in noise - and then, honestly, it's not even likely to help much. So i'd probably suggest to not implement section restrictions right now, but reconsider this in a year or two, when we have a better feeling how the new apropos will actually be used. The database format is not set in stone for eternity, i just don't want to announce public availability and then gratuitously break the format the very next week. Yours, Ingo -- To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv