From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from scc-mailout.scc.kit.edu (scc-mailout-webmail.scc.kit.edu [129.13.185.232]) by krisdoz.my.domain (8.14.3/8.14.3) with ESMTP id pAD0dvWx013482 for ; Sat, 12 Nov 2011 19:39:58 -0500 (EST) Received: from hekate.usta.de (asta-nat.asta.uni-karlsruhe.de [172.22.63.82]) by scc-mailout-02.scc.kit.edu with esmtp (Exim 4.72 #1) id 1RPO72-0007sL-VV; Sun, 13 Nov 2011 01:39:56 +0100 Received: from donnerwolke.usta.de ([172.24.96.3]) by hekate.usta.de with esmtp (Exim 4.72) (envelope-from ) id 1RPO73-0004Qg-2t for tech@mdocml.bsd.lv; Sun, 13 Nov 2011 01:39:57 +0100 Received: from iris.usta.de ([172.24.96.5] helo=usta.de) by donnerwolke.usta.de with esmtp (Exim 4.72) (envelope-from ) id 1RPO73-0001b3-1r for tech@mdocml.bsd.lv; Sun, 13 Nov 2011 01:39:57 +0100 Received: from schwarze by usta.de with local (Exim 4.72) (envelope-from ) id 1RPO72-0000R7-Nt for tech@mdocml.bsd.lv; Sun, 13 Nov 2011 01:39:56 +0100 Date: Sun, 13 Nov 2011 01:39:56 +0100 From: Ingo Schwarze To: tech@mdocml.bsd.lv Subject: Re: overhaul apropos(1) interface Message-ID: <20111113003956.GE16229@iris.usta.de> References: <20111109013044.GA25679@iris.usta.de> <20111112235410.GC16229@iris.usta.de> <4EBF0A38.3090004@bsd.lv> X-Mailinglist: mdocml-tech Reply-To: tech@mdocml.bsd.lv MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4EBF0A38.3090004@bsd.lv> User-Agent: Mutt/1.5.21 (2010-09-15) Hi Kristaps, Kristaps Dzonsons wrote on Sun, Nov 13, 2011 at 01:07:20AM +0100: > I'm working on the final parts of this check-in, so please hold off > on this file! I cannot; i need to get this working in the OpenBSD tree ASAP, i.e. tomorrow. The logical and and or are not critical for me right now, i can do without those for some time, but i need a working mandocdb-apropos toolchain in-tree to build upon. Ports hackathons are short, and i want to get on as fast as possible with the following critical path: 1) get mandocdb working on single directories to produce databases in the macro-format, i.e. with types like TYPE_An, TYPE_Cd 2) get apropos to work with that, such that the databases can be used 3) get rid of the most glaring bugs and complete backward compatibility 4) integrate the man.conf parser into mandocdb such that mandocdb can walk the MANPATH just like makewhatis(8) does 5) integrate the man.conf parser into apropos such that mandoc-apropos gets useable as a real apropos replacement 6) rudimentary formatted page parsing in mandocdb 7) integrate mandocdb into pkg_add such that it gets useable as a real makewhatis replacement I realized in Ljubljana that this is more work than i thought before, and i can't hold off, especially not this week, or i will surely miss the 5.1 release. Maybe you can hold off and bring in the and/or stuff afterwards, i.e. after the patches i have posted so far? You need not wait long, i would *gladly* push all my stuff tomorrow in the morning, and then you have clean earth to till and to adjust your work to it. > It's by no means finished; as mentioned in the source checkin, > I'll post to tech@ when the implementation is feature complete. Sure, no doubt, but blocking system integration at a critical time to implement optional, fancy features is a bad idea! > Consider, to gauge the complexity: > > apropos Ar == foo -a Ar =~ baz > > (I don't care at all about the connecting syntax, ==, etc., so long > as it's regular. Your notation is not extensible to > case-insensitive matching: can one extend to include these?) It is. ischwarze@isnote $ cd /usr/share/man ischwarze@isnote $ apropos.m Nm~^b.e$ BCE(4) - Broadcom BCM4401 10/100 Ethernet device BGE(4) - Broadcom BCM57xx/BCM590x 10/100/Gigabit Ethernet device ischwarze@isnote $ apropos.m Nm~^B.e$ ischwarze@isnote $ apropos.m Nm,i~^B.e$ BCE(4) - Broadcom BCM4401 10/100 Ethernet device BGE(4) - Broadcom BCM57xx/BCM590x 10/100/Gigabit Ethernet device That's not a mockup, that's what i'm running right now. However, i propose that substring match always be case-insensitive. Substring match is a simplification for daily wear and tear and doesn't need such complexity. If case-insensitive substring match drowns you in noise, just use regex matching - done. > Anyway, the "AND" is tricky: each file's evaluation state must be > retained for all keyword entries (which have no guarantee on > ordering) then post-operated. Thus, I'm maintaining evaluation > trees during the parse. The goal is to make the trivial case -- "Ar > foo", say -- as fast as in the simple implementation before. (I > guarantee the trivial case by the partial evaluation.) > > Anyway, I anticipate a few days til I get the final checkins. The > code is not tricky in implementation (not much, anyway) and > guarantees arbitrary expressions with well-defined compute time. > > I'm also not at all married to the filenames, but let's hold off for > a bit more as I get these chunks into place. apropos_db.c is fine > by me, for the record. Sounds good, except that a hackathon is a bad time to make me wait. Thanks for your understanding (and your work, above all!), Ingo -- To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv