From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp-1.sys.kth.se (smtp-1.sys.kth.se [130.237.32.175]) by krisdoz.my.domain (8.14.3/8.14.3) with ESMTP id pAD07Ugg031721 for ; Sat, 12 Nov 2011 19:07:30 -0500 (EST) Received: from mailscan-1.sys.kth.se (mailscan-1.sys.kth.se [130.237.32.91]) by smtp-1.sys.kth.se (Postfix) with ESMTP id C8A4B156333; Sun, 13 Nov 2011 01:07:24 +0100 (CET) X-Virus-Scanned: by amavisd-new at kth.se Received: from smtp-1.sys.kth.se ([130.237.32.175]) by mailscan-1.sys.kth.se (mailscan-1.sys.kth.se [130.237.32.91]) (amavisd-new, port 10024) with LMTP id 2pg9gZg13g4V; Sun, 13 Nov 2011 01:07:23 +0100 (CET) X-KTH-Auth: kristaps [77.201.233.73] X-KTH-mail-from: kristaps@bsd.lv Received: from [192.168.1.57] (73.233.201.77.rev.sfr.net [77.201.233.73]) by smtp-1.sys.kth.se (Postfix) with ESMTP id F0796154137; Sun, 13 Nov 2011 01:07:22 +0100 (CET) Message-ID: <4EBF0A38.3090004@bsd.lv> Date: Sun, 13 Nov 2011 01:07:20 +0100 From: Kristaps Dzonsons User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:7.0.1) Gecko/20110929 Thunderbird/7.0.1 X-Mailinglist: mdocml-tech Reply-To: tech@mdocml.bsd.lv MIME-Version: 1.0 To: tech@mdocml.bsd.lv CC: Ingo Schwarze Subject: Re: overhaul apropos(1) interface References: <20111109013044.GA25679@iris.usta.de> <20111112235410.GC16229@iris.usta.de> In-Reply-To: <20111112235410.GC16229@iris.usta.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit On 13/11/2011 00:54, Ingo Schwarze wrote: > Hi, > > here is a new version of my overhaul after Kristaps' changes. > The patch given below is not likely to apply cleanly to anybody's > repo, it requires the apropos_db.* rename first. It is meant > for quick review; if you agree with the direction, i'll figure > out how to get this in cleanly. > > So, here is what the overhaul does: > > * Remove -I from the main program. > Logically, that's not a global option, > but a per-expression thingy. > * Sort the arguments of exprcomp. > All the world uses (argc, argv), > why should we suddenly go for (argv, argc)? > * I agree with dropping MATCH_EXACT: > That's rarely needed and can easily be constructed > as MATCH_REGEX with ^...$. > * For the same reason, let's drop case-sensitive MATCH_STR: > It's rarely needed and can easily be constructed with MATCH_REGEX. > * Keeping MATCH_STR seems OK: It is needed as the default > when the type is unspecified. > * The MATCH_REGEXCASE enum item is unused already now, > since the iflag is compiled into the regex object. > So drop that one as well. > * Only two enum items remain; that's better and more easily > expressed by a single boolean integer (expr.regex). > * The exprexec() function requires a mask argument, > or all search keys will act as "any" (important bug fix > along the way!). > > The most massive changes are in exprcomp(). > I strongly dislike the proposed interface. > It is cumbersome and requires too much typing. > The -eq and -re arguments are exceedingly ugly, and the > implementation is hard to get right - if i remember > correctly, not all cases work properly right now. > > Thus, i have simplified my interface proposal to just this: > > apropos [-s section] [-S arch] query_phrase [...] > > query_phrase ::= [[macro[,...]](=|~)]query_value > > So, the value to be searched for can optionally > be preceded by '=' (for string search) or '~' (for regex search), > and that can optionally be preceded by one or more macro names, > joined by commas. Including "i" among the macros switches > regex searches to case-insensitive and has no effect on > string searches. Ingo, I'm working on the final parts of this check-in, so please hold off on this file! It's by no means finished; as mentioned in the source checkin, I'll post to tech@ when the implementation is feature complete. Consider, to gauge the complexity: apropos Ar == foo -a Ar =~ baz (I don't care at all about the connecting syntax, ==, etc., so long as it's regular. Your notation is not extensible to case-insensitive matching: can one extend to include these?) Anyway, the "AND" is tricky: each file's evaluation state must be retained for all keyword entries (which have no guarantee on ordering) then post-operated. Thus, I'm maintaining evaluation trees during the parse. The goal is to make the trivial case -- "Ar foo", say -- as fast as in the simple implementation before. (I guarantee the trivial case by the partial evaluation.) Anyway, I anticipate a few days til I get the final checkins. The code is not tricky in implementation (not much, anyway) and guarantees arbitrary expressions with well-defined compute time. I'm also not at all married to the filenames, but let's hold off for a bit more as I get these chunks into place. apropos_db.c is fine by me, for the record. Thanks again, Kristaps -- To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv