From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 9070 invoked by alias); 23 Aug 2010 18:25:31 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 28196 Received: (qmail 22494 invoked from network); 23 Aug 2010 18:25:29 -0000 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received-SPF: pass (ns1.primenet.com.au: SPF record at _spf.google.com designates 209.85.212.43 as permitted sender) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=p4254khsrMQZnISOzXqaNz/x8kMBQEpIOIwWEO65SYw=; b=CcvDY/rbYCtmRRolDeBJhiOnwm/1w9S1wChCAic2/9HBtY8K6VIgN3UQdTa0jzEsEz FhRw7MUFhh4ILQqJkNbyfxEqcwgYwwkAoaXIFLDIezzDOuXNuiPk0oZiUsPrL5ZeUgYA bI5YM06pAS03nrXHEqXLhvUQ95CagB+cM45tI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=kcxHtvMf4hBibSn9zxxFsXBgmeN2TQCdTxHZWri9o1Ki0sBN66flSFWjjydwGz9hxH DcLmDVGUTyR7BCxRhOssQ9mU3CfsWxfR0dtkRbXFn9JnAqN6+y6PwJJpIgTAybk8z9ug TdIE+z9Imwv4/p7uSd4zYFOiGdATL8V6cxs2Q= MIME-Version: 1.0 In-Reply-To: <100823104610.ZM26959@torch.brasslantern.com> References: <100823104610.ZM26959@torch.brasslantern.com> Date: Mon, 23 Aug 2010 20:25:24 +0200 Message-ID: Subject: Re: [RFC or so] Add HASH_LOOKUP option From: Mikael Magnusson To: zsh workers Content-Type: text/plain; charset=UTF-8 On 23 August 2010 19:46, Bart Schaefer wrote: > On Aug 23, 4:09pm, Mikael Magnusson wrote: > } > } When this is unset, external commands are always resolved with a full > } path search, but still inserted into the hash for spell correction if > } those options are on. > } > } diff --git a/Src/exec.c b/Src/exec.c > } index 93d1b26..9a488fe 100644 > } --- a/Src/exec.c > } +++ b/Src/exec.c > } @@ -754,7 +754,7 @@ findcmd(char *arg0, int docopy) > } } > } break; > } } > } - if (cn) { > } + if (cn && isset(HASHLOOKUP)) { > } char nn[PATH_MAX]; > } > } if (cn->node.flags & HASHED) > > I think there may be problem with this in the event that the "hash" > command has been used to deliberately insert an entry into the hash > table. That's a documented mechanism for overriding path search. > Also I think you've missed that execute() also uses the hash table; > did you intend that "command foo" ignores NO_HASHLOOKUP? As I said, I didn't check everywhere yet to see if there are more duplicated codepaths, so no, that is not the intent. However, command foo for me doesn't appear to ignore it for me. > In looking more closely, there are a number of things about findcmd() > that look a bit fishy to me. Correct me where I've gone wrong? A lot of things looked fishy to me while poking at this, but I couldn't be sure which were actually fishy and which were just arcane reasons. > It starts (if the command is not already in the hash table and HASHCMDS > is set) by doing hashcmd(), which performs a search of only absolute > directory names in the path (not, e.g., ".", "..", or other relative > names). That search itself seems a bit off because if *arg0 == '/' > then it's going to prepend the path component anyway. [*] That's the first fishy thing I noticed, why is this for loop over $path copied to at least four places, two of which are in the same function, with slight variations? The fact that 'command' supposedly does a hash lookup while 'command -p' doesn't seems to be rather undocumented too. > It then ignores what it got back from the search and, if arg0 contains > a '/', checks whether arg0 is executable without path traversal. This > seems a bit sideways to me; why not do that first? HASHDIRS, I guess? > [This disagrees with execute(), which does the directly-executable test > first.] In any case this can return NULL, leaving the ignored results > from the previous search in the hash table, unless PATHDIRS is set. > > Then, if it DID find the command either in the hashtable already or > via hashcmd() but the hashtable entry does not have the HASHED bit [**] > it performs a search of only relative directory names in the path ahead > of the previously-found absolute directory. I guess in this specific function I could move my isset(HASHLOOKUP) to be in the & HASHED if() instead of the cn one. Which is to say if the whole thing isn't broken in the first place. > Finally, if all of the above failed, we do a full path search of both > relative and absolute directories again, which is redundant in the > case of HASHCMDS unless by some unlikely race condition the command > has appeared in one of the directories in the path in the meantime. > > > [*] And indeed, from zsh -f: > > torch% hash > torch% print =/X11/xterm > zsh: /X11/xterm not found > torch% hash | grep X11 > /X11/xterm=/usr/bin//X11/xterm Even funner: % print =./ls /bin/./ls > [**] The HASHED bit means the command was deliberately inserted into > the hash table with the "hash" builtin, rather than found by search. > In this case the both findcmd() and execute() are forced to believe > what the hash table tells them. Could this be used, theoretically, to still allow users to override the path, and still do a full path search for search hashed entries? I note you didn't include execcmd() in the function listing there, but I'm not clear on how/if execute() and execcmd() relate to eachother. When I looked at the definition for the struct hashnode earlier when writing the patch, I found the rather helpful int flags; /* various flags */ and proceeded to ignore it :P. Another thing I was vaguely wondering about is how is HASH_CMDS forced on by CORRECT? I was unable to grep up anything about it. -- Mikael Magnusson