From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 28463 invoked from network); 28 Feb 2005 06:54:24 -0000 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by ns1.primenet.com.au with SMTP; 28 Feb 2005 06:54:24 -0000 Received: (qmail 28875 invoked from network); 28 Feb 2005 06:54:18 -0000 Received: from sunsite.dk (130.225.247.90) by a.mx.sunsite.dk with SMTP; 28 Feb 2005 06:54:18 -0000 Received: (qmail 27900 invoked by alias); 28 Feb 2005 06:54:15 -0000 Mailing-List: contact zsh-workers-help@sunsite.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 20883 Received: (qmail 27889 invoked from network); 28 Feb 2005 06:54:14 -0000 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by sunsite.dk with SMTP; 28 Feb 2005 06:54:14 -0000 Received: (qmail 28608 invoked from network); 28 Feb 2005 06:54:14 -0000 Received: from vms046pub.verizon.net (206.46.252.46) by a.mx.sunsite.dk with SMTP; 28 Feb 2005 06:54:09 -0000 Received: from candle.brasslantern.com ([4.11.1.68]) by vms046.mailsrvcs.net (Sun Java System Messaging Server 6.2 HotFix 0.04 (built Dec 24 2004)) with ESMTPA id <0ICM005XT0I8T7U2@vms046.mailsrvcs.net> for zsh-workers@sunsite.dk; Mon, 28 Feb 2005 00:54:09 -0600 (CST) Received: from candle.brasslantern.com (IDENT:schaefer@localhost [127.0.0.1]) by candle.brasslantern.com (8.12.11/8.12.11) with ESMTP id j1S6s7rE020818 for ; Sun, 27 Feb 2005 22:54:07 -0800 Received: (from schaefer@localhost) by candle.brasslantern.com (8.12.11/8.12.11/Submit) id j1S6s7X8020817 for zsh-workers@sunsite.dk; Sun, 27 Feb 2005 22:54:07 -0800 Date: Mon, 28 Feb 2005 06:54:07 +0000 From: Bart Schaefer Subject: Re: PATCH: Apply spell correction to autocd In-reply-to: <1050227204407.ZM19297@candle.brasslantern.com> To: zsh-workers@sunsite.dk Message-id: <1050228065407.ZM20816@candle.brasslantern.com> MIME-version: 1.0 X-Mailer: Z-Mail (5.0.0 30July97) Content-type: text/plain; charset=us-ascii References: <1050227204407.ZM19297@candle.brasslantern.com> Comments: In reply to Bart Schaefer "PATCH: Apply spell correction to autocd" (Feb 27, 8:44pm) X-Spam-Checker-Version: SpamAssassin 3.0.2 on a.mx.sunsite.dk X-Spam-Level: X-Spam-Status: No, score=-2.6 required=6.0 tests=AWL,BAYES_00 autolearn=ham version=3.0.2 X-Spam-Hits: -2.6 On Feb 27, 8:44pm, Bart Schaefer wrote: } Subject: PATCH: Apply spell correction to autocd } } I don't know whether this is going to require tweaking for wide-char file } names, but it's at least as good as the current bin_cd() implementation. I was in a bit of a hurry when I worked out that patch, and it occurred to me laterthat this implementation prefers names by cdpath order rather than by comparison distance, so I went back to look again, and found several interesting things. The first is this snippet of spckword(): if ((u = spname(guess)) != guess) best = u; The condition tested here is always true, because spname() never returns anything other than NULL or a pointer to an internal static buffer. This might as well be: best = spname(guess); However, I'm not sure that's the intended semantic, which might be: if ((u = spname(guess)) && strcmp(u, guess)) best = u; The next thing that I noticed is that there's no way to recover the comp distance computed by spname(). Which probably doesn't matter as it's always less than 3 if spname() returned anything useful. This is a bit different than the scheme applied to scanning the hash tables, which uses a threshold distance of 1/4 of the length of the input. In other words, zsh can correct more mistakes in hashed strings than in file paths, unless the component directory names are very short. The reason I was interested in the distance computed by spname() was that it seemed reasonable to loop over the entire cdpath to find the best of all possible matches, and also to use that distance as the starting value of d in the next section of spckword(): d = 100; scanhashtable(reswdtab, 1, 0, 0, spscan, 0); That is, I'd prefer not to choose something from the hash tables if there's a cdpath directory that's a better fit. Presently (even before my patch) zsh always prefers the hash table unless there's an exact match from spname(), even if the hashed value is a less precise match. Finally, spname() is a bit inconsistent, because it returns NULL if it finds a match with a distance >= 3 in any leading path component, but returns a copy of the input string even when it finds no match at all in the final path component. I suppose that's intended to allow one to create new files in existing directories, correcting only the existing part of the path, but it makes spname() ugly to use (and CORRECT_ALL less useful from the user's perspective) in any case where the final component is required to exist, such as for "cd". So I'm not going to commit that patch -- which would be better off not having to call spckword() recursively in any case -- pending resolution of some of these issues. Anybody have any comments? -- Bart Schaefer Brass Lantern Enterprises http://www.well.com/user/barts http://www.brasslantern.com Zsh: http://www.zsh.org | PHPerl Project: http://phperl.sourceforge.net