From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 27579 invoked from network); 18 May 2008 19:04:13 -0000 X-Spam-Checker-Version: SpamAssassin 3.2.4 (2008-01-01) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.4 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by ns1.primenet.com.au with SMTP; 18 May 2008 19:04:13 -0000 Received-SPF: none (ns1.primenet.com.au: domain at sunsite.dk does not designate permitted sender hosts) Received: (qmail 29862 invoked from network); 18 May 2008 19:03:46 -0000 Received: from sunsite.dk (130.225.247.90) by a.mx.sunsite.dk with SMTP; 18 May 2008 19:03:46 -0000 Received: (qmail 11009 invoked by alias); 18 May 2008 19:03:34 -0000 Mailing-List: contact zsh-workers-help@sunsite.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 25067 Received: (qmail 10907 invoked from network); 18 May 2008 19:03:33 -0000 Received: from bifrost.dotsrc.org (130.225.254.106) by sunsite.dk with SMTP; 18 May 2008 19:03:33 -0000 Received: from mtaout01-winn.ispmail.ntl.com (mtaout01-winn.ispmail.ntl.com [81.103.221.47]) by bifrost.dotsrc.org (Postfix) with ESMTP id B98DD8059114 for ; Sun, 18 May 2008 21:03:26 +0200 (CEST) Received: from aamtaout03-winn.ispmail.ntl.com ([81.103.221.35]) by mtaout01-winn.ispmail.ntl.com with ESMTP id <20080518190709.GOOB14647.mtaout01-winn.ispmail.ntl.com@aamtaout03-winn.ispmail.ntl.com> for ; Sun, 18 May 2008 20:07:09 +0100 Received: from pws-pc.ntlworld.com ([81.107.40.67]) by aamtaout03-winn.ispmail.ntl.com with ESMTP id <20080518191137.KFQH26699.aamtaout03-winn.ispmail.ntl.com@pws-pc.ntlworld.com> for ; Sun, 18 May 2008 20:11:37 +0100 Received: from pws-pc (pws-pc [127.0.0.1]) by pws-pc.ntlworld.com (8.14.2/8.14.2) with ESMTP id m4IJ1don010711 for ; Sun, 18 May 2008 20:01:39 +0100 From: Peter Stephenson To: zsh-workers@sunsite.dk (Zsh hackers list) Subject: compmatch behaviour X-Mailer: MH-E 8.0.3; nmh 1.3-RC1; GNU Emacs 22.2.1 Date: Sun, 18 May 2008 20:01:39 +0100 Message-ID: <10710.1211137299@pws-pc> X-Virus-Scanned: ClamAV 0.91.2/7150/Sun May 18 20:10:15 2008 on bifrost X-Virus-Status: Clean Plea for help. I usually face the completion code on my own, but I'm just looking at trying to update the matching code to make it more general. (I believe I'm already relying on a change of Andrey's that simplifies it in one respect. Thank you.) However, there's one bit that's got me stumped, and unfortunately it's the core of the whole business. bld_line() in Src/Zle/compmatch.c works as follows: - Input a "word pattern" (the test completion) and a "line pattern" (what we're matching it against). - If we haven't yet got to the end of the line pattern - If the line pattern is an equivalence class, then for *every* character that can match the character in the test word (yes, you read that correctly---if we're looking at upper case characters, for example, we will try every possible upper case character until it works) - set the character in the string from the command line - recurse to test with this character in place with the line pattern advanced but the same word pattern - if it succeeded, return success. - If it's not an equivalence class, no problem: only one character to try. Try it (same recursive logic but no nasty loop). - If we've got to the end of the line pattern (i.e. have recursed to the extent where we've got a complete string from the command line, - try matching the test completion and the trial word - return success or failure. (This causes the loop above either to return success or try with a new character in the equivalence class.) In other words, to detect a match we try every possible character that could possibly match and see if it does. This is crazy. Obviously this doesn't generalize to larger groups of characters. I think the basic reason for this is something along the lines of the following (I realise this isn't particularly coherent but this is the best I've got for now): because we can have patterns associated with both the trial string and the word on the command line, we have got ourselves into a position where the logic is naturally qudratic: both sides can in principle change and consequently we need to change one side to see if it can match the other. The code for bld_line() is in Src/Zle/compmatch.c. If anyone can see a way out of this mess, I'd be glad to hear even tentative theories. Obviously, I will continue to look at it. For now, however, I'm stumped. Another problem is that the match code makes extensive use of lengths, which need to become character counts, which means that anything that touches this code needs to use wide characters, which is a lot of tortuous code. However, that problem is in principle soluble. We need to get the first problem solved. -- Peter Stephenson Web page now at http://homepage.ntlworld.com/p.w.stephenson/