From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 12852 invoked from network); 19 May 2008 15:12:14 -0000 X-Spam-Checker-Version: SpamAssassin 3.2.4 (2008-01-01) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.4 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by ns1.primenet.com.au with SMTP; 19 May 2008 15:12:14 -0000 Received-SPF: none (ns1.primenet.com.au: domain at sunsite.dk does not designate permitted sender hosts) Received: (qmail 89044 invoked from network); 19 May 2008 15:12:08 -0000 Received: from sunsite.dk (130.225.247.90) by a.mx.sunsite.dk with SMTP; 19 May 2008 15:12:08 -0000 Received: (qmail 118 invoked by alias); 19 May 2008 15:12:04 -0000 Mailing-List: contact zsh-workers-help@sunsite.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 25073 Received: (qmail 86 invoked from network); 19 May 2008 15:12:02 -0000 Received: from bifrost.dotsrc.org (130.225.254.106) by sunsite.dk with SMTP; 19 May 2008 15:12:02 -0000 Received: from vms044pub.verizon.net (vms044pub.verizon.net [206.46.252.44]) by bifrost.dotsrc.org (Postfix) with ESMTP id 493F18059114 for ; Mon, 19 May 2008 17:11:58 +0200 (CEST) Received: from torch.brasslantern.com ([71.116.113.54]) by vms044.mailsrvcs.net (Sun Java System Messaging Server 6.2-6.01 (built Apr 3 2006)) with ESMTPA id <0K14005P1FJK9VWE@vms044.mailsrvcs.net> for zsh-workers@sunsite.dk; Mon, 19 May 2008 10:11:46 -0500 (CDT) Received: from torch.brasslantern.com (localhost.localdomain [127.0.0.1]) by torch.brasslantern.com (8.13.1/8.13.1) with ESMTP id m4JFBhMx005429 for ; Mon, 19 May 2008 08:11:44 -0700 Received: (from schaefer@localhost) by torch.brasslantern.com (8.13.1/8.13.1/Submit) id m4JFBhfu005428 for zsh-workers@sunsite.dk; Mon, 19 May 2008 08:11:43 -0700 Date: Mon, 19 May 2008 08:11:43 -0700 From: Bart Schaefer Subject: Re: compmatch behaviour In-reply-to: <20080519113421.2ded4a42@news01> To: zsh-workers@sunsite.dk (Zsh hackers list) Message-id: <080519081143.ZM5427@torch.brasslantern.com> MIME-version: 1.0 X-Mailer: OpenZMail Classic (0.9.2 24April2005) Content-type: text/plain; charset=us-ascii References: <10710.1211137299@pws-pc> <080518165753.ZM4385@torch.brasslantern.com> <20080519113421.2ded4a42@news01> Comments: In reply to Peter Stephenson "Re: compmatch behaviour" (May 19, 11:34am) X-Virus-Scanned: ClamAV 0.91.2/7159/Mon May 19 16:26:42 2008 on bifrost X-Virus-Status: Clean On May 19, 11:34am, Peter Stephenson wrote: } } Bart Schaefer wrote: } > There are two situations being handled } > simultaneously here, and maybe the first thing to do is to separate } > them. The first situation is where wpat is a correspondence class } > and we need to select the corresponding position out of lpat. The } > second case is where lpat is an equivalence class and we need to try } > every possible character in the class at line position *lp. } } Hmm... terminology first... Sven's "correspondence class" appears } to be the one with the "equiv" flag set, i.e. {...}. So here the } characters are numbered and we are searching for a particular one. Actually Sven has, again, overloaded something with a similar structure to serve multiple purposes. There are two possible cases: (1) lpat->equiv is false OR wpat does not exist: an equivalence class. [every existing character position in lpat->tab is tried at *lp] (2) wpat exists AND lpat->equiv is true: a correspondence class. [the character wpat->tab[*mword] must have a position in lpat->tab] Case (1) also occurs as a degenerate of (2) when there is no character in wpat for the current character in mword. I'm not sure why that's correct. } However, in my rewrite I want to be able to say "any upper case } character" so that it can match the corresponding lower case } character. If it's only going to match the corresponding lower case character, then you have [:upper:] in wpat->tab and you need to simulate case (2) above. If your lookup in lpat->tab returns [:lower:], convert *mword to lower case and you're done. I have no idea how you plan to handle something like [:upper:] mapping to [:digit:], though. There's a reason Sven chose to require enumeration: this works more like "tr" than like "sed". The two classes in case (2) ought to have the same number of values, because its the positions in each class that have to match up. The bit you're worried about, though, is when you have [:upper:] in lpat->tab and either no wpat or no character in wpat->tab for *mword. Then you need to try all the possible upper case characters. Sven's algorithm seems to be to build every possible combination all the way out to the end of the line and then compare entire words, discarding non-matches. I would think it's possible to try matching the prefix so far, so that you can short-circuit the rest of the process on a non-match.