From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 2748 invoked from network); 1 Sep 1999 08:59:05 -0000 Received: from sunsite.auc.dk (130.225.51.30) by ns1.primenet.com.au with SMTP; 1 Sep 1999 08:59:05 -0000 Received: (qmail 27855 invoked by alias); 1 Sep 1999 08:58:59 -0000 Mailing-List: contact zsh-workers-help@sunsite.auc.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 7594 Received: (qmail 27848 invoked from network); 1 Sep 1999 08:58:58 -0000 Message-Id: <9909010824.AA13835@ibmth.df.unipi.it> To: zsh-workers@sunsite.auc.dk Subject: Re: 6-pws-2 In-Reply-To: "Sven Wischnowsky"'s message of "Tue, 31 Aug 1999 10:22:20 DFT." <199908310822.KAA27325@beta.informatik.hu-berlin.de> Date: Wed, 01 Sep 1999 10:24:54 +0200 From: Peter Stephenson Sven Wischnowsky wrote: > I think we could > > 1) come back to my old suggestion to optimise some common patterns > (like *str, str*, *str*, s1|s2|s3) in the same way non-pattern > strings are already optimised The trouble here is that this makes the compilation time longer, which is probably(?) the main bottleneck in cases like case where the pattern is executed only once (I haven't done any profiling). str* is already (supposedly) quite well optimised at run time; in the next set of tweaks, *str should be, too. I imagine the pure string optimisation could be extended to alternatives of pure strings, but I'm not sure of the gain. > 3) store compiled patterns in the execution tree (for now I'm only > thinking about `case', `[[ .. = .. ]]' and `[[ .. != .. ]]' if the > patterns don't need to be singsub()ed, which could be checked at > parse time) This ought to be possible, although it'll take a certain amount of rearranging of structures to be able to flag whether something is a compiled regexp. Or maybe a new token could do that. Bart Schaefer wrote: > ... but even there it'll mostly help loops, not one-pass sorts of things > like sourcing init files. This is true, it won't help the case statements in question. > The ideal thing might be to "incrementally" compile the pattern; do just > enough to start discarding strings that can't possibly match, then do a > bit more upon finding one that might, etc., so parsing and compiling the > full pattern requires encountering the first successful match (unless it's > a really intractible pattern). This will have to live alongside the current structure, for cases where the pattern is going to be used more than once, so the code could get large and complicated. But it could probably be done to some extent. > Master's theses have been written on less > interesting problems ... you aren't looking to get a paper out of this, > are you, Peter? :-) Well, it looks like I'm giving up physics at the end of the month, so I need something else to write papers on. But I just don't know enough of the theory, anyway. -- Peter Stephenson Tel: +39 050 844536 WWW: http://www.ifh.de/~pws/ Dipartimento di Fisica, Via Buonarroti 2, 56127 Pisa, Italy