From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <zsh-workers-return-7594-mason-zsh=primenet.com.au@sunsite.auc.dk>
Received: (qmail 2748 invoked from network); 1 Sep 1999 08:59:05 -0000
Received: from sunsite.auc.dk (130.225.51.30)
  by ns1.primenet.com.au with SMTP; 1 Sep 1999 08:59:05 -0000
Received: (qmail 27855 invoked by alias); 1 Sep 1999 08:58:59 -0000
Mailing-List: contact zsh-workers-help@sunsite.auc.dk; run by ezmlm
Precedence: bulk
X-No-Archive: yes
X-Seq: 7594
Received: (qmail 27848 invoked from network); 1 Sep 1999 08:58:58 -0000
Message-Id: <9909010824.AA13835@ibmth.df.unipi.it>
To: zsh-workers@sunsite.auc.dk
Subject: Re: 6-pws-2 
In-Reply-To: "Sven Wischnowsky"'s message of "Tue, 31 Aug 1999 10:22:20 DFT."
             <199908310822.KAA27325@beta.informatik.hu-berlin.de> 
Date: Wed, 01 Sep 1999 10:24:54 +0200
From: Peter Stephenson <pws@ibmth.df.unipi.it>

Sven Wischnowsky wrote:
> I think we could
> 
> 1) come back to my old suggestion to optimise some common patterns
>    (like *str, str*, *str*, s1|s2|s3) in the same way non-pattern
>    strings are already optimised

The trouble here is that this makes the compilation time longer, which is
probably(?) the main bottleneck in cases like case where the pattern is
executed only once (I haven't done any profiling).  str* is already
(supposedly) quite well optimised at run time; in the next set of tweaks,
*str should be, too.  I imagine the pure string optimisation could be
extended to alternatives of pure strings, but I'm not sure of the gain.

> 3) store compiled patterns in the execution tree (for now I'm only
>    thinking about `case', `[[ .. = .. ]]' and `[[ .. != .. ]]' if the
>    patterns don't need to be singsub()ed, which could be checked at
>    parse time)

This ought to be possible, although it'll take a certain amount of
rearranging of structures to be able to flag whether something is a
compiled regexp.  Or maybe a new token could do that.

Bart Schaefer wrote:
> ... but even there it'll mostly help loops, not one-pass sorts of things
> like sourcing init files.

This is true, it won't help the case statements in question.

> The ideal thing might be to "incrementally" compile the pattern; do just
> enough to start discarding strings that can't possibly match, then do a
> bit more upon finding one that might, etc., so parsing and compiling the
> full pattern requires encountering the first successful match (unless it's
> a really intractible pattern).

This will have to live alongside the current structure, for cases where the
pattern is going to be used more than once, so the code could get large and
complicated.  But it could probably be done to some extent.

> Master's theses have been written on less
> interesting problems ... you aren't looking to get a paper out of this,
> are you, Peter?

:-) Well, it looks like I'm giving up physics at the end of the month, so I
need something else to write papers on.  But I just don't know enough of
the theory, anyway.

-- 
Peter Stephenson <pws@ibmth.df.unipi.it>       Tel: +39 050 844536
WWW:  http://www.ifh.de/~pws/
Dipartimento di Fisica, Via Buonarroti 2, 56127 Pisa, Italy