From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 9593 invoked from network); 13 Apr 2000 12:23:07 -0000 Received: from sunsite.auc.dk (130.225.51.30) by ns1.primenet.com.au with SMTP; 13 Apr 2000 12:23:07 -0000 Received: (qmail 29669 invoked by alias); 13 Apr 2000 12:22:50 -0000 Mailing-List: contact zsh-workers-help@sunsite.auc.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 10735 Received: (qmail 29656 invoked from network); 13 Apr 2000 12:22:50 -0000 Subject: Re: matching flags (was: Re: Should we backup this change? RE: Modifier substitutions.) In-Reply-To: <200004131137.NAA14795@beta.informatik.hu-berlin.de> from Sven Wischnowsky at "Apr 13, 2000 01:37:56 pm" To: Sven Wischnowsky Date: Thu, 13 Apr 2000 13:22:20 +0100 (BST) CC: zsh-workers@sunsite.auc.dk X-Mailer: ELM [version 2.4ME+ PL66 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-Id: From: Zefram Sven Wischnowsky wrote: >Before I forget it... from the code in pattern.c it doesn't look too >hard to support flags for the other special `gaps between characters' >supported by some regexp systems (beginning/end of line/word). Would >it be interesting enough to have them, too? Might be useful with >parsing command outputs as in ${$(...)...} sometimes. (The definition >of `word' might be a problem, of course...) The more general solution would be to support user-specified anchors. Word/line anchors can be defined in terms of lookahead/lookback assertions (like Perl's (?!)). Suppose we define anchors thus: (#A indicates anchor a indicates looking ahead b indicates looking back n (optional) negates the anchor result : separates flags from pattern ) pattern to check for Then a `beginning of line' anchor (matching at the start of the string or after an embedded newline) could be implemented as (#Ab:(#s)|$'\n') (in this case, the (#s) could be taken out of the anchor). One reasonable `beginning of word' anchor would be (#Abn:[a-zA-Z0-9_])(#Aa:[a-zA-Z0-9_]) (read as "there is a word character immediately ahead, but not one immediately behind"). Of course, anyone that uses these anchors frequently would put them into a parameter, so that they can be invoked as $~bol, etc. -zefram