From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: Date: Fri, 24 Oct 2008 14:10:36 -0700 From: "Russ Cox" To: "Fans of the OS Plan 9 from Bell Labs" <9fans@9fans.net> Subject: Re: [9fans] non greedy regular expressions In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <765ef13a653652d5fcef9001ff70f814@quanstro.net> <20081024170237.68ED28DE7@okapi.maths.tcd.ie> Topicbox-Message-UUID: 26ec8c24-ead4-11e9-9d60-3106f5b1d025 > I thought greedy=leftmost-longest, while non-greedy=leftmost-first: Greedy leftmost-first is different from leftmost-longest. Search for /a*(ab)?/ in "ab". The leftmost-longest match is "ab", but the leftmost-first match (because of the greedy star) is "a". In the leftmost-first case, the greediness of the star caused an overall short match. > All the thinking about this is simply removed with 'non-greedy' ops. But it isn't (or shouldn't be). Using /\(.*\)/ to match small parenthesized expressions is fragile: /\(.*\)/ in "(a(b))" matches "(a(b)". In contrast, the solution you rejected /\([^)]*\)/ is more robust. It doesn't make sense to shoehorn non-greedy and greedy operators into an engine that provides leftmost-longest matching. If you want a different model, you need to use a different program. Perl has been ported. Russ