From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <7359f0490810241015s4d3107dagfef90e3fb463b173@mail.gmail.com> Date: Fri, 24 Oct 2008 10:15:51 -0700 From: "Rob Pike" To: "Fans of the OS Plan 9 from Bell Labs" <9fans@9fans.net> In-Reply-To: <20081024170237.68ED28DE7@okapi.maths.tcd.ie> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <765ef13a653652d5fcef9001ff70f814@quanstro.net> <20081024170237.68ED28DE7@okapi.maths.tcd.ie> Subject: Re: [9fans] non greedy regular expressions Topicbox-Message-UUID: 263ee326-ead4-11e9-9d60-3106f5b1d025 Backreferences within the pattern (such as in /(.*)\1/) make the matcher non-regular and exponentially hard. It was a deliberate decision not to implement them in sam and I'd make the same decision today. As far as greedy/non-greedy operators go, they have more utility but I believe they have become popular primarily to deal with the problems with Perl and PCRE doing greedy implementations badly in order to avoid the exponential cost of their backtracking matchers. These matchers can give crazy results. Sam etc. are O(length of string * length of pattern) at all times and their leftmost-longest properties work throughout. Again, Russ has most of this in his analysis. -rob