From mboxrd@z Thu Jan 1 00:00:00 1970 MIME-Version: 1.0 In-Reply-To: <3aaafc130908201052o3c8a6334v13350c18f78120d9@mail.gmail.com> References: <3aaafc130908201052o3c8a6334v13350c18f78120d9@mail.gmail.com> Date: Fri, 21 Aug 2009 07:24:28 -0700 Message-ID: Subject: Re: [9fans] Recursive structural expressions? From: Russ Cox To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Topicbox-Message-UUID: 50102af6-ead5-11e9-9d60-3106f5b1d025 Regular expressions, in the strict formal sense, have an important property: they completely express the set of patterns that can be searched for in a single linear-time pass through the text. That is, they have an associated linear-time performance guarantee. In your particular case, adding recursion would produce context-free expressions, and general context-free parsing as typically implemented is cubic as opposed to linear. (It turns out to be the same problem as matrix multiplication, so with a lot more effort you could get it down to n^2.38 or perhaps even lower, but you can't possibly get sub-quadratic.) As Charles pointed out, there is a long, distinguished history of editors and other tools making regular expressions more expressive. It is important to remember that this is always done at the cost of giving up the performance guarantee, at least in some cases. And unless a general matcher can distinguish the different cases and implement separate algorithms for each, it usually gives up the performance guarantee in all cases. Russ