From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: Date: Fri, 24 Oct 2008 10:08:29 +0200 From: "Rudolf Sykora" To: "Fans of the OS Plan 9 from Bell Labs" <9fans@9fans.net> In-Reply-To: <6988bd18a7bfb3d7def95c11e4dd9bd9@quanstro.net> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <6988bd18a7bfb3d7def95c11e4dd9bd9@quanstro.net> Subject: Re: [9fans] non greedy regular expressions Topicbox-Message-UUID: 261374e8-ead4-11e9-9d60-3106f5b1d025 > russ has a great writeup on this. > http://swtch.com/~rsc/regexp/ > i think it covers all your questions. > > - erik I read trough some of that already yesterday. Anyway, am still puzzled. In the text of Regular Expression Matching Can Be Simple And Fast (but is slow in Java, Perl, PHP, Python, Ruby, ...) R. Cox writes: --- While writing the text editor sam [6] in the early 1980s, Rob Pike wrote a new regular expression implementation, which Dave Presotto extracted into a library that appeared in the Eighth Edition. Pike's implementation incorporated submatch tracking into an efficient NFA simulation but, like the rest of the Eighth Edition source, was not widely distributed. ... Pike's regular expression implementation, extended to support Unicode, was made freely available with sam in late 1992, but the particularly efficient regular expression search algorithm went unnoticed. The code is now available in many forms: as part of sam, as Plan 9's regular expression library, or packaged separately for Unix. --- But any manual page (regexp(6), that of sam) keeps completely silent about eg. any submatch tracking. So what's wrong? Can anybody clarify the situation for me or do I really have to read the codes? Ruda