From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: Date: Thu, 7 Jun 2007 15:50:30 +0100 From: rog@vitanuova.com To: 9fans@cse.psu.edu Subject: Re: [9fans] regexp to match paragraphs in troff documents In-Reply-To: <082ffdbcfda12b51637105e95ea479c8@mteege.de> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Topicbox-Message-UUID: 7a94274e-ead2-11e9-9d60-3106f5b1d025 > I've tried to pipe paragraphs of a troff document to fmt but I > have problems with the correct regular expression. My first attempt > "/^\.[A-Z]++.*\n(^[^]*)*\n^\.[A-Z]++.*\n/" matches only any second > paragraph because the expression is overlapping. Does anyone have a nice > idea to match troff parapgraphs? it's not easy (maybe not possible?) to do what you're doing unless you can unambiguously find the start and end of paragraphs (e.g. between ^\.PP and ^$), in which case you can use something like: ,x/^\.PP/.,/^$/|fmt for other cases, i suppose it might be nice to have non-greedy matching, in which case you could do something like: ,x/^\.[A-Z][A-Z].*\n(.*\n)*?\.[A-Z][A-Z]\n/ russ: how easy do you think it would be to put non-greedy matching into the acme/sam regexp engine? PS. ++ means exactly the same as +