From mboxrd@z Thu Jan  1 00:00:00 1970
Message-ID: <f73b1f8195b654c13724f293b5020909@vitanuova.com>
Date: Thu,  7 Jun 2007 15:50:30 +0100
From: rog@vitanuova.com
To: 9fans@cse.psu.edu
Subject: Re: [9fans] regexp to match paragraphs in troff documents
In-Reply-To: <082ffdbcfda12b51637105e95ea479c8@mteege.de>
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
Topicbox-Message-UUID: 7a94274e-ead2-11e9-9d60-3106f5b1d025

> I've tried to pipe paragraphs of a troff document to fmt but I
> have problems with the correct regular expression. My first attempt
> "/^\.[A-Z]++.*\n(^[^]*)*\n^\.[A-Z]++.*\n/" matches only any second
> paragraph because the expression is overlapping. Does anyone have a nice
> idea to match troff parapgraphs?

it's not easy (maybe not possible?) to do what you're doing unless you
can unambiguously find the start and end of paragraphs (e.g. between ^\.PP and ^$),
in which case you can use something like:

,x/^\.PP/.,/^$/|fmt

for other cases, i suppose it might be nice to have non-greedy matching,
in which case you could do something like:
,x/^\.[A-Z][A-Z].*\n(.*\n)*?\.[A-Z][A-Z]\n/
russ: how easy do you think it would be to put non-greedy matching into
the acme/sam regexp engine?

PS. ++ means exactly the same as +