From mboxrd@z Thu Jan 1 00:00:00 1970 From: John Stalker To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> In-reply-to: References: <20081024170237.68ED28DE7@okapi.maths.tcd.ie> <6520c845566013ada472281bf9c0da73@coraid.com> <2e4a50a0810241652r38d2aa1ft2b6fb9104d2988ae@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <2383.1225011467.1@maths.tcd.ie> Date: Sun, 26 Oct 2008 08:57:47 +0000 Message-Id: <20081026085747.8F5F0868E@okapi.maths.tcd.ie> Subject: Re: [9fans] non greedy regular expressions Topicbox-Message-UUID: 2733a05a-ead4-11e9-9d60-3106f5b1d025 > Now. If the leftmost-longest match is usable for my problem, I am fine > with C + regexp(6). If not I only see the possibility to use > perl/python nowadays (if I don't want to go mad like above). There is another option: yacc. I'm not saying it's simpler than perl or python, but it's not much harder and it's more a more general tool. The resulting code will be fast, if you care. > My question then is: wouldn't it be better to switch to the > leftmost-first paradigm, hence open possible use of (non-)greedy > operators, and in a way contribute to an accord with perl/python > syntax? And use a good algorithm for that all? But maybe it's not > worth and the current state is just sufficient... Switching semantics in existing tools is rarely a good idea. Too many things break. If anyone is going to do this, and it won't be me, then it needs to be done in a way that doesn't break anything, like when UNIX switched from basic to extended regular expressions and added a -e option to grep. -- John Stalker School of Mathematics Trinity College Dublin tel +353 1 896 1983 fax +353 1 896 2282