* [PATCH] regex: support non-greedy quantifiers
@ 2016-03-13 11:06 Julien Ramseier
2016-06-16 15:33 ` Julien Ramseier
0 siblings, 1 reply; 4+ messages in thread
From: Julien Ramseier @ 2016-03-13 11:06 UTC (permalink / raw)
To: musl
Here's a tiny patch to enable non-greedy regex quantifiers.
This is not specified by POSIX, but I think it's a useful
extension, and all the code for supporting it is already present.
I tested this against the TRE and AT&T test suites (from NetBSD)
and didn't found any regressions.
However I don't know all the ins and outs of the implementation
and I may have missed something obvious.
- Julien
diff --git a/src/regex/regcomp.c b/src/regex/regcomp.c
index 5fad98b..cc7d633 100644
--- a/src/regex/regcomp.c
+++ b/src/regex/regcomp.c
@@ -979,6 +979,7 @@ static reg_errcode_t tre_parse(tre_parse_ctx_t *ctx)
parse_iter:
for (;;) {
int min, max;
+ int minimal = 0;
if (*s!='\\' && *s!='*') {
if (!ere)
@@ -1014,11 +1015,16 @@ static reg_errcode_t tre_parse(tre_parse_ctx_t *ctx)
if (*s == '?')
max = 1;
s++;
+ /* Non-greedy */
+ if (ere && *s == '?') {
+ minimal = 1;
+ s++;
+ }
}
if (max == 0)
ctx->n = tre_ast_new_literal(ctx->mem, EMPTY, -1, -1);
else
- ctx->n = tre_ast_new_iter(ctx->mem, ctx->n, min, max, 0);
+ ctx->n = tre_ast_new_iter(ctx->mem, ctx->n, min, max, minimal);
if (!ctx->n)
return REG_ESPACE;
}
--
2.7.2
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] regex: support non-greedy quantifiers
2016-03-13 11:06 [PATCH] regex: support non-greedy quantifiers Julien Ramseier
@ 2016-06-16 15:33 ` Julien Ramseier
2016-06-16 19:10 ` Szabolcs Nagy
0 siblings, 1 reply; 4+ messages in thread
From: Julien Ramseier @ 2016-06-16 15:33 UTC (permalink / raw)
To: musl
> Le 13 mars 2016 à 12:06, Julien Ramseier <j.ramseier@gmail.com> a écrit :
>
> Here's a tiny patch to enable non-greedy regex quantifiers.
> This is not specified by POSIX, but I think it's a useful
> extension, and all the code for supporting it is already present.
>
> I tested this against the TRE and AT&T test suites (from NetBSD)
> and didn't found any regressions.
> However I don't know all the ins and outs of the implementation
> and I may have missed something obvious.
>
> - Julien
>
> diff --git a/src/regex/regcomp.c b/src/regex/regcomp.c
> index 5fad98b..cc7d633 100644
> --- a/src/regex/regcomp.c
> +++ b/src/regex/regcomp.c
> @@ -979,6 +979,7 @@ static reg_errcode_t tre_parse(tre_parse_ctx_t *ctx)
> parse_iter:
> for (;;) {
> int min, max;
> + int minimal = 0;
>
> if (*s!='\\' && *s!='*') {
> if (!ere)
> @@ -1014,11 +1015,16 @@ static reg_errcode_t tre_parse(tre_parse_ctx_t *ctx)
> if (*s == '?')
> max = 1;
> s++;
> + /* Non-greedy */
> + if (ere && *s == '?') {
> + minimal = 1;
> + s++;
> + }
> }
> if (max == 0)
> ctx->n = tre_ast_new_literal(ctx->mem, EMPTY, -1, -1);
> else
> - ctx->n = tre_ast_new_iter(ctx->mem, ctx->n, min, max, 0);
> + ctx->n = tre_ast_new_iter(ctx->mem, ctx->n, min, max, minimal);
> if (!ctx->n)
> return REG_ESPACE;
> }
> --
> 2.7.2
Ping?
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Re: [PATCH] regex: support non-greedy quantifiers
2016-06-16 15:33 ` Julien Ramseier
@ 2016-06-16 19:10 ` Szabolcs Nagy
2016-06-16 19:41 ` Rich Felker
0 siblings, 1 reply; 4+ messages in thread
From: Szabolcs Nagy @ 2016-06-16 19:10 UTC (permalink / raw)
To: musl
* Julien Ramseier <j.ramseier@gmail.com> [2016-06-16 17:33:48 +0200]:
> > Le 13 mars 2016 à 12:06, Julien Ramseier <j.ramseier@gmail.com> a écrit :
> >
> > Here's a tiny patch to enable non-greedy regex quantifiers.
> > This is not specified by POSIX, but I think it's a useful
> > extension, and all the code for supporting it is already present.
...
>
> Ping?
musl is conservative about its regex syntax since extensions
are not portable across impementations so users cannot rely
on them.
i think this extension is not available in glibc and other
posix regex implementations either, it's a perl invention,
so it would be wrong to add it to musl, those who want perl
syntax can use the pcre lib.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Re: [PATCH] regex: support non-greedy quantifiers
2016-06-16 19:10 ` Szabolcs Nagy
@ 2016-06-16 19:41 ` Rich Felker
0 siblings, 0 replies; 4+ messages in thread
From: Rich Felker @ 2016-06-16 19:41 UTC (permalink / raw)
To: musl
On Thu, Jun 16, 2016 at 09:10:50PM +0200, Szabolcs Nagy wrote:
> * Julien Ramseier <j.ramseier@gmail.com> [2016-06-16 17:33:48 +0200]:
> > > Le 13 mars 2016 à 12:06, Julien Ramseier <j.ramseier@gmail.com> a écrit :
> > >
> > > Here's a tiny patch to enable non-greedy regex quantifiers.
> > > This is not specified by POSIX, but I think it's a useful
> > > extension, and all the code for supporting it is already present.
> ...
> >
> > Ping?
>
> musl is conservative about its regex syntax since extensions
> are not portable across impementations so users cannot rely
> on them.
>
> i think this extension is not available in glibc and other
> posix regex implementations either, it's a perl invention,
> so it would be wrong to add it to musl, those who want perl
> syntax can use the pcre lib.
Indeed, and beyond the general principle of not adding extensions
without strong precedent and justification, I really do not want to be
adding new dubious extensions to something we're considering rewriting
completely. It would add implementation burden and constraints on the
new implementation.
Rich
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2016-06-16 19:41 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-13 11:06 [PATCH] regex: support non-greedy quantifiers Julien Ramseier
2016-06-16 15:33 ` Julien Ramseier
2016-06-16 19:10 ` Szabolcs Nagy
2016-06-16 19:41 ` Rich Felker
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).