mailing list of musl libc
 help / color / mirror / code / Atom feed
* [PATCH] regex: support non-greedy quantifiers
@ 2016-03-13 11:06 Julien Ramseier
  2016-06-16 15:33 ` Julien Ramseier
  0 siblings, 1 reply; 4+ messages in thread
From: Julien Ramseier @ 2016-03-13 11:06 UTC (permalink / raw)
  To: musl

Here's a tiny patch to enable non-greedy regex quantifiers.
This is not specified by POSIX, but I think it's a useful
extension, and all the code for supporting it is already present.

I tested this against the TRE and AT&T test suites (from NetBSD)
and didn't found any regressions.
However I don't know all the ins and outs of the implementation
and I may have missed something obvious.

- Julien

diff --git a/src/regex/regcomp.c b/src/regex/regcomp.c
index 5fad98b..cc7d633 100644
--- a/src/regex/regcomp.c
+++ b/src/regex/regcomp.c
@@ -979,6 +979,7 @@ static reg_errcode_t tre_parse(tre_parse_ctx_t *ctx)
 	parse_iter:
 		for (;;) {
 			int min, max;
+			int minimal = 0;
 
 			if (*s!='\\' && *s!='*') {
 				if (!ere)
@@ -1014,11 +1015,16 @@ static reg_errcode_t tre_parse(tre_parse_ctx_t *ctx)
 				if (*s == '?')
 					max = 1;
 				s++;
+				/* Non-greedy */
+				if (ere && *s == '?') {
+					minimal = 1;
+					s++;
+				}
 			}
 			if (max == 0)
 				ctx->n = tre_ast_new_literal(ctx->mem, EMPTY, -1, -1);
 			else
-				ctx->n = tre_ast_new_iter(ctx->mem, ctx->n, min, max, 0);
+				ctx->n = tre_ast_new_iter(ctx->mem, ctx->n, min, max, minimal);
 			if (!ctx->n)
 				return REG_ESPACE;
 		}
-- 
2.7.2


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] regex: support non-greedy quantifiers
  2016-03-13 11:06 [PATCH] regex: support non-greedy quantifiers Julien Ramseier
@ 2016-06-16 15:33 ` Julien Ramseier
  2016-06-16 19:10   ` Szabolcs Nagy
  0 siblings, 1 reply; 4+ messages in thread
From: Julien Ramseier @ 2016-06-16 15:33 UTC (permalink / raw)
  To: musl


> Le 13 mars 2016 à 12:06, Julien Ramseier <j.ramseier@gmail.com> a écrit :
> 
> Here's a tiny patch to enable non-greedy regex quantifiers.
> This is not specified by POSIX, but I think it's a useful
> extension, and all the code for supporting it is already present.
> 
> I tested this against the TRE and AT&T test suites (from NetBSD)
> and didn't found any regressions.
> However I don't know all the ins and outs of the implementation
> and I may have missed something obvious.
> 
> - Julien
> 
> diff --git a/src/regex/regcomp.c b/src/regex/regcomp.c
> index 5fad98b..cc7d633 100644
> --- a/src/regex/regcomp.c
> +++ b/src/regex/regcomp.c
> @@ -979,6 +979,7 @@ static reg_errcode_t tre_parse(tre_parse_ctx_t *ctx)
> 	parse_iter:
> 		for (;;) {
> 			int min, max;
> +			int minimal = 0;
> 
> 			if (*s!='\\' && *s!='*') {
> 				if (!ere)
> @@ -1014,11 +1015,16 @@ static reg_errcode_t tre_parse(tre_parse_ctx_t *ctx)
> 				if (*s == '?')
> 					max = 1;
> 				s++;
> +				/* Non-greedy */
> +				if (ere && *s == '?') {
> +					minimal = 1;
> +					s++;
> +				}
> 			}
> 			if (max == 0)
> 				ctx->n = tre_ast_new_literal(ctx->mem, EMPTY, -1, -1);
> 			else
> -				ctx->n = tre_ast_new_iter(ctx->mem, ctx->n, min, max, 0);
> +				ctx->n = tre_ast_new_iter(ctx->mem, ctx->n, min, max, minimal);
> 			if (!ctx->n)
> 				return REG_ESPACE;
> 		}
> -- 
> 2.7.2

Ping?




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Re: [PATCH] regex: support non-greedy quantifiers
  2016-06-16 15:33 ` Julien Ramseier
@ 2016-06-16 19:10   ` Szabolcs Nagy
  2016-06-16 19:41     ` Rich Felker
  0 siblings, 1 reply; 4+ messages in thread
From: Szabolcs Nagy @ 2016-06-16 19:10 UTC (permalink / raw)
  To: musl

* Julien Ramseier <j.ramseier@gmail.com> [2016-06-16 17:33:48 +0200]:
> > Le 13 mars 2016 à 12:06, Julien Ramseier <j.ramseier@gmail.com> a écrit :
> > 
> > Here's a tiny patch to enable non-greedy regex quantifiers.
> > This is not specified by POSIX, but I think it's a useful
> > extension, and all the code for supporting it is already present.
...
> 
> Ping?

musl is conservative about its regex syntax since extensions
are not portable across impementations so users cannot rely
on them.

i think this extension is not available in glibc and other
posix regex implementations either, it's a perl invention,
so it would be wrong to add it to musl, those who want perl
syntax can use the pcre lib.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Re: [PATCH] regex: support non-greedy quantifiers
  2016-06-16 19:10   ` Szabolcs Nagy
@ 2016-06-16 19:41     ` Rich Felker
  0 siblings, 0 replies; 4+ messages in thread
From: Rich Felker @ 2016-06-16 19:41 UTC (permalink / raw)
  To: musl

On Thu, Jun 16, 2016 at 09:10:50PM +0200, Szabolcs Nagy wrote:
> * Julien Ramseier <j.ramseier@gmail.com> [2016-06-16 17:33:48 +0200]:
> > > Le 13 mars 2016 à 12:06, Julien Ramseier <j.ramseier@gmail.com> a écrit :
> > > 
> > > Here's a tiny patch to enable non-greedy regex quantifiers.
> > > This is not specified by POSIX, but I think it's a useful
> > > extension, and all the code for supporting it is already present.
> ...
> > 
> > Ping?
> 
> musl is conservative about its regex syntax since extensions
> are not portable across impementations so users cannot rely
> on them.
> 
> i think this extension is not available in glibc and other
> posix regex implementations either, it's a perl invention,
> so it would be wrong to add it to musl, those who want perl
> syntax can use the pcre lib.

Indeed, and beyond the general principle of not adding extensions
without strong precedent and justification, I really do not want to be
adding new dubious extensions to something we're considering rewriting
completely. It would add implementation burden and constraints on the
new implementation.

Rich


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-06-16 19:41 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-13 11:06 [PATCH] regex: support non-greedy quantifiers Julien Ramseier
2016-06-16 15:33 ` Julien Ramseier
2016-06-16 19:10   ` Szabolcs Nagy
2016-06-16 19:41     ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).