mailing list of musl libc
 help / color / mirror / code / Atom feed
* tre regex in single regcomp.c file
@ 2016-06-16 14:31 Julien Ramseier
  2016-06-16 15:12 ` Szabolcs Nagy
  0 siblings, 1 reply; 4+ messages in thread
From: Julien Ramseier @ 2016-06-16 14:31 UTC (permalink / raw)
  To: musl

Hello,

Any reason most of the TRE regex sources have been merged in a single 3k lines 
file (regcomp.c)?
This makes diff-ing them against the original sources[0] very painful.

Would a patch to split them back be accepted?

[0] https://github.com/laurikari/tre/

- Julien

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: tre regex in single regcomp.c file
  2016-06-16 14:31 tre regex in single regcomp.c file Julien Ramseier
@ 2016-06-16 15:12 ` Szabolcs Nagy
  2016-06-16 15:25   ` Rich Felker
  0 siblings, 1 reply; 4+ messages in thread
From: Szabolcs Nagy @ 2016-06-16 15:12 UTC (permalink / raw)
  To: musl

* Julien Ramseier <j.ramseier@gmail.com> [2016-06-16 16:31:12 +0200]:
> Any reason most of the TRE regex sources have been merged in a single 3k lines 
> file (regcomp.c)?

creating internal interfaces for implementing a single
self-contained public function (regcomp) is bad design.

(even if you split the code up, all the code will be
linked together if the public api symbol is referenced
otherwise none of the code will be used, so you just
create linking overhead and headaches around the internal
api between the source files which must obey posix
namespace rules etc.)

> This makes diff-ing them against the original sources[0] very painful.

the original tre is not suitable for libc use, at least
the namespace issues, alloca use, aborts, debug printfs
should be fixed.

there were various other conformance issues and features
not relevant to the c runtime, the parser was rewritten e.g.
http://git.musl-libc.org/cgit/musl/commit/?id=ec1aed0a144b3e00e16eeb142c9d13362d6048e7

so diffing would be painful anyway.

> Would a patch to split them back be accepted?

unlikely

it's much more likely that the regex engine will be rewritten.

> [0] https://github.com/laurikari/tre/
> 
> - Julien


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: tre regex in single regcomp.c file
  2016-06-16 15:12 ` Szabolcs Nagy
@ 2016-06-16 15:25   ` Rich Felker
  2016-06-16 15:30     ` Julien Ramseier
  0 siblings, 1 reply; 4+ messages in thread
From: Rich Felker @ 2016-06-16 15:25 UTC (permalink / raw)
  To: musl

On Thu, Jun 16, 2016 at 05:12:24PM +0200, Szabolcs Nagy wrote:
> * Julien Ramseier <j.ramseier@gmail.com> [2016-06-16 16:31:12 +0200]:
> > Any reason most of the TRE regex sources have been merged in a single 3k lines 
> > file (regcomp.c)?
> 
> creating internal interfaces for implementing a single
> self-contained public function (regcomp) is bad design.
> 
> (even if you split the code up, all the code will be
> linked together if the public api symbol is referenced
> otherwise none of the code will be used, so you just
> create linking overhead and headaches around the internal
> api between the source files which must obey posix
> namespace rules etc.)

Yes, the main motivation was to get rid of external namespace
pollution without renaming all the internal functions to be
__-prefixed, and also to allow inter-procedural analysis
optimizations.

> > This makes diff-ing them against the original sources[0] very painful.
> 
> the original tre is not suitable for libc use, at least
> the namespace issues, alloca use, aborts, debug printfs
> should be fixed.
> 
> there were various other conformance issues and features
> not relevant to the c runtime, the parser was rewritten e.g.
> http://git.musl-libc.org/cgit/musl/commit/?id=ec1aed0a144b3e00e16eeb142c9d13362d6048e7
> 
> so diffing would be painful anyway.
> 
> > Would a patch to split them back be accepted?
> 
> unlikely
> 
> it's much more likely that the regex engine will be rewritten.

If the goal is to send improvements or fixes upstream (that would mean
picking maintenance of TRE back up yourself, I think) the right way
would be to read musl's git logs and follow the changes that way.

Rich


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: tre regex in single regcomp.c file
  2016-06-16 15:25   ` Rich Felker
@ 2016-06-16 15:30     ` Julien Ramseier
  0 siblings, 0 replies; 4+ messages in thread
From: Julien Ramseier @ 2016-06-16 15:30 UTC (permalink / raw)
  To: musl


> Le 16 juin 2016 à 17:25, Rich Felker <dalias@libc.org> a écrit :
>> 
>> creating internal interfaces for implementing a single
>> self-contained public function (regcomp) is bad design.
>> 
>> (even if you split the code up, all the code will be
>> linked together if the public api symbol is referenced
>> otherwise none of the code will be used, so you just
>> create linking overhead and headaches around the internal
>> api between the source files which must obey posix
>> namespace rules etc.)
> 
> Yes, the main motivation was to get rid of external namespace
> pollution without renaming all the internal functions to be
> __-prefixed, and also to allow inter-procedural analysis
> optimizations.
> 

That makes sense.
Thank you both for the explanations.

- Julien

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-06-16 15:30 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-16 14:31 tre regex in single regcomp.c file Julien Ramseier
2016-06-16 15:12 ` Szabolcs Nagy
2016-06-16 15:25   ` Rich Felker
2016-06-16 15:30     ` Julien Ramseier

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).