mailing list of musl libc
 help / color / mirror / code / Atom feed
* [PATCH] implement glibc's glob_pattern_p
@ 2014-07-17  9:12 Glauber Costa
  2014-07-19 20:11 ` Rich Felker
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Glauber Costa @ 2014-07-17  9:12 UTC (permalink / raw)
  To: musl; +Cc: Glauber Costa

Currently, glob is implemented but glob_pattern_p is not. This function
can trivially be implemented with existing glob machinery, since all it
does is to tell whether or not the string contains special symbols. This
is basically the same logic as is_literal, but with inverted return value.

The value of the "quote" parameter should also be inverted, since the man
page states that when the quote parameter is *non*-zero, then backslashes
are ignored. is_literal ignores them when useesc is *zero*.
---
 src/regex/glob.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/src/regex/glob.c b/src/regex/glob.c
index 6affee0..cc488b0 100644
--- a/src/regex/glob.c
+++ b/src/regex/glob.c
@@ -234,5 +234,11 @@ void globfree(glob_t *g)
 	g->gl_pathv = NULL;
 }
 
+int __glob_pattern_p(const char *pattern, int quote)
+{
+    return !is_literal(pattern, !quote);
+}
+
 LFS64(glob);
 LFS64(globfree);
+weak_alias(__glob_pattern_p, glob_pattern_p);
-- 
1.9.3



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] implement glibc's glob_pattern_p
  2014-07-17  9:12 [PATCH] implement glibc's glob_pattern_p Glauber Costa
@ 2014-07-19 20:11 ` Rich Felker
  2014-07-19 23:17   ` Glauber Costa
  2014-07-20 20:36 ` Szabolcs Nagy
  2014-07-21 14:33 ` Rich Felker
  2 siblings, 1 reply; 12+ messages in thread
From: Rich Felker @ 2014-07-19 20:11 UTC (permalink / raw)
  To: Glauber Costa; +Cc: musl

On Thu, Jul 17, 2014 at 01:12:55PM +0400, Glauber Costa wrote:
> Currently, glob is implemented but glob_pattern_p is not. This function
> can trivially be implemented with existing glob machinery, since all it
> does is to tell whether or not the string contains special symbols. This
> is basically the same logic as is_literal, but with inverted return value.
> 
> The value of the "quote" parameter should also be inverted, since the man
> page states that when the quote parameter is *non*-zero, then backslashes
> are ignored. is_literal ignores them when useesc is *zero*.

I don't where this function would meet any of the criteria for
exclusion, and so it looks like a candidate to add. Are there many
programs that need it? What is the motivation for adding it?

BTW thanks for making the patch clean and using weak_alias right.

Rich


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] implement glibc's glob_pattern_p
  2014-07-19 20:11 ` Rich Felker
@ 2014-07-19 23:17   ` Glauber Costa
  2014-07-19 23:45     ` Rich Felker
  0 siblings, 1 reply; 12+ messages in thread
From: Glauber Costa @ 2014-07-19 23:17 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

[-- Attachment #1: Type: text/plain, Size: 1098 bytes --]

On Sun, Jul 20, 2014 at 12:11 AM, Rich Felker <dalias@libc.org> wrote:

> On Thu, Jul 17, 2014 at 01:12:55PM +0400, Glauber Costa wrote:
> > Currently, glob is implemented but glob_pattern_p is not. This function
> > can trivially be implemented with existing glob machinery, since all it
> > does is to tell whether or not the string contains special symbols. This
> > is basically the same logic as is_literal, but with inverted return
> value.
> >
> > The value of the "quote" parameter should also be inverted, since the man
> > page states that when the quote parameter is *non*-zero, then backslashes
> > are ignored. is_literal ignores them when useesc is *zero*.
>
> I don't where this function would meet any of the criteria for
> exclusion, and so it looks like a candidate to add. Are there many
> programs that need it? What is the motivation for adding it?
>
> I found this missing symbol while running the rpm binary. I would not say
there are many programs using it.
It's the first time I hit it.


> BTW thanks for making the patch clean and using weak_alias right.
>

My pleasure.

[-- Attachment #2: Type: text/html, Size: 1663 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] implement glibc's glob_pattern_p
  2014-07-19 23:17   ` Glauber Costa
@ 2014-07-19 23:45     ` Rich Felker
  2014-07-20 20:01       ` Thomas Petazzoni
  0 siblings, 1 reply; 12+ messages in thread
From: Rich Felker @ 2014-07-19 23:45 UTC (permalink / raw)
  To: Glauber Costa; +Cc: musl

On Sun, Jul 20, 2014 at 03:17:27AM +0400, Glauber Costa wrote:
> On Sun, Jul 20, 2014 at 12:11 AM, Rich Felker <dalias@libc.org> wrote:
> 
> > On Thu, Jul 17, 2014 at 01:12:55PM +0400, Glauber Costa wrote:
> > > Currently, glob is implemented but glob_pattern_p is not. This function
> > > can trivially be implemented with existing glob machinery, since all it
> > > does is to tell whether or not the string contains special symbols. This
> > > is basically the same logic as is_literal, but with inverted return
> > value.
> > >
> > > The value of the "quote" parameter should also be inverted, since the man
> > > page states that when the quote parameter is *non*-zero, then backslashes
> > > are ignored. is_literal ignores them when useesc is *zero*.
> >
> > I don't where this function would meet any of the criteria for
> > exclusion, and so it looks like a candidate to add. Are there many
> > programs that need it? What is the motivation for adding it?
> >
> > I found this missing symbol while running the rpm binary. I would not say
> there are many programs using it.
> It's the first time I hit it.

Do you have any examples? I ask because, in general when adding
non-standard extensions like this, there are two principles to check:

1. Criteria for exclusion:

   A. Does the interface have multiple incompatible historical
      versions where adding support for one would break software
      expecting the other?

   B. Does adding the interface force a more complex, less robust, or
      otherwise worse implementation of related standard functionality
      or otherwise have significant cost to support?

2. Criteria for inclusion:

   A. Are there existing programs using the interface?

   B. Is there a reason that adding the interface to libc is
      significantly better/easier than patching the programs to work
      without it?

Like I said glob_pattern_p looks okay but I think we should have at
least some documentation of programs using it as justification for
adding it.

Rich


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] implement glibc's glob_pattern_p
  2014-07-19 23:45     ` Rich Felker
@ 2014-07-20 20:01       ` Thomas Petazzoni
  2014-07-21 10:19         ` Glauber Costa
  0 siblings, 1 reply; 12+ messages in thread
From: Thomas Petazzoni @ 2014-07-20 20:01 UTC (permalink / raw)
  To: Rich Felker; +Cc: Glauber Costa, musl

Dear Rich Felker,

On Sat, 19 Jul 2014 19:45:34 -0400, Rich Felker wrote:

> 2. Criteria for inclusion:
> 
>    A. Are there existing programs using the interface?

At least the popt library is using it. In Buildroot, this popt library
is used by cryptsetup, gptfdisk, libiscsi, logrotate, lttng-tools, next,
oprofile, rpm, rsync, samba and samba4.

Best regards,

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] implement glibc's glob_pattern_p
  2014-07-17  9:12 [PATCH] implement glibc's glob_pattern_p Glauber Costa
  2014-07-19 20:11 ` Rich Felker
@ 2014-07-20 20:36 ` Szabolcs Nagy
  2014-07-21 10:29   ` Glauber Costa
  2014-07-21 14:33 ` Rich Felker
  2 siblings, 1 reply; 12+ messages in thread
From: Szabolcs Nagy @ 2014-07-20 20:36 UTC (permalink / raw)
  To: Glauber Costa; +Cc: musl

* Glauber Costa <glommer@cloudius-systems.com> [2014-07-17 13:12:55 +0400]:
...

btw do you happen to know anything about the osv project
developed by cloudius-systems?

they seem to have copied a lot of code from musl, but
never said a word about it here.. would be nice to know
where they are going with it.. or why they decided to
use musl in the first place

and looking at the libc/ code in osv, they don't really
back port bug fixes so there are some old musl bugs there..


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] implement glibc's glob_pattern_p
  2014-07-20 20:01       ` Thomas Petazzoni
@ 2014-07-21 10:19         ` Glauber Costa
  0 siblings, 0 replies; 12+ messages in thread
From: Glauber Costa @ 2014-07-21 10:19 UTC (permalink / raw)
  To: Thomas Petazzoni; +Cc: Rich Felker, musl

[-- Attachment #1: Type: text/plain, Size: 604 bytes --]

On Mon, Jul 21, 2014 at 12:01 AM, Thomas Petazzoni <
thomas.petazzoni@free-electrons.com> wrote:

> Dear Rich Felker,
>
> On Sat, 19 Jul 2014 19:45:34 -0400, Rich Felker wrote:
>
> > 2. Criteria for inclusion:
> >
> >    A. Are there existing programs using the interface?
>
> At least the popt library is using it. In Buildroot, this popt library
> is used by cryptsetup, gptfdisk, libiscsi, logrotate, lttng-tools, next,
> oprofile, rpm, rsync, samba and samba4.
>
> Best regards,
>
> You are correct. That popped out for me as an rpm dependency, but now
looking
again, it seems it comes through popt.

[-- Attachment #2: Type: text/html, Size: 1012 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] implement glibc's glob_pattern_p
  2014-07-20 20:36 ` Szabolcs Nagy
@ 2014-07-21 10:29   ` Glauber Costa
  2014-07-21 14:11     ` Rich Felker
  0 siblings, 1 reply; 12+ messages in thread
From: Glauber Costa @ 2014-07-21 10:29 UTC (permalink / raw)
  To: Glauber Costa, musl

[-- Attachment #1: Type: text/plain, Size: 2314 bytes --]

Hi,

On Mon, Jul 21, 2014 at 12:36 AM, Szabolcs Nagy <nsz@port70.net> wrote:

> * Glauber Costa <glommer@cloudius-systems.com> [2014-07-17 13:12:55
> +0400]:
> ...
>
> btw do you happen to know anything about the osv project
> developed by cloudius-systems?
>

Since we have no other projects than OSv at the moment, yes, I do know
about it =)


>
> they seem to have copied a lot of code from musl, but
> never said a word about it here.. would be nice to know
> where they are going with it.. or why they decided to
> use musl in the first place
>

I am sorry about that! The decision to use musl was a very early
one, before I joined the company (even though I am employee # 4 =p)
As far as I know, it was motivated by both the license, and the coverage set
which was quite good, with good code quality. It also fits very well our
goal
of being lightweight and lean.

From my perspective, it would be good for us to be just able to import
upstream
musl without having it in our tree - exactly for the reasons you point
below, of bugfixes.
But we don't really use all of it, some files were converted to C++, and
other small
changes to fit our environment, etc. Part of the reason we don't use musl
in its
entirety, is that our core kernel provides the linux-compatible apis
directly, without
a wrapper being needed. We also don't call __syscall, etc.

It would definitely be good if musl had a build environment that allowed us
to build
just part of it and be more flexible with what we include in the image,
etc. But it is
not our priority to work on it right now - we're just too few people, with
too big of a
task =(


>
> and looking at the libc/ code in osv, they don't really
> back port bug fixes so there are some old musl bugs there..
>

Yes, we don't have the capacity to track it, that's why I believe
eventually we will have
to work towards being able to consume musl as a totally external entity,
rather than
importing code in our libc directory - if the changes needed for that are
acceptable
for you guys, of course. If you have from the top of your head the hashes
for some
of those bugfixes you mention, that would of course be a great service =)

What I can and happily will do, for the moment, is to send you back
whatever changes
we make on our side, like this one patch here.

Thanks again

[-- Attachment #2: Type: text/html, Size: 3275 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] implement glibc's glob_pattern_p
  2014-07-21 10:29   ` Glauber Costa
@ 2014-07-21 14:11     ` Rich Felker
  2014-07-21 23:45       ` John Spencer
  0 siblings, 1 reply; 12+ messages in thread
From: Rich Felker @ 2014-07-21 14:11 UTC (permalink / raw)
  To: Glauber Costa; +Cc: musl

On Mon, Jul 21, 2014 at 02:29:59PM +0400, Glauber Costa wrote:
> > they seem to have copied a lot of code from musl, but
> > never said a word about it here.. would be nice to know
> > where they are going with it.. or why they decided to
> > use musl in the first place
> >
> 
> I am sorry about that! The decision to use musl was a very early

No need to be sorry. I think nsz just wanted to alert you that there
might be serious bugs that should be fixed if you're going to continue
to use a fork. The WHATSNEW file does a pretty good job of summarizing
important fixes in each release and could be used to find patches to
backport. If your fork is pre-1.0, I would follow the 1.0.x series
WHATSNEW file as long as that branch is maintained since you'll have a
lot less non-bugfix items to sift through than with the 1.1.x series
which has lots of new development in addition to bug fixes.

Rich


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] implement glibc's glob_pattern_p
  2014-07-17  9:12 [PATCH] implement glibc's glob_pattern_p Glauber Costa
  2014-07-19 20:11 ` Rich Felker
  2014-07-20 20:36 ` Szabolcs Nagy
@ 2014-07-21 14:33 ` Rich Felker
  2014-07-21 22:44   ` Glauber Costa
  2 siblings, 1 reply; 12+ messages in thread
From: Rich Felker @ 2014-07-21 14:33 UTC (permalink / raw)
  To: musl

On Thu, Jul 17, 2014 at 01:12:55PM +0400, Glauber Costa wrote:
> Currently, glob is implemented but glob_pattern_p is not. This function
> can trivially be implemented with existing glob machinery, since all it
> does is to tell whether or not the string contains special symbols. This
> is basically the same logic as is_literal, but with inverted return value.
> 
> The value of the "quote" parameter should also be inverted, since the man
> page states that when the quote parameter is *non*-zero, then backslashes
> are ignored. is_literal ignores them when useesc is *zero*.

Upon reviewing this patch, I'm not sure if it's correct. It wrote this
code (glob.c) roughly 10 years ago and barely touched it since, but
from what I can tell, the existing is_literal looks like a premature
optimization to avoid calling fnmatch when it's not necessary than an
accurate predicate. In other words, it's designed to reliably return 0
if the string is a glob, but there are also cases where it returns 0
when the string is not a "glob pattern" but not purely a literal
either (e.g. anything containing backslash when useesc is nonzero,
since such strings cannot be processed with strcmp and need fnmatch).

My leaning would be to remove this "optimization" from glob.c entirely
(it shouldn't be needed if fnmatch is efficient) and move the code to
a separate file for glob_pattern_p, but then we need to make sure it
gets things right with no false positives or false negatives.

Rich


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] implement glibc's glob_pattern_p
  2014-07-21 14:33 ` Rich Felker
@ 2014-07-21 22:44   ` Glauber Costa
  0 siblings, 0 replies; 12+ messages in thread
From: Glauber Costa @ 2014-07-21 22:44 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 1642 bytes --]

On Mon, Jul 21, 2014 at 6:33 PM, Rich Felker <dalias@libc.org> wrote:

> On Thu, Jul 17, 2014 at 01:12:55PM +0400, Glauber Costa wrote:
> > Currently, glob is implemented but glob_pattern_p is not. This function
> > can trivially be implemented with existing glob machinery, since all it
> > does is to tell whether or not the string contains special symbols. This
> > is basically the same logic as is_literal, but with inverted return
> value.
> >
> > The value of the "quote" parameter should also be inverted, since the man
> > page states that when the quote parameter is *non*-zero, then backslashes
> > are ignored. is_literal ignores them when useesc is *zero*.
>
> Upon reviewing this patch, I'm not sure if it's correct. It wrote this
> code (glob.c) roughly 10 years ago and barely touched it since, but
> from what I can tell, the existing is_literal looks like a premature
> optimization to avoid calling fnmatch when it's not necessary than an
> accurate predicate. In other words, it's designed to reliably return 0
> if the string is a glob, but there are also cases where it returns 0
> when the string is not a "glob pattern" but not purely a literal
> either (e.g. anything containing backslash when useesc is nonzero,
> since such strings cannot be processed with strcmp and need fnmatch).
>
> My leaning would be to remove this "optimization" from glob.c entirely
> (it shouldn't be needed if fnmatch is efficient) and move the code to
> a separate file for glob_pattern_p, but then we need to make sure it
> gets things right with no false positives or false negatives.
>
> Hi

I will come back to this soon, then.

Cya

[-- Attachment #2: Type: text/html, Size: 2205 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] implement glibc's glob_pattern_p
  2014-07-21 14:11     ` Rich Felker
@ 2014-07-21 23:45       ` John Spencer
  0 siblings, 0 replies; 12+ messages in thread
From: John Spencer @ 2014-07-21 23:45 UTC (permalink / raw)
  To: glommer; +Cc: musl

Rich Felker wrote:
> On Mon, Jul 21, 2014 at 02:29:59PM +0400, Glauber Costa wrote:
>>> they seem to have copied a lot of code from musl, but
>>> never said a word about it here.. would be nice to know
>>> where they are going with it.. or why they decided to
>>> use musl in the first place
>>>
>> I am sorry about that! The decision to use musl was a very early

i was not delighted when i saw some of the musl code turned into C++ in 
the osv repo...

> 
> No need to be sorry. I think nsz just wanted to alert you that there
> might be serious bugs that should be fixed if you're going to continue
> to use a fork. The WHATSNEW file does a pretty good job of summarizing
> important fixes in each release and could be used to find patches to
> backport.

my idea to keep a fork in tune using a semi-automated approach:

- use a comment line like // SOURCE: $MUSL/src/unistd/getpid.c in every 
file in libc/ that's copied from musl

- use a shell script that updates a musl clone and then iterates over 
the files in osv/libc, and for each file that contains SOURCE tag do 
"git log -p $MUSL/$SOURCE" to see an overview of all commits and changes 
to that file.

you could even add the commit id from which the source was imported as 
another comment to restrict the set of changes that's shown to only the 
relevant ones. after backporting changes, you could manually raise the 
commit id to the one of the corresponding musl commit.

if done like regularly, like once a month, you could probably get away 
with very little work to get all the relevant changes backported.


--JS


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2014-07-21 23:45 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-17  9:12 [PATCH] implement glibc's glob_pattern_p Glauber Costa
2014-07-19 20:11 ` Rich Felker
2014-07-19 23:17   ` Glauber Costa
2014-07-19 23:45     ` Rich Felker
2014-07-20 20:01       ` Thomas Petazzoni
2014-07-21 10:19         ` Glauber Costa
2014-07-20 20:36 ` Szabolcs Nagy
2014-07-21 10:29   ` Glauber Costa
2014-07-21 14:11     ` Rich Felker
2014-07-21 23:45       ` John Spencer
2014-07-21 14:33 ` Rich Felker
2014-07-21 22:44   ` Glauber Costa

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).