* [PATCH] implement glibc's glob_pattern_p @ 2014-07-17 9:12 Glauber Costa 2014-07-19 20:11 ` Rich Felker ` (2 more replies) 0 siblings, 3 replies; 12+ messages in thread From: Glauber Costa @ 2014-07-17 9:12 UTC (permalink / raw) To: musl; +Cc: Glauber Costa Currently, glob is implemented but glob_pattern_p is not. This function can trivially be implemented with existing glob machinery, since all it does is to tell whether or not the string contains special symbols. This is basically the same logic as is_literal, but with inverted return value. The value of the "quote" parameter should also be inverted, since the man page states that when the quote parameter is *non*-zero, then backslashes are ignored. is_literal ignores them when useesc is *zero*. --- src/regex/glob.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/src/regex/glob.c b/src/regex/glob.c index 6affee0..cc488b0 100644 --- a/src/regex/glob.c +++ b/src/regex/glob.c @@ -234,5 +234,11 @@ void globfree(glob_t *g) g->gl_pathv = NULL; } +int __glob_pattern_p(const char *pattern, int quote) +{ + return !is_literal(pattern, !quote); +} + LFS64(glob); LFS64(globfree); +weak_alias(__glob_pattern_p, glob_pattern_p); -- 1.9.3 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] implement glibc's glob_pattern_p 2014-07-17 9:12 [PATCH] implement glibc's glob_pattern_p Glauber Costa @ 2014-07-19 20:11 ` Rich Felker 2014-07-19 23:17 ` Glauber Costa 2014-07-20 20:36 ` Szabolcs Nagy 2014-07-21 14:33 ` Rich Felker 2 siblings, 1 reply; 12+ messages in thread From: Rich Felker @ 2014-07-19 20:11 UTC (permalink / raw) To: Glauber Costa; +Cc: musl On Thu, Jul 17, 2014 at 01:12:55PM +0400, Glauber Costa wrote: > Currently, glob is implemented but glob_pattern_p is not. This function > can trivially be implemented with existing glob machinery, since all it > does is to tell whether or not the string contains special symbols. This > is basically the same logic as is_literal, but with inverted return value. > > The value of the "quote" parameter should also be inverted, since the man > page states that when the quote parameter is *non*-zero, then backslashes > are ignored. is_literal ignores them when useesc is *zero*. I don't where this function would meet any of the criteria for exclusion, and so it looks like a candidate to add. Are there many programs that need it? What is the motivation for adding it? BTW thanks for making the patch clean and using weak_alias right. Rich ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] implement glibc's glob_pattern_p 2014-07-19 20:11 ` Rich Felker @ 2014-07-19 23:17 ` Glauber Costa 2014-07-19 23:45 ` Rich Felker 0 siblings, 1 reply; 12+ messages in thread From: Glauber Costa @ 2014-07-19 23:17 UTC (permalink / raw) To: Rich Felker; +Cc: musl [-- Attachment #1: Type: text/plain, Size: 1098 bytes --] On Sun, Jul 20, 2014 at 12:11 AM, Rich Felker <dalias@libc.org> wrote: > On Thu, Jul 17, 2014 at 01:12:55PM +0400, Glauber Costa wrote: > > Currently, glob is implemented but glob_pattern_p is not. This function > > can trivially be implemented with existing glob machinery, since all it > > does is to tell whether or not the string contains special symbols. This > > is basically the same logic as is_literal, but with inverted return > value. > > > > The value of the "quote" parameter should also be inverted, since the man > > page states that when the quote parameter is *non*-zero, then backslashes > > are ignored. is_literal ignores them when useesc is *zero*. > > I don't where this function would meet any of the criteria for > exclusion, and so it looks like a candidate to add. Are there many > programs that need it? What is the motivation for adding it? > > I found this missing symbol while running the rpm binary. I would not say there are many programs using it. It's the first time I hit it. > BTW thanks for making the patch clean and using weak_alias right. > My pleasure. [-- Attachment #2: Type: text/html, Size: 1663 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] implement glibc's glob_pattern_p 2014-07-19 23:17 ` Glauber Costa @ 2014-07-19 23:45 ` Rich Felker 2014-07-20 20:01 ` Thomas Petazzoni 0 siblings, 1 reply; 12+ messages in thread From: Rich Felker @ 2014-07-19 23:45 UTC (permalink / raw) To: Glauber Costa; +Cc: musl On Sun, Jul 20, 2014 at 03:17:27AM +0400, Glauber Costa wrote: > On Sun, Jul 20, 2014 at 12:11 AM, Rich Felker <dalias@libc.org> wrote: > > > On Thu, Jul 17, 2014 at 01:12:55PM +0400, Glauber Costa wrote: > > > Currently, glob is implemented but glob_pattern_p is not. This function > > > can trivially be implemented with existing glob machinery, since all it > > > does is to tell whether or not the string contains special symbols. This > > > is basically the same logic as is_literal, but with inverted return > > value. > > > > > > The value of the "quote" parameter should also be inverted, since the man > > > page states that when the quote parameter is *non*-zero, then backslashes > > > are ignored. is_literal ignores them when useesc is *zero*. > > > > I don't where this function would meet any of the criteria for > > exclusion, and so it looks like a candidate to add. Are there many > > programs that need it? What is the motivation for adding it? > > > > I found this missing symbol while running the rpm binary. I would not say > there are many programs using it. > It's the first time I hit it. Do you have any examples? I ask because, in general when adding non-standard extensions like this, there are two principles to check: 1. Criteria for exclusion: A. Does the interface have multiple incompatible historical versions where adding support for one would break software expecting the other? B. Does adding the interface force a more complex, less robust, or otherwise worse implementation of related standard functionality or otherwise have significant cost to support? 2. Criteria for inclusion: A. Are there existing programs using the interface? B. Is there a reason that adding the interface to libc is significantly better/easier than patching the programs to work without it? Like I said glob_pattern_p looks okay but I think we should have at least some documentation of programs using it as justification for adding it. Rich ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] implement glibc's glob_pattern_p 2014-07-19 23:45 ` Rich Felker @ 2014-07-20 20:01 ` Thomas Petazzoni 2014-07-21 10:19 ` Glauber Costa 0 siblings, 1 reply; 12+ messages in thread From: Thomas Petazzoni @ 2014-07-20 20:01 UTC (permalink / raw) To: Rich Felker; +Cc: Glauber Costa, musl Dear Rich Felker, On Sat, 19 Jul 2014 19:45:34 -0400, Rich Felker wrote: > 2. Criteria for inclusion: > > A. Are there existing programs using the interface? At least the popt library is using it. In Buildroot, this popt library is used by cryptsetup, gptfdisk, libiscsi, logrotate, lttng-tools, next, oprofile, rpm, rsync, samba and samba4. Best regards, Thomas -- Thomas Petazzoni, CTO, Free Electrons Embedded Linux, Kernel and Android engineering http://free-electrons.com ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] implement glibc's glob_pattern_p 2014-07-20 20:01 ` Thomas Petazzoni @ 2014-07-21 10:19 ` Glauber Costa 0 siblings, 0 replies; 12+ messages in thread From: Glauber Costa @ 2014-07-21 10:19 UTC (permalink / raw) To: Thomas Petazzoni; +Cc: Rich Felker, musl [-- Attachment #1: Type: text/plain, Size: 604 bytes --] On Mon, Jul 21, 2014 at 12:01 AM, Thomas Petazzoni < thomas.petazzoni@free-electrons.com> wrote: > Dear Rich Felker, > > On Sat, 19 Jul 2014 19:45:34 -0400, Rich Felker wrote: > > > 2. Criteria for inclusion: > > > > A. Are there existing programs using the interface? > > At least the popt library is using it. In Buildroot, this popt library > is used by cryptsetup, gptfdisk, libiscsi, logrotate, lttng-tools, next, > oprofile, rpm, rsync, samba and samba4. > > Best regards, > > You are correct. That popped out for me as an rpm dependency, but now looking again, it seems it comes through popt. [-- Attachment #2: Type: text/html, Size: 1012 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] implement glibc's glob_pattern_p 2014-07-17 9:12 [PATCH] implement glibc's glob_pattern_p Glauber Costa 2014-07-19 20:11 ` Rich Felker @ 2014-07-20 20:36 ` Szabolcs Nagy 2014-07-21 10:29 ` Glauber Costa 2014-07-21 14:33 ` Rich Felker 2 siblings, 1 reply; 12+ messages in thread From: Szabolcs Nagy @ 2014-07-20 20:36 UTC (permalink / raw) To: Glauber Costa; +Cc: musl * Glauber Costa <glommer@cloudius-systems.com> [2014-07-17 13:12:55 +0400]: ... btw do you happen to know anything about the osv project developed by cloudius-systems? they seem to have copied a lot of code from musl, but never said a word about it here.. would be nice to know where they are going with it.. or why they decided to use musl in the first place and looking at the libc/ code in osv, they don't really back port bug fixes so there are some old musl bugs there.. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] implement glibc's glob_pattern_p 2014-07-20 20:36 ` Szabolcs Nagy @ 2014-07-21 10:29 ` Glauber Costa 2014-07-21 14:11 ` Rich Felker 0 siblings, 1 reply; 12+ messages in thread From: Glauber Costa @ 2014-07-21 10:29 UTC (permalink / raw) To: Glauber Costa, musl [-- Attachment #1: Type: text/plain, Size: 2314 bytes --] Hi, On Mon, Jul 21, 2014 at 12:36 AM, Szabolcs Nagy <nsz@port70.net> wrote: > * Glauber Costa <glommer@cloudius-systems.com> [2014-07-17 13:12:55 > +0400]: > ... > > btw do you happen to know anything about the osv project > developed by cloudius-systems? > Since we have no other projects than OSv at the moment, yes, I do know about it =) > > they seem to have copied a lot of code from musl, but > never said a word about it here.. would be nice to know > where they are going with it.. or why they decided to > use musl in the first place > I am sorry about that! The decision to use musl was a very early one, before I joined the company (even though I am employee # 4 =p) As far as I know, it was motivated by both the license, and the coverage set which was quite good, with good code quality. It also fits very well our goal of being lightweight and lean. From my perspective, it would be good for us to be just able to import upstream musl without having it in our tree - exactly for the reasons you point below, of bugfixes. But we don't really use all of it, some files were converted to C++, and other small changes to fit our environment, etc. Part of the reason we don't use musl in its entirety, is that our core kernel provides the linux-compatible apis directly, without a wrapper being needed. We also don't call __syscall, etc. It would definitely be good if musl had a build environment that allowed us to build just part of it and be more flexible with what we include in the image, etc. But it is not our priority to work on it right now - we're just too few people, with too big of a task =( > > and looking at the libc/ code in osv, they don't really > back port bug fixes so there are some old musl bugs there.. > Yes, we don't have the capacity to track it, that's why I believe eventually we will have to work towards being able to consume musl as a totally external entity, rather than importing code in our libc directory - if the changes needed for that are acceptable for you guys, of course. If you have from the top of your head the hashes for some of those bugfixes you mention, that would of course be a great service =) What I can and happily will do, for the moment, is to send you back whatever changes we make on our side, like this one patch here. Thanks again [-- Attachment #2: Type: text/html, Size: 3275 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] implement glibc's glob_pattern_p 2014-07-21 10:29 ` Glauber Costa @ 2014-07-21 14:11 ` Rich Felker 2014-07-21 23:45 ` John Spencer 0 siblings, 1 reply; 12+ messages in thread From: Rich Felker @ 2014-07-21 14:11 UTC (permalink / raw) To: Glauber Costa; +Cc: musl On Mon, Jul 21, 2014 at 02:29:59PM +0400, Glauber Costa wrote: > > they seem to have copied a lot of code from musl, but > > never said a word about it here.. would be nice to know > > where they are going with it.. or why they decided to > > use musl in the first place > > > > I am sorry about that! The decision to use musl was a very early No need to be sorry. I think nsz just wanted to alert you that there might be serious bugs that should be fixed if you're going to continue to use a fork. The WHATSNEW file does a pretty good job of summarizing important fixes in each release and could be used to find patches to backport. If your fork is pre-1.0, I would follow the 1.0.x series WHATSNEW file as long as that branch is maintained since you'll have a lot less non-bugfix items to sift through than with the 1.1.x series which has lots of new development in addition to bug fixes. Rich ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] implement glibc's glob_pattern_p 2014-07-21 14:11 ` Rich Felker @ 2014-07-21 23:45 ` John Spencer 0 siblings, 0 replies; 12+ messages in thread From: John Spencer @ 2014-07-21 23:45 UTC (permalink / raw) To: glommer; +Cc: musl Rich Felker wrote: > On Mon, Jul 21, 2014 at 02:29:59PM +0400, Glauber Costa wrote: >>> they seem to have copied a lot of code from musl, but >>> never said a word about it here.. would be nice to know >>> where they are going with it.. or why they decided to >>> use musl in the first place >>> >> I am sorry about that! The decision to use musl was a very early i was not delighted when i saw some of the musl code turned into C++ in the osv repo... > > No need to be sorry. I think nsz just wanted to alert you that there > might be serious bugs that should be fixed if you're going to continue > to use a fork. The WHATSNEW file does a pretty good job of summarizing > important fixes in each release and could be used to find patches to > backport. my idea to keep a fork in tune using a semi-automated approach: - use a comment line like // SOURCE: $MUSL/src/unistd/getpid.c in every file in libc/ that's copied from musl - use a shell script that updates a musl clone and then iterates over the files in osv/libc, and for each file that contains SOURCE tag do "git log -p $MUSL/$SOURCE" to see an overview of all commits and changes to that file. you could even add the commit id from which the source was imported as another comment to restrict the set of changes that's shown to only the relevant ones. after backporting changes, you could manually raise the commit id to the one of the corresponding musl commit. if done like regularly, like once a month, you could probably get away with very little work to get all the relevant changes backported. --JS ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] implement glibc's glob_pattern_p 2014-07-17 9:12 [PATCH] implement glibc's glob_pattern_p Glauber Costa 2014-07-19 20:11 ` Rich Felker 2014-07-20 20:36 ` Szabolcs Nagy @ 2014-07-21 14:33 ` Rich Felker 2014-07-21 22:44 ` Glauber Costa 2 siblings, 1 reply; 12+ messages in thread From: Rich Felker @ 2014-07-21 14:33 UTC (permalink / raw) To: musl On Thu, Jul 17, 2014 at 01:12:55PM +0400, Glauber Costa wrote: > Currently, glob is implemented but glob_pattern_p is not. This function > can trivially be implemented with existing glob machinery, since all it > does is to tell whether or not the string contains special symbols. This > is basically the same logic as is_literal, but with inverted return value. > > The value of the "quote" parameter should also be inverted, since the man > page states that when the quote parameter is *non*-zero, then backslashes > are ignored. is_literal ignores them when useesc is *zero*. Upon reviewing this patch, I'm not sure if it's correct. It wrote this code (glob.c) roughly 10 years ago and barely touched it since, but from what I can tell, the existing is_literal looks like a premature optimization to avoid calling fnmatch when it's not necessary than an accurate predicate. In other words, it's designed to reliably return 0 if the string is a glob, but there are also cases where it returns 0 when the string is not a "glob pattern" but not purely a literal either (e.g. anything containing backslash when useesc is nonzero, since such strings cannot be processed with strcmp and need fnmatch). My leaning would be to remove this "optimization" from glob.c entirely (it shouldn't be needed if fnmatch is efficient) and move the code to a separate file for glob_pattern_p, but then we need to make sure it gets things right with no false positives or false negatives. Rich ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] implement glibc's glob_pattern_p 2014-07-21 14:33 ` Rich Felker @ 2014-07-21 22:44 ` Glauber Costa 0 siblings, 0 replies; 12+ messages in thread From: Glauber Costa @ 2014-07-21 22:44 UTC (permalink / raw) To: musl [-- Attachment #1: Type: text/plain, Size: 1642 bytes --] On Mon, Jul 21, 2014 at 6:33 PM, Rich Felker <dalias@libc.org> wrote: > On Thu, Jul 17, 2014 at 01:12:55PM +0400, Glauber Costa wrote: > > Currently, glob is implemented but glob_pattern_p is not. This function > > can trivially be implemented with existing glob machinery, since all it > > does is to tell whether or not the string contains special symbols. This > > is basically the same logic as is_literal, but with inverted return > value. > > > > The value of the "quote" parameter should also be inverted, since the man > > page states that when the quote parameter is *non*-zero, then backslashes > > are ignored. is_literal ignores them when useesc is *zero*. > > Upon reviewing this patch, I'm not sure if it's correct. It wrote this > code (glob.c) roughly 10 years ago and barely touched it since, but > from what I can tell, the existing is_literal looks like a premature > optimization to avoid calling fnmatch when it's not necessary than an > accurate predicate. In other words, it's designed to reliably return 0 > if the string is a glob, but there are also cases where it returns 0 > when the string is not a "glob pattern" but not purely a literal > either (e.g. anything containing backslash when useesc is nonzero, > since such strings cannot be processed with strcmp and need fnmatch). > > My leaning would be to remove this "optimization" from glob.c entirely > (it shouldn't be needed if fnmatch is efficient) and move the code to > a separate file for glob_pattern_p, but then we need to make sure it > gets things right with no false positives or false negatives. > > Hi I will come back to this soon, then. Cya [-- Attachment #2: Type: text/html, Size: 2205 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2014-07-21 23:45 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-07-17 9:12 [PATCH] implement glibc's glob_pattern_p Glauber Costa 2014-07-19 20:11 ` Rich Felker 2014-07-19 23:17 ` Glauber Costa 2014-07-19 23:45 ` Rich Felker 2014-07-20 20:01 ` Thomas Petazzoni 2014-07-21 10:19 ` Glauber Costa 2014-07-20 20:36 ` Szabolcs Nagy 2014-07-21 10:29 ` Glauber Costa 2014-07-21 14:11 ` Rich Felker 2014-07-21 23:45 ` John Spencer 2014-07-21 14:33 ` Rich Felker 2014-07-21 22:44 ` Glauber Costa
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/musl/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).