mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] is fnmatch() a bit broken?
@ 2023-01-08 13:45 (GalaxyMaster)
  2023-01-09  7:32 ` Rich Felker
  0 siblings, 1 reply; 5+ messages in thread
From: (GalaxyMaster) @ 2023-01-08 13:45 UTC (permalink / raw)
  To: musl

Hello,

I haven't analysed the fnmatch.c in musl yet, but I wrote a quick test to
demonstrate the issue:
===
#include <fnmatch.h>
#include <stdio.h>

#define TEST_FNMATCH(pattern, haystack, expect) \
        ({ \
                printf("fnmatch(\"%s\", \"%s\", 0) = %d (expected: %d)\n", \
                                pattern, haystack, \
                                fnmatch(pattern, haystack, 0), expect); \
        })

int main()
{
        TEST_FNMATCH("abc", "abc", 0);
        TEST_FNMATCH("[1\\]] [1\\]]", "1 ]", 0);
        TEST_FNMATCH("[a-c-f] [a-c-f] [a-c-f] [a-c-f] [a-c-f]", "a b c - f", 0);
        TEST_FNMATCH("[a-c-f]", "e", 1);
        TEST_FNMATCH("[a\\-z]", "b", 1);
        return 0;
}
===

Below is a session on a box with musl 1.2.3:
===
galaxy@apollo:~/musl-fnmatch $ gcc -o musl-fnmatch musl-fnmatch.c
galaxy@apollo:~/musl-fnmatch $ ./musl-fnmatch
fnmatch("abc", "abc", 0) = 0 (expected: 0)
fnmatch("[1\]] [1\]]", "1 ]", 0) = 1 (expected: 0)
fnmatch("[a-c-f] [a-c-f] [a-c-f] [a-c-f] [a-c-f]", "a b c - f", 0) = 1 (expected: 0)
fnmatch("[a-c-f]", "e", 0) = 0 (expected: 1)
fnmatch("[a\-z]", "b", 0) = 0 (expected: 1)
galaxy@apollo:~/musl-fnmatch $ ldd ./musl-fnmatch
	/lib/ld-musl-x86_64.so.1 (0x7f25af652000)
	libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7f25af652000)
galaxy@apollo:~/musl-fnmatch $ rpm -q musl
musl-1.2.3-owl0.x86_64
galaxy@apollo:~/musl-fnmatch $
===

As you may see from above the patterns are not that crazy (they were actually
taken from the systemd test case, since I am trying to build the latest systemd
with musl correctly, but this is offtopic).  The most concerning are the last
three patterns, since obviously musl's fnmatch() seems to be greedy and also
does not respect escape character in some cases, e.g. in "[a\-z]" it is common
sense to treat it as 'a', '-', or 'z'.

A test on a glibc system has no issues:
===
[galaxy@apollo musl-fnmatch]$ gcc -o glibc-fnmatch musl-fnmatch.c
[galaxy@apollo musl-fnmatch]$ ./glibc-fnmatch
fnmatch("abc", "abc", 0) = 0 (expected: 0)
fnmatch("[1\]] [1\]]", "1 ]", 0) = 0 (expected: 0)
fnmatch("[a-c-f] [a-c-f] [a-c-f] [a-c-f] [a-c-f]", "a b c - f", 0) = 0 (expected: 0)
fnmatch("[a-c-f]", "e", 0) = 1 (expected: 1)
fnmatch("[a\-z]", "b", 0) = 1 (expected: 1)
[galaxy@apollo musl-fnmatch]$ ldd glibc-fnmatch
	linux-vdso.so.1 (0x00007ffc301c4000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f44bf93a000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f44bfb5c000)
[galaxy@apollo musl-fnmatch]$ rpm -q glibc
glibc-2.35.0.6.491f2e-alt2.x86_64
[galaxy@apollo musl-fnmatch]$
===

Am I missing something or os this behaviour is not expected?

-- 
(GM)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [musl] is fnmatch() a bit broken?
  2023-01-08 13:45 [musl] is fnmatch() a bit broken? (GalaxyMaster)
@ 2023-01-09  7:32 ` Rich Felker
  2023-01-09  7:57   ` (GalaxyMaster)
  0 siblings, 1 reply; 5+ messages in thread
From: Rich Felker @ 2023-01-09  7:32 UTC (permalink / raw)
  To: (GalaxyMaster); +Cc: musl

On Sun, Jan 08, 2023 at 01:45:09PM +0000, (GalaxyMaster) wrote:
> Hello,
> 
> I haven't analysed the fnmatch.c in musl yet, but I wrote a quick test to
> demonstrate the issue:
> ===
> #include <fnmatch.h>
> #include <stdio.h>
> 
> #define TEST_FNMATCH(pattern, haystack, expect) \
>         ({ \
>                 printf("fnmatch(\"%s\", \"%s\", 0) = %d (expected: %d)\n", \
>                                 pattern, haystack, \
>                                 fnmatch(pattern, haystack, 0), expect); \
>         })
> 
> int main()
> {
>         TEST_FNMATCH("abc", "abc", 0);
>         TEST_FNMATCH("[1\\]] [1\\]]", "1 ]", 0);
>         TEST_FNMATCH("[a-c-f] [a-c-f] [a-c-f] [a-c-f] [a-c-f]", "a b c - f", 0);
>         TEST_FNMATCH("[a-c-f]", "e", 1);
>         TEST_FNMATCH("[a\\-z]", "b", 1);
>         return 0;
> }
> ===
> 
> Below is a session on a box with musl 1.2.3:
> ===
> galaxy@apollo:~/musl-fnmatch $ gcc -o musl-fnmatch musl-fnmatch.c
> galaxy@apollo:~/musl-fnmatch $ ./musl-fnmatch
> fnmatch("abc", "abc", 0) = 0 (expected: 0)
> fnmatch("[1\]] [1\]]", "1 ]", 0) = 1 (expected: 0)
> fnmatch("[a-c-f] [a-c-f] [a-c-f] [a-c-f] [a-c-f]", "a b c - f", 0) = 1 (expected: 0)
> fnmatch("[a-c-f]", "e", 0) = 0 (expected: 1)
> fnmatch("[a\-z]", "b", 0) = 0 (expected: 1)
> galaxy@apollo:~/musl-fnmatch $ ldd ./musl-fnmatch
> 	/lib/ld-musl-x86_64.so.1 (0x7f25af652000)
> 	libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7f25af652000)
> galaxy@apollo:~/musl-fnmatch $ rpm -q musl
> musl-1.2.3-owl0.x86_64
> galaxy@apollo:~/musl-fnmatch $
> ===
> 
> As you may see from above the patterns are not that crazy (they were actually
> taken from the systemd test case, since I am trying to build the latest systemd
> with musl correctly, but this is offtopic).  The most concerning are the last
> three patterns, since obviously musl's fnmatch() seems to be greedy and also
> does not respect escape character in some cases, e.g. in "[a\-z]" it is common
> sense to treat it as 'a', '-', or 'z'.

This difference is intentional because glibc's behavior is contrary to
the spec. Glob/fnmatch brackets are equivalent to (specified as) regex
brackets where the only way to include a literal - is at the
beginning/end. A '\' can escape the '[' and make it non-special (not
open a bracket) but the '-' inside the bracket is not "special" to
begin with -- it's just part of the bracket syntax. Likewise with the
closing ']' case.

Regarding the [a-c-f] case, POSIX specifies:

"The interpretation of range expressions where the ending range point
is also the starting range point of a subsequent range expression (for
example, "[a-m-o]" ) is undefined."

so this test is explicitly invalid.

Rich

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [musl] is fnmatch() a bit broken?
  2023-01-09  7:32 ` Rich Felker
@ 2023-01-09  7:57   ` (GalaxyMaster)
  2023-01-09  8:09     ` Rich Felker
  0 siblings, 1 reply; 5+ messages in thread
From: (GalaxyMaster) @ 2023-01-09  7:57 UTC (permalink / raw)
  To: musl

Rich,

On Mon, Jan 09, 2023 at 02:32:13AM -0500, Rich Felker wrote:
> > galaxy@apollo:~/musl-fnmatch $ ./musl-fnmatch
> > fnmatch("abc", "abc", 0) = 0 (expected: 0)
> > fnmatch("[1\]] [1\]]", "1 ]", 0) = 1 (expected: 0)
> 
> This difference is intentional because glibc's behavior is contrary to
> the spec.

Thanks for the explanation, today I learnt something :).  However, you said:

> A '\' can escape the '[' and make it non-special (not
> open a bracket) but the '-' inside the bracket is not "special" to
> begin with -- it's just part of the bracket syntax. Likewise with the
> closing ']' case.

Which brings a question on the "[1\]] [1\]]" use case not matching "1 ]".  If I
read your response correctly, it is expected to actually match on musl, did I
get it wrong?

-- 
(GM)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [musl] is fnmatch() a bit broken?
  2023-01-09  7:57   ` (GalaxyMaster)
@ 2023-01-09  8:09     ` Rich Felker
  2023-01-09  8:24       ` (GalaxyMaster)
  0 siblings, 1 reply; 5+ messages in thread
From: Rich Felker @ 2023-01-09  8:09 UTC (permalink / raw)
  To: (GalaxyMaster); +Cc: musl

On Mon, Jan 09, 2023 at 07:57:56AM +0000, (GalaxyMaster) wrote:
> Rich,
> 
> On Mon, Jan 09, 2023 at 02:32:13AM -0500, Rich Felker wrote:
> > > galaxy@apollo:~/musl-fnmatch $ ./musl-fnmatch
> > > fnmatch("abc", "abc", 0) = 0 (expected: 0)
> > > fnmatch("[1\]] [1\]]", "1 ]", 0) = 1 (expected: 0)
> > 
> > This difference is intentional because glibc's behavior is contrary to
> > the spec.
> 
> Thanks for the explanation, today I learnt something :).  However, you said:
> 
> > A '\' can escape the '[' and make it non-special (not
> > open a bracket) but the '-' inside the bracket is not "special" to
> > begin with -- it's just part of the bracket syntax. Likewise with the
> > closing ']' case.
> 
> Which brings a question on the "[1\]] [1\]]" use case not matching "1 ]".  If I
> read your response correctly, it is expected to actually match on musl, did I
> get it wrong?

It's been a while since I looked at these, but the key thing is that ]
is not a special character. Only *, ?, and [ are special. ] is just
part of the bracket syntax once the bracket is open, and IIRC \] is
identical to ], closing the bracket. The subsequent ] is then outside
the bracket and matches itself (mismatch here).

Rich

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [musl] is fnmatch() a bit broken?
  2023-01-09  8:09     ` Rich Felker
@ 2023-01-09  8:24       ` (GalaxyMaster)
  0 siblings, 0 replies; 5+ messages in thread
From: (GalaxyMaster) @ 2023-01-09  8:24 UTC (permalink / raw)
  To: musl

Rich,

On Mon, Jan 09, 2023 at 03:09:20AM -0500, Rich Felker wrote:
> On Mon, Jan 09, 2023 at 07:57:56AM +0000, (GalaxyMaster) wrote:
> > Which brings a question on the "[1\]] [1\]]" use case not matching "1 ]".  If I
> > read your response correctly, it is expected to actually match on musl, did I
> > get it wrong?
> 
> It's been a while since I looked at these, but the key thing is that ]
> is not a special character. Only *, ?, and [ are special. ] is just
> part of the bracket syntax once the bracket is open, and IIRC \] is
> identical to ], closing the bracket. The subsequent ] is then outside
> the bracket and matches itself (mismatch here).

Makes a lot of sense. As always, your response was precise and very valuable.
Thank you for your time and patience! :)

-- 
(GM)

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-01-09  8:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-08 13:45 [musl] is fnmatch() a bit broken? (GalaxyMaster)
2023-01-09  7:32 ` Rich Felker
2023-01-09  7:57   ` (GalaxyMaster)
2023-01-09  8:09     ` Rich Felker
2023-01-09  8:24       ` (GalaxyMaster)

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).