mailing list of musl libc
 help / color / mirror / code / Atom feed
* [PATCH] In glob(), do not require that the target of a symlink exists.
@ 2019-07-23  0:37 James Y Knight
  2019-07-30 16:04 ` James Y Knight
  2019-08-06  2:03 ` Rich Felker
  0 siblings, 2 replies; 5+ messages in thread
From: James Y Knight @ 2019-07-23  0:37 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 611 bytes --]

Previously, when given a trailing path component with no
metacharacters, glob would call stat to check if the provided filename
existed, which would fail on a broken symlink. When expanding a
pattern however, glob would trust the list of files returned by
readdir, and thus would return broken symlinks.

Now, be consistent and allow broken symlinks in both cases, by using
lstat to determine file existence.

If GLOB_MARK is specified, stat must still be used to determine
whether a given name refers to a directory or file. But if that fails,
a symlink is simply considered to be a file for marking purposes.

[-- Attachment #2: 0001-In-glob-do-not-require-that-the-target-of-a-symlink-.patch --]
[-- Type: application/octet-stream, Size: 2736 bytes --]

From 7ef0fabf173f4d29461b01ccfb05c1f19977ba37 Mon Sep 17 00:00:00 2001
From: James Y Knight <jyknight@google.com>
Date: Thu, 11 Jul 2019 16:55:17 -0400
Subject: [PATCH] In glob(), do not require that the target of a symlink
 exists.

Previously, when given a trailing path component with no
metacharacters, glob would call stat to check if the provided filename
existed, which would fail on a broken symlink. When expanding a
pattern however, glob would trust the list of files returned by
readdir, and thus would return broken symlinks.

Now, be consistent and allow broken symlinks in both cases, by using
lstat to determine file existence.

If GLOB_MARK is specified, stat must still be used to determine
whether a given name refers to a directory or file. But if that fails,
a symlink is simply considered to be a file for marking purposes.
---
 src/regex/glob.c | 35 +++++++++++++++++++++++++----------
 1 file changed, 25 insertions(+), 10 deletions(-)

diff --git a/src/regex/glob.c b/src/regex/glob.c
index aa1c6a44..01b178dc 100644
--- a/src/regex/glob.c
+++ b/src/regex/glob.c
@@ -88,18 +88,33 @@ static int do_glob(char *buf, size_t pos, int type, char *pat, int flags, int (*
 	}
 	buf[pos] = 0;
 	if (!*pat) {
-		/* If we consumed any components above, or if GLOB_MARK is
-		 * requested and we don't yet know if the match is a dir,
-		 * we must call stat to confirm the file exists and/or
-		 * determine its type. */
+		/* If we're in at the end of the pattern, check if the file
+		 * exists. And, if GLOB_MARK is requested, determine if it's a
+		 * directory. */
 		struct stat st;
-		if ((flags & GLOB_MARK) && type==DT_LNK) type = 0;
-		if (!type && stat(buf, &st)) {
-			if (errno!=ENOENT && (errfunc(buf, errno) || (flags & GLOB_ERR)))
-				return GLOB_ABORTED;
-			return 0;
+		if (flags & GLOB_MARK) {
+			/* We need to know if the name refers to a directory
+			 * (after resolving symlinks).
+			 *
+			 * However, broken symlinks should be considered to be a
+			 * file, rather than non-existent or an error, so if
+			 * stat fails, we just don't modify type. (And lstat
+			 * will be called below if required.)
+			 */
+			if ((!type || type==DT_LNK) && stat(buf, &st) == 0) {
+				if(S_ISDIR(st.st_mode)) type = DT_DIR;
+				else type = DT_REG;
+			}
+		}
+		if (!type) {
+			/* If we're don't already know that the file exists,
+			 * confirm its presence. */
+			if (lstat(buf, &st)) {
+				if (errno!=ENOENT && (errfunc(buf, errno) || (flags & GLOB_ERR)))
+					return GLOB_ABORTED;
+				return 0;
+			}
 		}
-		if (!type && S_ISDIR(st.st_mode)) type = DT_DIR;
 		if (append(tail, buf, pos, (flags & GLOB_MARK) && type==DT_DIR))
 			return GLOB_NOSPACE;
 		return 0;
-- 
2.22.0.657.g960e92d24f-goog


[-- Attachment #3: 0001-Add-some-basic-tests-for-glob.patch --]
[-- Type: application/octet-stream, Size: 6876 bytes --]

From 4e828b7230d2d8dfebec87aa9d262ee10b8e45a4 Mon Sep 17 00:00:00 2001
From: James Y Knight <jyknight@google.com>
Date: Mon, 22 Jul 2019 20:25:38 -0400
Subject: [PATCH] Add some basic tests for glob.

---
 src/functional/glob.c                         | 135 ++++++++++++++++++
 src/functional/glob.data/basic/dir/file       |   0
 src/functional/glob.data/basic/file           |   0
 src/functional/glob.data/basic/symlink-dir    |   1 +
 src/functional/glob.data/basic/symlink-file   |   1 +
 .../glob.data/broken-links/broken-symlink     |   1 +
 src/functional/glob.data/weird-chars/foo-[a]  |   0
 src/functional/glob.data/weird-chars/foo-a    |   0
 8 files changed, 138 insertions(+)
 create mode 100644 src/functional/glob.c
 create mode 100644 src/functional/glob.data/basic/dir/file
 create mode 100644 src/functional/glob.data/basic/file
 create mode 120000 src/functional/glob.data/basic/symlink-dir
 create mode 120000 src/functional/glob.data/basic/symlink-file
 create mode 120000 src/functional/glob.data/broken-links/broken-symlink
 create mode 100644 src/functional/glob.data/weird-chars/foo-[a]
 create mode 100644 src/functional/glob.data/weird-chars/foo-a

diff --git a/src/functional/glob.c b/src/functional/glob.c
new file mode 100644
index 0000000..69ffea5
--- /dev/null
+++ b/src/functional/glob.c
@@ -0,0 +1,135 @@
+#include <string.h>
+#include <stdlib.h>
+#include <glob.h>
+#include <stdio.h>
+
+#include "test.h"
+
+glob_t globbuf;
+
+int paths_match(int expected_num, const char **expected) {
+	if (globbuf.gl_pathc + globbuf.gl_offs != expected_num)
+		return 0;
+	for (int i = 0; i < expected_num; ++i) {
+		if ((globbuf.gl_pathv[i] == NULL) != (expected[i] == NULL))
+			return 0;
+		if (expected[i] && strcmp(globbuf.gl_pathv[i], expected[i]) != 0)
+			return 0;
+	}
+	return 1;
+}
+
+void print_mismatch(int expected_num, const char **expected) {
+	t_printf("Expected paths (%d):\n", expected_num);
+	for (int i = 0; i < expected_num; ++i)
+		t_printf("  %s\n", expected[i] ? expected[i] : "NULL");
+	t_printf("Got paths (%d):\n", globbuf.gl_pathc + globbuf.gl_offs);
+	for (int i = 0; i < globbuf.gl_pathc + globbuf.gl_offs; ++i)
+		t_printf("  %s\n", globbuf.gl_pathv ? globbuf.gl_pathv[i] : "NULL");
+}
+
+#define TEST_NOFREE(pat, flags, expected_ret, ...) do {			\
+	int tflags = flags|extraflags;					\
+	int ret = glob(pat, tflags, NULL, &globbuf);			\
+	const char *e[] = __VA_ARGS__;					\
+	if (ret != expected_ret)					\
+		t_error("glob(\"%s\", %d) failed (return %d, expected %d)\n", \
+			pat, tflags, ret, expected_ret);		\
+	if (!paths_match(sizeof(e)/sizeof(*e), e)) {			\
+		t_error("glob(\"%s\", %d) returned unexpected paths.\n", \
+			pat, tflags);					\
+		print_mismatch(sizeof(e)/sizeof(*e), e);		\
+	}								\
+} while(0)
+
+#define TEST(pat, flags, expected_ret, ...) do {		\
+	TEST_NOFREE(pat, flags, expected_ret, __VA_ARGS__);	\
+	globfree(&globbuf);					\
+} while(0)
+
+#define TEST_UNMATCH(pat) do {					\
+	if (extraflags & GLOB_NOCHECK)				\
+		TEST(pat, extraflags, 0, {pat});		\
+	else							\
+		TEST(pat, extraflags, GLOB_NOMATCH, {});	\
+} while(0)
+
+#define MARK(str) ((extraflags & GLOB_MARK) ? str "/" : str)
+
+static void do_tests(int extraflags) {
+	// Check basic functioning of paths
+	TEST("basic/file", 0,
+	     0, {"basic/file"});
+	TEST("basic/symlink-file", 0,
+	     0, {"basic/symlink-file"});
+	TEST("basic/dir", 0,
+	     0, {MARK("basic/dir")});
+	TEST("basic/symlink-dir", 0,
+	     0, {MARK("basic/symlink-dir")});
+	// With trailing slash input, always returns a trailing slash
+	TEST("basic/dir/", 0,
+	     0, {"basic/dir/"});
+
+	// And patterns...
+	TEST("basic/*", 0,
+	     0, {MARK("basic/dir"), "basic/file", MARK("basic/symlink-dir"), "basic/symlink-file"});
+	TEST("basic/d?r", 0,
+	     0, {MARK("basic/dir")});
+	TEST("basic/[df]??", 0,
+	     0, {MARK("basic/dir")});
+	TEST("basic/fi[l]e", 0,
+	     0, {"basic/file"});
+	TEST("basic/*/", 0,
+	     0, {"basic/dir/", "basic/symlink-dir/"});
+	TEST("basic/*/*", 0,
+	     0, {"basic/dir/file", "basic/symlink-dir/file"});
+
+	TEST("weird-chars/foo-[a]", 0,
+	     0, {"weird-chars/foo-a"});
+	TEST("weird-chars/foo-\\[a\\]", 0,
+	     0, {"weird-chars/foo-[a]"});
+
+	// Non-matching patterns
+	TEST_UNMATCH("basic/not-there");
+	TEST_UNMATCH("basic/not-there*");
+	TEST_UNMATCH("basic/not-there/*");
+	// A file can't be specified with a trailing slash
+	TEST_UNMATCH("basic/file/");
+
+	// Check GLOB_APPEND, GLOB_DOOFFS, and both together
+	globbuf.gl_offs = 2; // should be ignored without GLOB_DOOFFS
+	TEST_NOFREE("basic/file", 0,
+		    0, {"basic/file"});
+	TEST("basic/dir", GLOB_APPEND,
+	     0, {"basic/file", MARK("basic/dir")});
+
+	globbuf.gl_offs = 2;
+	TEST_NOFREE("basic/file", GLOB_DOOFFS,
+		    0, {NULL, NULL, "basic/file"});
+	TEST("basic/dir", GLOB_APPEND|GLOB_DOOFFS,
+	     0, {NULL, NULL, "basic/file", MARK("basic/dir")});
+
+	// Check proper support for broken symlinks. Both completely-specified,
+	// and pattern matched.
+	TEST("broken-links/broken-symlink", 0,
+	     0, {"broken-links/broken-symlink"});
+
+	TEST("broken-links/*", 0,
+	     0, {"broken-links/broken-symlink"});
+}
+
+
+int main(int argc, char **argv)
+{
+	char buf[512];
+	if (!t_pathrel(buf, sizeof buf, argv[0], "glob.data")) {
+		t_error("failed to obtain relative path to glob data\n");
+		return 1;
+	}
+	chdir(buf);
+
+	do_tests(0);
+	do_tests(GLOB_MARK);
+	do_tests(GLOB_NOCHECK);
+	return t_status;
+}
diff --git a/src/functional/glob.data/basic/dir/file b/src/functional/glob.data/basic/dir/file
new file mode 100644
index 0000000..e69de29
diff --git a/src/functional/glob.data/basic/file b/src/functional/glob.data/basic/file
new file mode 100644
index 0000000..e69de29
diff --git a/src/functional/glob.data/basic/symlink-dir b/src/functional/glob.data/basic/symlink-dir
new file mode 120000
index 0000000..8724519
--- /dev/null
+++ b/src/functional/glob.data/basic/symlink-dir
@@ -0,0 +1 @@
+dir
\ No newline at end of file
diff --git a/src/functional/glob.data/basic/symlink-file b/src/functional/glob.data/basic/symlink-file
new file mode 120000
index 0000000..1a010b1
--- /dev/null
+++ b/src/functional/glob.data/basic/symlink-file
@@ -0,0 +1 @@
+file
\ No newline at end of file
diff --git a/src/functional/glob.data/broken-links/broken-symlink b/src/functional/glob.data/broken-links/broken-symlink
new file mode 120000
index 0000000..456794e
--- /dev/null
+++ b/src/functional/glob.data/broken-links/broken-symlink
@@ -0,0 +1 @@
+no-such-file
\ No newline at end of file
diff --git a/src/functional/glob.data/weird-chars/foo-[a] b/src/functional/glob.data/weird-chars/foo-[a]
new file mode 100644
index 0000000..e69de29
diff --git a/src/functional/glob.data/weird-chars/foo-a b/src/functional/glob.data/weird-chars/foo-a
new file mode 100644
index 0000000..e69de29
-- 
2.22.0.657.g960e92d24f-goog


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-08-06  2:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-23  0:37 [PATCH] In glob(), do not require that the target of a symlink exists James Y Knight
2019-07-30 16:04 ` James Y Knight
2019-07-30 16:13   ` Rich Felker
2019-07-31 20:35     ` Stephane Chazelas
2019-08-06  2:03 ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).