mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: [musl] realpath without procfs
Date: Tue, 8 Sep 2020 13:19:04 -0400	[thread overview]
Message-ID: <20200908171901.GV3265@brightrain.aerifal.cx> (raw)

[-- Attachment #1: Type: text/plain, Size: 2185 bytes --]

Since it was raised yet again on #musl, I took some time to research
justification for, and ease of implementing, realpath without procfs.

It turns out that, while musl documents needing procfs and allows that
arbitrary amounts of stuff may be broken if it's not available,
realpath is the main function where the core functionality disappears
without procfs. Other things are mostly lesser:

- handling of O_SEARCH/O_EXEC fds in some of the f* functions where
  kernel gives EBADF for O_PATH fds (nothing uses these anyway).

- implementing fchmodat with AT_SYMLINK_NOFOLLOW (important but will
  eventually have a non-hack way to do it via a new syscall).

- ttyname (important to things that use it)

- pthread_setname_np (nonstandard and non-load-bearing)

- dynamic linker identifying executable pathname

This is actually a lot less than I expected, and makes it reasonable
to envision a path to eventually not needing procfs at all.

So, I did the work to figure out what would be needed to write a
procfs-free realpath, and it turns out that actually writing it was
not any harder, so I did. Attached is a draft. It needs testing, and
I'm undecided whether we should keep the old code and just use this as
a fallback, or just replace it. (The old code has fixed 5-syscall
overhead and ugly side effects on kernels too old to have O_PATH; new
code needs one syscall per path component and might (?) have worse or
different behavior under concurrent changes to the dir tree.)

Some notes:

- Attempts to support all pathnames where no intermediate exceeds
  PATH_MAX.

- Initial // is treated as special, but //. and //.. resolve to /

- getcwd is expanded initially if pathname is relative. This might be
  a bad choice since it causes failure whenever pwd is not
  representable even if the symlink reached via a relative pathname
  would take us to an absolute path that is representable. We could
  accumulate a relative path, including preserving .. components,
  until the first absolute-target symlink, and only apply it by
  prepending (and cancelling ..) at the end if no absolute-target
  symlink was encountered, but that requires some rework to do.

Thoughts?

Rich

[-- Attachment #2: realpath2.c --]
[-- Type: text/plain, Size: 1562 bytes --]

#define _GNU_SOURCE
#include <stdlib.h>
#include <limits.h>
#include <errno.h>
#include <unistd.h>
#include <string.h>

char *realpath(const char *restrict filename, char *restrict resolved)
{
	char output[PATH_MAX], stack[PATH_MAX];
	size_t p, q, l, cnt=0;

	l = strlen(filename);
	if (l > sizeof stack) goto toolong;
	p = sizeof stack - l - 1;
	memcpy(stack+p, filename, l+1);

	if (stack[p] != '/') {
		if (getcwd(output, sizeof output) < 0) return 0;
		q = strlen(output);
	} else {
		q = 0;
	}

	while (stack[p]) {
		if (stack[p] == '/') {
			q=0;
			p++;
			/* Initial // is special. */
			if (stack[p] == '/' && stack[p+1] != '/') {
				output[q++] = '/';
			}
			while (stack[p] == '/') p++;
		}
		char *z = __strchrnul(stack+p, '/');
		l = z-(stack+p);
		if (l<=2 && stack[p]=='.' && stack[p+l-1]=='.') {
			if (l==2) {
				while(q>1 && output[q-1]!='/') q--;
				if (q>1) q--;
			}
			p += l;
			while (stack[p] == '/') p++;
			continue;
		}
		if (l==1 && stack[p]=='.')
		if (l+2 > sizeof output - q) goto toolong;
		output[q] = '/';
		memcpy(output+q+1, stack+p, l);
		output[q+1+l] = 0;
		p += l;
		ssize_t k = readlink(output, stack, p);
		if (k==-1) {
			if (errno == EINVAL) {
				q += 1+l;
				while (stack[p] == '/') p++;
				continue;
			}
			return 0;
		}
		if (k==p) goto toolong;
		if (++cnt == SYMLOOP_MAX) {
			errno = ELOOP;
			return 0;
		}
		p -= k;
		memmove(stack+p, stack, k);
	}
	if (!q) output[q++] = '/';
	output[q] = 0;
	return resolved ? strcpy(resolved, output) : strdup(output);

toolong:
	errno = ENAMETOOLONG;
	return 0;
}

             reply	other threads:[~2020-09-08 17:19 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-08 17:19 Rich Felker [this message]
2020-09-08 19:27 ` Markus Wichmann
2020-09-08 20:17   ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200908171901.GV3265@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).