From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 13787 invoked from network); 13 Jan 2021 18:01:25 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 13 Jan 2021 18:01:25 -0000 Received: (qmail 9790 invoked by uid 550); 13 Jan 2021 18:01:19 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 9772 invoked from network); 13 Jan 2021 18:01:18 -0000 Date: Wed, 13 Jan 2021 13:01:06 -0500 From: Rich Felker To: Natanael Copa Cc: musl@lists.openwall.com Message-ID: <20210113180105.GE23432@brightrain.aerifal.cx> References: <20210113132835.22685-1-ncopa@alpinelinux.org> <20210113143809.608ea269@ncopa-desktop.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210113143809.608ea269@ncopa-desktop.lan> User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] [PATCH] make realpath replace leading // with a single / On Wed, Jan 13, 2021 at 02:38:09PM +0100, Natanael Copa wrote: > On Wed, 13 Jan 2021 14:28:35 +0100 > Natanael Copa wrote: > > > On some systems a leading double slash may have special meaning, so > > POSIX[1] says that "If a pathname begins with two successive > > characters, the first component following the leading characters > > may be interpreted in an implementation-defined manner" > > > > While current musl implementation is technically correct, most other > > systems' (at least GNU libc, freebsd, openbsd, netbsd macOS) > > implementations will replace a leading // with a single /. Make musl > > do the same to avoid surprises. > > > > [1]: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_13 > > --- > > src/misc/realpath.c | 3 --- > > 1 file changed, 3 deletions(-) > > > > diff --git a/src/misc/realpath.c b/src/misc/realpath.c > > index db8b74dc..414b4741 100644 > > --- a/src/misc/realpath.c > > +++ b/src/misc/realpath.c > > @@ -46,9 +46,6 @@ restart: > > q=0; > > output[q++] = '/'; > > p++; > > - /* Initial // is special. */ > > - if (stack[p] == '/' && stack[p+1] != '/') > > - output[q++] = '/'; > > continue; > > } > > > > This fixes gettext's (gnulib) testsuite, which tests if realpath("//", > NULL) return "/" and fails if it doesn't. > > I ran this testcase on multiple systems: > > #include > #include > > int main() { > printf("%s\n", realpath("//", NULL)); > return 0; > } > > > > musl 1.1.24: / > ubuntu 20.03: / > macOS Big Sur: / > OpenBSD 6.8: / > FreeBSD 12.2: / > NetBSD 9.1: / > musl (current) // > > I don't know why this behavior was introduced in musl, but I think > this only adds meaningless friction to downstream users. It wasn't so much introduced new as written to avoid hard-coding a non-portable interpretation of pathnames into binaries. Previously, we relied on the kernel's path normalization as displayed in /proc/*/fd/* symlink contents to do the normalization, and it was able to convert leading // to / on the basis that it (the kernel) does not treat them differently. However POSIX specifies / and // as potentially being different (with /// and beyond all being equivalent to /, though), and if musl's userspace realpath resolution made the normalization of // to /, it would be encoding the present-day-Linux-specific semantic assumption. I did not want to do that. POSIX reserves // presumably to allow things like network shares, abstract volume references, drive letters, etc. not explcitly mounted into the filesystem but referenced via a user-provided pathname. At one point midipix was planning to use this for Windows drive letters, but I think they hit too much friction with applications doing their own pathname normalization that broke it (converting the // to /) and dropped it. Part of my motive in preserving compatibility with kernel implementations that use // specially was aimed at midipix (I wasn't yet aware they'd dropped it) but I was also thinking about the ability to run musl binaries on non-Linux kernels where pathnames should be interpreted according to the host filesystem policy, not a hard-coded assumption that Linux and other currently popular systems don't use // for anything. It would perhaps be nice to detect at realpath time when //foo and /foo are the same inode and drop the extra / in that case, but that requires pulling in stat, which is large nowadays thanks to the need for translation of kernel structures. It might be worth doing anyway if this ends up being a problem, but from my understanding on IRC earlier the breakage you hit was just a testsuite asserting something invalid, and nothing was actually broken at build or run time. Rich