From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.4 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 17601 invoked from network); 8 Sep 2020 19:28:04 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 8 Sep 2020 19:28:04 -0000 Received: (qmail 3401 invoked by uid 550); 8 Sep 2020 19:27:59 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 3380 invoked from network); 8 Sep 2020 19:27:59 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1599593267; bh=vrQFcB/iaNT8PSQeAU2hfaZ3OjANdwcVH39d2dMORgg=; h=X-UI-Sender-Class:Date:From:To:Subject:References:In-Reply-To; b=K25PpeLFnsIXSj+jNGpUCWqzlSQKCJW00kgzK+W+aUjkTJSTt+fr6NsWjogHIMXJD J0NsP9Zqw7SbmusnQ+Re89ncm2sHaHTw9DMZ70aol7P9uVHFw5uabxutOUwRrtgjuJ MywhJDajmJQtM6KBW3msUZwq9NH9Z3wNImsh/wNs= X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c Date: Tue, 8 Sep 2020 21:27:46 +0200 From: Markus Wichmann To: musl@lists.openwall.com Message-ID: <20200908192746.GA7854@voyager> References: <20200908171901.GV3265@brightrain.aerifal.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200908171901.GV3265@brightrain.aerifal.cx> User-Agent: Mutt/1.9.4 (2018-02-28) X-Provags-ID: V03:K1:l5z8h8ccbNncN5MFJOdIqN5UeiMzqs0thEbikQvirshZGEAe2TV yU1hb0q4bd04wy+EZ//np8kYN1uaslJrdbLG6sdVsHCIIy+JWTnffaRQJIT4oxTEWLwRy8k S+ZAXWpxTND2tLaAHeWQtqJqu8Q28T5tyDg/WnM6HE63WlGh04Yng8FwTQV2GgDgdLFg7p/ TFe8jlZrCs+3g8lpPyMXw== X-UI-Out-Filterresults: notjunk:1;V03:K0:ShnJtVnpka8=:OFJaBRWSj40dUhXDb/KtnR ydD2lOIyGSgAohmCnHLjYoYo/7/DZ8NQ1SGf6WuVMENCu3oYQby8HpRm2lY4yCSe9uHbLax12 vyNS25t7aCXhmp4kUdZDmCO/aaYjmVyhxXdJPASEC4/nSE4iREsyfLVW3JN0ljS58ysjtry+f z/wjewhiY63n1OOx+dPK8wCSMFNfibxGlfqR1C5iEtsRoQ/7vYLW8s6kUB1BN+y3RzZ6P3PDi A0MidxdCXLTiGZ0r/nPJu+tuINwLwMvp9i8AlCT3y0oBWoROnGBewykcvEMaze6Xd9Pe2D43o sVncTowGXtcy8J8dK1uZK/TVk8iDv3KVFVcXGjhcAIfTQMZ4d3VlPX97nmGLvrYYHU/Kyi1B7 3kRzm9nVIqrNpzfrEIdSjGO722XHWs2YPtbR3EumMzIyknVhNiNfH1yCdeDjpga4s11xneSpe sUEVPrnvk9udZS1+P4vOkR2wfyVqglsPGF+MzhyTSOCb6bIHzXeaTxLy6nwdQphmVV8KjwnON Sg0iwX0YCfZn3IPTIUvsm0IGGpcSwk6Epg1JlebUhmJJmgPLNpspO4Oe472YZt6e8mzT6JUak 9RCrL5ZLzqezi3wLGxSfXa6D0lrsSS22b3CV8H86X6B4sCz8OwHF62T/O0CSKCmnjfTVTPUHI XGYYRsypyh7myqY1TrV0FINEqUp/AwwouaiuvS2Ld8USgfL6WphlNgackJxiRpbBaf1XYydo6 anjtWi7jceVG0ZHt7j6aAntvKDvdRaAa+P9g3ckbQT9076K+p4RKpyAjBMHDaBz2m8DM40C7/ anm+dkIs1MG7M10+xEN2ZC7WFpBjqbnLQdHKwGt3TbtTvryjc1zzAI4iw6sAH7/sBB9RQlodA Gz5UaeIhxotB45qZZGJ2tf8/IvWp5IyEXZE1V/Tqt1DarzYhGTOJwg/X1ucosJcpFeJ9uhwYJ wTU169jSBxs98BxUMUOiCLOrP9XKmsSDFoIHuIVKYakB+KpfxjIvX+hviZJ64M6UuPGsfpjpj /sWM1cgkQWpQdUI0A47GOStX0TdXHVvLMXgQ2mfVMV6ICrnbUklu0jIAfJugFJ0mRdJyro0Lj MuiVcg5u+IvIOpZVIHpUglLe1chty6vodrwujkRkiMGo6H+Koaajw6EC9tJFIGcLBUwUT9l2d AWVxxJuUeptI2iUyCiw2jYMFZZu76m0GcZ4Mqzabsi3NtkkmjD4PQuOq5U/ytCzmr/kYGKgMa mNZHp+MQMp74UHPXE Content-Transfer-Encoding: quoted-printable Subject: Re: [musl] realpath without procfs On Tue, Sep 08, 2020 at 01:19:04PM -0400, Rich Felker wrote: > Since it was raised yet again on #musl, I took some time to research > justification for, and ease of implementing, realpath without procfs. > I do remember dietlibc's implementation of realpath(). But that has serious side effects that make it not thread-safe. The basic idea they had was to use chdir() and getcwd() to get the kernel to normalize the paths without having to read it from procfs. Not needing procfs was one of the design goals of that project, so that is why they implemented it that way. Unfortunately, in some cases chdir() is irreversible (e.g. deleted working directory), and also, there is only one working directory per process, so while this is going on, all other threads will have trouble finding their files. Adding locking to prevent the other threads from noticing this would be challenging, to say the least, if not outright impossible. There are just so many places where the working directory plays a role. Oh, and one more side effect: While the working directory is switched elsewhere, another process may unmount the volume containing the original directory. You could open "." first, to prevent this, but that adds another two syscalls overhead. > - ttyname (important to things that use it) > I don't see much of an alternative to using procfs for that one. You could probably search for device and inode of the fd among /dev/tty* and /dev/pts/* but that seems like a hack. That should probably be at most a fallback, if the normal way through /proc doesn't work. > - dynamic linker identifying executable pathname > Well, Linux could just pass AT_EXECFN. But if it doesn't, unless they want to add Solaris' getexecname() syscall, /proc/self/exe is the only link to the executable file name. > This is actually a lot less than I expected, and makes it reasonable > to envision a path to eventually not needing procfs at all. > > So, I did the work to figure out what would be needed to write a > procfs-free realpath, and it turns out that actually writing it was > not any harder, so I did. Attached is a draft. It needs testing, and > I'm undecided whether we should keep the old code and just use this as > a fallback, or just replace it. (The old code has fixed 5-syscall > overhead and ugly side effects on kernels too old to have O_PATH; new > code needs one syscall per path component and might (?) have worse or > different behavior under concurrent changes to the dir tree.) > > Some notes: > > - Attempts to support all pathnames where no intermediate exceeds > PATH_MAX. > > - Initial // is treated as special, but //. and //.. resolve to / > > - getcwd is expanded initially if pathname is relative. This might be > a bad choice since it causes failure whenever pwd is not > representable even if the symlink reached via a relative pathname > would take us to an absolute path that is representable. I just checked, and glibc does the same thing. So at least you are in good company with being unable to handle unreachable working directories in realpath(). > We could > accumulate a relative path, including preserving .. components, > until the first absolute-target symlink, and only apply it by > prepending (and cancelling ..) at the end if no absolute-target > symlink was encountered, but that requires some rework to do. > > Thoughts? > > Rich > #define _GNU_SOURCE > #include > #include > #include > #include > #include > > char *realpath(const char *restrict filename, char *restrict resolved) > { > char output[PATH_MAX], stack[PATH_MAX]; > size_t p, q, l, cnt=3D0; > > l =3D strlen(filename); > if (l > sizeof stack) goto toolong; Shouldn't that be strnlen(), then? > p =3D sizeof stack - l - 1; > memcpy(stack+p, filename, l+1); > > if (stack[p] !=3D '/') { > if (getcwd(output, sizeof output) < 0) return 0; > q =3D strlen(output); > } else { > q =3D 0; > } > > while (stack[p]) { > if (stack[p] =3D=3D '/') { > q=3D0; > p++; > /* Initial // is special. */ > if (stack[p] =3D=3D '/' && stack[p+1] !=3D '/') { You already incremented p here. Did you want to test for "///"? The comment indicated otherwise. > output[q++] =3D '/'; > } > while (stack[p] =3D=3D '/') p++; > } > char *z =3D __strchrnul(stack+p, '/'); > l =3D z-(stack+p); > if (l<=3D2 && stack[p]=3D=3D'.' && stack[p+l-1]=3D=3D'.') { > if (l=3D=3D2) { > while(q>1 && output[q-1]!=3D'/') q--; > if (q>1) q--; > } > p +=3D l; > while (stack[p] =3D=3D '/') p++; > continue; > } > if (l=3D=3D1 && stack[p]=3D=3D'.') > if (l+2 > sizeof output - q) goto toolong; I believe you forgot to finish the first "if" line here. Also, you have already handled the "." path at this point. > output[q] =3D '/'; > memcpy(output+q+1, stack+p, l); > output[q+1+l] =3D 0; > p +=3D l; > ssize_t k =3D readlink(output, stack, p); > if (k=3D=3D-1) { > if (errno =3D=3D EINVAL) { > q +=3D 1+l; > while (stack[p] =3D=3D '/') p++; > continue; > } > return 0; > } > if (k=3D=3Dp) goto toolong; > if (++cnt =3D=3D SYMLOOP_MAX) { > errno =3D ELOOP; > return 0; > } > p -=3D k; > memmove(stack+p, stack, k); > } > if (!q) output[q++] =3D '/'; > output[q] =3D 0; > return resolved ? strcpy(resolved, output) : strdup(output); > > toolong: > errno =3D ENAMETOOLONG; > return 0; > } Ciao, Markus