From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 6393 invoked from network); 24 Nov 2020 04:27:04 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 24 Nov 2020 04:27:04 -0000 Received: (qmail 1691 invoked by uid 550); 24 Nov 2020 04:27:00 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 1673 invoked from network); 24 Nov 2020 04:26:59 -0000 Date: Mon, 23 Nov 2020 23:26:46 -0500 From: Rich Felker To: Alexey Izbyshev Cc: musl@lists.openwall.com Message-ID: <20201124042646.GA534@brightrain.aerifal.cx> References: <20201122225619.GR534@brightrain.aerifal.cx> <97dd3cf7c69673e5962e9ccd46ea5131@ispras.ru> <20201123031932.GS534@brightrain.aerifal.cx> <20201123185633.GY534@brightrain.aerifal.cx> <20201123205259.GZ534@brightrain.aerifal.cx> <48faf5ab9a1f3c869c85897217db0d75@ispras.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <48faf5ab9a1f3c869c85897217db0d75@ispras.ru> User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] realpath without procfs -- should be ready for inclusion On Tue, Nov 24, 2020 at 06:39:59AM +0300, Alexey Izbyshev wrote: > On 2020-11-23 23:53, Rich Felker wrote: > >On Mon, Nov 23, 2020 at 01:56:33PM -0500, Rich Felker wrote: > >>On Sun, Nov 22, 2020 at 10:19:33PM -0500, Rich Felker wrote: > >>--- realpath8.c 2020-11-22 17:52:17.586481571 -0500 > >>+++ realpath9.c 2020-11-23 13:55:06.808458893 -0500 > >>@@ -19,7 +19,7 @@ > >> char *output = resolved ? resolved : buf; > >> size_t p, q, l, cnt=0; > >> > >>- l = strnlen(filename, sizeof stack + 1); > >>+ l = strnlen(filename, sizeof stack); > >> if (!l) { > >> errno = ENOENT; > >> return 0; > >>@@ -80,11 +80,16 @@ > >> return 0; > >> } > >> if (k==p) goto toolong; > >>+ if (!k) { > >>+ errno = ENOENT; > >>+ return 0; > >>+ } > >> if (++cnt == SYMLOOP_MAX) { > >> errno = ELOOP; > >> return 0; > >> } > >> p -= k; > >>+ if (stack[k-1]=='/') p++; > >> memmove(stack+p, stack, k); > > > >This is wrong and needs further consideration. > > > Yes, now memmove() overwrites NUL if p was at the end and stack[k-1] > == '/'. Is it true per POSIX that "rr/home" must resolve to "//home" > if "rr" -> "//"? I don't think // is even required be distinct from /, just permitted, but I think allowing it in userspace and handling it consistently is the right behavior in case you ever run on a kernel that does make use of the distinction. > If so, maybe something like the following instead: > > + while (stack[p] == '/') p++; > + if (stack[p] && stack[k-1] != '/') p--; > p -= k; > - if (stack[k-1]=='/') p++; Rather just: /* If link contents end in /, strip any slashes already on * stack to avoid /->// or //->/// or spurious toolong. */ if (stack[k-1]=='/') while (stack[p]=='/') p++; should work (before the p-=k;) > >> } > >> > >>@@ -95,7 +100,8 @@ > >> l = strlen(stack); > >> /* Cancel any initial .. components. */ > >> p = 0; > >>- while (q-p>=2 && at_dotdot(output+p+2, p+2)) { > >>+ while (output[p]=='.' && output[p+1]=='.' > >>+ && (!output[p+2] || output[p+2]=='/')) { > >> while(l>1 && stack[l-1]!='/') l--; > >> if (l>1) l--; > >> p += 2; > > > >OK, I have a better improvement for this: counting the number of > >levels of .. as they're built at the head of output. Then it's just > >while (nup--) here, and the condition for canceling .. in the first > >loop no longer needs any string inspection; it's just (q>3*nup). > > > Sounds good. > > I've missed the last time that the immediately following code is > also broken: > > > if (q-p && stack[l-1]!='/') output[--p] = '/'; > > It will underflow the output in case of a simple relative path that > doesn't start with "..". Thanks. This logic just looks wrong; I'll rework it. > I've also noticed other issues to be fixed, per POSIX: > > * ENOENT should be returned if filename is NULL Rather it looks like it's: [EINVAL] The file_name argument is a null pointer. ENOENT is only for empty string or ENOENT somewhere in the path traversal process. > * ENOTDIR should be returned if the last component is not a > directory and the path has one or more trailing slashes Yes, that's precisely what I've been working on the past couple hours. I think you missed but .. will also erase a path component that's not a dir (e.g. /dev/null/.. -> /dev) and these are both instances of a common problem. I thought use of readlink covered all the ENOTDIR cases but it doesn't when the next component isn't covered by readlink or isn't present at all. It's trivial to fix with a check after each component but that doubles the number of syscalls and mostly isn't necessary. I have a reworked draft to fix the problem by advancing over /(/|./|.$)* rather than just /+ after each component, so that we can lookahead and do an extra readlink in the cases that need it. Rich