From: Rich Felker <dalias@libc.org>
To: Alyssa Ross <hi@alyssa.is>
Cc: musl@lists.openwall.com
Subject: Re: [musl] [PATCH musl v2 3/3] mntent: fix parsing lines with optional fields
Date: Thu, 12 May 2022 10:08:37 -0400 [thread overview]
Message-ID: <20220512140835.GJ7074@brightrain.aerifal.cx> (raw)
In-Reply-To: <875yqn1n8g.fsf@alyssa.is>
On Thu, Jan 13, 2022 at 06:53:19PM +0000, Alyssa Ross wrote:
> Rich Felker <dalias@libc.org> writes:
>
> > On Thu, Jan 13, 2022 at 04:30:18PM +0000, Alyssa Ross wrote:
> >> Hi Rich, thanks for following up on this.
> >>
> >> Rich Felker <dalias@libc.org> writes:
> >>
> >> > On Mon, Sep 20, 2021 at 12:21:41AM -0400, Rich Felker wrote:
> >> >> On Wed, Sep 15, 2021 at 10:11:55PM +0000, Alyssa Ross wrote:
> >> >> > According to fstab(5), the last two fields are optional, but this
> >> >> > wasn't accepted by Musl. After this change, only the first field is
> >> >> > required, which matches Glibc's behaviour.
> >> >> >
> >> >> > Using sscanf as before, it would have been impossible to differentiate
> >> >> > between 0 fields and 4 fields, because sscanf would have returned 0 in
> >> >> > both cases due to the use of assignment suppression and %n for the
> >> >> > string fields (which is important to avoid copying any strings). So
> >> >> > instead, before calling sscanf, initialize every string to the empty
> >> >> > string, and then we can check which strings are empty afterwards to
> >> >> > know how many fields were matched.
> >> >> > ---
> >> >> >
> >> >> > We could also be stricter about it, and enforce that the first four
> >> >> > fields are present, since the man page says only the last two are
> >> >> > optional. Doing that would be a simple change of checking for the
> >> >> > presence of mnt_opts instead of mnt_fsname at the end of my patch.
> >> >> >
> >> >> > v2: don't change n from int to size_t
> >> >> >
> >> >> > src/misc/mntent.c | 18 +++++++++++++-----
> >> >> > 1 file changed, 13 insertions(+), 5 deletions(-)
> >> >> >
> >> >> > diff --git a/src/misc/mntent.c b/src/misc/mntent.c
> >> >> > index eabb8200..238a0efd 100644
> >> >> > --- a/src/misc/mntent.c
> >> >> > +++ b/src/misc/mntent.c
> >> >> > @@ -21,7 +21,8 @@ int endmntent(FILE *f)
> >> >> >
> >> >> > struct mntent *getmntent_r(FILE *f, struct mntent *mnt, char *linebuf, int buflen)
> >> >> > {
> >> >> > - int cnt, n[8], use_internal = (linebuf == SENTINEL);
> >> >> > + int n[8], use_internal = (linebuf == SENTINEL);
> >> >> > + size_t len, i;
> >> >> >
> >> >> > mnt->mnt_freq = 0;
> >> >> > mnt->mnt_passno = 0;
> >> >> > @@ -39,10 +40,14 @@ struct mntent *getmntent_r(FILE *f, struct mntent *mnt, char *linebuf, int bufle
> >> >> > errno = ERANGE;
> >> >> > return 0;
> >> >> > }
> >> >> > - cnt = sscanf(linebuf, " %n%*s%n %n%*s%n %n%*s%n %n%*s%n %d %d",
> >> >> > - n, n+1, n+2, n+3, n+4, n+5, n+6, n+7,
> >> >> > - &mnt->mnt_freq, &mnt->mnt_passno);
> >> >> > - } while (cnt < 2 || linebuf[n[0]] == '#');
> >> >> > +
> >> >> > + len = strlen(linebuf);
> >> >> > + for (i = 0; i < sizeof n / sizeof *n; i++) n[i] = len;
> >> >> > + if (sscanf(linebuf, " %n%*s%n %n%*s%n %n%*s%n %n%*s%n %d %d",
> >> >> > + n, n+1, n+2, n+3, n+4, n+5, n+6, n+7,
> >> >> > + &mnt->mnt_freq, &mnt->mnt_passno) == EOF && ferror(f))
> >> >> > + return 0;
> >> >> > + } while (linebuf[n[0]] == '#');
> >> >> >
> >> >> > linebuf[n[1]] = 0;
> >> >> > linebuf[n[3]] = 0;
> >> >> > @@ -54,6 +60,9 @@ struct mntent *getmntent_r(FILE *f, struct mntent *mnt, char *linebuf, int bufle
> >> >> > mnt->mnt_type = linebuf+n[4];
> >> >> > mnt->mnt_opts = linebuf+n[6];
> >> >> >
> >> >> > + if (!*mnt->mnt_fsname)
> >> >> > + return 0;
> >> >> > +
> >> >> > return mnt;
> >> >> > }
> >> >>
> >> >> It looks like your patch changes the behavior for malformed lines from
> >> >> skipping them (and continuing to search for the next valid line) to
> >> >> returning 0. Is that intentional? Maybe it's better; I'm not sure. But
> >> >> won't it even cause blank lines to return 0?
> >>
> >> Because I only check for the first field being present, a lot of
> >> nonsensical lines will be accepted.
> >>
> >> As I said in the patch commentary:
> >>
> >> > After this change, only the first field is
> >> > required, which matches Glibc's behaviour.
> >> >
> >> > [snip]
> >> >
> >> > We could also be stricter about it, and enforce that the first four
> >> > fields are present, since the man page says only the last two are
> >> > optional. Doing that would be a simple change of checking for the
> >> > presence of mnt_opts instead of mnt_fsname at the end of my patch.
> >>
> >> It probably would make more sense to check that the four fields the man
> >> pages implies are required are all there, by making the change I
> >> suggested in the commentary, at least until somebody complains about
> >> their two-field fstab being accepted by Glibc and not Musl.
> >>
> >> > Indeed it also seems to be skipping empty lines, contrary to what you
> >> > said in another message:
> >> >
> >> >> • Empty lines should be skipped.
> >>
> >> Yes, it looks like I was mistaken before when I thought that Musl didn't
> >> properly handle comments and empty lines. Looking back, it seems that
> >> the tests I was running were against mntent files with only four fields,
> >> so the parsing failures I was seeing were because of that, not because
> >> of an issue in the comment or empty line handling.
> >>
> >> My patch does (inadventently) change the behaviour of empty line
> >> handling. We should leave the current behaviour of skipping over empty
> >> lines as is. Suggested fix at the end of this message.
> >>
> >> > Do you have a preference on how to proceed? We could add back a
> >> > condition to the while loop, something like linebuf[n[0]]=='#' ||
> >> > n[6]==len (i.e. skip lines with too few fields, possibly using a
> >> > different number instead of 6 if more appropriate). Or we could do
> >> > what I suggested before:
> >> >
> >> >> A less invasive change might be adding "%1[ \t\n\v\f\r]" and a dummy
> >> >> char* argument to collect the value before the " %d %d". Then you can
> >> >> check for cnt<1. But I'm not sure even the 4th field should be
> >> >> mandatory. This same apprach could be used to make just 3 mandatory if
> >> >> desired though.
> >> >
> >> > Thoughts?
> >>
> >> I think it would clearer to have an explicit check that the last
> >> mandatory field is set, like I currently do with the mnt_fsname check at
> >> the end of the function. I don't particularly mind how many fields are
> >> mandatory, as long as its four or fewer so Musl's behaviour follows the
> >> fstab format described in the man page.
> >>
> >> So overall my proposed revisions would be the following. There's a
> >> change to move to the next line if the current one is empty, and a
> >> change to ensure the first four fields are all present. (If you decide
> >> you'd like the fourth field to also be optional, we can just change
> >> mnt_opts to mnt_type in that check.) It's been a long time since I last
> >> this code, btw, so I hope I'm not missing anything around the empty line
> >> check. I'd be happy to put together a revised series, with these
> >> changes and a corresponding change to my libc-test patch, if you'd like.
> >>
> >> diff --git i/src/misc/mntent.c w/src/misc/mntent.c
> >> index 169e9789..7782cb10 100644
> >> --- i/src/misc/mntent.c
> >> +++ w/src/misc/mntent.c
> >> @@ -47,7 +47,7 @@ struct mntent *getmntent_r(FILE *f, struct mntent *mnt, char *linebuf, int bufle
> >> n, n+1, n+2, n+3, n+4, n+5, n+6, n+7,
> >> &mnt->mnt_freq, &mnt->mnt_passno) == EOF && ferror(f))
> >> return 0;
> >> - } while (linebuf[n[0]] == '#');
> >> + } while (linebuf[0] == '\n' || linebuf[n[0]] == '#');
> >>
> >> linebuf[n[1]] = 0;
> >> linebuf[n[3]] = 0;
> >> @@ -59,7 +59,7 @@ struct mntent *getmntent_r(FILE *f, struct mntent *mnt, char *linebuf, int bufle
> >> mnt->mnt_type = linebuf+n[4];
> >> mnt->mnt_opts = linebuf+n[6];
> >>
> >> - if (!*mnt->mnt_fsname)
> >> + if (!*mnt->mnt_opts)
> >> return 0;
> >>
> >> return mnt;
> >
> > What I still don't like here is that this changes the behavior on
> > something that's not a valid record (in whatever sense we define that)
> > from continuing the loop looking for the next one, to returning a null
> > pointer with no indication of what the error was. Treating empty lines
> > as an error (rather than continuing) was just one special case of
> > that. For an analogy, see how the pwd/grp functions work. Something
> > malformed in the file is just "not a record" and search continues for
> > the next valid record (if any) rather than giving the caller a
> > non-actionable (since this is assumed to be a sort of
> > trusted/authoritative data outside the application's control) error.
Just getting back to this now after the release.. Apologies for the
long delay.
> Okay, that makes sense. So it seems like our choices when faced with an
> invalid record are:
>
> • Skip it and move on to the next one, as you've proposed here; or
>
> • Try to fill in as many fields of the mntent structure as possible,
> and return successfully, like Glibc does.
>
> If we go with the first option, as you've proposed, do you think the
> difference in behaviour from Glibc would be an issue?
I think the latter is better. If I remember correctly what we'd found,
glibc hardly has any condition on what's mandatory (just first field?
or first and second?) so I think the condition would be something
like:
...
} while (linebuf[n[0]] == '#' || n[1]==len);
or similar, possibly replacing 1 with 3 if 2 fields are mandatory,
etc.
> (I'm mostly thinking of the case where a record doesn't have enough
> fields here. There are also the cases where a record has too many
> fields, or when fields 5 and 6 are numeric. I haven't looked into how
> Glibc handles those yet.)
A few other changes I think should be made:
- The two numeric fields should be read into temporaries initialized
with default values so that it doesn't matter if they're read or
not.
- The n[] array should be changed to size_t[] and the %n's to %zn's.
This should actually be done as a separate change, as it's a fix for
a bug overlooked when 05973dc3bbc1aca9b3c8347de6879ed72147ab3b made
the buffer length potentially longer than INT_MAX.
- There's also an independent bug in hasmntent that was reported a
long time ago then lost: it will return false positives when one
mntopt name is a substring of another. strstr is just not the right
operation here, at least not without added logic to ensure matching
on a whole option boundary. This is a separate issue that calls for
a separate patch though, not a blocker on the patch under discussion
here.
Rich
next prev parent reply other threads:[~2022-05-12 14:08 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-15 22:11 [musl] [PATCH v2 0/3] Alyssa Ross
2021-09-15 22:11 ` [musl] [PATCH libc-test v2 1/3] functional: add mntent test Alyssa Ross
2021-09-15 22:11 ` [musl] [PATCH libc-test v2 2/3] functional: add mntent test for single-field line Alyssa Ross
2021-09-15 22:11 ` [musl] [PATCH musl v2 3/3] mntent: fix parsing lines with optional fields Alyssa Ross
2021-09-20 4:21 ` Rich Felker
2022-01-09 3:18 ` Rich Felker
2022-01-13 16:30 ` Alyssa Ross
2022-01-13 17:40 ` Rich Felker
2022-01-13 18:53 ` Alyssa Ross
2022-05-12 14:08 ` Rich Felker [this message]
2022-05-12 18:02 ` Alyssa Ross
2022-05-12 18:15 ` Rich Felker
2022-05-15 18:38 ` Alyssa Ross
2022-05-15 23:19 ` Rich Felker
2022-05-16 10:19 ` Alyssa Ross
2022-05-12 20:58 ` Oliver Ford
2022-05-12 21:10 ` Rich Felker
2022-05-13 21:39 ` Oliver Ford
2022-05-14 4:24 ` Rich Felker
2022-05-14 22:16 ` Oliver Ford
2022-05-15 23:31 ` Rich Felker
2022-05-16 10:07 ` Alyssa Ross
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220512140835.GJ7074@brightrain.aerifal.cx \
--to=dalias@libc.org \
--cc=hi@alyssa.is \
--cc=musl@lists.openwall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).