mailing list of musl libc
 help / color / mirror / code / Atom feed
From: William Pitcock <nenolod@dereferenced.org>
To: musl@lists.openwall.com
Subject: Re: [PATCH v7] stdio: implement fopencookie(3)
Date: Fri, 17 Nov 2017 13:52:05 -0600	[thread overview]
Message-ID: <CA+T2pCHgA0Vy5vb+r_SNof62qzptfddW6R-CHUgnuOrzfHSRZA@mail.gmail.com> (raw)
In-Reply-To: <20171117173541.GD1627@brightrain.aerifal.cx>

Hi,

On Fri, Nov 17, 2017 at 11:35 AM, Rich Felker <dalias@libc.org> wrote:
> On Thu, Nov 16, 2017 at 06:00:55AM +0000, William Pitcock wrote:
>> The fopencookie(3) function allows the programmer to create a custom
>> stdio implementation, using four hook functions which operate on a
>> "cookie" data type.
>>
>> [...]
>> +
>> +struct fcookie {
>> +     void *cookie;
>> +     cookie_io_functions_t iofuncs;
>> +};
>> +
>> +struct cookie_FILE {
>> +     FILE f;
>> +     struct fcookie fc;
>> +     unsigned char buf[UNGET+BUFSIZ];
>> +};
>> +
>> +static size_t cookieread(FILE *f, unsigned char *buf, size_t len)
>> +{
>> +     struct fcookie *fc = f->cookie;
>> +     ssize_t ret = -1;
>> +     size_t remain = len, readlen = 0;
>> +
>> +     if (!fc->iofuncs.read) goto bail;
>> +
>> +     ret = fc->iofuncs.read(fc->cookie, (char *) buf, len > 1 ? (len - 1) : 1);
>> +     if (ret <= 0) goto bail;
>> +
>> +     readlen += ret;
>> +     remain -= ret;
>> +
>> +     f->rpos = f->buf;
>> +     ret = fc->iofuncs.read(fc->cookie, (char *) f->rpos, f->buf_size);
>> +     if (ret <= 0) goto bail;
>> +     f->rend = f->rpos + ret;
>> +
>> +     if (remain > 0) {
>> +             if (remain > f->buf_size) remain = f->buf_size;
>> +             memcpy(buf + readlen, f->rpos, remain);
>> +             readlen += remain;
>> +             f->rpos += remain;
>> +     }
>> +
>> +     return readlen;
>> +
>> +bail:
>> +     f->flags |= F_EOF ^ ((F_ERR^F_EOF) & ret);
>> +     f->rpos = f->rend = f->buf;
>> +     return readlen;
>> +}
>
> At least the following bugs are still present:
>
> 1. If len==1, the read func is called twice, and might block the
>    second time even though the requested read of 1 byte succeeded.
>
> 2. The return value is wrong. It must be between 0 and len and
>    represent the amount of data made available to the caller. Buffer
>    filling is not supposed to be included in that since it's an
>    internal matter, not something the caller sees.
>
> 3. A return value <=0 from buffer filling step is not cause to set the
>    EOF or error indicators, since the part the caller requested
>    succeeded. But see below...
>
> Regarding issues 1 and 3, I think the right fix is to make the first
> call conditional on len>1, and always use len-1. This makes it so the
> second (possibly only) read call is never redundant/always part of the
> caller-requested read, so that you can honor its EOF or error result.
> Otherwise errors could be lost.
>
> Note that you also need to be able to handle the case where there is
> no buffer, in case setvbuf was called to disable buffering for the
> stream. In that case only the first read should happen and it should
> have full length len, not len-1.
>
> Really all of this logic should be identical to what's in
> __stdio_read.c, but with calls to the read backend rather than readv.

I believe I have reworked the read logic to match __stdio_read.c more closely.

I will be sending v8 shortly, which I hope to be the final version.

>
>> +static size_t cookiewrite(FILE *f, const unsigned char *buf, size_t len)
>> +{
>> +     struct fcookie *fc = f->cookie;
>> +     ssize_t ret;
>> +     size_t len2 = f->wpos - f->wbase;
>> +     if (!fc->iofuncs.write) return len;
>> +     if (len2) {
>> +             f->wpos = f->wbase;
>> +             if (cookiewrite(f, f->wpos, len2) < len2) return 0;
>> +     }
>> +     ret = fc->iofuncs.write(fc->cookie, (const char *) buf, len);
>> +     if (ret < 0) {
>> +             f->wpos = f->wbase = f->wend = 0;
>> +             f->flags |= F_ERR;
>> +             return 0;
>> +     }
>> +     return ret;
>> +}
>
> I don't see any bugs here.
>
>> +static off_t cookieseek(FILE *f, off_t off, int whence)
>> +{
>> +     struct fcookie *fc = f->cookie;
>> +     int res;
>> +     if (whence > 2) {
>
> Needs to be 2U. Otherwise you don't catch negative.

Done.

>> +             errno = EINVAL;
>> +             return -1;
>> +     }
>> +     if (!fc->iofuncs.seek) {
>> +             errno = ENOTSUP;
>> +             return -1;
>> +     }
>
> Is there a reason to prefer ENOTSUP here? Generally ESPIPE is the
> error for non-seekable FILE types, but maybe there's a reason
> something else is preferred here.

FreeBSD's implementation[1] used ENOTSUP for this case.

[1]: https://github.com/freebsd/freebsd/blob/master/lib/libc/stdio/fopencookie.c#L140-L143

glibc does not seem to set any errno at all for this case.

I think it is good to follow established convention though, since the
FreeBSD implementation does set errno, by matching theirs.

>
>> +FILE *fopencookie(void *cookie, const char *mode, cookie_io_functions_t iofuncs)
>> +{
>> +     struct cookie_FILE *f;
>> +
>> +     /* Check for valid initial mode character */
>> +     if (!strchr("rwa", *mode)) {
>> +             errno = EINVAL;
>> +             return 0;
>> +     }
>> +
>> +     /* Allocate FILE+fcookie+buffer or fail */
>> +     if (!(f=malloc(sizeof *f))) return 0;
>> +
>> +     /* Zero-fill only the struct, not the buffer */
>> +     memset(f, 0, sizeof(FILE));
>
> Largely style, but I would prefer memset(&f->f, 0, sizeof f->f) (i.e.
> not encoding a hidden assumption that f->f is first member and always
> using sizeof the object pointed to by first arg rather than a type to
> ensure consistency);

Done.

>
>> +
>> +     /* Impose mode restrictions */
>> +     if (!strchr(mode, '+')) f->f.flags = (*mode == 'r') ? F_NOWR : F_NORD;
>> +
>> +     /* Set up our fcookie */
>> +     f->fc.cookie = cookie;
>> +     f->fc.iofuncs.read = iofuncs.read;
>> +     f->fc.iofuncs.write = iofuncs.write;
>> +     f->fc.iofuncs.seek = iofuncs.seek;
>> +     f->fc.iofuncs.close = iofuncs.close;
>> +
>> +     f->f.fd = -1;
>> +     f->f.cookie = &f->fc;
>> +     f->f.buf = f->buf;
>
> Must be f->buf+UNGET. Otherwise ungetc will underflow.

Done.

>
>> +     f->f.buf_size = BUFSIZ;
>> +     f->f.lbf = EOF;
>> +
>> +     /* enable opportunistic stdio locking */
>> +     f->f.lock = 0;
>
> This comment seems wrong. I think the line is unnecessary/redundant
> since you memset, probably leftover from when you had it as -1.

I removed it from v8.

William


      parent reply	other threads:[~2017-11-17 19:52 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-16  6:00 William Pitcock
2017-11-17 17:35 ` Rich Felker
2017-11-17 17:57   ` Rich Felker
2017-11-17 19:52   ` William Pitcock [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+T2pCHgA0Vy5vb+r_SNof62qzptfddW6R-CHUgnuOrzfHSRZA@mail.gmail.com \
    --to=nenolod@dereferenced.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).