mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: stdio fixes & internals documentation
Date: Thu, 28 May 2015 21:39:37 -0400	[thread overview]
Message-ID: <20150529013937.GA32093@brightrain.aerifal.cx> (raw)

I'm writing this up because I need to fix a bug in ungetc -- it's
failing when the target FILE is at eof rather than clearing the eof
status and succeeding, despite having logic in the source that was
intended to make this case work right. So I'm going to go ahead and
make some tentative documentation of stdio internals for my own use
and for others who want to review these changes. This stuff can
eventually be cleaned up to go into the musl manual... :-)

So far, just for stuff used in reading:

f->read() assumes reading is currently a valid operation (so __toread
was called successfully without a subsequent __towrite) and the read
buffer is empty (but the values of the pointers don't matter). Right
now it's responsible for setting EOF and error flags for the file.
This is probably a bad idea because the logic, which is stdio-global
and not specific to the FILE type, is repeated in different underlying
FILE type implementations' read functions.

__toread() assumes the FILE is in a valid state to begin reading, but
does not misbehave horribly even if it was in the middle of writing.
If the FILE's fopen mode does not permit reading, it sets error status
and returns EOF. If the file is at EOF, it also returns EOF. This
means the caller (such as ungetc) cannot distinguish EOF from "error
going into reading mode" without temporarily clearing and restoring
the error flag (which would be ugly and inefficient). On failure
(either eof status or error switching to read mode) the buffer
pointers are left unchanged; on success, they're all set to the
beginning of the buffer.

The first thing I want to do is fix the known bug in ungetc, and I
think the easiest way to do that is to make __toread set valid read
buffer pointers when it fails due to eof status. Then, instead of
ungetc checking the return value of __toread, it can instead call
__toread and then just check rpos. That is, instead of:

	if ((!f->rend && __toread(f)) || f->rpos <= f->buf - UNGET) {
		// error

it can instead do:

	if (!f->rend) __toread(f);
	if (f->rpos <= f->buf - UNGET) {
		// error

or perhaps:

	if (!f->rpos) __toread(f);
	if (!f->rpos || f->rpos <= f->buf - UNGET) {
		// error

I like the second version better because it does not assume that a
null pointer compares <= any valid pointer, which could be wrong if
pointer <= is implemented as a signed comparison. I changed the
conditional from !f->rend to !f->rpos to simultaneously optimize two
code paths:

1. If f->rpos is initially non-null, the compiler can skip re-checking
   that it's non-null in the second if.

2. If f->rpos is initially null, then after __toread it needs to be
   re-read, but only one field (rpos) needs to be loaded from memory
   instead of two (rend and rpos).

Note that if the file is not open in a mode that permits writing, then
f->rpos is initially a null pointer and __toread will never set it to
something non-null, nor will f->read (since it's not permitted to call
f->read without a successful __toread).

Once these fixes are taken care of I'd like to look at the EOF logic
in f->read() and moving it out to the callers (only __uflow and fread)
where we won't have to worry about bugs (which I think exist) in the
FILE-type-specific read functions.

Rich


             reply	other threads:[~2015-05-29  1:39 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-29  1:39 Rich Felker [this message]
2015-05-29  3:05 ` Rich Felker
2015-05-29  4:32 ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150529013937.GA32093@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).