Making stdio writes robust/recoverable under errors

mailing list of musl libc
 help / color / mirror / code / Atom feed

* Making stdio writes robust/recoverable under errors
@ 2015-05-29 13:53 Rich Felker
  2015-05-29 14:08 ` Rich Felker
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Rich Felker @ 2015-05-29 13:53 UTC (permalink / raw)
  To: musl

Currently, musl's stdio write operations take the liberty of leaving
the current position and actual contents written rather unpredictable
if a write error occurs. Originally the motivation was probably a mix
of uncertainty about what to do and lack of code (or a desire to avoid
complexity) for tracking what part of the buffer was unwritten if a
write error occurred before the buffer was written out. But commit
58165923890865a6ac042fafce13f440ee986fd9, as part of adding
cancellation support, added the code to track what part of the buffer
remains unwritten, and therefore, from what I can see, avoiding data
loss when stdio writes fail transiently (like EINTR, for instance) is
simply a matter of removing line 31 in the error path of
__stdio_write:

			f->wpos = f->wbase = f->wend = 0;

so that the buffer is not thrown away.

An attached test program demonstrates the issue. It assumes Linux 64k
pipe buffers so it's not appropriate for inclusion in tests in its
current form, but it gets the job done where it works. It attempts to
send 128k over a pipe via write followed by a series of fwrite retries
that are interrupted by signals that cause EINTR; thanks to the
kernel's pipe buffer semantics, there's basically no timing/race
aspect involved. With current musl, I get only 130560 across the pipe;
with the above line removed, I get all 131072.

I think this is a good change to make from the standpoint of
robustness. Even if most errors are not recoverable, some may be; in
fact ENOSPC could be handled meaningfully if the calling program knows
it has temp files it could delete. What I am concerned about, though,
are residual assumptions that, after f->write, the write buffer will
be clear. With this change, fflush and similar operations would no
longer discard the buffer on error, meaning we would need to check
that callers don't need that property. As another example, fseek fails
when the buffer can't be written out, but since the current buffer
write throws away the buffer on failure, a second fseek call will
succeed; this change would leave it failing. (As an aside, rewind
already fails in this case and throws away the error; this is almost
certainly wrong. It should probably attempt to flush and throw away
the buffer on failure before seeking so that the seek always
succeeds.)

I don't want to make these changes right now at the end of a release
cycle but I might promote them to important targets for the next
cycle.

Rich

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Making stdio writes robust/recoverable under errors
  2015-05-29 13:53 Making stdio writes robust/recoverable under errors Rich Felker
@ 2015-05-29 14:08 ` Rich Felker
  2015-05-29 14:36 ` Rich Felker
  2015-06-05 21:14 ` Rich Felker
  2 siblings, 0 replies; 6+ messages in thread
From: Rich Felker @ 2015-05-29 14:08 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 649 bytes --]

On Fri, May 29, 2015 at 09:53:00AM -0400, Rich Felker wrote:
> An attached test program demonstrates the issue. It assumes Linux 64k
> pipe buffers so it's not appropriate for inclusion in tests in its
> current form, but it gets the job done where it works. It attempts to
> send 128k over a pipe via write followed by a series of fwrite retries
> that are interrupted by signals that cause EINTR; thanks to the
> kernel's pipe buffer semantics, there's basically no timing/race
> aspect involved. With current musl, I get only 130560 across the pipe;
> with the above line removed, I get all 131072.

And here is the missing attachment. :-)

Rich

[-- Attachment #2: fwrite_intr.c --]
[-- Type: text/plain, Size: 934 bytes --]

#include <pthread.h>
#include <stdio.h>
#include <signal.h>
#include <string.h>
#include <unistd.h>
#include <sys/time.h>

char bigbuf[65536];

void *thr(void *ptr)
{
	int *p = ptr;
	char tmp[256];
	size_t l, t=0;
	for (;;) {
		l = read(p[0], tmp, sizeof tmp);
		if (l <= 0) break;
		t += l;
	}
	printf("read %zu\n", t);
}

void dummy(int sig)
{
}

int main()
{
	int testpipe[2];
	pipe(testpipe);
	FILE *f = fdopen(testpipe[1], "w");
	struct sigaction sa = { .sa_handler = dummy };
	sigaction(SIGALRM, &sa, 0);
	//alarm(1);
	setitimer(ITIMER_REAL, &(struct itimerval){ .it_interval.tv_usec = 10000, .it_value.tv_usec = 10000 }, 0);
	char *p = bigbuf;
	size_t n = sizeof bigbuf, l;
	memset(p, 42, n);

	write(testpipe[1], p, n);
	l = fwrite(p, 1, 512, f);
	p+=l, n-=l;
	l = fwrite(p, 1, n, f);

	pthread_t td;
	pthread_create(&td, 0, thr, testpipe);

	while (p+=l, n-=l)
		l = fwrite(p, 1, n, f);

	fclose(f);

	pthread_join(td, 0);
}

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Making stdio writes robust/recoverable under errors
  2015-05-29 13:53 Making stdio writes robust/recoverable under errors Rich Felker
  2015-05-29 14:08 ` Rich Felker
@ 2015-05-29 14:36 ` Rich Felker
  2015-06-05 21:14 ` Rich Felker
  2 siblings, 0 replies; 6+ messages in thread
From: Rich Felker @ 2015-05-29 14:36 UTC (permalink / raw)
  To: musl

On Fri, May 29, 2015 at 09:53:00AM -0400, Rich Felker wrote:
> Currently, musl's stdio write operations take the liberty of leaving
> the current position and actual contents written rather unpredictable
> if a write error occurs. Originally the motivation was probably a mix
> of uncertainty about what to do and lack of code (or a desire to avoid
> complexity) for tracking what part of the buffer was unwritten if a
> write error occurred before the buffer was written out. But commit
> 58165923890865a6ac042fafce13f440ee986fd9, as part of adding
> cancellation support, added the code to track what part of the buffer
> remains unwritten, and therefore, from what I can see, avoiding data
> loss when stdio writes fail transiently (like EINTR, for instance) is
> simply a matter of removing line 31 in the error path of
> __stdio_write:
> 
> 			f->wpos = f->wbase = f->wend = 0;
> 
> so that the buffer is not thrown away.

fflush is also doing its down discarding on line 15 with the same
assignment, but in the 'success' path, which is taken with the above
line in __stdio_write removed -- in other words, fflush fails to
detect failure with the above change. I think the code on lines 6-9 of
fflush.c should be changed from:

	if (f->wpos > f->wbase) {
		f->write(f, 0, 0);
		if (!f->wpos) return EOF;
	}

to explicitly check for non-empty buffer:

	if (f->wpos > f->wbase) {
		f->write(f, 0, 0);
		if (f->wpos > f->wbase) return EOF;
	}

Then the subsequent zeroing of the buffer pointers in the success case
is not discarding anything, but just disabling writes through the
buffer.

Rich


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Making stdio writes robust/recoverable under errors
  2015-05-29 13:53 Making stdio writes robust/recoverable under errors Rich Felker
  2015-05-29 14:08 ` Rich Felker
  2015-05-29 14:36 ` Rich Felker
@ 2015-06-05 21:14 ` Rich Felker
  2015-06-05 21:47   ` Laurent Bercot
  2 siblings, 1 reply; 6+ messages in thread
From: Rich Felker @ 2015-06-05 21:14 UTC (permalink / raw)
  To: musl

On Fri, May 29, 2015 at 09:53:00AM -0400, Rich Felker wrote:
> Currently, musl's stdio write operations take the liberty of leaving
> the current position and actual contents written rather unpredictable
> if a write error occurs. Originally the motivation was probably a mix
> of uncertainty about what to do and lack of code (or a desire to avoid
> complexity) for tracking what part of the buffer was unwritten if a
> write error occurred before the buffer was written out. But commit
> 58165923890865a6ac042fafce13f440ee986fd9, as part of adding
> cancellation support, added the code to track what part of the buffer
> remains unwritten, and therefore, from what I can see, avoiding data
> loss when stdio writes fail transiently (like EINTR, for instance) is
> simply a matter of removing line 31 in the error path of
> __stdio_write:
> 
> 			f->wpos = f->wbase = f->wend = 0;
> 
> so that the buffer is not thrown away.
> 
> An attached test program demonstrates the issue. It assumes Linux 64k
> pipe buffers so it's not appropriate for inclusion in tests in its
> current form, but it gets the job done where it works. It attempts to
> send 128k over a pipe via write followed by a series of fwrite retries
> that are interrupted by signals that cause EINTR; thanks to the
> kernel's pipe buffer semantics, there's basically no timing/race
> aspect involved. With current musl, I get only 130560 across the pipe;
> with the above line removed, I get all 131072.
> 
> I think this is a good change to make from the standpoint of
> robustness. Even if most errors are not recoverable, some may be; in
> fact ENOSPC could be handled meaningfully if the calling program knows
> it has temp files it could delete. What I am concerned about, though,
> are residual assumptions that, after f->write, the write buffer will
> be clear. With this change, fflush and similar operations would no
> longer discard the buffer on error, meaning we would need to check
> that callers don't need that property. As another example, fseek fails
> when the buffer can't be written out, but since the current buffer
> write throws away the buffer on failure, a second fseek call will
> succeed; this change would leave it failing. (As an aside, rewind
> already fails in this case and throws away the error; this is almost
> certainly wrong. It should probably attempt to flush and throw away
> the buffer on failure before seeking so that the seek always
> succeeds.)
> 
> I don't want to make these changes right now at the end of a release
> cycle but I might promote them to important targets for the next
> cycle.

I'm a bit undecided on whether to move forward with this or not. While
there's some benefit to being able to resume/recover transient errors
that occur during buffered writing, there are also disadvantages: a
stream that has buffered unwritable output due to error is stuck in a
permanent state of being unable to seek, switch to reading, etc.

I don't think the EINTR case is very interesting; using EINTR to
interrupt stdio writes is not viable anyway since, if some data was
already written, the underlying write will return with a short write
and stdio will re-block trying to write the rest. There is one
interesting case: in the event of ENOSCP, a caller could remove some
temp files and then attempt to resume writing.

On the other hand, the permanent "no space" condition that would
result for fmemopen is rather ugly/unfortunate, because this condition
"should" be caught before the data is buffered, but isn't.

A similar condition exists for open_[w]memstream, but those would only
error out for "no space" when malloc fails.

I'm still not clear on which behavior is best, but probably going to
put aside ideas of changing this for now unless good reasons come to
light from discussion.

Rich


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Making stdio writes robust/recoverable under errors
  2015-06-05 21:14 ` Rich Felker
@ 2015-06-05 21:47   ` Laurent Bercot
  2015-06-06  5:32     ` Rich Felker
  0 siblings, 1 reply; 6+ messages in thread
From: Laurent Bercot @ 2015-06-05 21:47 UTC (permalink / raw)
  To: musl

On 05/06/2015 23:14, Rich Felker wrote:
> I'm a bit undecided on whether to move forward with this or not. While
> there's some benefit to being able to resume/recover transient errors
> that occur during buffered writing, there are also disadvantages

  My unpopular but firm opinion, which I already have expressed here, is
that stdio is a toy interface that should not be used for any output that
requires the simplest hint of reliability. The loss of information about
the number of bytes actually written when an error occurs is a major
pain point; even if the underlying implementation is as good as can be,
the API is just too poor and it's either fight the interface (in which
case you have already lost), use non-portable extensions (which musl aims
to avoid), or put stdio back into the trash bin and use something else
entirely.

  People who use stdio and expect good behaviour when an error occurs
deserve what's coming at them, and I think that worse is better in this
case: fail as badly as is permitted by the standard instead of trying to
salvage as many scraps as you can. Even if you manage to save a few close
calls, the next crash and burn is just around the corner. So, I vote for
not devoting any more brain power to "stdio robustness" than is strictly
necessary to be standards-compliant.

-- 
  Laurent

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Making stdio writes robust/recoverable under errors
  2015-06-05 21:47   ` Laurent Bercot
@ 2015-06-06  5:32     ` Rich Felker
  0 siblings, 0 replies; 6+ messages in thread
From: Rich Felker @ 2015-06-06  5:32 UTC (permalink / raw)
  To: musl

On Fri, Jun 05, 2015 at 11:47:33PM +0200, Laurent Bercot wrote:
> On 05/06/2015 23:14, Rich Felker wrote:
> >I'm a bit undecided on whether to move forward with this or not. While
> >there's some benefit to being able to resume/recover transient errors
> >that occur during buffered writing, there are also disadvantages
> 
>  My unpopular but firm opinion, which I already have expressed here, is
> that stdio is a toy interface that should not be used for any output that
> requires the simplest hint of reliability. The loss of information about
> the number of bytes actually written when an error occurs is a major
> pain point; even if the underlying implementation is as good as can be,
> the API is just too poor and it's either fight the interface (in which
> case you have already lost), use non-portable extensions (which musl aims
> to avoid), or put stdio back into the trash bin and use something else
> entirely.
> 
>  People who use stdio and expect good behaviour when an error occurs
> deserve what's coming at them, and I think that worse is better in this
> case: fail as badly as is permitted by the standard instead of trying to
> salvage as many scraps as you can. Even if you manage to save a few close
> calls, the next crash and burn is just around the corner. So, I vote for
> not devoting any more brain power to "stdio robustness" than is strictly
> necessary to be standards-compliant.

Yes, I'm aware of your position, and while I don't agree with it in
general -- there are plenty of cases where the weak control/guarantees
stdio gives you are perfectly fine and worth the simplicity and
portability the buy you -- I think you're right on this one. I meant
to mention that glibc currently behaves like musl does now, at least
for regular files.

I still think this issue should be kept "open" at least until musl's
stdio internals are well-documented, because there might currently be
inconsistent behavior with regards to how fmemopen and
open_[w]memstream files behave, which should be investigated. But I'm
leaning towards not trying to preserve buffer contents after write
errors.

Rich


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-06-06  5:32 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-29 13:53 Making stdio writes robust/recoverable under errors Rich Felker
2015-05-29 14:08 ` Rich Felker
2015-05-29 14:36 ` Rich Felker
2015-06-05 21:14 ` Rich Felker
2015-06-05 21:47   ` Laurent Bercot
2015-06-06  5:32     ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).