mailing list of musl libc
 help / color / mirror / code / Atom feed
* procfs stdio writev problem
@ 2013-05-05  9:16 Jens
  2013-05-05 10:01 ` Justin Cormack
  0 siblings, 1 reply; 5+ messages in thread
From: Jens @ 2013-05-05  9:16 UTC (permalink / raw)
  To: musl


Hello!

I've noticed a problem when using bash linked with musl.

laas:~# echo 60 > /proc/sys/kernel/panic
-su: echo: write error: Invalid argument

laas:~# cat t.sh
#!/bin/bash
echo 60 > /proc/sys/kernel/panic

laas:~# strace -f t.sh
...
writev(1, [{"60", 2}, {"\n", 1}], 2)    = 2
writev(1, [{"", 0}, {"\n", 1}], 2)      = -1 EINVAL (Invalid argument)

I'm guessing that musl uses writev in its stdio implementation.

And I think the error is due to a simplistic implementation in procfs, 
that parses each write on its own, and that the writev is split into 
several writes.

This is with linux kernel 3.6.0 btw.

I have no idea if something can and should be done about this, but likely 
somebody else will run into the same problem.

Regards,
Jens



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: procfs stdio writev problem
  2013-05-05  9:16 procfs stdio writev problem Jens
@ 2013-05-05 10:01 ` Justin Cormack
  2013-05-05 18:49   ` Jens
  0 siblings, 1 reply; 5+ messages in thread
From: Justin Cormack @ 2013-05-05 10:01 UTC (permalink / raw)
  To: musl

On Sun, May 5, 2013 at 10:16 AM, Jens <jensl@laas.mine.nu> wrote:
>
> Hello!
>
> I've noticed a problem when using bash linked with musl.
>
> laas:~# echo 60 > /proc/sys/kernel/panic
> -su: echo: write error: Invalid argument
>
> laas:~# cat t.sh
> #!/bin/bash
> echo 60 > /proc/sys/kernel/panic
>
> laas:~# strace -f t.sh
> ...
> writev(1, [{"60", 2}, {"\n", 1}], 2)    = 2
> writev(1, [{"", 0}, {"\n", 1}], 2)      = -1 EINVAL (Invalid argument)
>
> I'm guessing that musl uses writev in its stdio implementation.
>
> And I think the error is due to a simplistic implementation in procfs, that
> parses each write on its own, and that the writev is split into several
> writes.

Looks to me at a quick glance like stdio needs something like (untested)

--- ./src/stdio/__stdio_write.c~ 2012-12-01 22:56:34.156555480 +0000
+++ ./src/stdio/__stdio_write.c 2013-05-05 10:59:49.856504883 +0100
@@ -37,7 +37,7 @@
  return iovcnt == 2 ? 0 : len-iov[0].iov_len;
  }
  rem -= cnt;
- if (cnt > iov[0].iov_len) {
+ if (cnt >= iov[0].iov_len) {
  f->wpos = f->wbase = f->buf;
  cnt -= iov[0].iov_len;
  iov++; iovcnt--;

In the case where the kernel exactly eats the iov you need to move
onto the next one rather than have a zero length write pointing just
after the existing one, as that could be an invalid address.

Justin


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: procfs stdio writev problem
  2013-05-05 10:01 ` Justin Cormack
@ 2013-05-05 18:49   ` Jens
  2013-05-05 19:03     ` Rich Felker
  2013-05-11  3:49     ` Rich Felker
  0 siblings, 2 replies; 5+ messages in thread
From: Jens @ 2013-05-05 18:49 UTC (permalink / raw)
  To: musl


On Sun, 5 May 2013, Justin Cormack wrote:

> On Sun, May 5, 2013 at 10:16 AM, Jens <jensl@laas.mine.nu> wrote:
>>
>> Hello!
>>
>> I've noticed a problem when using bash linked with musl.
>>
>> laas:~# echo 60 > /proc/sys/kernel/panic
>> -su: echo: write error: Invalid argument
>>
>> laas:~# cat t.sh
>> #!/bin/bash
>> echo 60 > /proc/sys/kernel/panic
>>
>> laas:~# strace -f t.sh
>> ...
>> writev(1, [{"60", 2}, {"\n", 1}], 2)    = 2
>> writev(1, [{"", 0}, {"\n", 1}], 2)      = -1 EINVAL (Invalid argument)
>>
>> I'm guessing that musl uses writev in its stdio implementation.
>>
>> And I think the error is due to a simplistic implementation in procfs, that
>> parses each write on its own, and that the writev is split into several
>> writes.
>
> Looks to me at a quick glance like stdio needs something like (untested)
>
> --- ./src/stdio/__stdio_write.c~ 2012-12-01 22:56:34.156555480 +0000
> +++ ./src/stdio/__stdio_write.c 2013-05-05 10:59:49.856504883 +0100
> @@ -37,7 +37,7 @@
>  return iovcnt == 2 ? 0 : len-iov[0].iov_len;
>  }
>  rem -= cnt;
> - if (cnt > iov[0].iov_len) {
> + if (cnt >= iov[0].iov_len) {
>  f->wpos = f->wbase = f->buf;
>  cnt -= iov[0].iov_len;
>  iov++; iovcnt--;
>
> In the case where the kernel exactly eats the iov you need to move
> onto the next one rather than have a zero length write pointing just
> after the existing one, as that could be an invalid address.

In this case its not the zero length that is the problem.
The problem is that procfs treats each write (or apprently each part of 
the iov) as a separate operation.

So the first operation is "60" which is fine.
The next one is "\n" which is invalid.
So we get two operations instead of one.

The implementation in bash amounts to a printf("60") followed by 
putchar('\n');

The same thing in uclibc works as intended.

I guess I can patch bash, or use sysctl program.

AFAIK neither musl or procfs is doing anything wrong here, it just happens 
that a pure echo no longer works as it used to.

Cheers,
Jens

>
> Justin
>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: procfs stdio writev problem
  2013-05-05 18:49   ` Jens
@ 2013-05-05 19:03     ` Rich Felker
  2013-05-11  3:49     ` Rich Felker
  1 sibling, 0 replies; 5+ messages in thread
From: Rich Felker @ 2013-05-05 19:03 UTC (permalink / raw)
  To: musl

On Sun, May 05, 2013 at 08:49:50PM +0200, Jens wrote:
> >--- ./src/stdio/__stdio_write.c~ 2012-12-01 22:56:34.156555480 +0000
> >+++ ./src/stdio/__stdio_write.c 2013-05-05 10:59:49.856504883 +0100
> >@@ -37,7 +37,7 @@
> > return iovcnt == 2 ? 0 : len-iov[0].iov_len;
> > }
> > rem -= cnt;
> >- if (cnt > iov[0].iov_len) {
> >+ if (cnt >= iov[0].iov_len) {
> > f->wpos = f->wbase = f->buf;
> > cnt -= iov[0].iov_len;
> > iov++; iovcnt--;
> >
> >In the case where the kernel exactly eats the iov you need to move
> >onto the next one rather than have a zero length write pointing just
> >after the existing one, as that could be an invalid address.
> 
> In this case its not the zero length that is the problem.
> The problem is that procfs treats each write (or apprently each part
> of the iov) as a separate operation.
> 
> So the first operation is "60" which is fine.
> The next one is "\n" which is invalid.
> So we get two operations instead of one.
> 
> The implementation in bash amounts to a printf("60") followed by
> putchar('\n');
> 
> The same thing in uclibc works as intended.
> 
> I guess I can patch bash, or use sysctl program.
> 
> AFAIK neither musl or procfs is doing anything wrong here, it just
> happens that a pure echo no longer works as it used to.

I would say this is clearly a bug in procfs, for multiple reasons:

1. As far as I can tell, there's no documentation that the value has
   to be written via a single syscall. Unless otherwise documented,
   one would except repeated calls to putc() on an unbuffered stream
   to work. As it stands, they don't (with any libc).

2. POSIX requires that readv and writev behave the same as the
   corresponding single read and write calls. There's at least one
   other Linux bug in this area that we're already working around,
   with regards to how readv works on terminals.

Unfortunately I don't see any clean workaround for the issue. Linux
purposefully made it impossible to determine if a file is a proc
pseudo-file via falsely reporting files in /proc as regular files (via
stat) so there's no way to even detect that a workaround would be
needed.

Rich


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: procfs stdio writev problem
  2013-05-05 18:49   ` Jens
  2013-05-05 19:03     ` Rich Felker
@ 2013-05-11  3:49     ` Rich Felker
  1 sibling, 0 replies; 5+ messages in thread
From: Rich Felker @ 2013-05-11  3:49 UTC (permalink / raw)
  To: musl

On Sun, May 05, 2013 at 08:49:50PM +0200, Jens wrote:
> >Looks to me at a quick glance like stdio needs something like (untested)
> >
> >--- ./src/stdio/__stdio_write.c~ 2012-12-01 22:56:34.156555480 +0000
> >+++ ./src/stdio/__stdio_write.c 2013-05-05 10:59:49.856504883 +0100
> >@@ -37,7 +37,7 @@
> > return iovcnt == 2 ? 0 : len-iov[0].iov_len;
> > }
> > rem -= cnt;
> >- if (cnt > iov[0].iov_len) {
> >+ if (cnt >= iov[0].iov_len) {
> > f->wpos = f->wbase = f->buf;
> > cnt -= iov[0].iov_len;
> > iov++; iovcnt--;
> >
> >In the case where the kernel exactly eats the iov you need to move
> >onto the next one rather than have a zero length write pointing just
> >after the existing one, as that could be an invalid address.
> 
> In this case its not the zero length that is the problem.
> The problem is that procfs treats each write (or apprently each part
> of the iov) as a separate operation.
> 
> So the first operation is "60" which is fine.
> The next one is "\n" which is invalid.
> So we get two operations instead of one.
> 
> The implementation in bash amounts to a printf("60") followed by
> putchar('\n');
> 
> The same thing in uclibc works as intended.
> 
> I guess I can patch bash, or use sysctl program.
> 
> AFAIK neither musl or procfs is doing anything wrong here, it just
> happens that a pure echo no longer works as it used to.

I think we can work around this (and possibly improve performance) by
modifying __overflow to store the new character into the buffer and
pass 0 length to __stdio_write. However, __stdio_write does not seem
prepared to deal with zero length specially, and although per the
standard it should not cause problems as long as the buffer is
non-empty, I don't trust Linux not to do stupid things with a zero iov
length on certain file types (like /proc).

So I think two changes are needed:

1. Modifying __overflow to submit the new char as part of the buffer.

2. Modifying __stdio_write to optimize the iov array, removing
   zero-length FILE-buffer or caller-buffer components. This will
   require some overhaul since the current code is ugly and using the
   iov index to make assumptions about whether it's operating on the
   FILE buffer or not.

It's this second change that I think should improve performance, since
Linux often behaves suboptimally on multiple iov components.

Rich


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-05-11  3:49 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-05  9:16 procfs stdio writev problem Jens
2013-05-05 10:01 ` Justin Cormack
2013-05-05 18:49   ` Jens
2013-05-05 19:03     ` Rich Felker
2013-05-11  3:49     ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).