mailing list of musl libc
 help / color / mirror / code / Atom feed
* fseek EOVERFLOW
@ 2015-06-29 13:19 Alexander Monakov
  2015-06-29 14:45 ` Justin Cormack
  0 siblings, 1 reply; 7+ messages in thread
From: Alexander Monakov @ 2015-06-29 13:19 UTC (permalink / raw)
  To: musl

Hello,

if I run the following test:

cat <<EOF >test-fseek.c

#include <stdio.h>
#include <string.h>
#include <errno.h>
int main()
{
  FILE *f = fopen("/dev/zero", "r");
  int r = fseek(f, -1, SEEK_SET);
  printf("%d %s\n", r, strerror(errno));
  return 0;
}

EOF

I observe the following results:

- on musl, the argument (-1) is sign-extended for syscall, and no failure is
  reported;

- on glibc, the argument is sign-extended for syscall (and a syscall is made),
  but return value 'r' is set to -1 to indicate an error, but errno is not
  set.

It's not entirely obvious to me if (and how) the implementation should
diagnose this, but in light of the fact that a syscall is made with a huge
64-bit value as offset, the following seems to apply on 32-bit platforms:

[EOVERFLOW]
    [CX] For fseek(), the resulting file offset would be a value which cannot
    be represented correctly in an object of type long.


(I hit this issue due to using fseek with size_t offsets and SEEK_SET: on
32-bit, my offsets in range 2G-4G were sign-extended, leading to failure with
unclear diagnostics)

Thanks.
Alexander


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: fseek EOVERFLOW
  2015-06-29 13:19 fseek EOVERFLOW Alexander Monakov
@ 2015-06-29 14:45 ` Justin Cormack
  2015-06-29 14:51   ` Alexander Monakov
  0 siblings, 1 reply; 7+ messages in thread
From: Justin Cormack @ 2015-06-29 14:45 UTC (permalink / raw)
  To: musl

On 29 June 2015 at 14:19, Alexander Monakov <amonakov@ispras.ru> wrote:
> Hello,
>
> if I run the following test:
>
> cat <<EOF >test-fseek.c
>
> #include <stdio.h>
> #include <string.h>
> #include <errno.h>
> int main()
> {
>   FILE *f = fopen("/dev/zero", "r");
>   int r = fseek(f, -1, SEEK_SET);
>   printf("%d %s\n", r, strerror(errno));
>   return 0;
> }
>
> EOF
>
> I observe the following results:
>
> - on musl, the argument (-1) is sign-extended for syscall, and no failure is
>   reported;
>
> - on glibc, the argument is sign-extended for syscall (and a syscall is made),
>   but return value 'r' is set to -1 to indicate an error, but errno is not
>   set.
>
> It's not entirely obvious to me if (and how) the implementation should
> diagnose this, but in light of the fact that a syscall is made with a huge
> 64-bit value as offset, the following seems to apply on 32-bit platforms:
>
> [EOVERFLOW]
>     [CX] For fseek(), the resulting file offset would be a value which cannot
>     be represented correctly in an object of type long.
>
>
> (I hit this issue due to using fseek with size_t offsets and SEEK_SET: on
> 32-bit, my offsets in range 2G-4G were sign-extended, leading to failure with
> unclear diagnostics)

The sign extension is correct - the argument to fseek is off_t (and so
is the return value, it is not an int), and off_t is always 64 bit on
Musl. For glibc it depends if it is compiled with LARGEFILE_SOURCE.

So the sign extension is nothing to do with libc, it is your code.

Negative offsets are allowed by Posix, and Linux does seem ok with
them; you need to reset errno before lseek. I dont have a 32 bit Musl
machine at the minute, I should install one, but on 64 bit I get 0
returned correctly.

Justin


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: fseek EOVERFLOW
  2015-06-29 14:45 ` Justin Cormack
@ 2015-06-29 14:51   ` Alexander Monakov
  2015-06-29 15:11     ` Rich Felker
  0 siblings, 1 reply; 7+ messages in thread
From: Alexander Monakov @ 2015-06-29 14:51 UTC (permalink / raw)
  To: musl

On Mon, 29 Jun 2015, Justin Cormack wrote:
> The sign extension is correct - the argument to fseek is off_t (and so
> is the return value, it is not an int), and off_t is always 64 bit on
> Musl. For glibc it depends if it is compiled with LARGEFILE_SOURCE.

No; please consult the documentation.  The 'offset' argument to fseek is
'long', and the return value is 'int' (either -1 on error, or 0).

Alexander


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: fseek EOVERFLOW
  2015-06-29 14:51   ` Alexander Monakov
@ 2015-06-29 15:11     ` Rich Felker
  2015-06-29 15:31       ` Szabolcs Nagy
  0 siblings, 1 reply; 7+ messages in thread
From: Rich Felker @ 2015-06-29 15:11 UTC (permalink / raw)
  To: musl

On Mon, Jun 29, 2015 at 05:51:05PM +0300, Alexander Monakov wrote:
> On Mon, 29 Jun 2015, Justin Cormack wrote:
> > The sign extension is correct - the argument to fseek is off_t (and so
> > is the return value, it is not an int), and off_t is always 64 bit on
> > Musl. For glibc it depends if it is compiled with LARGEFILE_SOURCE.
> 
> No; please consult the documentation.  The 'offset' argument to fseek is
> 'long', and the return value is 'int' (either -1 on error, or 0).

I think Justin was referring to this:

> (I hit this issue due to using fseek with size_t offsets and SEEK_SET: on
> 32-bit, my offsets in range 2G-4G were sign-extended, leading to failure with
> unclear diagnostics)

Sign extension is the wrong word here; rather, it's coercion of an
unsigned 32-bit type into a signed 32-bit type, which is a valid
implicit conversion but doesn't do what you want. After the value
becomes negative, what happens is just like what would happen if you
had intentionally passed a negative value. I'm not sure why the
kernel's not catching it as invalid, but I'm also not sure what we
would do to fix things up if it doesn't... It's not easy to treat
negative offsets as invalid from userspace except in the special case
of SEEK_SET; for the others, userspace can't see that the result is
going to be negative.

Rich




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: fseek EOVERFLOW
  2015-06-29 15:11     ` Rich Felker
@ 2015-06-29 15:31       ` Szabolcs Nagy
  2015-06-29 16:05         ` Rich Felker
  0 siblings, 1 reply; 7+ messages in thread
From: Szabolcs Nagy @ 2015-06-29 15:31 UTC (permalink / raw)
  To: musl

* Rich Felker <dalias@libc.org> [2015-06-29 11:11:03 -0400]:
> On Mon, Jun 29, 2015 at 05:51:05PM +0300, Alexander Monakov wrote:
> > (I hit this issue due to using fseek with size_t offsets and SEEK_SET: on
> > 32-bit, my offsets in range 2G-4G were sign-extended, leading to failure with
> > unclear diagnostics)
> 
> Sign extension is the wrong word here; rather, it's coercion of an
> unsigned 32-bit type into a signed 32-bit type, which is a valid
> implicit conversion but doesn't do what you want. After the value
> becomes negative, what happens is just like what would happen if you
> had intentionally passed a negative value. I'm not sure why the
> kernel's not catching it as invalid, but I'm also not sure what we
> would do to fix things up if it doesn't... It's not easy to treat
> negative offsets as invalid from userspace except in the special case
> of SEEK_SET; for the others, userspace can't see that the result is
> going to be negative.

this affects lseek conformance too:

    [EINVAL]
        The whence argument is not a proper value, or the resulting
        file offset would be negative for a regular file, block
        special file, or directory.
    [EOVERFLOW]
        The resulting file offset would be a value which cannot be
        represented correctly in an object of type off_t.

and then in the rationale:

    An invalid file offset that would cause [EINVAL] to be returned
    may be both implementation-defined and device-dependent (for
    example, memory may have few invalid values). A negative file
    offset may be valid for some devices in some implementations.

    The POSIX.1-1990 standard did not specifically prohibit lseek()
    from returning a negative offset. Therefore, an application was
    required to clear errno prior to the call and check errno upon
    return to determine whether a return value of (off_t)-1 is a
    negative offset or an indication of an error condition. The
    standard developers did not wish to require this action on the
    part of a conforming application, and chose to require that
    errno be set to [EINVAL] when the resulting file offset would
    be negative for a regular file, block special file, or directory.

the kernel side should fix this.. unless they consider
/dev/zero a special device where netative offset is valid,
and work correctly on regular files.

if lseek is conforming then i think fseek would work too.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: fseek EOVERFLOW
  2015-06-29 15:31       ` Szabolcs Nagy
@ 2015-06-29 16:05         ` Rich Felker
  2015-06-29 16:43           ` Alexander Monakov
  0 siblings, 1 reply; 7+ messages in thread
From: Rich Felker @ 2015-06-29 16:05 UTC (permalink / raw)
  To: musl

On Mon, Jun 29, 2015 at 05:31:27PM +0200, Szabolcs Nagy wrote:
> the kernel side should fix this.. unless they consider
> /dev/zero a special device where netative offset is valid,
> and work correctly on regular files.
> 
> if lseek is conforming then i think fseek would work too.

I missed that it was operating on /dev/zero. I suspect the kernel
considers all offsets (negative and positive) valid for /dev/zero.
Obviously this makes the return value of the lseek syscall ambiguous
(is it an error value or a small negative number?) so I think, if this
is the case, it's a bug that should be fixed on the kernel side.

Rich


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: fseek EOVERFLOW
  2015-06-29 16:05         ` Rich Felker
@ 2015-06-29 16:43           ` Alexander Monakov
  0 siblings, 0 replies; 7+ messages in thread
From: Alexander Monakov @ 2015-06-29 16:43 UTC (permalink / raw)
  To: musl

On Mon, 29 Jun 2015, Rich Felker wrote:

> On Mon, Jun 29, 2015 at 05:31:27PM +0200, Szabolcs Nagy wrote:
> > the kernel side should fix this.. unless they consider
> > /dev/zero a special device where netative offset is valid,
> > and work correctly on regular files.
> > 
> > if lseek is conforming then i think fseek would work too.
> 
> I missed that it was operating on /dev/zero. I suspect the kernel
> considers all offsets (negative and positive) valid for /dev/zero.
> Obviously this makes the return value of the lseek syscall ambiguous
> (is it an error value or a small negative number?) so I think, if this
> is the case, it's a bug that should be fixed on the kernel side.

Well, /dev/zero was there only for the sake of making a small testcase; my
actual failure was with a /proc/<pid>/mem file, but I guess the same reasoning
applies.

If I make the test work on a "real" file (a 4G unallocated file on a tmpfs),
and compile with -m32 (plus -D_FILE_OFFSET_BITS=64 on glibc, otherwise fopen
fails), the kernel returns EINVAL.

I think the bit I was missing is that libc implementations expect the kernel
to always fail SEEK_SET lseek for offsets with high bit set.

Thanks.
Alexander


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-06-29 16:43 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-29 13:19 fseek EOVERFLOW Alexander Monakov
2015-06-29 14:45 ` Justin Cormack
2015-06-29 14:51   ` Alexander Monakov
2015-06-29 15:11     ` Rich Felker
2015-06-29 15:31       ` Szabolcs Nagy
2015-06-29 16:05         ` Rich Felker
2015-06-29 16:43           ` Alexander Monakov

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).