mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] Invalid pointer subtractions in __shlim and __shgetc
@ 2020-04-17 15:56 Pascal Cuoq
  2020-04-17 16:13 ` Rich Felker
  0 siblings, 1 reply; 8+ messages in thread
From: Pascal Cuoq @ 2020-04-17 15:56 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 2066 bytes --]

?Hello,

both functions `__shlim` and `__shgetc` subtract the members
named `buf` and `rpos` of the struct they manipulate.

In `__shlim`, this happens in the statement `f->shcnt = f->buf - f->rpos;`.
And in `__shgetc`, in happens inside the `shcnt` macro:

#define shcnt(f) ((f)->shcnt + ((f)->rpos - (f)->buf))

In our tests, while running `testsuite` in `libc-testsuite`,
both the `__shlim` and `__shgetc` functions are reached
with `f->buf` non-null and `f->rpos` a null pointer.

This can be made visible on execution platforms other than ours
by adding a statement at the beginning of the functions:

+      if (f->buf && !f->rpos) dprintf (2, "XXX Problem in __shlim\n");
+      if (f->buf && !f->rpos) dprintf (2, "XXX Problem in __shgetc\n");

Then if, running `libc-testsuite`, you see the following, it means that
`f->buf` was non-null and `f->rpos` was null when these points were
reached:

$ ./testsuite
fdopen test passed
fcntl test passed
fnmatch test passed
XXX Problem in __shlim
XXX Problem in __shgetc
XXX Problem in __shlim
XXX Problem in __shgetc
XXX Problem in __shlim
XXX Problem in __shgetc
XXX Problem in __shlim
XXX Problem in __shgetc
XXX Problem in __shlim
XXX Problem in __shgetc
XXX Problem in __shlim
XXX Problem in __shgetc
fscanf test passed
(...)

This has been tested on the (tag: v1.2.0) branch of git://git.musl-libc.org/musl

These pointer subtractions are undefined behavior. This is slightly worse
than computing `(char*)0-(char*)0`, which is undefined in C and defined in C++,
because compilers for both C and C++ are unlikely to exploit this one
for optimization. Subtracting between a non-null pointer and a null pointer
on the other hand is undefined behavior in both languages, and it is
plausible that doing it may someday have unexpected consequences.

I mention this because similar undefined behaviors that were extremely
unlikely to cause harm have been fixed in musl in recent months,
so that this looks like something you may want to fix too.

Pascal




[-- Attachment #2: Type: text/html, Size: 3661 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [musl] Invalid pointer subtractions in __shlim and __shgetc
  2020-04-17 15:56 [musl] Invalid pointer subtractions in __shlim and __shgetc Pascal Cuoq
@ 2020-04-17 16:13 ` Rich Felker
  2020-04-17 16:48   ` Rich Felker
  0 siblings, 1 reply; 8+ messages in thread
From: Rich Felker @ 2020-04-17 16:13 UTC (permalink / raw)
  To: Pascal Cuoq; +Cc: musl

On Fri, Apr 17, 2020 at 03:56:06PM +0000, Pascal Cuoq wrote:
> ?Hello,
> 
> both functions `__shlim` and `__shgetc` subtract the members
> named `buf` and `rpos` of the struct they manipulate.
> 
> In `__shlim`, this happens in the statement `f->shcnt = f->buf - f->rpos;`.
> And in `__shgetc`, in happens inside the `shcnt` macro:
> 
> #define shcnt(f) ((f)->shcnt + ((f)->rpos - (f)->buf))
> 
> In our tests, while running `testsuite` in `libc-testsuite`,
> both the `__shlim` and `__shgetc` functions are reached
> with `f->buf` non-null and `f->rpos` a null pointer.
> 
> This can be made visible on execution platforms other than ours
> by adding a statement at the beginning of the functions:
> 
> +      if (f->buf && !f->rpos) dprintf (2, "XXX Problem in __shlim\n");
> +      if (f->buf && !f->rpos) dprintf (2, "XXX Problem in __shgetc\n");
> 
> Then if, running `libc-testsuite`, you see the following, it means that
> `f->buf` was non-null and `f->rpos` was null when these points were
> reached:
> 
> $ ./testsuite
> fdopen test passed
> fcntl test passed
> fnmatch test passed
> XXX Problem in __shlim
> XXX Problem in __shgetc
> XXX Problem in __shlim
> XXX Problem in __shgetc
> XXX Problem in __shlim
> XXX Problem in __shgetc
> XXX Problem in __shlim
> XXX Problem in __shgetc
> XXX Problem in __shlim
> XXX Problem in __shgetc
> XXX Problem in __shlim
> XXX Problem in __shgetc
> fscanf test passed
> (...)
> 
> This has been tested on the (tag: v1.2.0) branch of git://git.musl-libc.org/musl
> 
> These pointer subtractions are undefined behavior. This is slightly worse
> than computing `(char*)0-(char*)0`, which is undefined in C and defined in C++,
> because compilers for both C and C++ are unlikely to exploit this one
> for optimization. Subtracting between a non-null pointer and a null pointer
> on the other hand is undefined behavior in both languages, and it is
> plausible that doing it may someday have unexpected consequences.
> 
> I mention this because similar undefined behaviors that were extremely
> unlikely to cause harm have been fixed in musl in recent months,
> so that this looks like something you may want to fix too.

Absolutely. Do you have an analysis of how this is reached? Neither of
these should be called when the FILE is not in suitable state for
reading. It might just be that vfscanf needs to call __toread on the
FILE before starting and error out if it fails.

Rich

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [musl] Invalid pointer subtractions in __shlim and __shgetc
  2020-04-17 16:13 ` Rich Felker
@ 2020-04-17 16:48   ` Rich Felker
  2020-04-17 17:56     ` Rich Felker
  0 siblings, 1 reply; 8+ messages in thread
From: Rich Felker @ 2020-04-17 16:48 UTC (permalink / raw)
  To: Pascal Cuoq; +Cc: musl

[-- Attachment #1: Type: text/plain, Size: 2649 bytes --]

On Fri, Apr 17, 2020 at 12:13:51PM -0400, Rich Felker wrote:
> On Fri, Apr 17, 2020 at 03:56:06PM +0000, Pascal Cuoq wrote:
> > ?Hello,
> > 
> > both functions `__shlim` and `__shgetc` subtract the members
> > named `buf` and `rpos` of the struct they manipulate.
> > 
> > In `__shlim`, this happens in the statement `f->shcnt = f->buf - f->rpos;`.
> > And in `__shgetc`, in happens inside the `shcnt` macro:
> > 
> > #define shcnt(f) ((f)->shcnt + ((f)->rpos - (f)->buf))
> > 
> > In our tests, while running `testsuite` in `libc-testsuite`,
> > both the `__shlim` and `__shgetc` functions are reached
> > with `f->buf` non-null and `f->rpos` a null pointer.
> > 
> > This can be made visible on execution platforms other than ours
> > by adding a statement at the beginning of the functions:
> > 
> > +      if (f->buf && !f->rpos) dprintf (2, "XXX Problem in __shlim\n");
> > +      if (f->buf && !f->rpos) dprintf (2, "XXX Problem in __shgetc\n");
> > 
> > Then if, running `libc-testsuite`, you see the following, it means that
> > `f->buf` was non-null and `f->rpos` was null when these points were
> > reached:
> > 
> > $ ./testsuite
> > fdopen test passed
> > fcntl test passed
> > fnmatch test passed
> > XXX Problem in __shlim
> > XXX Problem in __shgetc
> > XXX Problem in __shlim
> > XXX Problem in __shgetc
> > XXX Problem in __shlim
> > XXX Problem in __shgetc
> > XXX Problem in __shlim
> > XXX Problem in __shgetc
> > XXX Problem in __shlim
> > XXX Problem in __shgetc
> > XXX Problem in __shlim
> > XXX Problem in __shgetc
> > fscanf test passed
> > (...)
> > 
> > This has been tested on the (tag: v1.2.0) branch of git://git.musl-libc.org/musl
> > 
> > These pointer subtractions are undefined behavior. This is slightly worse
> > than computing `(char*)0-(char*)0`, which is undefined in C and defined in C++,
> > because compilers for both C and C++ are unlikely to exploit this one
> > for optimization. Subtracting between a non-null pointer and a null pointer
> > on the other hand is undefined behavior in both languages, and it is
> > plausible that doing it may someday have unexpected consequences.
> > 
> > I mention this because similar undefined behaviors that were extremely
> > unlikely to cause harm have been fixed in musl in recent months,
> > so that this looks like something you may want to fix too.
> 
> Absolutely. Do you have an analysis of how this is reached? Neither of
> these should be called when the FILE is not in suitable state for
> reading. It might just be that vfscanf needs to call __toread on the
> FILE before starting and error out if it fails.

Indeed I think the attached fixes it.

Rich

[-- Attachment #2: shgetc_rpos_usage_fix.diff --]
[-- Type: text/plain, Size: 355 bytes --]

diff --git a/src/stdio/vfscanf.c b/src/stdio/vfscanf.c
index 9e030fc4..d990db9f 100644
--- a/src/stdio/vfscanf.c
+++ b/src/stdio/vfscanf.c
@@ -76,6 +76,8 @@ int vfscanf(FILE *restrict f, const char *restrict fmt, va_list ap)
 
 	FLOCK(f);
 
+	if (!f->rpos && __toread(f)) goto input_fail;
+
 	for (p=(const unsigned char *)fmt; *p; p++) {
 
 		alloc = 0;

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [musl] Invalid pointer subtractions in __shlim and __shgetc
  2020-04-17 16:48   ` Rich Felker
@ 2020-04-17 17:56     ` Rich Felker
  2020-04-23 11:34       ` Pascal Cuoq
  0 siblings, 1 reply; 8+ messages in thread
From: Rich Felker @ 2020-04-17 17:56 UTC (permalink / raw)
  To: musl

On Fri, Apr 17, 2020 at 12:48:07PM -0400, Rich Felker wrote:
> On Fri, Apr 17, 2020 at 12:13:51PM -0400, Rich Felker wrote:
> > On Fri, Apr 17, 2020 at 03:56:06PM +0000, Pascal Cuoq wrote:
> > > ?Hello,
> > > 
> > > both functions `__shlim` and `__shgetc` subtract the members
> > > named `buf` and `rpos` of the struct they manipulate.
> > > 
> > > In `__shlim`, this happens in the statement `f->shcnt = f->buf - f->rpos;`.
> > > And in `__shgetc`, in happens inside the `shcnt` macro:
> > > 
> > > #define shcnt(f) ((f)->shcnt + ((f)->rpos - (f)->buf))
> > > 
> > > In our tests, while running `testsuite` in `libc-testsuite`,
> > > both the `__shlim` and `__shgetc` functions are reached
> > > with `f->buf` non-null and `f->rpos` a null pointer.
> > > 
> > > This can be made visible on execution platforms other than ours
> > > by adding a statement at the beginning of the functions:
> > > 
> > > +      if (f->buf && !f->rpos) dprintf (2, "XXX Problem in __shlim\n");
> > > +      if (f->buf && !f->rpos) dprintf (2, "XXX Problem in __shgetc\n");
> > > 
> > > Then if, running `libc-testsuite`, you see the following, it means that
> > > `f->buf` was non-null and `f->rpos` was null when these points were
> > > reached:
> > > 
> > > $ ./testsuite
> > > fdopen test passed
> > > fcntl test passed
> > > fnmatch test passed
> > > XXX Problem in __shlim
> > > XXX Problem in __shgetc
> > > XXX Problem in __shlim
> > > XXX Problem in __shgetc
> > > XXX Problem in __shlim
> > > XXX Problem in __shgetc
> > > XXX Problem in __shlim
> > > XXX Problem in __shgetc
> > > XXX Problem in __shlim
> > > XXX Problem in __shgetc
> > > XXX Problem in __shlim
> > > XXX Problem in __shgetc
> > > fscanf test passed
> > > (...)
> > > 
> > > This has been tested on the (tag: v1.2.0) branch of git://git.musl-libc.org/musl
> > > 
> > > These pointer subtractions are undefined behavior. This is slightly worse
> > > than computing `(char*)0-(char*)0`, which is undefined in C and defined in C++,
> > > because compilers for both C and C++ are unlikely to exploit this one
> > > for optimization. Subtracting between a non-null pointer and a null pointer
> > > on the other hand is undefined behavior in both languages, and it is
> > > plausible that doing it may someday have unexpected consequences.
> > > 
> > > I mention this because similar undefined behaviors that were extremely
> > > unlikely to cause harm have been fixed in musl in recent months,
> > > so that this looks like something you may want to fix too.
> > 
> > Absolutely. Do you have an analysis of how this is reached? Neither of
> > these should be called when the FILE is not in suitable state for
> > reading. It might just be that vfscanf needs to call __toread on the
> > FILE before starting and error out if it fails.
> 
> Indeed I think the attached fixes it.
> 
> Rich

> diff --git a/src/stdio/vfscanf.c b/src/stdio/vfscanf.c
> index 9e030fc4..d990db9f 100644
> --- a/src/stdio/vfscanf.c
> +++ b/src/stdio/vfscanf.c
> @@ -76,6 +76,8 @@ int vfscanf(FILE *restrict f, const char *restrict fmt, va_list ap)
>  
>  	FLOCK(f);
>  
> +	if (!f->rpos && __toread(f)) goto input_fail;
> +
>  	for (p=(const unsigned char *)fmt; *p; p++) {
>  
>  		alloc = 0;

I think this patch may result in wrong error behavior on a trivial
scanf that doesn't try to read anything. Instead it should be:

	if (!f->rpos) __toread(f);
	if (!f->rpos) goto input_fail;

so that the error path is taken only on failure to enter read mode,
not on EOF.

If this works on in my tests I'll commit it.

Rich

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [musl] Invalid pointer subtractions in __shlim and __shgetc
  2020-04-17 17:56     ` Rich Felker
@ 2020-04-23 11:34       ` Pascal Cuoq
  2020-04-23 16:14         ` Rich Felker
  0 siblings, 1 reply; 8+ messages in thread
From: Pascal Cuoq @ 2020-04-23 11:34 UTC (permalink / raw)
  To: musl

Hello again,

Rich Felker <dalias@libc.org> wrote:
> I think this patch may result in wrong error behavior on a trivial
> scanf that doesn't try to read anything. Instead it should be:
>
>        if (!f->rpos) __toread(f);
>        if (!f->rpos) goto input_fail;
>
> so that the error path is taken only on failure to enter read mode,
> not on EOF.

This has indeed fixed the invalid comparisons that were observed
from the tests I mentioned earlier, but a different test still has
the same problem.

As of commit 33338eb, the function wcstox does:
        f.rpos = f.rend = 0;
        f.buf = buf + 4;

(https://git.musl-libc.org/cgit/musl/tree/src/stdlib/wcstol.c?id=33338ebc853d37c80f0f236cc7a92cb0acc6aace#n38 )

It then passes the address of this f to shlim (line 45), causing the same invalid pointer subtraction f->buf - f->rpos that has already been discussed in this thread.

Best regards,

Pascal

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [musl] Invalid pointer subtractions in __shlim and __shgetc
  2020-04-23 11:34       ` Pascal Cuoq
@ 2020-04-23 16:14         ` Rich Felker
  2020-04-24  9:40           ` Pascal Cuoq
  0 siblings, 1 reply; 8+ messages in thread
From: Rich Felker @ 2020-04-23 16:14 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 1047 bytes --]

On Thu, Apr 23, 2020 at 11:34:26AM +0000, Pascal Cuoq wrote:
> Hello again,
> 
> Rich Felker <dalias@libc.org> wrote:
> > I think this patch may result in wrong error behavior on a trivial
> > scanf that doesn't try to read anything. Instead it should be:
> >
> >        if (!f->rpos) __toread(f);
> >        if (!f->rpos) goto input_fail;
> >
> > so that the error path is taken only on failure to enter read mode,
> > not on EOF.
> 
> This has indeed fixed the invalid comparisons that were observed
> from the tests I mentioned earlier, but a different test still has
> the same problem.
> 
> As of commit 33338eb, the function wcstox does:
>         f.rpos = f.rend = 0;
>         f.buf = buf + 4;
> 
> (https://git.musl-libc.org/cgit/musl/tree/src/stdlib/wcstol.c?id=33338ebc853d37c80f0f236cc7a92cb0acc6aace#n38 )
> 
> It then passes the address of this f to shlim (line 45), causing the
> same invalid pointer subtraction f->buf - f->rpos that has already
> been discussed in this thread.

Thanks. The attached should fix it, I think.

Rich

[-- Attachment #2: wcstox.diff --]
[-- Type: text/plain, Size: 848 bytes --]

diff --git a/src/stdlib/wcstod.c b/src/stdlib/wcstod.c
index 26fe9af8..0be8c167 100644
--- a/src/stdlib/wcstod.c
+++ b/src/stdlib/wcstod.c
@@ -33,8 +33,7 @@ static long double wcstox(const wchar_t *s, wchar_t **p, int prec)
 	unsigned char buf[64];
 	FILE f = {0};
 	f.flags = 0;
-	f.rpos = f.rend = 0;
-	f.buf = buf + 4;
+	f.rpos = f.rend = buf + 4;
 	f.buf_size = sizeof buf - 4;
 	f.lock = -1;
 	f.read = do_read;
diff --git a/src/stdlib/wcstol.c b/src/stdlib/wcstol.c
index 4443f577..39a51269 100644
--- a/src/stdlib/wcstol.c
+++ b/src/stdlib/wcstol.c
@@ -35,8 +35,7 @@ static unsigned long long wcstox(const wchar_t *s, wchar_t **p, int base, unsign
 	unsigned char buf[64];
 	FILE f = {0};
 	f.flags = 0;
-	f.rpos = f.rend = 0;
-	f.buf = buf + 4;
+	f.rpos = f.rend = buf + 4;
 	f.buf_size = sizeof buf - 4;
 	f.lock = -1;
 	f.read = do_read;

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [musl] Invalid pointer subtractions in __shlim and __shgetc
  2020-04-23 16:14         ` Rich Felker
@ 2020-04-24  9:40           ` Pascal Cuoq
  2020-04-24 14:12             ` Rich Felker
  0 siblings, 1 reply; 8+ messages in thread
From: Pascal Cuoq @ 2020-04-24  9:40 UTC (permalink / raw)
  To: musl

Hello,

Rich Felker <dalias@libc.org> wrote:
> The attached should fix it, I think.

The patch sets f.rpos and f.rend to buf+4, but it also leaves
f.buf containing 0 from “FILE f = {0};”:
--- a/src/stdlib/wcstol.c
+++ b/src/stdlib/wcstol.c
@@ -35,8 +35,7 @@ static unsigned long long wcstox(const wchar_t *s, wchar_t **p, int base, unsign
 	unsigned char buf[64];
 	FILE f = {0};
 	f.flags = 0;
-	f.rpos = f.rend = 0;
-	f.buf = buf + 4;
+	f.rpos = f.rend = buf + 4;
 	f.buf_size = sizeof buf - 4;
 	f.lock = -1;
 	f.read = do_read;

Unfortunately, the function __shlim also subtracts f.rpos from f.buf, at this line:

  f->shcnt = f->buf - f->rpos;

(https://git.musl-libc.org/cgit/musl/tree/src/internal/shgetc.c?id=33338ebc853d37c80f0f236cc7a92cb0acc6aace#n11 )

So that is now where the invalid subtraction happens.

For what it's worth, we have tested the patch consisting in
initializing all three of f.rpos, f.rend and f.buf to buf+4, and that
does not cause UB in this test. But we can't tell if if provides the
correct functional behavior for this test and for other inputs.

Pascal

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [musl] Invalid pointer subtractions in __shlim and __shgetc
  2020-04-24  9:40           ` Pascal Cuoq
@ 2020-04-24 14:12             ` Rich Felker
  0 siblings, 0 replies; 8+ messages in thread
From: Rich Felker @ 2020-04-24 14:12 UTC (permalink / raw)
  To: musl

On Fri, Apr 24, 2020 at 09:40:15AM +0000, Pascal Cuoq wrote:
> Hello,
> 
> Rich Felker <dalias@libc.org> wrote:
> > The attached should fix it, I think.
> 
> The patch sets f.rpos and f.rend to buf+4, but it also leaves
> f.buf containing 0 from “FILE f = {0};”:
> --- a/src/stdlib/wcstol.c
> +++ b/src/stdlib/wcstol.c
> @@ -35,8 +35,7 @@ static unsigned long long wcstox(const wchar_t *s, wchar_t **p, int base, unsign
>  	unsigned char buf[64];
>  	FILE f = {0};
>  	f.flags = 0;
> -	f.rpos = f.rend = 0;
> -	f.buf = buf + 4;
> +	f.rpos = f.rend = buf + 4;
>  	f.buf_size = sizeof buf - 4;
>  	f.lock = -1;
>  	f.read = do_read;
> 
> Unfortunately, the function __shlim also subtracts f.rpos from f.buf, at this line:
> 
>   f->shcnt = f->buf - f->rpos;

Uhg, this was purely a mechanical error in the edit (selecting too
much text to delete) and I should have tested before sending. Should
be:

-	f.rpos = f.rend = 0;
-	f.buf = buf + 4;
+	f.rpos = f.rend = f.buf = buf + 4;

> (https://git.musl-libc.org/cgit/musl/tree/src/internal/shgetc.c?id=33338ebc853d37c80f0f236cc7a92cb0acc6aace#n11 )
> 
> So that is now where the invalid subtraction happens.
> 
> For what it's worth, we have tested the patch consisting in
> initializing all three of f.rpos, f.rend and f.buf to buf+4, and that
> does not cause UB in this test. But we can't tell if if provides the
> correct functional behavior for this test and for other inputs.

Yep, that's what I intended. Sorry for wasting your time with a bad
patch.

Rich

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-04-24 14:12 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-17 15:56 [musl] Invalid pointer subtractions in __shlim and __shgetc Pascal Cuoq
2020-04-17 16:13 ` Rich Felker
2020-04-17 16:48   ` Rich Felker
2020-04-17 17:56     ` Rich Felker
2020-04-23 11:34       ` Pascal Cuoq
2020-04-23 16:14         ` Rich Felker
2020-04-24  9:40           ` Pascal Cuoq
2020-04-24 14:12             ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).