mailing list of musl libc
 help / color / mirror / code / Atom feed
* printf family handling of INT_MAX +1 tested on aarch64
@ 2018-11-07 19:33 CM Graff
  2018-11-07 20:31 ` Rich Felker
  0 siblings, 1 reply; 5+ messages in thread
From: CM Graff @ 2018-11-07 19:33 UTC (permalink / raw)
  To: musl

Hello everyone,

The C standard states that:
"The number of characters or wide characters transmitted by a formatted output
function (or written to an array, or that would have been written to an array)
is greater
than INT_MAX" is undefined behavior.

POSIX states that:

"In addition, all forms of fprintf() shall fail if:

[...]
[EOVERFLOW]
    [CX] [Option Start] The value to be returned is greater than {INT_MAX}.
[Option End]
"

Though arguments of over INT_MAX are undefined behavior it seems like some
provisions have been made in musl to handle it, and the method for handling
such appear similar in effect to that of glibc and freebsd's libc. INT_MAX + 2
appears to represent this case, however INT_MAX + 1 produces a segfault on my
aarch64 test box running debian version 9.5.

I do not have a suggested fix other than to either carefully inspect the
EOVERFLOW semantics or to mitigate the need for more complex mathematics by
using a size_t as the primary counter for the stdio family instead of an int.

This segfault was discovered when testing my own small libc
(https://github.com/hlibc/hlibc) against the various robust production grade
libc to understand more about how to properly handle EOVERFLOW and in general
the cases of INT_MAX related undefined behavior for the formatted stdio
functions as per specified in the C standard and POSIX.

I am not sure that handling this is an important case for musl, however I
thought it best to report the scenario as best I could describe it.

Here is a script and a small C program to verify this segfault on aarch64,
I apologize for not testing on other architectures but my time is limited
lately as I'm working toward my degree in mathematics.

#!/bin/sh
git clone git://git.musl-libc.org/musl
cd musl
./configure --prefix=$(pwd)/usr
make -j4 > log 2>&1
make install >> log 2>&1
./usr/bin/musl-gcc -static ../printf_overflow.c
./a.out > log2



#include <stdio.h>
#include <limits.h>
#include <errno.h>
#include <string.h>
#include <stdlib.h>
int main(void)
{
        size_t i = INT_MAX;
        ++i;
        char *s = malloc(i);
        if (!(s))
        {
                fprintf(stderr, "unable to allocate enough memory\n");
                return 1;
        }
        memset(s, 'A', i - 1);
        s[i] = 0;
        /* make sure printf is not changed to puts() by the compiler */
        int len = printf("%s", s, 1);

        if (errno == EOVERFLOW)
                fprintf(stderr, "printf set EOVERFLOW\n");
        else
                fprintf(stderr, "printf did not set EOVERFLOW\n");

        fprintf(stderr, "printf returned %d\n", len);
        return 0;
}

Thank you for your time,

Graff


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: printf family handling of INT_MAX +1 tested on aarch64
  2018-11-07 19:33 printf family handling of INT_MAX +1 tested on aarch64 CM Graff
@ 2018-11-07 20:31 ` Rich Felker
  2018-11-07 20:54   ` CM Graff
  0 siblings, 1 reply; 5+ messages in thread
From: Rich Felker @ 2018-11-07 20:31 UTC (permalink / raw)
  To: musl

On Wed, Nov 07, 2018 at 01:33:13PM -0600, CM Graff wrote:
> Hello everyone,
> 
> The C standard states that:
> "The number of characters or wide characters transmitted by a formatted output
> function (or written to an array, or that would have been written to an array)
> is greater
> than INT_MAX" is undefined behavior.
> 
> POSIX states that:
> 
> "In addition, all forms of fprintf() shall fail if:
> 
> [...]
> [EOVERFLOW]
>     [CX] [Option Start] The value to be returned is greater than {INT_MAX}.
> [Option End]
> "
> 
> Though arguments of over INT_MAX are undefined behavior it seems like some
> provisions have been made in musl to handle it, and the method for handling
> such appear similar in effect to that of glibc and freebsd's libc. INT_MAX + 2
> appears to represent this case, however INT_MAX + 1 produces a segfault on my
> aarch64 test box running debian version 9.5.
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^

At first this sounded like you were using glibc, but based on the
below test program it seems you're using the musl-gcc wrapper, then
running musl binaries on the same box. Ok, this should work.

> I do not have a suggested fix other than to either carefully inspect the
> EOVERFLOW semantics or to mitigate the need for more complex mathematics by
> using a size_t as the primary counter for the stdio family instead of an int.

The counter is not incremented without seeing that the increment would
not cause overflow. See vfprintf.c lines 447-450:

		/* This error is only specified for snprintf, but since it's
		 * unspecified for other forms, do the same. Stop immediately
		 * on overflow; otherwise %n could produce wrong results. */
		if (l > INT_MAX - cnt) goto overflow;

> This segfault was discovered when testing my own small libc
> (https://github.com/hlibc/hlibc) against the various robust production grade
> libc to understand more about how to properly handle EOVERFLOW and in general
> the cases of INT_MAX related undefined behavior for the formatted stdio
> functions as per specified in the C standard and POSIX.
> 
> I am not sure that handling this is an important case for musl, however I
> thought it best to report the scenario as best I could describe it.

I don't understand exactly what you're claiming is wrong. If it's a
segfault, where does it occur?

> Here is a script and a small C program to verify this segfault on aarch64,
> I apologize for not testing on other architectures but my time is limited
> lately as I'm working toward my degree in mathematics.
> 
> #!/bin/sh
> git clone git://git.musl-libc.org/musl
> cd musl
> ../configure --prefix=$(pwd)/usr
> make -j4 > log 2>&1
> make install >> log 2>&1
> ../usr/bin/musl-gcc -static ../printf_overflow.c
> ../a.out > log2
> 
> 
> 
> #include <stdio.h>
> #include <limits.h>
> #include <errno.h>
> #include <string.h>
> #include <stdlib.h>
> int main(void)
> {
>         size_t i = INT_MAX;
>         ++i;
>         char *s = malloc(i);
>         if (!(s))
>         {
>                 fprintf(stderr, "unable to allocate enough memory\n");
>                 return 1;
>         }
>         memset(s, 'A', i - 1);
>         s[i] = 0;
>         /* make sure printf is not changed to puts() by the compiler */
>         int len = printf("%s", s, 1);
> 
>         if (errno == EOVERFLOW)
>                 fprintf(stderr, "printf set EOVERFLOW\n");
>         else
>                 fprintf(stderr, "printf did not set EOVERFLOW\n");
> 
>         fprintf(stderr, "printf returned %d\n", len);
>         return 0;
> }

There is nothing in this test program that overflows. printf produces
precisely INT_MAX bytes of output, which is representable, and
therefore it succeeds and returns INT_MAX. I tested this on x86_64
(it's not possible as written on 32-bit archs since INT_MAX+1 is not
allocatable, although you could do similar tests like using %s%s to
print the same INT_MAX/2-size string twice) and it worked as expected.

Rich


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: printf family handling of INT_MAX +1 tested on aarch64
  2018-11-07 20:31 ` Rich Felker
@ 2018-11-07 20:54   ` CM Graff
  2018-11-08  2:04     ` Rich Felker
  0 siblings, 1 reply; 5+ messages in thread
From: CM Graff @ 2018-11-07 20:54 UTC (permalink / raw)
  To: musl

RIch,
It just produces a segfault on debian aarch64 in my test case. Whereas
INTMAX + 2 does not. So I thought it worth reporting.

graff@hlib-debian-arm:~/hlibc-test/tests-emperical/musl$
./usr/bin/musl-gcc ../printf_overflow.c
graff@hlib-debian-arm:~/hlibc-test/tests-emperical/musl$
./usr/bin/musl-gcc -static ../printf_overflow.c
graff@hlib-debian-arm:~/hlibc-test/tests-emperical/musl$ ./a.out > logfile
Segmentation fault
graff@hlib-debian-arm:~/hlibc-test/tests-emperical/musl$ uname -a
Linux hlib-debian-arm 4.9.0-8-arm64 #1 SMP Debian 4.9.110-3+deb9u6
(2018-10-08) aarch64 GNU/Linux
graff@hlib-debian-arm:~/hlibc-test/tests-emperical/musl$

I can supply access to the 96 core 124 GB RAM aarch64 debian test box
if it would help reproduce the segfault. Just email me a public key if
you want access.

Graff

On 11/7/18, Rich Felker <dalias@libc.org> wrote:
> On Wed, Nov 07, 2018 at 01:33:13PM -0600, CM Graff wrote:
>> Hello everyone,
>>
>> The C standard states that:
>> "The number of characters or wide characters transmitted by a formatted
>> output
>> function (or written to an array, or that would have been written to an
>> array)
>> is greater
>> than INT_MAX" is undefined behavior.
>>
>> POSIX states that:
>>
>> "In addition, all forms of fprintf() shall fail if:
>>
>> [...]
>> [EOVERFLOW]
>>     [CX] [Option Start] The value to be returned is greater than
>> {INT_MAX}.
>> [Option End]
>> "
>>
>> Though arguments of over INT_MAX are undefined behavior it seems like
>> some
>> provisions have been made in musl to handle it, and the method for
>> handling
>> such appear similar in effect to that of glibc and freebsd's libc. INT_MAX
>> + 2
>> appears to represent this case, however INT_MAX + 1 produces a segfault on
>> my
>> aarch64 test box running debian version 9.5.
>                    ^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> At first this sounded like you were using glibc, but based on the
> below test program it seems you're using the musl-gcc wrapper, then
> running musl binaries on the same box. Ok, this should work.
>
>> I do not have a suggested fix other than to either carefully inspect the
>> EOVERFLOW semantics or to mitigate the need for more complex mathematics
>> by
>> using a size_t as the primary counter for the stdio family instead of an
>> int.
>
> The counter is not incremented without seeing that the increment would
> not cause overflow. See vfprintf.c lines 447-450:
>
> 		/* This error is only specified for snprintf, but since it's
> 		 * unspecified for other forms, do the same. Stop immediately
> 		 * on overflow; otherwise %n could produce wrong results. */
> 		if (l > INT_MAX - cnt) goto overflow;
>
>> This segfault was discovered when testing my own small libc
>> (https://github.com/hlibc/hlibc) against the various robust production
>> grade
>> libc to understand more about how to properly handle EOVERFLOW and in
>> general
>> the cases of INT_MAX related undefined behavior for the formatted stdio
>> functions as per specified in the C standard and POSIX.
>>
>> I am not sure that handling this is an important case for musl, however I
>> thought it best to report the scenario as best I could describe it.
>
> I don't understand exactly what you're claiming is wrong. If it's a
> segfault, where does it occur?
>
>> Here is a script and a small C program to verify this segfault on
>> aarch64,
>> I apologize for not testing on other architectures but my time is limited
>> lately as I'm working toward my degree in mathematics.
>>
>> #!/bin/sh
>> git clone git://git.musl-libc.org/musl
>> cd musl
>> ../configure --prefix=$(pwd)/usr
>> make -j4 > log 2>&1
>> make install >> log 2>&1
>> ../usr/bin/musl-gcc -static ../printf_overflow.c
>> ../a.out > log2
>>
>>
>>
>> #include <stdio.h>
>> #include <limits.h>
>> #include <errno.h>
>> #include <string.h>
>> #include <stdlib.h>
>> int main(void)
>> {
>>         size_t i = INT_MAX;
>>         ++i;
>>         char *s = malloc(i);
>>         if (!(s))
>>         {
>>                 fprintf(stderr, "unable to allocate enough memory\n");
>>                 return 1;
>>         }
>>         memset(s, 'A', i - 1);
>>         s[i] = 0;
>>         /* make sure printf is not changed to puts() by the compiler */
>>         int len = printf("%s", s, 1);
>>
>>         if (errno == EOVERFLOW)
>>                 fprintf(stderr, "printf set EOVERFLOW\n");
>>         else
>>                 fprintf(stderr, "printf did not set EOVERFLOW\n");
>>
>>         fprintf(stderr, "printf returned %d\n", len);
>>         return 0;
>> }
>
> There is nothing in this test program that overflows. printf produces
> precisely INT_MAX bytes of output, which is representable, and
> therefore it succeeds and returns INT_MAX. I tested this on x86_64
> (it's not possible as written on 32-bit archs since INT_MAX+1 is not
> allocatable, although you could do similar tests like using %s%s to
> print the same INT_MAX/2-size string twice) and it worked as expected.
>
> Rich
>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: printf family handling of INT_MAX +1 tested on aarch64
  2018-11-07 20:54   ` CM Graff
@ 2018-11-08  2:04     ` Rich Felker
  2018-11-08  2:47       ` CM Graff
  0 siblings, 1 reply; 5+ messages in thread
From: Rich Felker @ 2018-11-08  2:04 UTC (permalink / raw)
  To: musl

On Wed, Nov 07, 2018 at 02:54:02PM -0600, CM Graff wrote:
> RIch,
> It just produces a segfault on debian aarch64 in my test case. Whereas
> INTMAX + 2 does not. So I thought it worth reporting.
> 
> graff@hlib-debian-arm:~/hlibc-test/tests-emperical/musl$
> ../usr/bin/musl-gcc ../printf_overflow.c
> graff@hlib-debian-arm:~/hlibc-test/tests-emperical/musl$
> ../usr/bin/musl-gcc -static ../printf_overflow.c
> graff@hlib-debian-arm:~/hlibc-test/tests-emperical/musl$ ./a.out > logfile
> Segmentation fault
> graff@hlib-debian-arm:~/hlibc-test/tests-emperical/musl$ uname -a
> Linux hlib-debian-arm 4.9.0-8-arm64 #1 SMP Debian 4.9.110-3+deb9u6
> (2018-10-08) aarch64 GNU/Linux
> graff@hlib-debian-arm:~/hlibc-test/tests-emperical/musl$
> 
> I can supply access to the 96 core 124 GB RAM aarch64 debian test box
> if it would help reproduce the segfault. Just email me a public key if
> you want access.

The failure has nothing to do with printf. You're calling malloc(i)
then writing to s[i], which is one past the end of the allocated
buffer. I failed to notice this because you're only writing i-1 A's to
the buffer, and there already happens to be a nul byte at s[i-1] to
terminate them.

Actually the crash has nothing to do with aarch64 vs x86_64 but rather
static vs dynamic linking. With dynamic linking, full malloc is used
and there happens to be padding space at the end of the allocation
because there was a header at the beginning and it has to be rounded
up to whole pages. But with static linking, simple_malloc (a bump
allocator) was used, and there are exactly i bytes in the allocation.

Fix the s[i]=0 to be s[i-1]=0 instead and the test works as expected.
And please, when reporting crashes like this, at least try to identify
where the crash is occurring (e.g. with gdb or even just some trivial
printf debugging).

Rich


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: printf family handling of INT_MAX +1 tested on aarch64
  2018-11-08  2:04     ` Rich Felker
@ 2018-11-08  2:47       ` CM Graff
  0 siblings, 0 replies; 5+ messages in thread
From: CM Graff @ 2018-11-08  2:47 UTC (permalink / raw)
  To: musl

Rich,
Ah you are right. Sorry about that. My test is off by one.
Graff

On 11/7/18, Rich Felker <dalias@libc.org> wrote:
> On Wed, Nov 07, 2018 at 02:54:02PM -0600, CM Graff wrote:
>> RIch,
>> It just produces a segfault on debian aarch64 in my test case. Whereas
>> INTMAX + 2 does not. So I thought it worth reporting.
>>
>> graff@hlib-debian-arm:~/hlibc-test/tests-emperical/musl$
>> ../usr/bin/musl-gcc ../printf_overflow.c
>> graff@hlib-debian-arm:~/hlibc-test/tests-emperical/musl$
>> ../usr/bin/musl-gcc -static ../printf_overflow.c
>> graff@hlib-debian-arm:~/hlibc-test/tests-emperical/musl$ ./a.out >
>> logfile
>> Segmentation fault
>> graff@hlib-debian-arm:~/hlibc-test/tests-emperical/musl$ uname -a
>> Linux hlib-debian-arm 4.9.0-8-arm64 #1 SMP Debian 4.9.110-3+deb9u6
>> (2018-10-08) aarch64 GNU/Linux
>> graff@hlib-debian-arm:~/hlibc-test/tests-emperical/musl$
>>
>> I can supply access to the 96 core 124 GB RAM aarch64 debian test box
>> if it would help reproduce the segfault. Just email me a public key if
>> you want access.
>
> The failure has nothing to do with printf. You're calling malloc(i)
> then writing to s[i], which is one past the end of the allocated
> buffer. I failed to notice this because you're only writing i-1 A's to
> the buffer, and there already happens to be a nul byte at s[i-1] to
> terminate them.
>
> Actually the crash has nothing to do with aarch64 vs x86_64 but rather
> static vs dynamic linking. With dynamic linking, full malloc is used
> and there happens to be padding space at the end of the allocation
> because there was a header at the beginning and it has to be rounded
> up to whole pages. But with static linking, simple_malloc (a bump
> allocator) was used, and there are exactly i bytes in the allocation.
>
> Fix the s[i]=0 to be s[i-1]=0 instead and the test works as expected.
> And please, when reporting crashes like this, at least try to identify
> where the crash is occurring (e.g. with gdb or even just some trivial
> printf debugging).
>
> Rich
>


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-11-08  2:47 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-07 19:33 printf family handling of INT_MAX +1 tested on aarch64 CM Graff
2018-11-07 20:31 ` Rich Felker
2018-11-07 20:54   ` CM Graff
2018-11-08  2:04     ` Rich Felker
2018-11-08  2:47       ` CM Graff

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).