From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/13402 Path: news.gmane.org!.POSTED!not-for-mail From: CM Graff Newsgroups: gmane.linux.lib.musl.general Subject: Re: printf family handling of INT_MAX +1 tested on aarch64 Date: Wed, 7 Nov 2018 14:54:02 -0600 Message-ID: References: <20181107203121.GT5150@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" X-Trace: blaine.gmane.org 1541623930 26896 195.159.176.226 (7 Nov 2018 20:52:10 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Wed, 7 Nov 2018 20:52:10 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-13418-gllmg-musl=m.gmane.org@lists.openwall.com Wed Nov 07 21:52:05 2018 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1gKUo1-0006sU-Q1 for gllmg-musl@m.gmane.org; Wed, 07 Nov 2018 21:52:05 +0100 Original-Received: (qmail 7678 invoked by uid 550); 7 Nov 2018 20:54:14 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 7660 invoked from network); 7 Nov 2018 20:54:14 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=WA3dBw/FdNnVKsTMXQpU/e9mYC5ESKzOtQviY+jB3fo=; b=Tx8QYe5IaOULeGPjonjEQP13sQblcHANteV1OqdHSaP7lgFrfhvFyNYRGUI6VfloNE TjzeDvgz0S6NQLw7FswS90SVhfbOwZV3uoio8eZ30T3j4M0evPAUrx4Mv85us+3OkMBs yrwgVQGSHPFVnPakb9Gy4YiScNyBxWH80/aN5H9HHR/Eiw9mQ3n27uCbTW+XhHMyWwv6 WqyocBBVGDHvrs2K75h30NXbOdoZz6IOVuK/qM2cSFxisUlufkAHwAccUM7KPf2a5CYB 8gDHX/tN9n7Yc0ksc5npfccqgwHDOLltQsMQopPhIau/XsiiR2ufAznvGzcPaB28zOrg jRZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=WA3dBw/FdNnVKsTMXQpU/e9mYC5ESKzOtQviY+jB3fo=; b=Cryb5uaOrqcPgIl5Jvz/Jm8UIINimnW4OTlRp72xLo5n+kg/EqO9DIHgMSSkG6eKFT oq5B2iPTqO40pjLX7qiSmwsibGEtkV2rF0Vab+S1jB7NBEv0xHz7V+a3y2yvAHogJkzu k4W+bqMysKX1Q60N8pJhRouADuRAJ55gHIVpgspio7XXz3cSf7dv70WtGFlFPUr9icCU paKVr8HqMuXBsT9ryo09lEmTLlxdYDzJzdB+MgYsk5GZkiRKeoX+I7S5o84pzQFY3eV/ LrXrjUw/5RJza8LM+Mn0R1RonCW0P3Shj/7KREO5WGOWQMnH3JSKVLtjZz5Mj9VtCVLC JREQ== X-Gm-Message-State: AGRZ1gLguuvD18C0ImOeHcuqnoHsHpibGaXM7BT7CkgUJX4dYeEwfea4 SBDO3E7wXqKekrhAih3UfZzYSd2j4jXzYRoRIn/iCi3B X-Google-Smtp-Source: AJdET5eWcScMrsD2vM8RsklYgXbez956syjvTn9bK10jrn4DTRn9bczNPjX1BP3YzrGNShZpSj57crW238cUYgI6fdY= X-Received: by 2002:a5d:4609:: with SMTP id t9-v6mr1758446wrq.198.1541624042577; Wed, 07 Nov 2018 12:54:02 -0800 (PST) In-Reply-To: <20181107203121.GT5150@brightrain.aerifal.cx> Xref: news.gmane.org gmane.linux.lib.musl.general:13402 Archived-At: RIch, It just produces a segfault on debian aarch64 in my test case. Whereas INTMAX + 2 does not. So I thought it worth reporting. graff@hlib-debian-arm:~/hlibc-test/tests-emperical/musl$ ./usr/bin/musl-gcc ../printf_overflow.c graff@hlib-debian-arm:~/hlibc-test/tests-emperical/musl$ ./usr/bin/musl-gcc -static ../printf_overflow.c graff@hlib-debian-arm:~/hlibc-test/tests-emperical/musl$ ./a.out > logfile Segmentation fault graff@hlib-debian-arm:~/hlibc-test/tests-emperical/musl$ uname -a Linux hlib-debian-arm 4.9.0-8-arm64 #1 SMP Debian 4.9.110-3+deb9u6 (2018-10-08) aarch64 GNU/Linux graff@hlib-debian-arm:~/hlibc-test/tests-emperical/musl$ I can supply access to the 96 core 124 GB RAM aarch64 debian test box if it would help reproduce the segfault. Just email me a public key if you want access. Graff On 11/7/18, Rich Felker wrote: > On Wed, Nov 07, 2018 at 01:33:13PM -0600, CM Graff wrote: >> Hello everyone, >> >> The C standard states that: >> "The number of characters or wide characters transmitted by a formatted >> output >> function (or written to an array, or that would have been written to an >> array) >> is greater >> than INT_MAX" is undefined behavior. >> >> POSIX states that: >> >> "In addition, all forms of fprintf() shall fail if: >> >> [...] >> [EOVERFLOW] >> [CX] [Option Start] The value to be returned is greater than >> {INT_MAX}. >> [Option End] >> " >> >> Though arguments of over INT_MAX are undefined behavior it seems like >> some >> provisions have been made in musl to handle it, and the method for >> handling >> such appear similar in effect to that of glibc and freebsd's libc. INT_MAX >> + 2 >> appears to represent this case, however INT_MAX + 1 produces a segfault on >> my >> aarch64 test box running debian version 9.5. > ^^^^^^^^^^^^^^^^^^^^^^^^^^ > > At first this sounded like you were using glibc, but based on the > below test program it seems you're using the musl-gcc wrapper, then > running musl binaries on the same box. Ok, this should work. > >> I do not have a suggested fix other than to either carefully inspect the >> EOVERFLOW semantics or to mitigate the need for more complex mathematics >> by >> using a size_t as the primary counter for the stdio family instead of an >> int. > > The counter is not incremented without seeing that the increment would > not cause overflow. See vfprintf.c lines 447-450: > > /* This error is only specified for snprintf, but since it's > * unspecified for other forms, do the same. Stop immediately > * on overflow; otherwise %n could produce wrong results. */ > if (l > INT_MAX - cnt) goto overflow; > >> This segfault was discovered when testing my own small libc >> (https://github.com/hlibc/hlibc) against the various robust production >> grade >> libc to understand more about how to properly handle EOVERFLOW and in >> general >> the cases of INT_MAX related undefined behavior for the formatted stdio >> functions as per specified in the C standard and POSIX. >> >> I am not sure that handling this is an important case for musl, however I >> thought it best to report the scenario as best I could describe it. > > I don't understand exactly what you're claiming is wrong. If it's a > segfault, where does it occur? > >> Here is a script and a small C program to verify this segfault on >> aarch64, >> I apologize for not testing on other architectures but my time is limited >> lately as I'm working toward my degree in mathematics. >> >> #!/bin/sh >> git clone git://git.musl-libc.org/musl >> cd musl >> ../configure --prefix=$(pwd)/usr >> make -j4 > log 2>&1 >> make install >> log 2>&1 >> ../usr/bin/musl-gcc -static ../printf_overflow.c >> ../a.out > log2 >> >> >> >> #include >> #include >> #include >> #include >> #include >> int main(void) >> { >> size_t i = INT_MAX; >> ++i; >> char *s = malloc(i); >> if (!(s)) >> { >> fprintf(stderr, "unable to allocate enough memory\n"); >> return 1; >> } >> memset(s, 'A', i - 1); >> s[i] = 0; >> /* make sure printf is not changed to puts() by the compiler */ >> int len = printf("%s", s, 1); >> >> if (errno == EOVERFLOW) >> fprintf(stderr, "printf set EOVERFLOW\n"); >> else >> fprintf(stderr, "printf did not set EOVERFLOW\n"); >> >> fprintf(stderr, "printf returned %d\n", len); >> return 0; >> } > > There is nothing in this test program that overflows. printf produces > precisely INT_MAX bytes of output, which is representable, and > therefore it succeeds and returns INT_MAX. I tested this on x86_64 > (it's not possible as written on 32-bit archs since INT_MAX+1 is not > allocatable, although you could do similar tests like using %s%s to > print the same INT_MAX/2-size string twice) and it worked as expected. > > Rich >