* [musl] [PATCH] [RFC] trap on invalid printf formats @ 2023-06-21 21:37 Rich Felker 2023-06-21 22:10 ` enh 2023-06-22 14:35 ` Markus Wichmann 0 siblings, 2 replies; 6+ messages in thread From: Rich Felker @ 2023-06-21 21:37 UTC (permalink / raw) To: musl [-- Attachment #1: Type: text/plain, Size: 492 bytes --] Inspired by a new instance of some bitrotted software using %Lu instead of %llu, attached is a draft patch to catch such errors rather than silently leaving missing output. I don't know if this is a good idea to actually do (note: probably matching changes should be made in wide printf and maybe also scanf if so) but I'm posting it here in case anyone wants to experiment or discuss. Note that there is no conformance distinction since invalid format strings are undefined behavior. Rich [-- Attachment #2: printf-trap.diff --] [-- Type: text/plain, Size: 486 bytes --] diff --git a/src/stdio/vfprintf.c b/src/stdio/vfprintf.c index 33019ff1..c3bd4d31 100644 --- a/src/stdio/vfprintf.c +++ b/src/stdio/vfprintf.c @@ -10,6 +10,7 @@ #include <inttypes.h> #include <math.h> #include <float.h> +#include "atomic.h" /* Some useful macros */ @@ -652,8 +653,7 @@ static int printf_core(FILE *f, const char *fmt, va_list *ap, union arg *nl_arg, return 1; inval: - errno = EINVAL; - return -1; + a_crash(); overflow: errno = EOVERFLOW; return -1; ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [musl] [PATCH] [RFC] trap on invalid printf formats 2023-06-21 21:37 [musl] [PATCH] [RFC] trap on invalid printf formats Rich Felker @ 2023-06-21 22:10 ` enh 2023-06-22 14:35 ` Markus Wichmann 1 sibling, 0 replies; 6+ messages in thread From: enh @ 2023-06-21 22:10 UTC (permalink / raw) To: musl [-- Attachment #1: Type: text/plain, Size: 1358 bytes --] i really should do this in bionic... currently we only abort on %n (which is explicitly _not_ supported) or %w with a silly size. for random junk, we currently do what the BSDs do (since that's where this code originally came from): default: /* "%?" prints ?, unless ? is NUL */ if (ch == '\0') goto done; /* pretend it was %c with argument ch */ cp = buf; *cp = ch; size = 1; sign = '\0'; break; from running a quick test program, macOS and glibc seem to work similarly, which increases the chances that there's incorrect code out there. (clang does at least warn "warning: invalid conversion specifier '?' [-Wformat-invalid-specifier]" by default, though. gcc doesn't.) On Wed, Jun 21, 2023 at 2:38 PM Rich Felker <dalias@libc.org> wrote: > Inspired by a new instance of some bitrotted software using %Lu > instead of %llu, attached is a draft patch to catch such errors rather > than silently leaving missing output. > > I don't know if this is a good idea to actually do (note: probably > matching changes should be made in wide printf and maybe also scanf if > so) but I'm posting it here in case anyone wants to experiment or > discuss. Note that there is no conformance distinction since invalid > format strings are undefined behavior. > > Rich > [-- Attachment #2: Type: text/html, Size: 1828 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [musl] [PATCH] [RFC] trap on invalid printf formats 2023-06-21 21:37 [musl] [PATCH] [RFC] trap on invalid printf formats Rich Felker 2023-06-21 22:10 ` enh @ 2023-06-22 14:35 ` Markus Wichmann 2023-06-22 14:45 ` Rich Felker 1 sibling, 1 reply; 6+ messages in thread From: Markus Wichmann @ 2023-06-22 14:35 UTC (permalink / raw) To: musl Am Wed, Jun 21, 2023 at 05:37:49PM -0400 schrieb Rich Felker: > Inspired by a new instance of some bitrotted software using %Lu > instead of %llu, attached is a draft patch to catch such errors rather > than silently leaving missing output. > Ah, it's the old dichotomy: For developers, this is nice, since it catches bad behavior in a way that is impossible to ignore. For users, this is horrible, since it crashes bad applications, maybe just before they could save their data. And making the behavior selectable is probably not sensible. As I count myself among the former group, I do support this move. But then, I do not run a distro, and the only computer I am responsible for is my own. Ciao, Markus ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [musl] [PATCH] [RFC] trap on invalid printf formats 2023-06-22 14:35 ` Markus Wichmann @ 2023-06-22 14:45 ` Rich Felker 2023-06-22 23:37 ` Alex Xu (Hello71) 0 siblings, 1 reply; 6+ messages in thread From: Rich Felker @ 2023-06-22 14:45 UTC (permalink / raw) To: Markus Wichmann; +Cc: musl On Thu, Jun 22, 2023 at 04:35:54PM +0200, Markus Wichmann wrote: > Am Wed, Jun 21, 2023 at 05:37:49PM -0400 schrieb Rich Felker: > > Inspired by a new instance of some bitrotted software using %Lu > > instead of %llu, attached is a draft patch to catch such errors rather > > than silently leaving missing output. > > > > Ah, it's the old dichotomy: For developers, this is nice, since it > catches bad behavior in a way that is impossible to ignore. For users, > this is horrible, since it crashes bad applications, maybe just before > they could save their data. And making the behavior selectable is > probably not sensible. > > As I count myself among the former group, I do support this move. But > then, I do not run a distro, and the only computer I am responsible for > is my own. Even for users, I'm not so sure of that. The distinction is potentially between crashing while writing out a new file containing corrupted data before rename()ing it over the old file, and silently succeeding then rename()ing the corrupted file over the old file. (Note that this is only silent if you don't check for errors, but if anyone using invalid formats were checking for errors, it seems these errors would necessarily have been reported. Unless they just check ferror() at the end, and we're not setting the FILE error flag on format string errors...which we probably should do if we don't switch to trapping.) FWIW I don't think there are a lot of these cases left in the wild at all, but I'm not sure. it might be nice to do some distro-wide testing with this patch applied (which is what I had in mind posting it) and see if any problems are caught before really considering whether to pursue upstreaming it. Rich ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [musl] [PATCH] [RFC] trap on invalid printf formats 2023-06-22 14:45 ` Rich Felker @ 2023-06-22 23:37 ` Alex Xu (Hello71) 2023-06-22 23:51 ` Rich Felker 0 siblings, 1 reply; 6+ messages in thread From: Alex Xu (Hello71) @ 2023-06-22 23:37 UTC (permalink / raw) To: musl; +Cc: Markus Wichmann, Rich Felker Excerpts from Rich Felker's message of June 22, 2023 10:45 am: > FWIW I don't think there are a lot of these cases left in the wild at > all, but I'm not sure. it might be nice to do some distro-wide testing > with this patch applied (which is what I had in mind posting it) and > see if any problems are caught before really considering whether to > pursue upstreaming it. Unfortunately, it seems fairly widespread: https://codesearch.debian.net/search?q=printf.*%5B%25%5DL%5Bud%5D+-package%3Agcc+-package%3Allvm*+path%3A.*%5C.c%24&literal=0 The most painful example: #if defined(__OpenBSD__) || defined(__FreeBSD__) || defined(__APPLE_CC__) || defined(__APPLE__) || defined(ARGUS_SOLARIS) sprintf (pbuf, "%llu", value); #else sprintf (pbuf, "%Lu", value); #endif (copied and pasted 17 times in the same file, of course) I did some research and the most likely source of %Lu is the Linux man-pages, which, before 1999 or thereabouts, said: > • The optional character l (ell) specifying that a following d, i, o, > u, x, or X conversion applies to a pointer to a long int or unsigned > long int argument, or that a following n conversion corresponds to a > pointer to a long int argument. Linux provides a non ANSI compliant > use of two l flags as a synonym to q or L. Thus ll can be used in > combination with float conversions. *This usage is, however, strongly > discouraged.* > > • The character L specifying that a following e, E, f, g, or G > conversion corresponds to a long double argument, or a following d, i, > o, u, x, or X conversion corresponds to a long long argument. Note > that long long is not specified in ANSI C and therefore not portable > to all architectures. Emphasis added. So, pre-C99, L was in fact the recommended modifier for long long. Cheers, Alex. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [musl] [PATCH] [RFC] trap on invalid printf formats 2023-06-22 23:37 ` Alex Xu (Hello71) @ 2023-06-22 23:51 ` Rich Felker 0 siblings, 0 replies; 6+ messages in thread From: Rich Felker @ 2023-06-22 23:51 UTC (permalink / raw) To: Alex Xu (Hello71); +Cc: musl, Markus Wichmann On Thu, Jun 22, 2023 at 07:37:22PM -0400, Alex Xu (Hello71) wrote: > Excerpts from Rich Felker's message of June 22, 2023 10:45 am: > > FWIW I don't think there are a lot of these cases left in the wild at > > all, but I'm not sure. it might be nice to do some distro-wide testing > > with this patch applied (which is what I had in mind posting it) and > > see if any problems are caught before really considering whether to > > pursue upstreaming it. > > Unfortunately, it seems fairly widespread: > https://codesearch.debian.net/search?q=printf.*%5B%25%5DL%5Bud%5D+-package%3Agcc+-package%3Allvm*+path%3A.*%5C.c%24&literal=0 > > The most painful example: > > #if defined(__OpenBSD__) || defined(__FreeBSD__) || defined(__APPLE_CC__) || defined(__APPLE__) || defined(ARGUS_SOLARIS) > sprintf (pbuf, "%llu", value); > #else > sprintf (pbuf, "%Lu", value); > #endif > > (copied and pasted 17 times in the same file, of course) Which file are you looking at? Some of the results I see are Linux (the kernel) using its own very-nonstandard printf format specifier system. Those aren't relevant at all. All of the rest are bugs where the software is silently malfunctioning now. Some of these are junk code like examples/* etc. but presumably a lot are actual bugs that need to be fixed, where something is malfunctioning now. > I did some research and the most likely source of %Lu is the Linux > man-pages, which, before 1999 or thereabouts, said: > > > • The optional character l (ell) specifying that a following d, i, o, > > u, x, or X conversion applies to a pointer to a long int or unsigned > > long int argument, or that a following n conversion corresponds to a > > pointer to a long int argument. Linux provides a non ANSI compliant > > use of two l flags as a synonym to q or L. Thus ll can be used in > > combination with float conversions. *This usage is, however, strongly > > discouraged.* > > > > • The character L specifying that a following e, E, f, g, or G > > conversion corresponds to a long double argument, or a following d, i, > > o, u, x, or X conversion corresponds to a long long argument. Note > > that long long is not specified in ANSI C and therefore not portable > > to all architectures. > > Emphasis added. So, pre-C99, L was in fact the recommended modifier for > long long. As I read it, the usage that was discouraged was using ll as an alias for L with float conversions. Not use of ll where it was the right form. In any case, I'm not sure what we can take away from the history. ll was the form that ended up getting standardized (probably because it didn't "steal" any additional letter that might be used for other things on existing implementations or in the future), and both q and the overloading of L got rejected. Rich ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-06-22 23:52 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-06-21 21:37 [musl] [PATCH] [RFC] trap on invalid printf formats Rich Felker 2023-06-21 22:10 ` enh 2023-06-22 14:35 ` Markus Wichmann 2023-06-22 14:45 ` Rich Felker 2023-06-22 23:37 ` Alex Xu (Hello71) 2023-06-22 23:51 ` Rich Felker
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/musl/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).