mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] [PATCH] [RFC] trap on invalid printf formats
@ 2023-06-21 21:37 Rich Felker
  2023-06-21 22:10 ` enh
  2023-06-22 14:35 ` Markus Wichmann
  0 siblings, 2 replies; 6+ messages in thread
From: Rich Felker @ 2023-06-21 21:37 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 492 bytes --]

Inspired by a new instance of some bitrotted software using %Lu
instead of %llu, attached is a draft patch to catch such errors rather
than silently leaving missing output.

I don't know if this is a good idea to actually do (note: probably
matching changes should be made in wide printf and maybe also scanf if
so) but I'm posting it here in case anyone wants to experiment or
discuss. Note that there is no conformance distinction since invalid
format strings are undefined behavior.

Rich

[-- Attachment #2: printf-trap.diff --]
[-- Type: text/plain, Size: 486 bytes --]

diff --git a/src/stdio/vfprintf.c b/src/stdio/vfprintf.c
index 33019ff1..c3bd4d31 100644
--- a/src/stdio/vfprintf.c
+++ b/src/stdio/vfprintf.c
@@ -10,6 +10,7 @@
 #include <inttypes.h>
 #include <math.h>
 #include <float.h>
+#include "atomic.h"
 
 /* Some useful macros */
 
@@ -652,8 +653,7 @@ static int printf_core(FILE *f, const char *fmt, va_list *ap, union arg *nl_arg,
 	return 1;
 
 inval:
-	errno = EINVAL;
-	return -1;
+	a_crash();
 overflow:
 	errno = EOVERFLOW;
 	return -1;

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [musl] [PATCH] [RFC] trap on invalid printf formats
  2023-06-21 21:37 [musl] [PATCH] [RFC] trap on invalid printf formats Rich Felker
@ 2023-06-21 22:10 ` enh
  2023-06-22 14:35 ` Markus Wichmann
  1 sibling, 0 replies; 6+ messages in thread
From: enh @ 2023-06-21 22:10 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 1358 bytes --]

i really should do this in bionic... currently we only abort on %n (which
is explicitly _not_ supported) or %w with a silly size.

for random junk, we currently do what the BSDs do (since that's where this
code originally came from):

      default: /* "%?" prints ?, unless ? is NUL */
        if (ch == '\0') goto done;
        /* pretend it was %c with argument ch */
        cp = buf;
        *cp = ch;
        size = 1;
        sign = '\0';
        break;

from running a quick test program, macOS and glibc seem to work similarly,
which increases the chances that there's incorrect code out there. (clang
does at least warn "warning: invalid conversion specifier '?'
[-Wformat-invalid-specifier]" by default, though. gcc doesn't.)

On Wed, Jun 21, 2023 at 2:38 PM Rich Felker <dalias@libc.org> wrote:

> Inspired by a new instance of some bitrotted software using %Lu
> instead of %llu, attached is a draft patch to catch such errors rather
> than silently leaving missing output.
>
> I don't know if this is a good idea to actually do (note: probably
> matching changes should be made in wide printf and maybe also scanf if
> so) but I'm posting it here in case anyone wants to experiment or
> discuss. Note that there is no conformance distinction since invalid
> format strings are undefined behavior.
>
> Rich
>

[-- Attachment #2: Type: text/html, Size: 1828 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [musl] [PATCH] [RFC] trap on invalid printf formats
  2023-06-21 21:37 [musl] [PATCH] [RFC] trap on invalid printf formats Rich Felker
  2023-06-21 22:10 ` enh
@ 2023-06-22 14:35 ` Markus Wichmann
  2023-06-22 14:45   ` Rich Felker
  1 sibling, 1 reply; 6+ messages in thread
From: Markus Wichmann @ 2023-06-22 14:35 UTC (permalink / raw)
  To: musl

Am Wed, Jun 21, 2023 at 05:37:49PM -0400 schrieb Rich Felker:
> Inspired by a new instance of some bitrotted software using %Lu
> instead of %llu, attached is a draft patch to catch such errors rather
> than silently leaving missing output.
>

Ah, it's the old dichotomy: For developers, this is nice, since it
catches bad behavior in a way that is impossible to ignore. For users,
this is horrible, since it crashes bad applications, maybe just before
they could save their data. And making the behavior selectable is
probably not sensible.

As I count myself among the former group, I do support this move. But
then, I do not run a distro, and the only computer I am responsible for
is my own.

Ciao,
Markus

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [musl] [PATCH] [RFC] trap on invalid printf formats
  2023-06-22 14:35 ` Markus Wichmann
@ 2023-06-22 14:45   ` Rich Felker
  2023-06-22 23:37     ` Alex Xu (Hello71)
  0 siblings, 1 reply; 6+ messages in thread
From: Rich Felker @ 2023-06-22 14:45 UTC (permalink / raw)
  To: Markus Wichmann; +Cc: musl

On Thu, Jun 22, 2023 at 04:35:54PM +0200, Markus Wichmann wrote:
> Am Wed, Jun 21, 2023 at 05:37:49PM -0400 schrieb Rich Felker:
> > Inspired by a new instance of some bitrotted software using %Lu
> > instead of %llu, attached is a draft patch to catch such errors rather
> > than silently leaving missing output.
> >
> 
> Ah, it's the old dichotomy: For developers, this is nice, since it
> catches bad behavior in a way that is impossible to ignore. For users,
> this is horrible, since it crashes bad applications, maybe just before
> they could save their data. And making the behavior selectable is
> probably not sensible.
> 
> As I count myself among the former group, I do support this move. But
> then, I do not run a distro, and the only computer I am responsible for
> is my own.

Even for users, I'm not so sure of that. The distinction is
potentially between crashing while writing out a new file containing
corrupted data before rename()ing it over the old file, and silently
succeeding then rename()ing the corrupted file over the old file.

(Note that this is only silent if you don't check for errors, but if
anyone using invalid formats were checking for errors, it seems these
errors would necessarily have been reported. Unless they just check
ferror() at the end, and we're not setting the FILE error flag on
format string errors...which we probably should do if we don't switch
to trapping.)

FWIW I don't think there are a lot of these cases left in the wild at
all, but I'm not sure. it might be nice to do some distro-wide testing
with this patch applied (which is what I had in mind posting it) and
see if any problems are caught before really considering whether to
pursue upstreaming it.

Rich

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [musl] [PATCH] [RFC] trap on invalid printf formats
  2023-06-22 14:45   ` Rich Felker
@ 2023-06-22 23:37     ` Alex Xu (Hello71)
  2023-06-22 23:51       ` Rich Felker
  0 siblings, 1 reply; 6+ messages in thread
From: Alex Xu (Hello71) @ 2023-06-22 23:37 UTC (permalink / raw)
  To: musl; +Cc: Markus Wichmann, Rich Felker

Excerpts from Rich Felker's message of June 22, 2023 10:45 am:
> FWIW I don't think there are a lot of these cases left in the wild at
> all, but I'm not sure. it might be nice to do some distro-wide testing
> with this patch applied (which is what I had in mind posting it) and
> see if any problems are caught before really considering whether to
> pursue upstreaming it.

Unfortunately, it seems fairly widespread:
https://codesearch.debian.net/search?q=printf.*%5B%25%5DL%5Bud%5D+-package%3Agcc+-package%3Allvm*+path%3A.*%5C.c%24&literal=0

The most painful example:

#if defined(__OpenBSD__) || defined(__FreeBSD__) || defined(__APPLE_CC__) || defined(__APPLE__) || defined(ARGUS_SOLARIS)
    sprintf (pbuf, "%llu", value);
#else
    sprintf (pbuf, "%Lu", value);
#endif

(copied and pasted 17 times in the same file, of course)

I did some research and the most likely source of %Lu is the Linux 
man-pages, which, before 1999 or thereabouts, said:

> • The optional character l (ell) specifying that a following d, i, o, 
> u, x, or X conversion applies to a pointer to a long int or unsigned 
> long int argument, or that a following n conversion corresponds to a 
> pointer to a long int argument.  Linux provides a non ANSI compliant 
> use of two l flags as a synonym to q or L.  Thus ll can be used in 
> combination with float conversions.  *This usage is, however, strongly 
> discouraged.*
>
> • The character L specifying that a following e, E, f, g, or G 
> conversion corresponds to a long double argument, or a following d, i, 
> o, u, x, or X conversion corresponds to a long long argument.  Note 
> that long long is not specified in ANSI C and therefore not portable 
> to all architectures.

Emphasis added. So, pre-C99, L was in fact the recommended modifier for 
long long.

Cheers,
Alex.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [musl] [PATCH] [RFC] trap on invalid printf formats
  2023-06-22 23:37     ` Alex Xu (Hello71)
@ 2023-06-22 23:51       ` Rich Felker
  0 siblings, 0 replies; 6+ messages in thread
From: Rich Felker @ 2023-06-22 23:51 UTC (permalink / raw)
  To: Alex Xu (Hello71); +Cc: musl, Markus Wichmann

On Thu, Jun 22, 2023 at 07:37:22PM -0400, Alex Xu (Hello71) wrote:
> Excerpts from Rich Felker's message of June 22, 2023 10:45 am:
> > FWIW I don't think there are a lot of these cases left in the wild at
> > all, but I'm not sure. it might be nice to do some distro-wide testing
> > with this patch applied (which is what I had in mind posting it) and
> > see if any problems are caught before really considering whether to
> > pursue upstreaming it.
> 
> Unfortunately, it seems fairly widespread:
> https://codesearch.debian.net/search?q=printf.*%5B%25%5DL%5Bud%5D+-package%3Agcc+-package%3Allvm*+path%3A.*%5C.c%24&literal=0
> 
> The most painful example:
> 
> #if defined(__OpenBSD__) || defined(__FreeBSD__) || defined(__APPLE_CC__) || defined(__APPLE__) || defined(ARGUS_SOLARIS)
>     sprintf (pbuf, "%llu", value);
> #else
>     sprintf (pbuf, "%Lu", value);
> #endif
> 
> (copied and pasted 17 times in the same file, of course)

Which file are you looking at? Some of the results I see are Linux
(the kernel) using its own very-nonstandard printf format specifier
system. Those aren't relevant at all.

All of the rest are bugs where the software is silently malfunctioning
now. Some of these are junk code like examples/* etc. but presumably a
lot are actual bugs that need to be fixed, where something is
malfunctioning now.

> I did some research and the most likely source of %Lu is the Linux 
> man-pages, which, before 1999 or thereabouts, said:
> 
> > • The optional character l (ell) specifying that a following d, i, o, 
> > u, x, or X conversion applies to a pointer to a long int or unsigned 
> > long int argument, or that a following n conversion corresponds to a 
> > pointer to a long int argument.  Linux provides a non ANSI compliant 
> > use of two l flags as a synonym to q or L.  Thus ll can be used in 
> > combination with float conversions.  *This usage is, however, strongly 
> > discouraged.*
> >
> > • The character L specifying that a following e, E, f, g, or G 
> > conversion corresponds to a long double argument, or a following d, i, 
> > o, u, x, or X conversion corresponds to a long long argument.  Note 
> > that long long is not specified in ANSI C and therefore not portable 
> > to all architectures.
> 
> Emphasis added. So, pre-C99, L was in fact the recommended modifier for 
> long long.

As I read it, the usage that was discouraged was using ll as an alias
for L with float conversions. Not use of ll where it was the right
form.

In any case, I'm not sure what we can take away from the history. ll
was the form that ended up getting standardized (probably because it
didn't "steal" any additional letter that might be used for other
things on existing implementations or in the future), and both q and
the overloading of L got rejected.

Rich

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-06-22 23:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-21 21:37 [musl] [PATCH] [RFC] trap on invalid printf formats Rich Felker
2023-06-21 22:10 ` enh
2023-06-22 14:35 ` Markus Wichmann
2023-06-22 14:45   ` Rich Felker
2023-06-22 23:37     ` Alex Xu (Hello71)
2023-06-22 23:51       ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).