Hm, trying this with my native GCC compiler it does behave as you claim. I think I may have misdiagnosed the issue I was running into. I apologize for taking up your time.

On Thu, Jun 28, 2018 at 10:41 AM Rich Felker <dalias@libc.org> wrote:
On Thu, Jun 28, 2018 at 10:20:28AM -0700, Mark Winterrowd wrote:
> Hi all,
>
> I believe I have found an out of bounds memory read in vfprintf.c
>
> On line 509 in src/stdio/vfprintf.c in the current source tree head, you
> can observe the following snippet of code:
>
> /* Format specifier state machine */
> st=0;
> do {
> if (OOB(*s)) goto inval;
> ps=st;
> st=states[st]S(*s++);
> } while (st-1<STOP);
> if (!st) goto inval;
>
> Note that on line 99 the OOB macro expands to the following test whether
> the argument falls outside of 'A' and 'z', written to use a single compare:
>
> #define OOB(x) ((unsigned)(x)-'A' > 'z'-'A')
> Unfortunately, the cast to unsigned binds tighter than the subtract

For this idiom, it's intentional that it bind higher. Here since x is
small (char-range) anyway it doesn't matter, but in general the
pattern (x-'A') could overflow, producing UB, if x weren't alreaady
unsigned.

> from 'A', so if x is less than 'A',
> OOB will return false. This is common in the case of space, which has
> an ascii value of 32

No, the result of (unsigned)(x)-'A' is unsigned, and in the case
x<'A', it's a value larger than INT_MAX which is much larger than
'z'-'A'.

> compared to 'A' 's value of 65.
>
> This causes us to index into states with a negative value for its
> second dimension, causing us to
> index to an unpredictable location in states, possibly even off the beginning.

Did you test this? It's possible there's another mistake we're not
seeing, but the above isn't one. Also note that passing an invalid
format string is UB already, so any graceful handling of that is just
hardening, not correctness.

Rich