Hm, trying this with my native GCC compiler it does behave as you claim. I think I may have misdiagnosed the issue I was running into. I apologize for taking up your time. On Thu, Jun 28, 2018 at 10:41 AM Rich Felker wrote: > On Thu, Jun 28, 2018 at 10:20:28AM -0700, Mark Winterrowd wrote: > > Hi all, > > > > I believe I have found an out of bounds memory read in vfprintf.c > > > > On line 509 in src/stdio/vfprintf.c in the current source tree head, you > > can observe the following snippet of code: > > > > /* Format specifier state machine */ > > st=0; > > do { > > if (OOB(*s)) goto inval; > > ps=st; > > st=states[st]S(*s++); > > } while (st-1 > if (!st) goto inval; > > > > Note that on line 99 the OOB macro expands to the following test whether > > the argument falls outside of 'A' and 'z', written to use a single > compare: > > > > #define OOB(x) ((unsigned)(x)-'A' > 'z'-'A') > > Unfortunately, the cast to unsigned binds tighter than the subtract > > For this idiom, it's intentional that it bind higher. Here since x is > small (char-range) anyway it doesn't matter, but in general the > pattern (x-'A') could overflow, producing UB, if x weren't alreaady > unsigned. > > > from 'A', so if x is less than 'A', > > OOB will return false. This is common in the case of space, which has > > an ascii value of 32 > > No, the result of (unsigned)(x)-'A' is unsigned, and in the case > x<'A', it's a value larger than INT_MAX which is much larger than > 'z'-'A'. > > > compared to 'A' 's value of 65. > > > > This causes us to index into states with a negative value for its > > second dimension, causing us to > > index to an unpredictable location in states, possibly even off the > beginning. > > Did you test this? It's possible there's another mistake we're not > seeing, but the above isn't one. Also note that passing an invalid > format string is UB already, so any graceful handling of that is just > hardening, not correctness. > > Rich >