mailing list of musl libc
 help / color / mirror / code / Atom feed
* string-backed FILEs mess
@ 2018-09-12 14:02 Rich Felker
  2018-09-12 15:09 ` Markus Wichmann
  0 siblings, 1 reply; 15+ messages in thread
From: Rich Felker @ 2018-09-12 14:02 UTC (permalink / raw)
  To: musl

While working on the headers/declarations/hidden stuff, I've run
across some issues that should be documented as needing fix. One big
one is this:

strto* is still setting up a fake FILE with buffer end pointer as
(void*)-1 as a hack to read directly from a string of unknown length.
This is of course nonsensical UB and might lead to actual problems
(signed overflow) in places where rend-rpos is computed. strtol.c
handles that aspect already by "if ((size_t)s > (size_t)-1/2)" but
it's still horribly wrong. There's a __string_read function sscanf
uses that avoids the whole problem by incrementally measuring string
length and copying it to a real stdio buffer, which also makes ungetc
safe, and this is the obvious *right* thing to do, but it might make
strto* unacceptably slower. I haven't done any measurement.

The other "mostly right" (modulo ungetc not being available then)
approach would be getting rid of the whole current buffer design with
start/end pointers and using indices instead. This would solve a lot
of stupid gratuitous UB in stdio, like (T*)0-(T*)0 being undefined.
It's not clear to me whether it would be more or less efficient. It
would "break" glibc ABI-compat for stdio -- the original reason I used
the pointer-based design -- but that could be fixed by putting
"must-be-null" fields in place of the buffer pointers so that any
glibc code using getc_unlocked/putc_unlocked macros would hit the
"no buffer space" code path and call an actual function. In many ways
that's desirable anyway.

Probably the right next step here is measuring whether just using
__string_read would make anything measurably slower.

Rich


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: string-backed FILEs mess
  2018-09-12 14:02 string-backed FILEs mess Rich Felker
@ 2018-09-12 15:09 ` Markus Wichmann
  2018-09-12 15:43   ` Rich Felker
  2018-09-12 15:55   ` A. Wilcox
  0 siblings, 2 replies; 15+ messages in thread
From: Markus Wichmann @ 2018-09-12 15:09 UTC (permalink / raw)
  To: musl

On Wed, Sep 12, 2018 at 10:02:39AM -0400, Rich Felker wrote:
> While working on the headers/declarations/hidden stuff, I've run
> across some issues that should be documented as needing fix. One big
> one is this:
> 
> strto* is still setting up a fake FILE with buffer end pointer as
> (void*)-1 as a hack to read directly from a string of unknown length.
> This is of course nonsensical UB and might lead to actual problems
> (signed overflow) in places where rend-rpos is computed. strtol.c
> handles that aspect already by "if ((size_t)s > (size_t)-1/2)" but
> it's still horribly wrong. There's a __string_read function sscanf
> uses that avoids the whole problem by incrementally measuring string
> length and copying it to a real stdio buffer, which also makes ungetc
> safe, and this is the obvious *right* thing to do, but it might make
> strto* unacceptably slower. I haven't done any measurement.
> 

Well, first of all, I might set my foot wrong here very badly, but I
generally don't care about C standard UB as long as the behavior is
defined elsewhere. In this case, the behavior is undefined because the C
standard only defines a partial mapping from pointer to integer and back
(namely null pointer <-> 0 and non-null pointer <-> nonzero, but nothing
else), and that is why most operations on pointers are undefined.
However, Linux defines that it only works on a linear address space, and
so completes the mapping from pointer to integer, and so avoids the UB
by defining it.

We have done similar things in the past, like assuming long to be large
enough for a pointer, as Linux depends on that as well.

As for the overflow: That would be fixed if you set rend to rpos +
PTRDIFF_MAX unless that overflows. Technically still UB, but at least
the calculations work now.

The __string_read approach shouldn't be all that much slower, though. It
only adds overhead when the read buffer is empty, which is going to be
on the first call and then every 256 characters. And the overhead is
mostly a memchr()... on second thought, that might take a while.

> The other "mostly right" (modulo ungetc not being available then)
> approach would be getting rid of the whole current buffer design with
> start/end pointers and using indices instead. This would solve a lot
> of stupid gratuitous UB in stdio, like (T*)0-(T*)0 being undefined.
> It's not clear to me whether it would be more or less efficient. It
> would "break" glibc ABI-compat for stdio -- the original reason I used
> the pointer-based design -- but that could be fixed by putting
> "must-be-null" fields in place of the buffer pointers so that any
> glibc code using getc_unlocked/putc_unlocked macros would hit the
> "no buffer space" code path and call an actual function. In many ways
> that's desirable anyway.
> 

I don't know how important the glibc-compat still is for people. I don't
need it, but then, I have glibc available if necessary. And with Steam
moving to Linux, closed-source software is certainly going to be on the
rise for Linux/PC (i.e. Linux/i386 and Linux/amd64). Which is going to
be a problem for musl-based distributions, if they want to support that.

> Probably the right next step here is measuring whether just using
> __string_read would make anything measurably slower.
> 
> Rich

Probably. Although the memchr() in that function might access past the
string, but only if the string isn't terminated, so we can invoke the
GIGO principle, right?

Ciao,
Markus


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: string-backed FILEs mess
  2018-09-12 15:09 ` Markus Wichmann
@ 2018-09-12 15:43   ` Rich Felker
  2018-09-12 16:33     ` Rich Felker
  2018-09-12 17:41     ` Markus Wichmann
  2018-09-12 15:55   ` A. Wilcox
  1 sibling, 2 replies; 15+ messages in thread
From: Rich Felker @ 2018-09-12 15:43 UTC (permalink / raw)
  To: musl

On Wed, Sep 12, 2018 at 05:09:41PM +0200, Markus Wichmann wrote:
> On Wed, Sep 12, 2018 at 10:02:39AM -0400, Rich Felker wrote:
> > While working on the headers/declarations/hidden stuff, I've run
> > across some issues that should be documented as needing fix. One big
> > one is this:
> > 
> > strto* is still setting up a fake FILE with buffer end pointer as
> > (void*)-1 as a hack to read directly from a string of unknown length.
> > This is of course nonsensical UB and might lead to actual problems
> > (signed overflow) in places where rend-rpos is computed. strtol.c
> > handles that aspect already by "if ((size_t)s > (size_t)-1/2)" but
> > it's still horribly wrong. There's a __string_read function sscanf
> > uses that avoids the whole problem by incrementally measuring string
> > length and copying it to a real stdio buffer, which also makes ungetc
> > safe, and this is the obvious *right* thing to do, but it might make
> > strto* unacceptably slower. I haven't done any measurement.
> > 
> 
> Well, first of all, I might set my foot wrong here very badly, but I
> generally don't care about C standard UB as long as the behavior is
> defined elsewhere.

Like where? In order for it to be defined, the *compiler* has to
define it, since otherwise it can make transformations that assume the
behavior is undefined. So what you're asking for here is basically
amounting to only supporting certain compilers (with certain flags),
and notably *not supporting* UBSan, which is a really valuable tool
for catching bugs.

> In this case, the behavior is undefined because the C
> standard only defines a partial mapping from pointer to integer and back
> (namely null pointer <-> 0 and non-null pointer <-> nonzero, but nothing
> else), and that is why most operations on pointers are undefined.
> However, Linux defines that it only works on a linear address space, and
> so completes the mapping from pointer to integer, and so avoids the UB
> by defining it.
> 
> We have done similar things in the past, like assuming long to be large
> enough for a pointer, as Linux depends on that as well.
> 
> As for the overflow: That would be fixed if you set rend to rpos +
> PTRDIFF_MAX unless that overflows. Technically still UB, but at least
> the calculations work now.

That's what strtol.c is doing right now. The corresponding fix in
strtod.c (all the float functions) is missing.

> The __string_read approach shouldn't be all that much slower, though. It
> only adds overhead when the read buffer is empty, which is going to be
> on the first call and then every 256 characters. And the overhead is
> mostly a memchr()... on second thought, that might take a while.

Ah, I forgot that __string_read is still doing the hack of having
buffer pointers point directly to the string, not copying it, so that
ungetc is unsafe (in general; the intscan/floatscan/shgetc framework
has a way to handle it). So it's not a copy, but it is a memchr of N
to N+(-N&255) bytes, where N is the length of the input item. Even if
N is short, the input item might lie in a longer string, and if you
have a sequence of short numbers (worst-case, most single-digit) in a
long string of numbers, you'll end up repeatedly scanning for null up
to 255 bytes past the end of the number. Just lowering the scanahead
from 256 to something like 16-24 should make this largely irrelevant,
though. Such tuning should probably be specific to caller (strto* vs
vsscanf, since the latter is likely to be reading multiple fields and
possibly non-numeric fields).

With that said, I'd kind of like to make it work the way I was
mistakenly thinking __string_read works, doing actual copying into a
stdio buffer. Then there's no risk of bugs from other stdio code
assuming it has a real buffer (e.g. ungetc is always safe). But I'm
not sure if it would incur noticable cost.

> > The other "mostly right" (modulo ungetc not being available then)
> > approach would be getting rid of the whole current buffer design with
> > start/end pointers and using indices instead. This would solve a lot
> > of stupid gratuitous UB in stdio, like (T*)0-(T*)0 being undefined.
> > It's not clear to me whether it would be more or less efficient. It
> > would "break" glibc ABI-compat for stdio -- the original reason I used
> > the pointer-based design -- but that could be fixed by putting
> > "must-be-null" fields in place of the buffer pointers so that any
> > glibc code using getc_unlocked/putc_unlocked macros would hit the
> > "no buffer space" code path and call an actual function. In many ways
> > that's desirable anyway.
> > 
> 
> I don't know how important the glibc-compat still is for people. I don't

It's important enough that Alpine has done a good deal of work making
it run better with additional libraries. People use it to run various
proprietary software that doesn't ship a musl-linked version.

> need it, but then, I have glibc available if necessary. And with Steam
> moving to Linux, closed-source software is certainly going to be on the
> rise for Linux/PC (i.e. Linux/i386 and Linux/amd64). Which is going to
> be a problem for musl-based distributions, if they want to support that.

Ideally they would be using musl for Steam and similar things, since
it's much less dependant on the host library ecosystem. It would also
be more license-friendly to them. However the awful dlopen-based
architecture of video drivers is a stopper right now.

> > Probably the right next step here is measuring whether just using
> > __string_read would make anything measurably slower.
> 
> Probably. Although the memchr() in that function might access past the
> string, but only if the string isn't terminated, so we can invoke the
> GIGO principle, right?

Input to strto* has to be a string (null-terminated), but it doesn't
need to end after the input item (number). That's why the endptr
argument exists -- for parsing numbers out of a stream of data.

Rich


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: string-backed FILEs mess
  2018-09-12 15:09 ` Markus Wichmann
  2018-09-12 15:43   ` Rich Felker
@ 2018-09-12 15:55   ` A. Wilcox
  2018-09-12 16:35     ` Rich Felker
  1 sibling, 1 reply; 15+ messages in thread
From: A. Wilcox @ 2018-09-12 15:55 UTC (permalink / raw)
  To: musl


[-- Attachment #1.1: Type: text/plain, Size: 1211 bytes --]

On 09/12/18 10:09, Markus Wichmann wrote:
> I don't know how important the glibc-compat still is for people. I don't
> need it, but then, I have glibc available if necessary. And with Steam
> moving to Linux, closed-source software is certainly going to be on the
> rise for Linux/PC (i.e. Linux/i386 and Linux/amd64). Which is going to
> be a problem for musl-based distributions, if they want to support that.


Genuinely, I used it every day I had an x86_64 box.  Spotify was the
main reason (the Qt 4 app; NOT the CEF/chrome-based piece of ****).  Of
course, now that I trashed my last x86 and am surrounded by better cores
(almost exclusively PowerPC, with some ARM and MIPS for good measure),
glibc ABI compat is almost worthless.

However, that's what https://code.foxkit.us/adelie/gcompat/ is for
anyway.  Since it uses LD_PRELOAD, we could probably overwrite the
entirety of stdio with something compatible with glibc if necessary.

I hope it doesn't come to that, as that'd be a lot more cruft to
maintain; but if it improves musl significantly, that may be the price
to pay.

Best,
--arw

-- 
A. Wilcox (awilfox)
Project Lead, Adélie Linux
https://www.adelielinux.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: string-backed FILEs mess
  2018-09-12 15:43   ` Rich Felker
@ 2018-09-12 16:33     ` Rich Felker
  2018-09-12 17:18       ` Rich Felker
  2018-09-12 17:41     ` Markus Wichmann
  1 sibling, 1 reply; 15+ messages in thread
From: Rich Felker @ 2018-09-12 16:33 UTC (permalink / raw)
  To: musl

On Wed, Sep 12, 2018 at 11:43:06AM -0400, Rich Felker wrote:
> > The __string_read approach shouldn't be all that much slower, though. It
> > only adds overhead when the read buffer is empty, which is going to be
> > on the first call and then every 256 characters. And the overhead is
> > mostly a memchr()... on second thought, that might take a while.
> 
> Ah, I forgot that __string_read is still doing the hack of having
> buffer pointers point directly to the string, not copying it, so that
> ungetc is unsafe (in general; the intscan/floatscan/shgetc framework
> has a way to handle it). So it's not a copy, but it is a memchr of N
> to N+(-N&255) bytes, where N is the length of the input item. Even if
> N is short, the input item might lie in a longer string, and if you
> have a sequence of short numbers (worst-case, most single-digit) in a
> long string of numbers, you'll end up repeatedly scanning for null up
> to 255 bytes past the end of the number. Just lowering the scanahead
> from 256 to something like 16-24 should make this largely irrelevant,
> though. Such tuning should probably be specific to caller (strto* vs
> vsscanf, since the latter is likely to be reading multiple fields and
> possibly non-numeric fields).

OK, some measurements. The test is making a string of " 1" repeated
10M times and looping through it with strtol updating end pointer.

With existing strtol, it takes 0.91s.

With __string_read and existing 256-byte unit, it takes 5.76s.

That's a pretty huge slowdown.

Lowering the step from 256 to 24, it takes 2.92s. Better but still
unacceptably slower.

Inlining the buffer setup code and memchr so that __string_read
doesn't have to get called unless it's exhausted (and doesn't get
called in the test) gets it down to 2.00s with a step of 24, and 1.75s
with a step of 8.

OK, I've been properly initializing the FILE rather than leaving it
uninitialized except for the important fields like the old code did.
Changing that, it's 1.44s with step 8, 1.60s with step 24. I also
confirmed that this version of the code is almost as fast as the
existing code with the memchr removed (just assuming it can read
ahead).

I'll see what else I can find but it doesn't look like there's a way
to fix this without either making it a minimum of ~1.5x slower at this
particular worst-case, or redesigning the backend not to use a fake
FILE but some different abstraction.

Rich


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: string-backed FILEs mess
  2018-09-12 15:55   ` A. Wilcox
@ 2018-09-12 16:35     ` Rich Felker
  0 siblings, 0 replies; 15+ messages in thread
From: Rich Felker @ 2018-09-12 16:35 UTC (permalink / raw)
  To: musl

On Wed, Sep 12, 2018 at 10:55:45AM -0500, A. Wilcox wrote:
> On 09/12/18 10:09, Markus Wichmann wrote:
> > I don't know how important the glibc-compat still is for people. I don't
> > need it, but then, I have glibc available if necessary. And with Steam
> > moving to Linux, closed-source software is certainly going to be on the
> > rise for Linux/PC (i.e. Linux/i386 and Linux/amd64). Which is going to
> > be a problem for musl-based distributions, if they want to support that.
> 
> 
> Genuinely, I used it every day I had an x86_64 box.  Spotify was the
> main reason (the Qt 4 app; NOT the CEF/chrome-based piece of ****).  Of
> course, now that I trashed my last x86 and am surrounded by better cores
> (almost exclusively PowerPC, with some ARM and MIPS for good measure),
> glibc ABI compat is almost worthless.
> 
> However, that's what https://code.foxkit.us/adelie/gcompat/ is for
> anyway.  Since it uses LD_PRELOAD, we could probably overwrite the
> entirety of stdio with something compatible with glibc if necessary.

As described in the post Markus was replying to, this is a non-issue.
If the design were going to break compat it would just put
must-be-zero fields over top of the old glibc positions so that they'd
always force an actual function call. This would actually reduce the
ABI surface down from actual field values and their interpretation to
a mere contract that certain bytes be all-zeros.

Rich


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: string-backed FILEs mess
  2018-09-12 16:33     ` Rich Felker
@ 2018-09-12 17:18       ` Rich Felker
  2018-09-14 15:52         ` Rich Felker
  0 siblings, 1 reply; 15+ messages in thread
From: Rich Felker @ 2018-09-12 17:18 UTC (permalink / raw)
  To: musl

On Wed, Sep 12, 2018 at 12:33:45PM -0400, Rich Felker wrote:
> OK, I've been properly initializing the FILE rather than leaving it
> uninitialized except for the important fields like the old code did.
> Changing that, it's 1.44s with step 8, 1.60s with step 24. I also
> confirmed that this version of the code is almost as fast as the
> existing code with the memchr removed (just assuming it can read
> ahead).

Uhg, the source of the "almost" here makes me even more convinced the
current code must go. Part of the reason it's not as fast was that I
was still setting f.read=__string_read, which requires (this is on
i386, 32-bit) setting up the GOT pointer.

What was the old code doing? f.read was uninitialized. But the new
code crashes in that case when hitting the end of the string. Why
doesn't the old code crash? Because f.rend is set way past the end of
the string and never reached. If it were reached:

1. The shgetc macro calls the __shgetc function.
2. The __shgetc function calls __uflow.
3. __uflow calls __toread.
4. __toread inspects uninitialized f->wpos/f->wbase fields and,
   depending on the values it sees, calls f->write, which is also
   uninitialized.
5. If it gets past that, next __uflow calls the uninitialized f->read.

The fact that any of this works at all is a fragile shit-show, and
completely depends on __intscan/__floatscan just using (via shgetc) a
tiny undocumented subset of the FILE structure and a tiny undocumented
subset of the stdio interfaces on it.

Really the existing code is just a poor substitute for having an
abstraction for windowed string iteration, using the stdio FILE
structure in a way that also works with real FILEs. It's clever, but
this kind of clever is not a good thing.

I'm still not sure what the right way forward is, though.

Rich


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: string-backed FILEs mess
  2018-09-12 15:43   ` Rich Felker
  2018-09-12 16:33     ` Rich Felker
@ 2018-09-12 17:41     ` Markus Wichmann
  2018-09-12 18:03       ` Rich Felker
  2018-09-12 18:48       ` A. Wilcox
  1 sibling, 2 replies; 15+ messages in thread
From: Markus Wichmann @ 2018-09-12 17:41 UTC (permalink / raw)
  To: musl

On Wed, Sep 12, 2018 at 11:43:06AM -0400, Rich Felker wrote:
> On Wed, Sep 12, 2018 at 05:09:41PM +0200, Markus Wichmann wrote:
> > Well, first of all, I might set my foot wrong here very badly, but I
> > generally don't care about C standard UB as long as the behavior is
> > defined elsewhere.
> 
> Like where? In order for it to be defined, the *compiler* has to
> define it, since otherwise it can make transformations that assume the
> behavior is undefined. So what you're asking for here is basically
> amounting to only supporting certain compilers (with certain flags),
> and notably *not supporting* UBSan, which is a really valuable tool
> for catching bugs.
> 

Oh, I didn't think of that. But the compiler still has to follow the
ABI, and the ABI says we have linear addresses. So the pointer to
integer mapping still has to work, and (void*)-1 is defined in the SysV
ABI. Wouldn't make much sense for DOS, but hey, that's not a supported
platform. (Actually that's a bad example, because it would totally make
sense as the far pointer to FFFF:FFFF, but you get my point.)

Besides, you're opening a very scary door there: The C standard's
chapter 7 contains a whole lot of UB in the library, and a compiler
writer could now say: Since it is undefined, obviously it is never going
to happen (and if it does, it is your own fault), so I can write the
optimizer to assume all arguments to functions are such that UB does not
occur. The standard says fflush() is only defined for output streams, so
we're going to assume any stream passed into fflush() is an output
stream and... I don't know, assume all input functions are going to fail
until the next fseek()? Actually, I'm drawing a blank as to what they
could do with this, but the GCC folks would find a way to mess with my
code.

As for UBSan: Can't these sanitizers get their fingers out of the system
implementation? That is pretty much the reason why warnings are
suppressed for system header files, after all: Sometimes, the
implementation just does things you're not supposed to. Like casting -1
to void*. Or calling free() on random pointers that weren't produced by
malloc() before. But the implementation can do these things because it
knows things a portable program can't know.

> 
> Rich

Ciao,
Markus


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: string-backed FILEs mess
  2018-09-12 17:41     ` Markus Wichmann
@ 2018-09-12 18:03       ` Rich Felker
  2018-09-12 18:48       ` A. Wilcox
  1 sibling, 0 replies; 15+ messages in thread
From: Rich Felker @ 2018-09-12 18:03 UTC (permalink / raw)
  To: musl

On Wed, Sep 12, 2018 at 07:41:12PM +0200, Markus Wichmann wrote:
> On Wed, Sep 12, 2018 at 11:43:06AM -0400, Rich Felker wrote:
> > On Wed, Sep 12, 2018 at 05:09:41PM +0200, Markus Wichmann wrote:
> > > Well, first of all, I might set my foot wrong here very badly, but I
> > > generally don't care about C standard UB as long as the behavior is
> > > defined elsewhere.
> > 
> > Like where? In order for it to be defined, the *compiler* has to
> > define it, since otherwise it can make transformations that assume the
> > behavior is undefined. So what you're asking for here is basically
> > amounting to only supporting certain compilers (with certain flags),
> > and notably *not supporting* UBSan, which is a really valuable tool
> > for catching bugs.
> 
> Oh, I didn't think of that. But the compiler still has to follow the
> ABI, and the ABI says we have linear addresses.

The ABI defines an interface boundary. These transformations do not
take place at or across boundaries but in contexts where no boundary
is present and ABI does not apply. The possibility of them is the only
reason a tool like UBSan or _FORTIFY_SOURCE can work; otherwise what
these tools do would be invalid, arbitrarily breaking well-defined
code because they think it's bad style rather than justifiedly
changing the behavior of cases where the behavior is not defined.

> So the pointer to
> integer mapping still has to work, and (void*)-1 is defined in the SysV
> ABI. Wouldn't make much sense for DOS, but hey, that's not a supported
> platform. (Actually that's a bad example, because it would totally make
> sense as the far pointer to FFFF:FFFF, but you get my point.)

Creating a pointer like (void*)-1 is implementation- (platform ABI-)
defined, but that doesn't mean you can perform arbitrary operations on
it and have the results be meaningful or even defined. In particular
the -,<,<=,>,>= operators are only defined for a pair of pointers into
the same array. Since (void*)-1 is not a pointer into any array
object, comparing or subtracting it is not defined.

> Besides, you're opening a very scary door there: The C standard's
> chapter 7 contains a whole lot of UB in the library, and a compiler
> writer could now say: Since it is undefined, obviously it is never going
> to happen (and if it does, it is your own fault), so I can write the
> optimizer to assume all arguments to functions are such that UB does not
> occur. The standard says fflush() is only defined for output streams, so
> we're going to assume any stream passed into fflush() is an output
> stream and... I don't know, assume all input functions are going to fail
> until the next fseek()? Actually, I'm drawing a blank as to what they
> could do with this, but the GCC folks would find a way to mess with my
> code.

That's exactly why we have to use -ffreestanding, which says we want
to use the compiler as a freestanding C implementation that does not
include the standard library functions and corresponding assumptions
about their behavior. Without that, for example, the code in calloc
that only zero-fills the buffer produced by malloc if it's not already
zero will be optimized out, since the inspection of uninitialized
memory from malloc is undefined. Likewise the implementation of memcpy
could be optimized to a call to itself (infinite recursion).

> As for UBSan: Can't these sanitizers get their fingers out of the system
> implementation?

If you use UBSan for an application, it's of course not going to do
anything to code in libc. However you can also use it when building
libc, in order to find dangerous bugs. See this recent GCC bug report
where, if not for a bug in UBSan, it would have caught a serious,
dangerous regression in musl:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87191

(Thankfully it was caught manually before release.)

Rich


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: string-backed FILEs mess
  2018-09-12 17:41     ` Markus Wichmann
  2018-09-12 18:03       ` Rich Felker
@ 2018-09-12 18:48       ` A. Wilcox
  2018-09-12 19:30         ` Markus Wichmann
  1 sibling, 1 reply; 15+ messages in thread
From: A. Wilcox @ 2018-09-12 18:48 UTC (permalink / raw)
  To: musl


[-- Attachment #1.1: Type: text/plain, Size: 599 bytes --]

On 09/12/18 12:41, Markus Wichmann wrote:
> Besides, you're opening a very scary door there: The C standard's
> chapter 7 contains a whole lot of UB in the library, and a compiler
> writer could now say: Since it is undefined, obviously it is never going
> to happen (and if it does, it is your own fault), so I can write the
> optimizer to assume all arguments to functions are such that UB does not
> occur.


Happy early Hallowe'en:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87096

--arw


-- 
A. Wilcox (awilfox)
Project Lead, Adélie Linux
https://www.adelielinux.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: string-backed FILEs mess
  2018-09-12 18:48       ` A. Wilcox
@ 2018-09-12 19:30         ` Markus Wichmann
  2018-09-12 19:46           ` Rich Felker
  0 siblings, 1 reply; 15+ messages in thread
From: Markus Wichmann @ 2018-09-12 19:30 UTC (permalink / raw)
  To: musl

On Wed, Sep 12, 2018 at 01:48:37PM -0500, A. Wilcox wrote:
> Happy early Hallowe'en:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87096
> 

You know, if I were programming on Linux professionally rather than as a
hobby, stuff like this would drive me to drink. I mean, in this case it
is a return value on a pathological case, but actually, it's sort of my
point: C says, snprintf() is undefined with a buffer size of more than
INT_MAX, POSIX defines it, and GCC just goes with the ISO-C version --
because they can? Great!

Wanna bet there's more where that came from?

G'night,
Markus


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: string-backed FILEs mess
  2018-09-12 19:30         ` Markus Wichmann
@ 2018-09-12 19:46           ` Rich Felker
  0 siblings, 0 replies; 15+ messages in thread
From: Rich Felker @ 2018-09-12 19:46 UTC (permalink / raw)
  To: musl

On Wed, Sep 12, 2018 at 09:30:22PM +0200, Markus Wichmann wrote:
> On Wed, Sep 12, 2018 at 01:48:37PM -0500, A. Wilcox wrote:
> > Happy early Hallowe'en:
> > 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87096
> > 
> 
> You know, if I were programming on Linux professionally rather than as a
> hobby, stuff like this would drive me to drink. I mean, in this case it
> is a return value on a pathological case, but actually, it's sort of my
> point: C says, snprintf() is undefined with a buffer size of more than
> INT_MAX, POSIX defines it, and GCC just goes with the ISO-C version --
> because they can? Great!

In this case, it's not undefined. ISO C defines it as success (there's
nothing erroneous about the buffer being large; it's only an error if
the output would exceed INT_MAX in length). POSIX then erroneously
(because it nominally extends C and defers to it in case of accidental
conflict) redefines it as an error.

Rich


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: string-backed FILEs mess
  2018-09-12 17:18       ` Rich Felker
@ 2018-09-14 15:52         ` Rich Felker
  2018-09-14 16:24           ` Rich Felker
  0 siblings, 1 reply; 15+ messages in thread
From: Rich Felker @ 2018-09-14 15:52 UTC (permalink / raw)
  To: musl

On Wed, Sep 12, 2018 at 01:18:24PM -0400, Rich Felker wrote:
> On Wed, Sep 12, 2018 at 12:33:45PM -0400, Rich Felker wrote:
> > OK, I've been properly initializing the FILE rather than leaving it
> > uninitialized except for the important fields like the old code did.
> > Changing that, it's 1.44s with step 8, 1.60s with step 24. I also
> > confirmed that this version of the code is almost as fast as the
> > existing code with the memchr removed (just assuming it can read
> > ahead).
> 
> Uhg, the source of the "almost" here makes me even more convinced the
> current code must go. Part of the reason it's not as fast was that I
> was still setting f.read=__string_read, which requires (this is on
> i386, 32-bit) setting up the GOT pointer.
> 
> What was the old code doing? f.read was uninitialized. But the new
> code crashes in that case when hitting the end of the string. Why
> doesn't the old code crash? Because f.rend is set way past the end of
> the string and never reached. If it were reached:
> 
> 1. The shgetc macro calls the __shgetc function.
> 2. The __shgetc function calls __uflow.
> 3. __uflow calls __toread.
> 4. __toread inspects uninitialized f->wpos/f->wbase fields and,
>    depending on the values it sees, calls f->write, which is also
>    uninitialized.
> 5. If it gets past that, next __uflow calls the uninitialized f->read.
> 
> The fact that any of this works at all is a fragile shit-show, and
> completely depends on __intscan/__floatscan just using (via shgetc) a
> tiny undocumented subset of the FILE structure and a tiny undocumented
> subset of the stdio interfaces on it.
> 
> Really the existing code is just a poor substitute for having an
> abstraction for windowed string iteration, using the stdio FILE
> structure in a way that also works with real FILEs. It's clever, but
> this kind of clever is not a good thing.
> 
> I'm still not sure what the right way forward is, though.

OK, a small breakthrough that makes this mess a lot simpler:

The __intscan and __floatscan backends do not (and are not allowed to,
but this should be documented and isn't) call any stdio functions on
the fake FILE* passed to them except for the shgetc and shunget
macros, defined as:

#define shgetc(f) (((f)->rpos < (f)->shend) ? *(f)->rpos++ : __shgetc(f))
#define shunget(f) ((f)->shend ? (void)(f)->rpos-- : (void)0)

If the < is merely replaced by !=, which is functionally equivalent,
then shend can be any type of sentinel pointer we want (e.g. (void*)-1
or even just &localvar) to use the buffer as a string with no known
length, and we have a guarantee that __shgetc is never called.

I think this -1+2-byte change is an acceptable resolution to the issue
for now.

I did however perform some casual performance testing to determine
that strtol on short numbers would be 30% faster if it operated
directly on strings only rather than going through the fake FILE and
having to check end pointer and including code to conditionally call
__shgetc. So it might make sense in the longer term to redesign the
interface to operate only on strings, and have scanf be responsible
for transforming the input to a bounded-size string that will produce
the same result. (Note: vfscanf could shortcut out of having to do
this if it could detect that its underlying FILE is a vsscanf string
buffer.) Such a redesign would also reduce the static linking cost for
programs that use strto* but not stdio input functions.

Rich


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: string-backed FILEs mess
  2018-09-14 15:52         ` Rich Felker
@ 2018-09-14 16:24           ` Rich Felker
  2018-09-14 20:39             ` Rich Felker
  0 siblings, 1 reply; 15+ messages in thread
From: Rich Felker @ 2018-09-14 16:24 UTC (permalink / raw)
  To: musl

On Fri, Sep 14, 2018 at 11:52:27AM -0400, Rich Felker wrote:
> On Wed, Sep 12, 2018 at 01:18:24PM -0400, Rich Felker wrote:
> > On Wed, Sep 12, 2018 at 12:33:45PM -0400, Rich Felker wrote:
> > > OK, I've been properly initializing the FILE rather than leaving it
> > > uninitialized except for the important fields like the old code did.
> > > Changing that, it's 1.44s with step 8, 1.60s with step 24. I also
> > > confirmed that this version of the code is almost as fast as the
> > > existing code with the memchr removed (just assuming it can read
> > > ahead).
> > 
> > Uhg, the source of the "almost" here makes me even more convinced the
> > current code must go. Part of the reason it's not as fast was that I
> > was still setting f.read=__string_read, which requires (this is on
> > i386, 32-bit) setting up the GOT pointer.
> > 
> > What was the old code doing? f.read was uninitialized. But the new
> > code crashes in that case when hitting the end of the string. Why
> > doesn't the old code crash? Because f.rend is set way past the end of
> > the string and never reached. If it were reached:
> > 
> > 1. The shgetc macro calls the __shgetc function.
> > 2. The __shgetc function calls __uflow.
> > 3. __uflow calls __toread.
> > 4. __toread inspects uninitialized f->wpos/f->wbase fields and,
> >    depending on the values it sees, calls f->write, which is also
> >    uninitialized.
> > 5. If it gets past that, next __uflow calls the uninitialized f->read.
> > 
> > The fact that any of this works at all is a fragile shit-show, and
> > completely depends on __intscan/__floatscan just using (via shgetc) a
> > tiny undocumented subset of the FILE structure and a tiny undocumented
> > subset of the stdio interfaces on it.
> > 
> > Really the existing code is just a poor substitute for having an
> > abstraction for windowed string iteration, using the stdio FILE
> > structure in a way that also works with real FILEs. It's clever, but
> > this kind of clever is not a good thing.
> > 
> > I'm still not sure what the right way forward is, though.
> 
> OK, a small breakthrough that makes this mess a lot simpler:
> 
> The __intscan and __floatscan backends do not (and are not allowed to,
> but this should be documented and isn't) call any stdio functions on
> the fake FILE* passed to them except for the shgetc and shunget
> macros, defined as:
> 
> #define shgetc(f) (((f)->rpos < (f)->shend) ? *(f)->rpos++ : __shgetc(f))
> #define shunget(f) ((f)->shend ? (void)(f)->rpos-- : (void)0)
> 
> If the < is merely replaced by !=, which is functionally equivalent,
> then shend can be any type of sentinel pointer we want (e.g. (void*)-1
> or even just &localvar) to use the buffer as a string with no known
> length, and we have a guarantee that __shgetc is never called.
> 
> I think this -1+2-byte change is an acceptable resolution to the issue
> for now.

Uhg, nope, mistake: they also use shcnt/shlim, which perform
arithmetic on f->shend. This fix might still be salvagable, but not
without significant additional work removing the dependency on invalid
arithmetic.

Rich


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: string-backed FILEs mess
  2018-09-14 16:24           ` Rich Felker
@ 2018-09-14 20:39             ` Rich Felker
  0 siblings, 0 replies; 15+ messages in thread
From: Rich Felker @ 2018-09-14 20:39 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 4461 bytes --]

On Fri, Sep 14, 2018 at 12:24:09PM -0400, Rich Felker wrote:
> On Fri, Sep 14, 2018 at 11:52:27AM -0400, Rich Felker wrote:
> > On Wed, Sep 12, 2018 at 01:18:24PM -0400, Rich Felker wrote:
> > > On Wed, Sep 12, 2018 at 12:33:45PM -0400, Rich Felker wrote:
> > > > OK, I've been properly initializing the FILE rather than leaving it
> > > > uninitialized except for the important fields like the old code did.
> > > > Changing that, it's 1.44s with step 8, 1.60s with step 24. I also
> > > > confirmed that this version of the code is almost as fast as the
> > > > existing code with the memchr removed (just assuming it can read
> > > > ahead).
> > > 
> > > Uhg, the source of the "almost" here makes me even more convinced the
> > > current code must go. Part of the reason it's not as fast was that I
> > > was still setting f.read=__string_read, which requires (this is on
> > > i386, 32-bit) setting up the GOT pointer.
> > > 
> > > What was the old code doing? f.read was uninitialized. But the new
> > > code crashes in that case when hitting the end of the string. Why
> > > doesn't the old code crash? Because f.rend is set way past the end of
> > > the string and never reached. If it were reached:
> > > 
> > > 1. The shgetc macro calls the __shgetc function.
> > > 2. The __shgetc function calls __uflow.
> > > 3. __uflow calls __toread.
> > > 4. __toread inspects uninitialized f->wpos/f->wbase fields and,
> > >    depending on the values it sees, calls f->write, which is also
> > >    uninitialized.
> > > 5. If it gets past that, next __uflow calls the uninitialized f->read.
> > > 
> > > The fact that any of this works at all is a fragile shit-show, and
> > > completely depends on __intscan/__floatscan just using (via shgetc) a
> > > tiny undocumented subset of the FILE structure and a tiny undocumented
> > > subset of the stdio interfaces on it.
> > > 
> > > Really the existing code is just a poor substitute for having an
> > > abstraction for windowed string iteration, using the stdio FILE
> > > structure in a way that also works with real FILEs. It's clever, but
> > > this kind of clever is not a good thing.
> > > 
> > > I'm still not sure what the right way forward is, though.
> > 
> > OK, a small breakthrough that makes this mess a lot simpler:
> > 
> > The __intscan and __floatscan backends do not (and are not allowed to,
> > but this should be documented and isn't) call any stdio functions on
> > the fake FILE* passed to them except for the shgetc and shunget
> > macros, defined as:
> > 
> > #define shgetc(f) (((f)->rpos < (f)->shend) ? *(f)->rpos++ : __shgetc(f))
> > #define shunget(f) ((f)->shend ? (void)(f)->rpos-- : (void)0)
> > 
> > If the < is merely replaced by !=, which is functionally equivalent,
> > then shend can be any type of sentinel pointer we want (e.g. (void*)-1
> > or even just &localvar) to use the buffer as a string with no known
> > length, and we have a guarantee that __shgetc is never called.
> > 
> > I think this -1+2-byte change is an acceptable resolution to the issue
> > for now.
> 
> Uhg, nope, mistake: they also use shcnt/shlim, which perform
> arithmetic on f->shend. This fix might still be salvagable, but not
> without significant additional work removing the dependency on invalid
> arithmetic.

Here's the patch I'm looking at now. Summary of what happened:

Previously, f->shcnt stored the count that would have been read if
consuming the whole buffer, which required an end pointer for the
buffer. The purpose for this was that it allowed reading it and adding
rpos-rend at any time to get the actual count so far, and required no
adjustment at the time of __shgetc (actual function call) since the
call would only happen when reaching the end of the buffer (actually
there may have been bugs in this with the interaction of scanf field
widths and end-of-buffer boundaries; it would be interesting to go
back and check that).

To get rid of the dependency on rend, I'm instead offsetting shcnt by
buf-rpos (start of buffer) at the time of last __shlim/__shgetc call.
This makes for slightly more work in __shgetc the function, but for
the inline macro it's still just as easy to compute the current count.

It doesn't break anything in libc-test or cause a performance or
functionality regression in my strtol performance measurement program,
so I think it's okay, but I'd appreciate any second looks others are
available to give. :-)

As a bonus I added comments.

Rich

[-- Attachment #2: strtoUB.diff --]
[-- Type: text/plain, Size: 4084 bytes --]

diff --git a/src/internal/shgetc.c b/src/internal/shgetc.c
index e878b00..ebd5fae 100644
--- a/src/internal/shgetc.c
+++ b/src/internal/shgetc.c
@@ -1,10 +1,16 @@
 #include "shgetc.h"
 
+/* The shcnt field stores the number of bytes read so far, offset by
+ * the value of buf-rpos at the last function call (__shlim or __shgetc),
+ * so that between calls the inline shcnt macro can add rpos-buf to get
+ * the actual count. */
+
 void __shlim(FILE *f, off_t lim)
 {
 	f->shlim = lim;
-	f->shcnt = f->rend - f->rpos;
-	if (lim && f->shcnt > lim)
+	f->shcnt = f->buf - f->rpos;
+	/* If lim is nonzero, rend must be a valid pointer. */
+	if (lim && f->rend - f->rpos > lim)
 		f->shend = f->rpos + lim;
 	else
 		f->shend = f->rend;
@@ -13,15 +19,18 @@ void __shlim(FILE *f, off_t lim)
 int __shgetc(FILE *f)
 {
 	int c;
-	if (f->shlim && f->shcnt >= f->shlim || (c=__uflow(f)) < 0) {
+	off_t cnt = shcnt(f);
+	if (f->shlim && cnt >= f->shlim || (c=__uflow(f)) < 0) {
+		f->shcnt = f->buf - f->rpos + cnt;
 		f->shend = 0;
 		return EOF;
 	}
-	if (f->shlim && f->rend - f->rpos > f->shlim - f->shcnt - 1)
-		f->shend = f->rpos + (f->shlim - f->shcnt - 1);
+	cnt++;
+	if (f->shlim && f->rend - f->rpos > f->shlim - cnt)
+		f->shend = f->rpos + (f->shlim - cnt);
 	else
 		f->shend = f->rend;
-	if (f->rend) f->shcnt += f->rend - f->rpos + 1;
+	f->shcnt = f->buf - f->rpos + cnt;
 	if (f->rpos[-1] != c) f->rpos[-1] = c;
 	return c;
 }
diff --git a/src/internal/shgetc.h b/src/internal/shgetc.h
index 210f646..a4dd6d6 100644
--- a/src/internal/shgetc.h
+++ b/src/internal/shgetc.h
@@ -1,9 +1,27 @@
 #include "stdio_impl.h"
 
+/* Scan helper "stdio" functions for use by scanf-family and strto*-family
+ * functions. These accept either avalid stdio FILE, or a minimal pseduo
+ * FILE whose buffer pointers point into a null-terminated string. In the
+ * latter case, the sh_fromstring macro should be used to setup the FILE;
+ * the rest of the structure can be left uninitialized.
+ *
+ * To begin using these functions, shlim must first be called on the FILE
+ * to set a field width limit, or 0 for no limit. For string pseudo-FILEs,
+ * a nonzero limit is not valid and produces undefined behavior. After that,
+ * shgetc, shunget, and shcnt are valid as long as no other stdio functions
+ * are called on the stream. shunget can only be called once without an
+ * intervening successful shgetc on real stdio FILEs. It can be called
+ * multiple times, up to the number of successful shgetc calls, on a
+ * string pseudo-FILE. */
+
 hidden void __shlim(FILE *, off_t);
 hidden int __shgetc(FILE *);
 
-#define shcnt(f) ((f)->shcnt + ((f)->rpos - (f)->rend))
+#define sh_fromstring(f, s) \
+	((f)->buf = (f)->rpos = (void *)(s), (f)->rend = (void*)-1)
+
+#define shcnt(f) ((f)->shcnt + ((f)->rpos - (f)->buf))
 #define shlim(f, lim) __shlim((f), (lim))
-#define shgetc(f) (((f)->rpos < (f)->shend) ? *(f)->rpos++ : __shgetc(f))
+#define shgetc(f) (((f)->rpos != (f)->shend) ? *(f)->rpos++ : __shgetc(f))
 #define shunget(f) ((f)->shend ? (void)(f)->rpos-- : (void)0)
diff --git a/src/stdlib/strtod.c b/src/stdlib/strtod.c
index a898b1d..a5d0118 100644
--- a/src/stdlib/strtod.c
+++ b/src/stdlib/strtod.c
@@ -5,10 +5,8 @@
 
 static long double strtox(const char *s, char **p, int prec)
 {
-	FILE f = {
-		.buf = (void *)s, .rpos = (void *)s,
-		.rend = (void *)-1, .lock = -1
-	};
+	FILE f;
+	sh_fromstring(&f, s);
 	shlim(&f, 0);
 	long double y = __floatscan(&f, prec, 1);
 	off_t cnt = shcnt(&f);
diff --git a/src/stdlib/strtol.c b/src/stdlib/strtol.c
index d82ecf7..f42e7f8 100644
--- a/src/stdlib/strtol.c
+++ b/src/stdlib/strtol.c
@@ -9,13 +9,7 @@ static unsigned long long strtox(const char *s, char **p, int base, unsigned lon
 {
 	/* FIXME: use a helper function or macro to setup the FILE */
 	FILE f;
-	f.flags = 0;
-	f.buf = f.rpos = (void *)s;
-	if ((size_t)s > (size_t)-1/2)
-		f.rend = (void *)-1;
-	else
-		f.rend = (unsigned char *)s+(size_t)-1/2;
-	f.lock = -1;
+	sh_fromstring(&f, s);
 	shlim(&f, 0);
 	unsigned long long y = __intscan(&f, base, 1, lim);
 	if (p) {

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-09-14 20:39 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-12 14:02 string-backed FILEs mess Rich Felker
2018-09-12 15:09 ` Markus Wichmann
2018-09-12 15:43   ` Rich Felker
2018-09-12 16:33     ` Rich Felker
2018-09-12 17:18       ` Rich Felker
2018-09-14 15:52         ` Rich Felker
2018-09-14 16:24           ` Rich Felker
2018-09-14 20:39             ` Rich Felker
2018-09-12 17:41     ` Markus Wichmann
2018-09-12 18:03       ` Rich Felker
2018-09-12 18:48       ` A. Wilcox
2018-09-12 19:30         ` Markus Wichmann
2018-09-12 19:46           ` Rich Felker
2018-09-12 15:55   ` A. Wilcox
2018-09-12 16:35     ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).