mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] out-of-bounds reads in strstr
@ 2020-12-08 19:39 Brooks Davis
  2020-12-08 19:53 ` Rich Felker
  0 siblings, 1 reply; 7+ messages in thread
From: Brooks Davis @ 2020-12-08 19:39 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 625 bytes --]

The strstr implementation contains the following snippet which results
in out-of-bounds reads in memchr (we detect them on CHERI because we
have byte-granularity bounds of small buffers):

	/* Fast estimate for MIN(l,63) */
	size_t grow = l | 63;
	const unsigned char *z2 = memchr(z, 0, grow);

The use of `|` means this is very much not an approximation of
`MIN(l,63)`.  What is actually intended here?  For CheriBSD (via FreeBSD)
I need a way to avoid out-of-bounds reads entirely (`MIN(l,63)` does seem
to work in simple system-level testing, but given the mismatch it's
unclear that's what was intended).

Thanks,
Brooks

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [musl] out-of-bounds reads in strstr
  2020-12-08 19:39 [musl] out-of-bounds reads in strstr Brooks Davis
@ 2020-12-08 19:53 ` Rich Felker
  2020-12-08 22:44   ` Brooks Davis
  0 siblings, 1 reply; 7+ messages in thread
From: Rich Felker @ 2020-12-08 19:53 UTC (permalink / raw)
  To: Brooks Davis; +Cc: musl

On Tue, Dec 08, 2020 at 07:39:19PM +0000, Brooks Davis wrote:
> The strstr implementation contains the following snippet which results
> in out-of-bounds reads in memchr (we detect them on CHERI because we
> have byte-granularity bounds of small buffers):
> 
> 	/* Fast estimate for MIN(l,63) */
> 	size_t grow = l | 63;
> 	const unsigned char *z2 = memchr(z, 0, grow);
> 
> The use of `|` means this is very much not an approximation of
> `MIN(l,63)`.  What is actually intended here?  For CheriBSD (via FreeBSD)
> I need a way to avoid out-of-bounds reads entirely (`MIN(l,63)` does seem
> to work in simple system-level testing, but given the mismatch it's
> unclear that's what was intended).

There is no OOB read in strstr here. The overread is in the
implementation of memchr, which (if you're using the musl version) is
relying on the ability to overread as long as the address is aligned
(assuming protection is at boundaries larger than word size). If this
is a problem, you should use a version of memchr that does not
overread. Note that, per 7.24.5.1 The memchr function, ¶2:

    "The implementation shall behave as if it reads the characters
    sequentially and stops as soon as a matching character is found."

so strstr's use of memchr here is perfectly correct/valid without any
assumption of implementation details, i.e. [at least this part of]
strstr is portable C.

Rich

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [musl] out-of-bounds reads in strstr
  2020-12-08 19:53 ` Rich Felker
@ 2020-12-08 22:44   ` Brooks Davis
  2020-12-08 22:57     ` Rich Felker
  0 siblings, 1 reply; 7+ messages in thread
From: Brooks Davis @ 2020-12-08 22:44 UTC (permalink / raw)
  To: Rich Felker; +Cc: Brooks Davis, musl

[-- Attachment #1: Type: text/plain, Size: 1763 bytes --]

On Tue, Dec 08, 2020 at 02:53:28PM -0500, Rich Felker wrote:
> On Tue, Dec 08, 2020 at 07:39:19PM +0000, Brooks Davis wrote:
> > The strstr implementation contains the following snippet which results
> > in out-of-bounds reads in memchr (we detect them on CHERI because we
> > have byte-granularity bounds of small buffers):
> > 
> > 	/* Fast estimate for MIN(l,63) */
> > 	size_t grow = l | 63;
> > 	const unsigned char *z2 = memchr(z, 0, grow);
> > 
> > The use of `|` means this is very much not an approximation of
> > `MIN(l,63)`.  What is actually intended here?  For CheriBSD (via FreeBSD)
> > I need a way to avoid out-of-bounds reads entirely (`MIN(l,63)` does seem
> > to work in simple system-level testing, but given the mismatch it's
> > unclear that's what was intended).
> 
> There is no OOB read in strstr here. The overread is in the
> implementation of memchr, which (if you're using the musl version) is
> relying on the ability to overread as long as the address is aligned
> (assuming protection is at boundaries larger than word size). If this
> is a problem, you should use a version of memchr that does not
> overread. Note that, per 7.24.5.1 The memchr function, ??2:
> 
>     "The implementation shall behave as if it reads the characters
>     sequentially and stops as soon as a matching character is found."
> 
> so strstr's use of memchr here is perfectly correct/valid without any
> assumption of implementation details, i.e. [at least this part of]
> strstr is portable C.

Ok, I see that. I'll adjust our (musl-derived) memchr to work
correctly in this case.

That being said. I'm still confused by the comment in strstr. `l | 63`
is closer to `MAX(l,63)` than `MIN(l,63)`.

Thanks,
Brooks


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [musl] out-of-bounds reads in strstr
  2020-12-08 22:44   ` Brooks Davis
@ 2020-12-08 22:57     ` Rich Felker
  2020-12-09  6:54       ` Alexander Monakov
  0 siblings, 1 reply; 7+ messages in thread
From: Rich Felker @ 2020-12-08 22:57 UTC (permalink / raw)
  To: Brooks Davis; +Cc: Brooks Davis, musl

On Tue, Dec 08, 2020 at 10:44:54PM +0000, Brooks Davis wrote:
> On Tue, Dec 08, 2020 at 02:53:28PM -0500, Rich Felker wrote:
> > On Tue, Dec 08, 2020 at 07:39:19PM +0000, Brooks Davis wrote:
> > > The strstr implementation contains the following snippet which results
> > > in out-of-bounds reads in memchr (we detect them on CHERI because we
> > > have byte-granularity bounds of small buffers):
> > > 
> > > 	/* Fast estimate for MIN(l,63) */
> > > 	size_t grow = l | 63;
> > > 	const unsigned char *z2 = memchr(z, 0, grow);
> > > 
> > > The use of `|` means this is very much not an approximation of
> > > `MIN(l,63)`.  What is actually intended here?  For CheriBSD (via FreeBSD)
> > > I need a way to avoid out-of-bounds reads entirely (`MIN(l,63)` does seem
> > > to work in simple system-level testing, but given the mismatch it's
> > > unclear that's what was intended).
> > 
> > There is no OOB read in strstr here. The overread is in the
> > implementation of memchr, which (if you're using the musl version) is
> > relying on the ability to overread as long as the address is aligned
> > (assuming protection is at boundaries larger than word size). If this
> > is a problem, you should use a version of memchr that does not
> > overread. Note that, per 7.24.5.1 The memchr function, ??2:
> > 
> >     "The implementation shall behave as if it reads the characters
> >     sequentially and stops as soon as a matching character is found."
> > 
> > so strstr's use of memchr here is perfectly correct/valid without any
> > assumption of implementation details, i.e. [at least this part of]
> > strstr is portable C.
> 
> Ok, I see that. I'll adjust our (musl-derived) memchr to work
> correctly in this case.
> 
> That being said. I'm still confused by the comment in strstr. `l | 63`
> is closer to `MAX(l,63)` than `MIN(l,63)`.

Yes, the comment is wrong. The point is just to scan at least l bytes
forward for the end of the haystack (since we'll need that many
immediately) and at least some decent minimum to avoid doing it over
and over if the needle is short. But there's no need for it to be
precise.

Rich

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [musl] out-of-bounds reads in strstr
  2020-12-08 22:57     ` Rich Felker
@ 2020-12-09  6:54       ` Alexander Monakov
  2020-12-09 16:37         ` Rich Felker
  0 siblings, 1 reply; 7+ messages in thread
From: Alexander Monakov @ 2020-12-09  6:54 UTC (permalink / raw)
  To: musl; +Cc: Brooks Davis, Brooks Davis

On Tue, 8 Dec 2020, Rich Felker wrote:

> > That being said. I'm still confused by the comment in strstr. `l | 63`
> > is closer to `MAX(l,63)` than `MIN(l,63)`.
> 
> Yes, the comment is wrong. The point is just to scan at least l bytes
> forward for the end of the haystack (since we'll need that many
> immediately) and at least some decent minimum to avoid doing it over
> and over if the needle is short. But there's no need for it to be
> precise.

It's not the first time this comes up. I suspect you'd save more time
correcting the misleading comment instead of responding to each inquiry
individually.

Alexander

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [musl] out-of-bounds reads in strstr
  2020-12-09  6:54       ` Alexander Monakov
@ 2020-12-09 16:37         ` Rich Felker
  2020-12-24 22:26           ` Fangrui Song
  0 siblings, 1 reply; 7+ messages in thread
From: Rich Felker @ 2020-12-09 16:37 UTC (permalink / raw)
  To: musl

On Wed, Dec 09, 2020 at 09:54:51AM +0300, Alexander Monakov wrote:
> On Tue, 8 Dec 2020, Rich Felker wrote:
> 
> > > That being said. I'm still confused by the comment in strstr. `l | 63`
> > > is closer to `MAX(l,63)` than `MIN(l,63)`.
> > 
> > Yes, the comment is wrong. The point is just to scan at least l bytes
> > forward for the end of the haystack (since we'll need that many
> > immediately) and at least some decent minimum to avoid doing it over
> > and over if the needle is short. But there's no need for it to be
> > precise.
> 
> It's not the first time this comes up. I suspect you'd save more time
> correcting the misleading comment instead of responding to each inquiry
> individually.

Thanks for poking me again about this. Will do.

Rich

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [musl] out-of-bounds reads in strstr
  2020-12-09 16:37         ` Rich Felker
@ 2020-12-24 22:26           ` Fangrui Song
  0 siblings, 0 replies; 7+ messages in thread
From: Fangrui Song @ 2020-12-24 22:26 UTC (permalink / raw)
  To: musl

On Wed, Dec 9, 2020 at 8:37 AM Rich Felker <dalias@libc.org> wrote:
>
> On Wed, Dec 09, 2020 at 09:54:51AM +0300, Alexander Monakov wrote:
> > On Tue, 8 Dec 2020, Rich Felker wrote:
> >
> > > > That being said. I'm still confused by the comment in strstr. `l | 63`
> > > > is closer to `MAX(l,63)` than `MIN(l,63)`.
> > >
> > > Yes, the comment is wrong. The point is just to scan at least l bytes
> > > forward for the end of the haystack (since we'll need that many
> > > immediately) and at least some decent minimum to avoid doing it over
> > > and over if the needle is short. But there's no need for it to be
> > > precise.
> >
> > It's not the first time this comes up. I suspect you'd save more time
> > correcting the misleading comment instead of responding to each inquiry
> > individually.
>
> Thanks for poking me again about this. Will do.
>
> Rich

wcsstr.c has a similar typo.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-12-24 22:27 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-08 19:39 [musl] out-of-bounds reads in strstr Brooks Davis
2020-12-08 19:53 ` Rich Felker
2020-12-08 22:44   ` Brooks Davis
2020-12-08 22:57     ` Rich Felker
2020-12-09  6:54       ` Alexander Monakov
2020-12-09 16:37         ` Rich Felker
2020-12-24 22:26           ` Fangrui Song

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).