mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Andy Caldwell <andycaldwell@microsoft.com>
To: "musl@lists.openwall.com" <musl@lists.openwall.com>
Subject: RE: [musl] RE: [EXTERNAL] Re: [musl] [PATCH] fix avoidable segfault in catclose
Date: Fri, 26 Jan 2024 20:16:58 +0000	[thread overview]
Message-ID: <AS4PR83MB0546C502234F33A9EB87CA3DCB792@AS4PR83MB0546.EURPRD83.prod.outlook.com> (raw)
In-Reply-To: <20240126195701.GO4163@brightrain.aerifal.cx>

> > > > > > > And it has been musl policy to crash on invalid args since the
> beginning.
> > > > > >
> > > > > > The current implementation doesn't (necessarily) crash/trap on
> > > > > > an invalid argument, instead it invokes (C-language
> > > > > > spec-defined) UB itself (it dereferences
> > > > > > `(uint32_t*)((char*)cat) + 8)`, which, in the case of the `-1`
> > > > > > handle is the address 0x7, which in turn, not being a valid
> > > > > > address, is UB to dereference). If you're lucky (or are
> > > > > > compiling without optimizations/inlining) the compiler will
> > > > > > emit a MOV that will trigger an access violation and hence a
> > > > > > SEGV, if
> > > > >
> > > > > In general, it's impossible to test for "is this pointer valid?"
> > > > >
> > > > > There are certain special cases we could test for, but unless
> > > > > there is a particularly convincing reason that they could lead
> > > > > to runaway wrong execution/vulnerabilities prior to naturally
> > > > > trapping, we have not considered littering the code with these
> > > > > kinds of checks to be a
> > > worthwhile trade-off.
> > > > >
> > > > > > you're unlucky the compiler will make wild assumptions about
> > > > > > the value of the variable passed as the arg (and for example
> > > > > > in your first code snippet, simply delete the `if` statement,
> > > > > > meaning `use_cat` gets called even when `catopen` fails
> > > > > > potentially corrupting user data/state).
> > > > >
> > > > > I have no idea what you're talking about there. The compiler
> > > > > cannot make that kind of transformation (lifting code that could
> > > > > produce undefined behavior, side effects, etc. out of a conditional).
> > > >
> > > > It's a hypothetical, but something like the following is valid for
> > > > the compiler to
> > > do:
> > > >
> > > > * inline the catclose (e.g. in LTO for a static link)
> > > > * consider the `if` statement and ask "what if `cat` is `-1`
> > > > * look forward to the pointer dereference (confirming that `cat`
> > > > can't change in the interim)
> > > > * realise that `0x7` is not a valid pointer on the target platform
> > > > so UB is inevitable if `cat` is `-1`
> > > > * optimize out the comparison since UB frees the compiler of any
> > > > responsibilities
> > >
> > > You have the logic backwards. In the case where cat==(cat_t)-1,
> > > catclose is not called on the abstract machine, so no conclusions
> > > can be drawn from anything catclose would do.
> >
> > The original code I was working from was:
> >
> > ```
> > nl_catd cat = catopen(...);
> > if (cat != (nl_catd)-1) {
> >     use_cat(cat);
> > }
> > catclose(cat);
> > ```
> >
> > (i.e. an incorrect use of the APIs, but not UB in a "C99 spec"
> > sense). In that code the `catclose` call is provably inevitable,
> > allowing the compiler to infer properties of `cat` from it.
>
> Ah, okay, at least now that makes sense. But indeed it is undefined:
>
>   "Each of the following statements shall apply to all functions
>    unless explicitly stated otherwise in the detailed descriptions
>    that follow:
>
>    1. If an argument to a function has an invalid value (such as a
>       value outside the domain of the function, or a pointer outside
>       the address space of the program, or a null pointer), the
>       behavior is undefined.
>
>    ..."
>
> https://pubs.ope/
> ngroup.org%2Fonlinepubs%2F9699919799%2Ffunctions%2FV2_chap02.html%23t
> ag_15_01&data=05%7C02%7Candycaldwell%40microsoft.com%7C7cfdfca2ce32
> 4149a5cd08dc1ea8f669%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7
> C638418958276620853%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMD
> AiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata
> =0lEpsKknyTB3xULpxaOBZS6UIzI1rCG4D86U9z6P5Xc%3D&reserved=0
>
> So I guess what you're saying is that, in the case where an erroneous program like
> the above has undefined behavior, the compiler could make a transformation such
> that the effect of the UB is seen at a point different from where it logically occurs.
> (This is the norm for UB.) In particular, despite cat being -1 from a failed catopen,
> you might see use_cat being called with a seemingly impossible argument.

Yes, this - the details aren't particularly interesting but the key is that "invoke UB" is not the same as "crash/trap".  I'm also contrasting this to the comments in the glibc wiki and Markus's synopsis (from the earlier email) that "it has been musl policy to crash on invalid args since the beginning" - in the face of UB, musl (and presumably also glibc) _doesn't_ crash/trap, nor does it "fail early and catastrophically" it instead "propagates the UB".  In debug builds these are often equivalent, but the specific path to UB might not be seen in a debug build, and only be seen in production where the non-locality of UB effects are at their worst.

> Exacerbating the degree to which UB can become non-localized is one of the
> expected effects of LTO, and arguably a good reason not to use LTO for debugging.
> I don't see a lot of value in trying to prevent this in isolated cases when it's going
> to happen all over the place anyway for other reasons.

I think that's reasonable, but it disagrees with the "policy" as described in the glibc wiki/your SO post/etc. (maybe I'm being pedantic, but I read "it shall fail early and catastrophically" as "it will definitely fail, and in a useful way, even in optimized builds" which isn't always the case and often can't be).  Maybe I'm simply suggesting that that position needs some clarification/caveating?

  reply	other threads:[~2024-01-26 20:17 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-25  7:09 Ismael Luceno
2024-01-25 14:05 ` Rich Felker
2024-01-25 15:28   ` Ismael Luceno
2024-01-25 15:56     ` Rich Felker
2024-01-25 14:11 ` Markus Wichmann
2024-01-25 15:30   ` Ismael Luceno
2024-01-25 20:10   ` [musl] RE: [EXTERNAL] " Andy Caldwell
2024-01-25 21:25     ` Rich Felker
2024-01-26 17:13       ` Andy Caldwell
2024-01-26 17:27         ` Rich Felker
2024-01-26 19:12           ` Andy Caldwell
2024-01-26 19:57             ` Rich Felker
2024-01-26 20:16               ` Andy Caldwell [this message]
2024-01-27 11:04                 ` Szabolcs Nagy
2024-01-27 12:58                   ` Alexander Monakov
2024-01-27 14:56                     ` Rich Felker
2024-01-27 19:20                       ` Szabolcs Nagy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AS4PR83MB0546C502234F33A9EB87CA3DCB792@AS4PR83MB0546.EURPRD83.prod.outlook.com \
    --to=andycaldwell@microsoft.com \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).