From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <musl-return-20308-ml=inbox.vuxu.org@lists.openwall.com>
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org
X-Spam-Level: 
X-Spam-Status: No, score=-3.1 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS,
	MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,
	RCVD_IN_MSPIKE_WL,T_SCC_BODY_TEXT_LINE autolearn=ham
	autolearn_force=no version=3.4.4
Received: from second.openwall.net (second.openwall.net [193.110.157.125])
	by inbox.vuxu.org (Postfix) with SMTP id 1360322B65
	for <ml@inbox.vuxu.org>; Fri, 26 Jan 2024 20:56:59 +0100 (CET)
Received: (qmail 24140 invoked by uid 550); 26 Jan 2024 19:54:43 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
List-ID: <musl.lists.openwall.com>
Reply-To: musl@lists.openwall.com
Received: (qmail 24093 invoked from network); 26 Jan 2024 19:54:43 -0000
Date: Fri, 26 Jan 2024 14:57:01 -0500
From: Rich Felker <dalias@libc.org>
To: Andy Caldwell <andycaldwell@microsoft.com>
Cc: "musl@lists.openwall.com" <musl@lists.openwall.com>
Message-ID: <20240126195701.GO4163@brightrain.aerifal.cx>
References: <20240125070950.28673-1-ismael@iodev.co.uk>
 <ZbJsGMwsqlU_DY8g@voyager>
 <AS4PR83MB05462D2FE519A65787A596DACB7A2@AS4PR83MB0546.EURPRD83.prod.outlook.com>
 <20240125212548.GL4163@brightrain.aerifal.cx>
 <AS4PR83MB0546C4287E4459E2C0EFDD2BCB792@AS4PR83MB0546.EURPRD83.prod.outlook.com>
 <20240126172716.GN4163@brightrain.aerifal.cx>
 <AS4PR83MB05466FBFDA5F0126CC2E3B43CB792@AS4PR83MB0546.EURPRD83.prod.outlook.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <AS4PR83MB05466FBFDA5F0126CC2E3B43CB792@AS4PR83MB0546.EURPRD83.prod.outlook.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Subject: Re: [musl] RE: [EXTERNAL] Re: [musl] [PATCH] fix avoidable segfault
 in catclose

On Fri, Jan 26, 2024 at 07:12:59PM +0000, Andy Caldwell wrote:
> > > > > > And it has been musl policy to crash on invalid args since the beginning.
> > > > >
> > > > > The current implementation doesn't (necessarily) crash/trap on an
> > > > > invalid argument, instead it invokes (C-language spec-defined) UB
> > > > > itself (it dereferences `(uint32_t*)((char*)cat) + 8)`, which, in
> > > > > the case of the `-1` handle is the address 0x7, which in turn, not
> > > > > being a valid address, is UB to dereference). If you're lucky (or
> > > > > are compiling without optimizations/inlining) the compiler will
> > > > > emit a MOV that will trigger an access violation and hence a SEGV,
> > > > > if
> > > >
> > > > In general, it's impossible to test for "is this pointer valid?"
> > > >
> > > > There are certain special cases we could test for, but unless there
> > > > is a particularly convincing reason that they could lead to runaway
> > > > wrong execution/vulnerabilities prior to naturally trapping, we have
> > > > not considered littering the code with these kinds of checks to be a
> > worthwhile trade-off.
> > > >
> > > > > you're unlucky the compiler will make wild assumptions about the
> > > > > value of the variable passed as the arg (and for example in your
> > > > > first code snippet, simply delete the `if` statement, meaning
> > > > > `use_cat` gets called even when `catopen` fails potentially
> > > > > corrupting user data/state).
> > > >
> > > > I have no idea what you're talking about there. The compiler cannot
> > > > make that kind of transformation (lifting code that could produce
> > > > undefined behavior, side effects, etc. out of a conditional).
> > >
> > > It's a hypothetical, but something like the following is valid for the compiler to
> > do:
> > >
> > > * inline the catclose (e.g. in LTO for a static link)
> > > * consider the `if` statement and ask "what if `cat` is `-1`
> > > * look forward to the pointer dereference (confirming that `cat` can't
> > > change in the interim)
> > > * realise that `0x7` is not a valid pointer on the target platform so
> > > UB is inevitable if `cat` is `-1`
> > > * optimize out the comparison since UB frees the compiler of any
> > > responsibilities
> > 
> > You have the logic backwards. In the case where cat==(cat_t)-1, catclose is not
> > called on the abstract machine, so no conclusions can be drawn from anything
> > catclose would do.
> 
> The original code I was working from was:
> 
> ```
> nl_catd cat = catopen(...);
> if (cat != (nl_catd)-1) {
>     use_cat(cat);
> }
> catclose(cat);
> ```
> 
> (i.e. an incorrect use of the APIs, but not UB in a "C99 spec"
> sense). In that code the `catclose` call is provably inevitable,
> allowing the compiler to infer properties of `cat` from it.

Ah, okay, at least now that makes sense. But indeed it is undefined:

  "Each of the following statements shall apply to all functions
   unless explicitly stated otherwise in the detailed descriptions
   that follow:

   1. If an argument to a function has an invalid value (such as a
      value outside the domain of the function, or a pointer outside
      the address space of the program, or a null pointer), the
      behavior is undefined.

   ..."

https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_01

So I guess what you're saying is that, in the case where an erroneous
program like the above has undefined behavior, the compiler could make
a transformation such that the effect of the UB is seen at a point
different from where it logically occurs. (This is the norm for UB.)
In particular, despite cat being -1 from a failed catopen, you might
see use_cat being called with a seemingly impossible argument.

Exacerbating the degree to which UB can become non-localized is one of
the expected effects of LTO, and arguably a good reason not to use LTO
for debugging. I don't see a lot of value in trying to prevent this in
isolated cases when it's going to happen all over the place anyway for
other reasons.

Rich