From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.1 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: from second.openwall.net (second.openwall.net [193.110.157.125]) by inbox.vuxu.org (Postfix) with SMTP id 1360322B65 for ; Fri, 26 Jan 2024 20:56:59 +0100 (CET) Received: (qmail 24140 invoked by uid 550); 26 Jan 2024 19:54:43 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 24093 invoked from network); 26 Jan 2024 19:54:43 -0000 Date: Fri, 26 Jan 2024 14:57:01 -0500 From: Rich Felker To: Andy Caldwell Cc: "musl@lists.openwall.com" Message-ID: <20240126195701.GO4163@brightrain.aerifal.cx> References: <20240125070950.28673-1-ismael@iodev.co.uk> <20240125212548.GL4163@brightrain.aerifal.cx> <20240126172716.GN4163@brightrain.aerifal.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] RE: [EXTERNAL] Re: [musl] [PATCH] fix avoidable segfault in catclose On Fri, Jan 26, 2024 at 07:12:59PM +0000, Andy Caldwell wrote: > > > > > > And it has been musl policy to crash on invalid args since the beginning. > > > > > > > > > > The current implementation doesn't (necessarily) crash/trap on an > > > > > invalid argument, instead it invokes (C-language spec-defined) UB > > > > > itself (it dereferences `(uint32_t*)((char*)cat) + 8)`, which, in > > > > > the case of the `-1` handle is the address 0x7, which in turn, not > > > > > being a valid address, is UB to dereference). If you're lucky (or > > > > > are compiling without optimizations/inlining) the compiler will > > > > > emit a MOV that will trigger an access violation and hence a SEGV, > > > > > if > > > > > > > > In general, it's impossible to test for "is this pointer valid?" > > > > > > > > There are certain special cases we could test for, but unless there > > > > is a particularly convincing reason that they could lead to runaway > > > > wrong execution/vulnerabilities prior to naturally trapping, we have > > > > not considered littering the code with these kinds of checks to be a > > worthwhile trade-off. > > > > > > > > > you're unlucky the compiler will make wild assumptions about the > > > > > value of the variable passed as the arg (and for example in your > > > > > first code snippet, simply delete the `if` statement, meaning > > > > > `use_cat` gets called even when `catopen` fails potentially > > > > > corrupting user data/state). > > > > > > > > I have no idea what you're talking about there. The compiler cannot > > > > make that kind of transformation (lifting code that could produce > > > > undefined behavior, side effects, etc. out of a conditional). > > > > > > It's a hypothetical, but something like the following is valid for the compiler to > > do: > > > > > > * inline the catclose (e.g. in LTO for a static link) > > > * consider the `if` statement and ask "what if `cat` is `-1` > > > * look forward to the pointer dereference (confirming that `cat` can't > > > change in the interim) > > > * realise that `0x7` is not a valid pointer on the target platform so > > > UB is inevitable if `cat` is `-1` > > > * optimize out the comparison since UB frees the compiler of any > > > responsibilities > > > > You have the logic backwards. In the case where cat==(cat_t)-1, catclose is not > > called on the abstract machine, so no conclusions can be drawn from anything > > catclose would do. > > The original code I was working from was: > > ``` > nl_catd cat = catopen(...); > if (cat != (nl_catd)-1) { > use_cat(cat); > } > catclose(cat); > ``` > > (i.e. an incorrect use of the APIs, but not UB in a "C99 spec" > sense). In that code the `catclose` call is provably inevitable, > allowing the compiler to infer properties of `cat` from it. Ah, okay, at least now that makes sense. But indeed it is undefined: "Each of the following statements shall apply to all functions unless explicitly stated otherwise in the detailed descriptions that follow: 1. If an argument to a function has an invalid value (such as a value outside the domain of the function, or a pointer outside the address space of the program, or a null pointer), the behavior is undefined. ..." https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_01 So I guess what you're saying is that, in the case where an erroneous program like the above has undefined behavior, the compiler could make a transformation such that the effect of the UB is seen at a point different from where it logically occurs. (This is the norm for UB.) In particular, despite cat being -1 from a failed catopen, you might see use_cat being called with a seemingly impossible argument. Exacerbating the degree to which UB can become non-localized is one of the expected effects of LTO, and arguably a good reason not to use LTO for debugging. I don't see a lot of value in trying to prevent this in isolated cases when it's going to happen all over the place anyway for other reasons. Rich