From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/13920 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: C Annex K safe C functions Date: Mon, 4 Mar 2019 10:14:55 -0500 Message-ID: <20190304151455.GN23599@brightrain.aerifal.cx> References: <20190227105050.GE21289@port70.net> Reply-To: musl@lists.openwall.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="7965"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-13936-gllmg-musl=m.gmane.org@lists.openwall.com Mon Mar 04 16:15:12 2019 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1h0pJ8-0001tX-7r for gllmg-musl@m.gmane.org; Mon, 04 Mar 2019 16:15:10 +0100 Original-Received: (qmail 15773 invoked by uid 550); 4 Mar 2019 15:15:07 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 15754 invoked from network); 4 Mar 2019 15:15:07 -0000 Content-Disposition: inline In-Reply-To: Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:13920 Archived-At: On Mon, Mar 04, 2019 at 11:24:08AM +0700, Jonny Grant wrote: > Hi! > > On 27/02/2019 17:50, Szabolcs Nagy wrote: > >* Jonny Grant [2019-02-27 10:30:52 +0700]: > >>Not on the list, so please cc me in replies. > >>Any plans to support Annex K? > >>Those safe functions are great, strncpy_s etc > > > >i wonder why you think they are great, > >if they are advertised anywhere as safe or > >useful then that should be fixed. > > > >annex k is so incredibly broken and bad > >that there is a wg14 paper about it > > > >http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1969.htm > > > >normally it's ok to add nonsense interfaces > >for compatibility, but in this case there is > >no widespread use and the api depends on global > >state that causes implementation issues even > >if we wanted to implement it. > > Thanks for your reply! > > Well I wouldn't disagree with experts. I should re-read that review though. > > However, I was not aware that these APIs have global state? > (memset_s, memcpy_s, memmove_s, strcpy_s, strncpy_s, strcat_s, > strncat_s, strtok_s, memset_s, strerror_s, strerrorlen_s, strnlen_s) > - do they? Yes. The action they take when a "runtime constraint violation" happens is determined by global state. See K.3.6.1.1 The set_constraint_handler_s function. > strncpy_s is great, it avoids the bug in strncpy that could cause > the buffer to not be terminated. It's better than the strlcpy BSD > uses which truncates buffers. This is not a bug in strncpy. It's strncpy's purpose being completely different from what people are wrongly using it for. The strncpy function is for working with fixed-size, null-padded data fields, which are not C strings, and which went out of style in the 80s -- the sort of things often cited in the Y2K problem fiasco. For the most part, there is no modern use for strncpy. Naming a function with a completely different purpose after it (strncpy_s) and implying it's a "secure version of strncpy" contributes to this misconception that's the whole reason people are wrongly using strncpy in the first place. > BSD/OS X supports memset_s etc, but does not support > set_constraint_handler_s Most BSDs also support explicit_bzero, and musl does too. Providing a function with the same name as one that's specified by some standard we don't intend to support, but without fully compatible semantics, is something musl generally tries to avoid. > If issues, I'd support amending Annex K, rather than removing. It's > good they check for NULL/nullptr, they return errno_t directly > instead of the errno global kludge. Sticking with old APIs forever > is difficult, but no one uses creat() anymore either. Returning an error code is an okay choice when it works, but if you try to apply it everywhere, it leads to severe bugs. The classic example is allocator functions which don't return void* but instead take a void** argument into which to store the result. Invariably people call them with (void**)&ptr, where ptr does not have type void*, and this is undefined behavior. In a worst case, on an implementation where different pointer types have different representations and sizes, it would be a buffer overflow. This is kinda tangential to the Annex K issue, but the point is that trying to push for a uniform "everything returns an error code" convention introduce concrete harm for the sake of somebody's preferred style, and as such is a really bad idea. A better convention if you want to avoid errno is taking an int *errcode argument, but the arguments against errno are generally pretty weak. Most of the time, errno is the most efficient, in terms of simplifying readability and flow of code, solution to the problem. > Could I ask, does your libc follow POSIX spec to the letter? eg not > checking pointers for NULL (where spec omits to mention checking > pointers valid) ? eg this call which crashes glibc? > > puts(NULL); > > It looks like it will still SIGSEGV... > > https://git.musl-libc.org/cgit/musl/tree/src/stdio/puts.c There is fundamentally no way to check a pointer for validity without false negatives or false positives unless you have a fully memory-safe C implementation with a runtime which tracks memory in much more detail than even tools like ASan or valgrind do. Of course it's possible to check for a null pointer as one special case, but then the question is what you want to do with it. Unless you have a function that's specified to treat null pointers specially somehow (e.g. strtol doesn't store an end position of endptr is null), the caller has invoked undefined behavior. One commonly-requested behavior is to return an error. I've written many times on why this is a bad choice, and wording based on my views on it has even been integrated into the glibc wiki on the topic: https://sourceware.org/glibc/wiki/Style_and_Conventions#Invalid_pointers "If you return an error code to a caller which has already proven itself buggy, the most likely result is that the caller will ignore the error, and bad things will happen much later down the line when the original cause of the error has become difficult or impossible to track down. Why is it reasonable to assume the caller will ignore the error you return? Because the caller already ignored the error return of malloc or fopen or some other library-specific allocation function which returned NULL to indicate an error." Generally, musl adopts a hardening approach of aiming to crash (terminate with a fatal signal) as early as possible, and as directly as possible, when undefined behavior has been detected. In most cases, just dereferencing the potentially-invalid pointer is sufficient to achieve this without explicit code for it. In other cases, where UB can be detected without expensive (slow, complex, or error-prone) runtime tests, musl has an explicit call to a_crash() for this. Some examples are pthread_join (attempting to join a detached thread), free (double free or heap corruption), and asctime (date string doesn't fit in the buffer). Rich