mailing list of musl libc
 help / color / mirror / code / Atom feed
* C Annex K safe C functions
@ 2019-02-27  3:30 Jonny Grant
  2019-02-27 10:50 ` Szabolcs Nagy
  2019-02-27 13:11 ` Rich Felker
  0 siblings, 2 replies; 9+ messages in thread
From: Jonny Grant @ 2019-02-27  3:30 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 133 bytes --]

Hello
Not on the list, so please cc me in replies.
Any plans to support Annex K?
Those safe functions are great, strncpy_s etc
Jonny

[-- Attachment #2: Type: text/html, Size: 272 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: C Annex K safe C functions
  2019-02-27  3:30 C Annex K safe C functions Jonny Grant
@ 2019-02-27 10:50 ` Szabolcs Nagy
  2019-03-04  4:24   ` Jonny Grant
  2019-02-27 13:11 ` Rich Felker
  1 sibling, 1 reply; 9+ messages in thread
From: Szabolcs Nagy @ 2019-02-27 10:50 UTC (permalink / raw)
  To: musl; +Cc: Jonny Grant

* Jonny Grant <jg@jguk.org> [2019-02-27 10:30:52 +0700]:
> Not on the list, so please cc me in replies.
> Any plans to support Annex K?
> Those safe functions are great, strncpy_s etc

i wonder why you think they are great,
if they are advertised anywhere as safe or
useful then that should be fixed.

annex k is so incredibly broken and bad
that there is a wg14 paper about it

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1969.htm

normally it's ok to add nonsense interfaces
for compatibility, but in this case there is
no widespread use and the api depends on global
state that causes implementation issues even
if we wanted to implement it.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: C Annex K safe C functions
  2019-02-27  3:30 C Annex K safe C functions Jonny Grant
  2019-02-27 10:50 ` Szabolcs Nagy
@ 2019-02-27 13:11 ` Rich Felker
  2019-02-27 16:34   ` Rich Felker
  1 sibling, 1 reply; 9+ messages in thread
From: Rich Felker @ 2019-02-27 13:11 UTC (permalink / raw)
  To: musl

On Wed, Feb 27, 2019 at 10:30:52AM +0700, Jonny Grant wrote:
> Hello
> Not on the list, so please cc me in replies.
> Any plans to support Annex K?
> Those safe functions are great, strncpy_s etc

No, and there's been considerable amounts of stuff written on this in
the past. I'll try to give you a good summary later. The reply so far
by Szabolcs Nagy is a good start.

Rich


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: C Annex K safe C functions
  2019-02-27 13:11 ` Rich Felker
@ 2019-02-27 16:34   ` Rich Felker
  2019-03-04  8:09     ` Florian Weimer
  0 siblings, 1 reply; 9+ messages in thread
From: Rich Felker @ 2019-02-27 16:34 UTC (permalink / raw)
  To: musl

On Wed, Feb 27, 2019 at 08:11:53AM -0500, Rich Felker wrote:
> On Wed, Feb 27, 2019 at 10:30:52AM +0700, Jonny Grant wrote:
> > Hello
> > Not on the list, so please cc me in replies.
> > Any plans to support Annex K?
> > Those safe functions are great, strncpy_s etc
> 
> No, and there's been considerable amounts of stuff written on this in
> the past. I'll try to give you a good summary later. The reply so far
> by Szabolcs Nagy is a good start.

Here's what comes to mind about the technical and social reasons
behind not adopting Annex K:

1. The Annex K interfaces are clearly inspired by (even named
identically to) the corresponding *_s interfaces provided by MSVC.
However, the MSVC functions don't actually conform to the
specification in Annex K, due to some subtle breakage in types (IIRC
the rsize_t stuff, possibly other things too). Thus there are actually
two slightly incompatible sets of functions with the same names. This
is generally a strong criterion for exclusion from musl, unless one of
the standards we aim to support mandates the presence of the functions
with this problem (like strerror_r).

2. The handling of "runtime constraint violation" and the ability to
customize the handler for it is highly problematic. Code using the
Annex K functions can neither rely on a particular default handling of
these errors, nor can it set the handling it wants unless it's global
initialization code for the whole application, since the runtime
constraint handler is global state that can't be set in a thread-safe
or library-safe manner. Thus, to use these functions safely, you must
both check for the erroneous conditions before making the call (since,
for one choice of handler, the call could abort your program if it
detects them), and you must check for an error return from the
function (since, for a different choice of handler, it might just
return an error, and if you don't check for that yourself, your
program could continue running in an unsafe state).

3. In order to provide any safety over the standard functions they
"replace", the Annex K functions rely on the caller to provide
accurate information about the sizes of buffers, etc. in the
additional function arguments. There's no good reason to believe that
the programmer will get these arguments right when they consistently
fail to get the arguments to the standard functions right. For
example, there's no reason to believe programmers won't just call
memcpy_s(dest, n, src, n) instead of going back to some better
authoritative source for the size of the dest buffer. In many (most?)
cases, n is already computed just as the size of the dest object, and
nothing is gained. These kinds of problems are solved *much* better by
techniques that allow the compiler or runtime to track the size of
objects and automatically insert checks, which aren't subject to human
error or misuse. The _FORTIFY_SOURCE feature, sanitizers, etc. all
provide much better protection than Annex K functions in this area.

4. The Annex K *printf_s specifications spread FUD about %n being
dangerous, ignoring that this is a poor stand-in for the general
unsafety of *passing a literal string where a format string is
expected* and for use of variable format strings. Without %n, the
impact of such bugs is somewhat reduced, but it's still possible to
get heartbleed-like info leak exploits from them. A real hardened
version of printf-family functions would take a list of argument types
and match the format string against it, and this list would be
generated automatically by the tooling (ala _FORTIFY_SOURCE).

5. Expanding on the topic of FUD/misinformation, both the introduction
of the original *_s functions, and lobbying for their inclusion in the
standard (which eventually reached the compromise of just putting them
in an Annex), was not about improving the C language or making useful
tools for programmers, but about introducing incompatibility and
fragmentation to the language/standard with the goal of undermining
it. The company that introduced it produces a product that is not
compatible with the C language as specified and does not even aim to
be, but aims to give the impression of being a C implementation (it's
mainly a C++ implementation, though likely not conforming to that
standard either). This is part of a long history, including wrong
wchar_t handling, inverting the meaning of %s and %ls with the wide
printf functions, etc. etc. etc. See also point #1 above about
incompatibility of the Annex K functions themselves. It's my position,
and I believe it's shared by many others in the musl community and C
language communities, that parties not interested in implementing or
using the standard should not try to influence its direction, and that
this kind of behavior should not be rewarded by playing along with it,
but that it should be shunned as long as doing so is practical.

Rich


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: C Annex K safe C functions
  2019-02-27 10:50 ` Szabolcs Nagy
@ 2019-03-04  4:24   ` Jonny Grant
  2019-03-04  6:40     ` A. Wilcox
  2019-03-04 15:14     ` Rich Felker
  0 siblings, 2 replies; 9+ messages in thread
From: Jonny Grant @ 2019-03-04  4:24 UTC (permalink / raw)
  To: musl

Hi!

On 27/02/2019 17:50, Szabolcs Nagy wrote:
> * Jonny Grant <jg@jguk.org> [2019-02-27 10:30:52 +0700]:
>> Not on the list, so please cc me in replies.
>> Any plans to support Annex K?
>> Those safe functions are great, strncpy_s etc
> 
> i wonder why you think they are great,
> if they are advertised anywhere as safe or
> useful then that should be fixed.
> 
> annex k is so incredibly broken and bad
> that there is a wg14 paper about it
> 
> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1969.htm
> 
> normally it's ok to add nonsense interfaces
> for compatibility, but in this case there is
> no widespread use and the api depends on global
> state that causes implementation issues even
> if we wanted to implement it.

Thanks for your reply!

Well I wouldn't disagree with experts. I should re-read that review though.

However, I was not aware that these APIs have global state? (memset_s, 
memcpy_s, memmove_s, strcpy_s, strncpy_s, strcat_s, strncat_s, strtok_s, 
memset_s, strerror_s, strerrorlen_s, strnlen_s) - do they?

strncpy_s is great, it avoids the bug in strncpy that could cause the 
buffer to not be terminated. It's better than the strlcpy BSD uses which 
truncates buffers.

BSD/OS X supports memset_s etc, but does not support 
set_constraint_handler_s

https://opensource.apple.com/source/Libc/Libc-1244.1.7/string/NetBSD/memset_s.c.auto.html

FreeBSD seems to support memset_s
https://www.freebsd.org/cgi/man.cgi?query=memset&sektion=3&apropos=0&manpath=freebsd

Oracle Solaris supports Annex K
https://docs.oracle.com/cd/E88353_01/html/E37843/strncpy-s-3c.html#scrolltoc


If issues, I'd support amending Annex K, rather than removing. It's good 
they check for NULL/nullptr, they return errno_t directly instead of the 
errno global kludge. Sticking with old APIs forever is difficult, but no 
one uses creat() anymore either.

Could I ask, does your libc follow POSIX spec to the letter? eg not 
checking pointers for NULL (where spec omits to mention checking 
pointers valid) ? eg this call which crashes glibc?

puts(NULL);

It looks like it will still SIGSEGV...

https://git.musl-libc.org/cgit/musl/tree/src/stdio/puts.c



Thanks
Jonny


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: C Annex K safe C functions
  2019-03-04  4:24   ` Jonny Grant
@ 2019-03-04  6:40     ` A. Wilcox
  2019-03-04 15:14     ` Rich Felker
  1 sibling, 0 replies; 9+ messages in thread
From: A. Wilcox @ 2019-03-04  6:40 UTC (permalink / raw)
  To: musl


[-- Attachment #1.1: Type: text/plain, Size: 873 bytes --]

On 03/03/19 22:24, Jonny Grant wrote:
> Could I ask, does your libc follow POSIX spec to the letter? eg not
> checking pointers for NULL (where spec omits to mention checking
> pointers valid) ?


2.1.1 Use and Implementation of Functions

  1. If an argument to a function has an invalid value (such as a value
outside the domain of the function, or a pointer outside the address
space of the program, or a null pointer), the behavior is undefined.


The spec never omits checking the validity of pointers unless actually
specified.  In this way, musl is definitely conformant.

There are some non-conformant bits of musl, though some of these may be
solved since the last test (which was musl 1.1.19):

https://wiki.adelielinux.org/wiki/POSIX


Best,
--arw


-- 
A. Wilcox (awilfox)
Project Lead, Adélie Linux
https://www.adelielinux.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: C Annex K safe C functions
  2019-02-27 16:34   ` Rich Felker
@ 2019-03-04  8:09     ` Florian Weimer
  2019-03-05 13:19       ` Szabolcs Nagy
  0 siblings, 1 reply; 9+ messages in thread
From: Florian Weimer @ 2019-03-04  8:09 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

* Rich Felker:

> 5. Expanding on the topic of FUD/misinformation, both the introduction
> of the original *_s functions, and lobbying for their inclusion in the
> standard (which eventually reached the compromise of just putting them
> in an Annex), was not about improving the C language or making useful
> tools for programmers, but about introducing incompatibility and
> fragmentation to the language/standard with the goal of undermining
> it. The company that introduced it produces a product that is not
> compatible with the C language as specified and does not even aim to
> be, but aims to give the impression of being a C implementation (it's
> mainly a C++ implementation, though likely not conforming to that
> standard either).

Does this really reflect history?  I thought that Annex K was submitted
for standardization well after the vendor in question withdrew from the
ISO process.

> It's my position, and I believe it's shared by many others in the musl
> community and C language communities, that parties not interested in
> implementing or using the standard should not try to influence its
> direction, and that this kind of behavior should not be rewarded by
> playing along with it, but that it should be shunned as long as doing
> so is practical.

My impression is that compiler vendors and large-scale users are
generally not well-represented in the ISO process anyway.  If true, your
requirement, while looking completely reasonable, would effectively halt
evolution of the standard.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: C Annex K safe C functions
  2019-03-04  4:24   ` Jonny Grant
  2019-03-04  6:40     ` A. Wilcox
@ 2019-03-04 15:14     ` Rich Felker
  1 sibling, 0 replies; 9+ messages in thread
From: Rich Felker @ 2019-03-04 15:14 UTC (permalink / raw)
  To: musl

On Mon, Mar 04, 2019 at 11:24:08AM +0700, Jonny Grant wrote:
> Hi!
> 
> On 27/02/2019 17:50, Szabolcs Nagy wrote:
> >* Jonny Grant <jg@jguk.org> [2019-02-27 10:30:52 +0700]:
> >>Not on the list, so please cc me in replies.
> >>Any plans to support Annex K?
> >>Those safe functions are great, strncpy_s etc
> >
> >i wonder why you think they are great,
> >if they are advertised anywhere as safe or
> >useful then that should be fixed.
> >
> >annex k is so incredibly broken and bad
> >that there is a wg14 paper about it
> >
> >http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1969.htm
> >
> >normally it's ok to add nonsense interfaces
> >for compatibility, but in this case there is
> >no widespread use and the api depends on global
> >state that causes implementation issues even
> >if we wanted to implement it.
> 
> Thanks for your reply!
> 
> Well I wouldn't disagree with experts. I should re-read that review though.
> 
> However, I was not aware that these APIs have global state?
> (memset_s, memcpy_s, memmove_s, strcpy_s, strncpy_s, strcat_s,
> strncat_s, strtok_s, memset_s, strerror_s, strerrorlen_s, strnlen_s)
> - do they?

Yes. The action they take when a "runtime constraint violation"
happens is determined by global state. See K.3.6.1.1 The
set_constraint_handler_s function.

> strncpy_s is great, it avoids the bug in strncpy that could cause
> the buffer to not be terminated. It's better than the strlcpy BSD
> uses which truncates buffers.

This is not a bug in strncpy. It's strncpy's purpose being completely
different from what people are wrongly using it for. The strncpy
function is for working with fixed-size, null-padded data fields,
which are not C strings, and which went out of style in the 80s -- the
sort of things often cited in the Y2K problem fiasco. For the most
part, there is no modern use for strncpy. Naming a function with a
completely different purpose after it (strncpy_s) and implying it's a
"secure version of strncpy" contributes to this misconception that's
the whole reason people are wrongly using strncpy in the first place.

> BSD/OS X supports memset_s etc, but does not support
> set_constraint_handler_s

Most BSDs also support explicit_bzero, and musl does too. Providing a
function with the same name as one that's specified by some standard
we don't intend to support, but without fully compatible semantics, is
something musl generally tries to avoid.

> If issues, I'd support amending Annex K, rather than removing. It's
> good they check for NULL/nullptr, they return errno_t directly
> instead of the errno global kludge. Sticking with old APIs forever
> is difficult, but no one uses creat() anymore either.

Returning an error code is an okay choice when it works, but if you
try to apply it everywhere, it leads to severe bugs. The classic
example is allocator functions which don't return void* but instead
take a void** argument into which to store the result. Invariably
people call them with (void**)&ptr, where ptr does not have type
void*, and this is undefined behavior. In a worst case, on an
implementation where different pointer types have different
representations and sizes, it would be a buffer overflow.

This is kinda tangential to the Annex K issue, but the point is that
trying to push for a uniform "everything returns an error code"
convention introduce concrete harm for the sake of somebody's
preferred style, and as such is a really bad idea. A better convention
if you want to avoid errno is taking an int *errcode argument, but the
arguments against errno are generally pretty weak. Most of the time,
errno is the most efficient, in terms of simplifying readability and
flow of code, solution to the problem.

> Could I ask, does your libc follow POSIX spec to the letter? eg not
> checking pointers for NULL (where spec omits to mention checking
> pointers valid) ? eg this call which crashes glibc?
> 
> puts(NULL);
> 
> It looks like it will still SIGSEGV...
> 
> https://git.musl-libc.org/cgit/musl/tree/src/stdio/puts.c

There is fundamentally no way to check a pointer for validity without
false negatives or false positives unless you have a fully memory-safe
C implementation with a runtime which tracks memory in much more
detail than even tools like ASan or valgrind do. Of course it's
possible to check for a null pointer as one special case, but then the
question is what you want to do with it. Unless you have a function
that's specified to treat null pointers specially somehow (e.g. strtol
doesn't store an end position of endptr is null), the caller has
invoked undefined behavior.

One commonly-requested behavior is to return an error. I've written
many times on why this is a bad choice, and wording based on my views
on it has even been integrated into the glibc wiki on the topic:

https://sourceware.org/glibc/wiki/Style_and_Conventions#Invalid_pointers

    "If you return an error code to a caller which has already proven
    itself buggy, the most likely result is that the caller will
    ignore the error, and bad things will happen much later down the
    line when the original cause of the error has become difficult or
    impossible to track down. Why is it reasonable to assume the
    caller will ignore the error you return? Because the caller
    already ignored the error return of malloc or fopen or some other
    library-specific allocation function which returned NULL to
    indicate an error."

Generally, musl adopts a hardening approach of aiming to crash
(terminate with a fatal signal) as early as possible, and as directly
as possible, when undefined behavior has been detected. In most cases,
just dereferencing the potentially-invalid pointer is sufficient to
achieve this without explicit code for it. In other cases, where UB
can be detected without expensive (slow, complex, or error-prone)
runtime tests, musl has an explicit call to a_crash() for this. Some
examples are pthread_join (attempting to join a detached thread), free
(double free or heap corruption), and asctime (date string doesn't fit
in the buffer).

Rich


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: C Annex K safe C functions
  2019-03-04  8:09     ` Florian Weimer
@ 2019-03-05 13:19       ` Szabolcs Nagy
  0 siblings, 0 replies; 9+ messages in thread
From: Szabolcs Nagy @ 2019-03-05 13:19 UTC (permalink / raw)
  To: musl

* Florian Weimer <fweimer@redhat.com> [2019-03-04 09:09:31 +0100]:
> * Rich Felker:
> > 5. Expanding on the topic of FUD/misinformation, both the introduction
> > of the original *_s functions, and lobbying for their inclusion in the
> > standard (which eventually reached the compromise of just putting them
> > in an Annex), was not about improving the C language or making useful
> > tools for programmers, but about introducing incompatibility and
> > fragmentation to the language/standard with the goal of undermining
> > it. The company that introduced it produces a product that is not
> > compatible with the C language as specified and does not even aim to
> > be, but aims to give the impression of being a C implementation (it's
> > mainly a C++ implementation, though likely not conforming to that
> > standard either).
> 
> Does this really reflect history?  I thought that Annex K was submitted
> for standardization well after the vendor in question withdrew from the
> ISO process.

well microsoft did lobby for the _s functions originally,
the msvc library team is likely responsible for the concept.
it turned into tr 24731 and eventually annex k in c11,
by that time microsoft may not have much to do with it.

one can dig through wg14 to see how it evolved.

initial work:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1007.pdf
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1031.pdf
a tr 24731 meeting (ms is still involved):
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1129.pdf
some tr 24731 drafts:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1088.pdf
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1146.pdf
austing group comments:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1106.txt
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1160.pdf
responses:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1118.htm
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1174.pdf
tr 24731 rationale:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1173.pdf
c1x inclusion:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1350.htm
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1394.pdf



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-03-05 13:19 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-27  3:30 C Annex K safe C functions Jonny Grant
2019-02-27 10:50 ` Szabolcs Nagy
2019-03-04  4:24   ` Jonny Grant
2019-03-04  6:40     ` A. Wilcox
2019-03-04 15:14     ` Rich Felker
2019-02-27 13:11 ` Rich Felker
2019-02-27 16:34   ` Rich Felker
2019-03-04  8:09     ` Florian Weimer
2019-03-05 13:19       ` Szabolcs Nagy

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).