* Handling of L and ll prefixes different from glibc @ 2016-12-14 13:46 Nadav Har'El 2016-12-14 16:13 ` Rich Felker 0 siblings, 1 reply; 7+ messages in thread From: Nadav Har'El @ 2016-12-14 13:46 UTC (permalink / raw) To: musl; +Cc: Nadav Har'El [-- Attachment #1: Type: text/plain, Size: 1100 bytes --] Hi, Posix's printf manual suggests (see http://pubs.opengroup.org/onlinepubs/9699919799/functions/fprintf.html) that the "ll" format prefix should only be used for integer types, and "L" should only be used for long double type. And it seems that indeed, this is what Musl's printf() supports - the test program long double d = 123.456; printf("Lf: %Lf\n", d); printf("llf %llf\n", d); long long int i = 123456; printf("Ld: %Ld\n", i); printf("lld: %lld\n", i); produces with Musl's printf just two lines of output: Lf: 123.456000 lld: 123456 The two other printf()s (with %Ld and %llf) are silently dropped. However, in glibc, it seems that "ll" and "L" are synonyms, and both work for both integer and floating types. The above program produces with glibc four lines of output: Lf: 123.456000 llf 123.456000 Ld: 123456 lld: 123456 If Musl's intention is to be compatible with glibc, not Posix, I guess this behavior should be fixed, and LL and ll should become synonyms, not different flags? Thanks, Nadav. -- Nadav Har'El nyh@scylladb.com [-- Attachment #2: Type: text/html, Size: 1789 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Handling of L and ll prefixes different from glibc 2016-12-14 13:46 Handling of L and ll prefixes different from glibc Nadav Har'El @ 2016-12-14 16:13 ` Rich Felker 2016-12-14 17:17 ` Szabolcs Nagy 0 siblings, 1 reply; 7+ messages in thread From: Rich Felker @ 2016-12-14 16:13 UTC (permalink / raw) To: musl On Wed, Dec 14, 2016 at 03:46:40PM +0200, Nadav Har'El wrote: > Hi, > > Posix's printf manual suggests (see > http://pubs.opengroup.org/onlinepubs/9699919799/functions/fprintf.html) > that the "ll" format prefix should only be used for integer types, and "L" > should only be used for long double type. And it seems that indeed, this is > what Musl's printf() supports - the test program > > long double d = 123.456; > printf("Lf: %Lf\n", d); > printf("llf %llf\n", d); > long long int i = 123456; > printf("Ld: %Ld\n", i); > printf("lld: %lld\n", i); > > produces with Musl's printf just two lines of output: > > Lf: 123.456000 > lld: 123456 > > The two other printf()s (with %Ld and %llf) are silently dropped. Not quite silently; printf is returning -1 with errno set to EINVAL. > However, in glibc, it seems that "ll" and "L" are synonyms, and both work > for both integer and floating types. The above program produces with glibc > four lines of output: > > Lf: 123.456000 > llf 123.456000 > Ld: 123456 > lld: 123456 > > If Musl's intention is to be compatible with glibc, not Posix, I guess this > behavior should be fixed, and LL and ll should become synonyms, not > different flags? There is no general "intention to be compatible with glibc". There are a couple related topics you might be thinking of: Widely-used and widely-available extensions: There are written-up guidelines for the criteria for inclusion or exclusion of such interfaces, balancing things like usefulness, cost, and whether there's already a better way to do the same thing portably. ABI compatibility: There is an intent to support use of some glibc-linked code in binary form with musl. From a practical standpoint, this is mainly for libraries without source that some users depend on (like flash and eventually nvidia stuff, maybe). Aside from practical needs like that, the scope of the compatibility goal is purely to support fully POSIX-conforming programs, or programs which use common extension APIs provided by musl, not programs relying on unsupported glibc functionality, doing things gratuitously wrong (like ll vs L here), or depending on glibc bugs. Now back to the topic of printf: As for printf formats specifically, musl avoids defining any of the cases which are undefined behavior in order to avoid getting in a situation where we conflict with future versions of the standard. This happened with glibc's scanf, which took 'a' as an extension flag for auto-allocation, only to have C99 later assign it for floating point (to match printf hex formatting), and glibc had to use hackery of remapping symbols in different conformance profiles to work around the problem. musl does not do that kind of hackery, so we have to be careful not to introduce such problems in the first place. Note that there is one printf extension we have, %m, but this is because POSIX requires %m to be supported by syslog(), making it unlikely that the standards would assign a conflicting meaning in the future. Also %m is very useful, whereas mismatched L/ll is just programmer sloppiness. One thing I'm not happy with now is the way printf returns an error (which the caller usually ignores) on invalid format strings; this hides lots of bugs (for example, a similar issue with some legacy software using %qd for printing long long) and doesn't have any basis in requirements, since invalid format strings invoke undefined behavior. I'm mildly leaning towards causing a crash on invalid format strings so that the location of the incorrect usage can be quickly found with a debugger, but I'd like feedback from users who've debugged this sort of thing on whether that'd actually be helpful. Rich ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Handling of L and ll prefixes different from glibc 2016-12-14 16:13 ` Rich Felker @ 2016-12-14 17:17 ` Szabolcs Nagy 2016-12-14 22:37 ` A. Wilcox 0 siblings, 1 reply; 7+ messages in thread From: Szabolcs Nagy @ 2016-12-14 17:17 UTC (permalink / raw) To: musl * Rich Felker <dalias@libc.org> [2016-12-14 11:13:48 -0500]: > behavior. I'm mildly leaning towards causing a crash on invalid format > strings so that the location of the incorrect usage can be quickly > found with a debugger, but I'd like feedback from users who've > debugged this sort of thing on whether that'd actually be helpful. crashing sounds good to me. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Handling of L and ll prefixes different from glibc 2016-12-14 17:17 ` Szabolcs Nagy @ 2016-12-14 22:37 ` A. Wilcox 2016-12-15 2:30 ` Rich Felker 0 siblings, 1 reply; 7+ messages in thread From: A. Wilcox @ 2016-12-14 22:37 UTC (permalink / raw) To: musl -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 14/12/16 11:17, Szabolcs Nagy wrote: > * Rich Felker <dalias@libc.org> [2016-12-14 11:13:48 -0500]: >> behavior. I'm mildly leaning towards causing a crash on invalid >> format strings so that the location of the incorrect usage can be >> quickly found with a debugger, but I'd like feedback from users >> who've debugged this sort of thing on whether that'd actually be >> helpful. > > crashing sounds good to me. > Would this be able to be configured in some way when building the libc (-D_CRASH_ON_PRINTF_UB or such)? This sounds like a great tool to use when doing conformance testing, and in general once testing has been done. However, it also sounds like a great way to break packages already "working" on musl. - --arw - -- A. Wilcox (awilfox) Project Lead, Adélie Linux http://adelielinux.org -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJYUcm9AAoJEMspy1GSK50U4twQALPW33GUPTTBdycM8QsnzDNE nb3X/Mnuf30PjD5MaLBbVrw5jiOMbBLkD9krVvkpAPmFSpkzv43ec8rLknmhSX8E 97UsQtZKqUKft/LqpU4rJyAqcKMIhIUVRbCfixFkW2LIVCMoEZu7znaf6p3E3ISk lCQpuOfmnESW4/YPozu0nWLZSdwabCilqNylLLu9AsUJJrwWwIWU4XyZYjMlfWhP CIGFtnBqPw20rCHxdpyPnzzlzz/eb89ZmwjfR88BsRzk9g5piaoDWbWi5LGgG4iz JrhVJ2wYeGOxuDxmnXy64WM0b+cw4loe5uNp1CYeCCbAGNFe6bv0IonwBb/PfKyX 4wzBtlsRcvPavJCQhlWU1pX+jMHDFdDc0Z9kT07IVQCKsMDVzp95rMCEFfiYCVkK 1grZB6GO0mZKaH+1EavUXeRb+OF2T8o+xvSqHEc4NpRQ+xsqrM6TFAL0vdEBny7z GLC9+v9rr6BTFy+MVICPIyEmpPvzzkzztMOCWK6BBR0BHyj7/tFCDMHjZ8tarq5n bVe2x3hgG/II4hCOZ2dC/cxv5Q1jHG0oOrIsnUTecdGhN+VU+v1BT2OwjJMSAScM MYchwXK0V1DEaXd5vVj9/UuZGbupkYnzKPJLMNAlhWYBUbwQHp6pC81EsJQPbp+T rv/O49Cuv3xPRiSl0iu/ =2U8C -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Handling of L and ll prefixes different from glibc 2016-12-14 22:37 ` A. Wilcox @ 2016-12-15 2:30 ` Rich Felker 2016-12-15 4:01 ` A. Wilcox 0 siblings, 1 reply; 7+ messages in thread From: Rich Felker @ 2016-12-15 2:30 UTC (permalink / raw) To: musl On Wed, Dec 14, 2016 at 04:37:55PM -0600, A. Wilcox wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA256 > > On 14/12/16 11:17, Szabolcs Nagy wrote: > > * Rich Felker <dalias@libc.org> [2016-12-14 11:13:48 -0500]: > >> behavior. I'm mildly leaning towards causing a crash on invalid > >> format strings so that the location of the incorrect usage can be > >> quickly found with a debugger, but I'd like feedback from users > >> who've debugged this sort of thing on whether that'd actually be > >> helpful. > > > > crashing sounds good to me. > > > > Would this be able to be configured in some way when building the libc > (-D_CRASH_ON_PRINTF_UB or such)? > > This sounds like a great tool to use when doing conformance testing, > and in general once testing has been done. However, it also sounds > like a great way to break packages already "working" on musl. While that's possible, I _really_ prefer avoiding switches like this. It's a path that leads to maintenance-death of a project. It's true that some programs which are just misusing printf format specifiers as part of unnecessary status/debug/junk output will fully work now, despite having UB, and that they would stop working with such a change. But in most cases, the lack of output now, even if it's unnoticed, is a bug that could have serious consequences. For example missing output in text that's parsed and used in a script can lead to things like rm -rf'ing the wrong directory. So I tend to think always failing hard and catching the bug is preferable. BTW I wonder if gcc's -Wformat catches these errors. Rich ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Handling of L and ll prefixes different from glibc 2016-12-15 2:30 ` Rich Felker @ 2016-12-15 4:01 ` A. Wilcox 2016-12-15 11:30 ` Szabolcs Nagy 0 siblings, 1 reply; 7+ messages in thread From: A. Wilcox @ 2016-12-15 4:01 UTC (permalink / raw) To: musl -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 14/12/16 20:30, Rich Felker wrote: > It's true that some programs which are just misusing printf format > specifiers as part of unnecessary status/debug/junk output will > fully work now, despite having UB, and that they would stop working > with such a change. But in most cases, the lack of output now, even > if it's unnoticed, is a bug that could have serious consequences. > For example missing output in text that's parsed and used in a > script can lead to things like rm -rf'ing the wrong directory. So I > tend to think always failing hard and catching the bug is > preferable. Yeah, I can understand that. Just makes me nervous as a package maintainer is all :) > BTW I wonder if gcc's -Wformat catches these errors. It is meant to. I know that clang whines loudly on mismatched format specifiers, and I seem to recall it even whines on format specifiers that don't exist, but it has been a while since I checked GCC's. - --arw - -- A. Wilcox (awilfox) Project Lead, Adélie Linux http://adelielinux.org -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJYUhWyAAoJEMspy1GSK50UX9QP/0EhqwhljRAm3yY5Glhl2emt R0FtBYsHHDhnqkjPJ4AWV3z6eVCPb2nd9RZYGpj778rFl/nOijfR8ilzUL7sKYGJ KXiBx5N0cOWpm75RWGKwvAEEkoC4zqQZ4HbyK13RzWdO6rJPieb137UW7sKw+S7C I7S4PRbd09pBd9Uk1smDEEknbLxDwUbARJaFOuChzzGgZU0AOfnSg7FgOGEPv+va 1dBB98gIAcMkhSOy3xBZsMZWr0frpXiym119Y2IHP56xkoIQGN585ChluEWa54tt pHEXYsDIT5ZOMMdZqIbllI3mFILopZ3PalrBiLTKwqqnAyhkRyZNWTTTxtdm7aNx iARmCXupxk1boNYjBcsQhc25EZg6tRUebHveSKfoDxKALRu+YGtEcWg+um/29L78 Jz1G4D9nAExoUVBKGkxxG6VlTEUBdmVd6pCWdm08GzX0QJaq0aA1KBK+0lexDluV eqZfG+J40bwWhFuI3hNpKy46UHs+mDPgGPzCaGWupMAYaYLAo5UCnMqIAOSFMWed hwwNlwUCA8hwjXcq6nsWa3B2lIt5LmioAfZQ4+8WtiEfU5Kwzjw66olSF1uwdNMh q4g7Sju81oUOWEFId7Dy+zBah5XZt+nyRL/6QSob9WKz5hXb30WZinHH6M+m1z4F RAPqzt4nfGqMhRfBY2vL =5CdV -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Handling of L and ll prefixes different from glibc 2016-12-15 4:01 ` A. Wilcox @ 2016-12-15 11:30 ` Szabolcs Nagy 0 siblings, 0 replies; 7+ messages in thread From: Szabolcs Nagy @ 2016-12-15 11:30 UTC (permalink / raw) To: musl * A. Wilcox <awilfox@adelielinux.org> [2016-12-14 22:01:59 -0600]: > On 14/12/16 20:30, Rich Felker wrote: > > BTW I wonder if gcc's -Wformat catches these errors. > > It is meant to. I know that clang whines loudly on mismatched format > specifiers, and I seem to recall it even whines on format specifiers > that don't exist, but it has been a while since I checked GCC's. despite clang propaganda, gcc actually has more detailed model of printf now and thus gives better warnings https://godbolt.org/g/Z0nnEH note that clang does not warn at all, while gcc caught two bugs. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2016-12-15 11:30 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-12-14 13:46 Handling of L and ll prefixes different from glibc Nadav Har'El 2016-12-14 16:13 ` Rich Felker 2016-12-14 17:17 ` Szabolcs Nagy 2016-12-14 22:37 ` A. Wilcox 2016-12-15 2:30 ` Rich Felker 2016-12-15 4:01 ` A. Wilcox 2016-12-15 11:30 ` Szabolcs Nagy
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/musl/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).