* [musl] Bug in atoll strtoll, the output of then differ @ 2022-12-18 9:32 Domingo Alvarez Duarte 2022-12-18 9:58 ` Markus Wichmann ` (3 more replies) 0 siblings, 4 replies; 7+ messages in thread From: Domingo Alvarez Duarte @ 2022-12-18 9:32 UTC (permalink / raw) To: musl Hello ! Doing some work with emscripten with this project https://github.com/mingodad/CG-SQL-Lua-playground I was getting some errors with the usage of "atoll" and with this small program to compare the output of "musl" and "glibc" I found what seems to be a bug in "atoll" because with "musl" it gives a different output than "strtoll". ===== #include <stdio.h> #include <stdlib.h> int main(int argc, char *argv[]) { const char *s = "9223372036854775808"; long long ll = atoll(s); long long ll2 = strtoll (s, (char **) NULL, 10); int imax = 0x7fffffff; printf("%s : %lld : %lld : %d : %d\n", s, ll, ll2, imax, ll <= imax); return 0; } ===== Output from "glibc": ===== 9223372036854775808 : 9223372036854775807 : 9223372036854775807 : 2147483647 : 0 ===== Output from "musl": ===== 9223372036854775808 : -9223372036854775808 : 9223372036854775807 : 2147483647 : 1 ===== Cheers ! ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [musl] Bug in atoll strtoll, the output of then differ 2022-12-18 9:32 [musl] Bug in atoll strtoll, the output of then differ Domingo Alvarez Duarte @ 2022-12-18 9:58 ` Markus Wichmann 2022-12-18 10:22 ` Domingo Alvarez Duarte 2022-12-18 10:06 ` Quentin Rameau ` (2 subsequent siblings) 3 siblings, 1 reply; 7+ messages in thread From: Markus Wichmann @ 2022-12-18 9:58 UTC (permalink / raw) To: musl On Sun, Dec 18, 2022 at 10:32:10AM +0100, Domingo Alvarez Duarte wrote: > Hello ! > > Doing some work with emscripten with this project > https://github.com/mingodad/CG-SQL-Lua-playground I was getting some errors > with the usage of "atoll" and with this small program to compare the output > of "musl" and "glibc" I found what seems to be a bug in "atoll" because with > "musl" it gives a different output than "strtoll". > > ===== > > #include <stdio.h> > #include <stdlib.h> > > int main(int argc, char *argv[]) > { > const char *s = "9223372036854775808"; > long long ll = atoll(s); > long long ll2 = strtoll (s, (char **) NULL, 10); > int imax = 0x7fffffff; > printf("%s : %lld : %lld : %d : %d\n", s, ll, ll2, imax, ll <= imax); > return 0; > } > > ===== > > Output from "glibc": > > ===== > > 9223372036854775808 : 9223372036854775807 : 9223372036854775807 : 2147483647 > : 0 > > ===== > > Output from "musl": > > ===== > > 9223372036854775808 : -9223372036854775808 : 9223372036854775807 : > 2147483647 : 1 > > ===== > > Cheers ! > Well, your problem here is that ato* behavior on error is not defined. The C standard explicitly excepts behavior on error from the requirement that these functions return the same thing as their strto* counterparts, and §7.24.1 (of C23) explicitly states that behavior in that case is undefined. This means that a test case is wrong; no result is defined. Actually, a crash would be acceptable behavior. This also means that when I return to work next year, I should really go through my code base and replace all ato* calls with their strto* counterparts for that reason alone. Ciao, Markus ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [musl] Bug in atoll strtoll, the output of then differ 2022-12-18 9:58 ` Markus Wichmann @ 2022-12-18 10:22 ` Domingo Alvarez Duarte 2022-12-18 11:10 ` Markus Wichmann 0 siblings, 1 reply; 7+ messages in thread From: Domingo Alvarez Duarte @ 2022-12-18 10:22 UTC (permalink / raw) To: musl Here is the "glibc" implementation of "atoll": ===== /* Convert a string to a long long int. */ long long int atoll (const char *nptr) { return strtoll (nptr, (char **) NULL, 10); } ===== With that there is no way for get different results from "atolll" and "strtoll". Cheers ! On 18/12/22 10:58, Markus Wichmann wrote: > On Sun, Dec 18, 2022 at 10:32:10AM +0100, Domingo Alvarez Duarte wrote: >> Hello ! >> >> Doing some work with emscripten with this project >> https://github.com/mingodad/CG-SQL-Lua-playground I was getting some errors >> with the usage of "atoll" and with this small program to compare the output >> of "musl" and "glibc" I found what seems to be a bug in "atoll" because with >> "musl" it gives a different output than "strtoll". >> >> ===== >> >> #include <stdio.h> >> #include <stdlib.h> >> >> int main(int argc, char *argv[]) >> { >> const char *s = "9223372036854775808"; >> long long ll = atoll(s); >> long long ll2 = strtoll (s, (char **) NULL, 10); >> int imax = 0x7fffffff; >> printf("%s : %lld : %lld : %d : %d\n", s, ll, ll2, imax, ll <= imax); >> return 0; >> } >> >> ===== >> >> Output from "glibc": >> >> ===== >> >> 9223372036854775808 : 9223372036854775807 : 9223372036854775807 : 2147483647 >> : 0 >> >> ===== >> >> Output from "musl": >> >> ===== >> >> 9223372036854775808 : -9223372036854775808 : 9223372036854775807 : >> 2147483647 : 1 >> >> ===== >> >> Cheers ! >> > Well, your problem here is that ato* behavior on error is not defined. > The C standard explicitly excepts behavior on error from the requirement > that these functions return the same thing as their strto* counterparts, > and §7.24.1 (of C23) explicitly states that behavior in that case is > undefined. > > This means that a test case is wrong; no result is defined. Actually, a > crash would be acceptable behavior. This also means that when I return > to work next year, I should really go through my code base and replace > all ato* calls with their strto* counterparts for that reason alone. > > Ciao, > Markus ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [musl] Bug in atoll strtoll, the output of then differ 2022-12-18 10:22 ` Domingo Alvarez Duarte @ 2022-12-18 11:10 ` Markus Wichmann 0 siblings, 0 replies; 7+ messages in thread From: Markus Wichmann @ 2022-12-18 11:10 UTC (permalink / raw) To: musl On Sun, Dec 18, 2022 at 11:22:59AM +0100, Domingo Alvarez Duarte wrote: > Here is the "glibc" implementation of "atoll": > > ===== > > /* Convert a string to a long long int. */ > long long int > atoll (const char *nptr) > { > return strtoll (nptr, (char **) NULL, 10); > } > > ===== > > With that there is no way for get different results from "atolll" and > "strtoll". > > Cheers ! > That's precisely why I would like for programmers to learn some Pascal at some point in their lives, so they start to understand that interface and implementation are two separate things. Yes, that is a nice implementation up there. The interface for atoll however comes from the C standard, and it says: | 7.24.1 Numeric conversion functions | 1 The functions atof, atoi, atol, and atoll need not affect the value of the integer expression errno | on an error. If the value of the result cannot be represented, the behavior is undefined. | [...] | The atoi, atol, and atoll functions convert the initial portion of the string pointed to by nptr to | int, long int, and long long int representation, respectively. Except for the behavior on error, | they are equivalent to | atoi: (int)strtol(nptr, nullptr, 10) | atol: strtol(nptr, nullptr, 10) | atoll: strtoll(nptr, nullptr, 10) Entering 2^63 into atoll() (when long long int is a 64 bit type) is an error that invokes undefined behavior. Explicitly undefined behavior. Some implementations choose to deal with this by implementing the protections from strtoll() (for example by calling strtoll()). Some implementations don't. This is an implementation detail. The application cannot know what will happen, and should not make assumptions about what will happen. It should only call the ato* functions on known valid input. If it does not know that the input is valid, it should not call ato*. Ciao, Markus ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [musl] Bug in atoll strtoll, the output of then differ 2022-12-18 9:32 [musl] Bug in atoll strtoll, the output of then differ Domingo Alvarez Duarte 2022-12-18 9:58 ` Markus Wichmann @ 2022-12-18 10:06 ` Quentin Rameau 2022-12-18 12:23 ` Szabolcs Nagy 2022-12-18 15:25 ` Rich Felker 3 siblings, 0 replies; 7+ messages in thread From: Quentin Rameau @ 2022-12-18 10:06 UTC (permalink / raw) To: musl > Hello ! Hi, > Doing some work with emscripten with this project > https://github.com/mingodad/CG-SQL-Lua-playground I was getting some > errors with the usage of "atoll" and with this small program to compare > the output of "musl" and "glibc" I found what seems to be a bug in > "atoll" because with "musl" it gives a different output than "strtoll". > > ===== > > #include <stdio.h> > #include <stdlib.h> > > int main(int argc, char *argv[]) > { > const char *s = "9223372036854775808"; > long long ll = atoll(s); > long long ll2 = strtoll (s, (char **) NULL, 10); > int imax = 0x7fffffff; > printf("%s : %lld : %lld : %d : %d\n", s, ll, ll2, imax, ll <= imax); > return 0; > } This is not a bug in musl, but a bug in the code, 9223372036854775808 is outside the range of long long, so the behavior is undefined. As recommended by the standard, ato* should only be used if the input is known to always be in the target range, otherwise use the strto* functins and do proper error handling. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [musl] Bug in atoll strtoll, the output of then differ 2022-12-18 9:32 [musl] Bug in atoll strtoll, the output of then differ Domingo Alvarez Duarte 2022-12-18 9:58 ` Markus Wichmann 2022-12-18 10:06 ` Quentin Rameau @ 2022-12-18 12:23 ` Szabolcs Nagy 2022-12-18 15:25 ` Rich Felker 3 siblings, 0 replies; 7+ messages in thread From: Szabolcs Nagy @ 2022-12-18 12:23 UTC (permalink / raw) To: Domingo Alvarez Duarte; +Cc: musl * Domingo Alvarez Duarte <mingodad@gmail.com> [2022-12-18 10:32:10 +0100]: > Doing some work with emscripten with this project > https://github.com/mingodad/CG-SQL-Lua-playground I was getting some errors > with the usage of "atoll" and with this small program to compare the output > of "musl" and "glibc" I found what seems to be a bug in "atoll" because with > "musl" it gives a different output than "strtoll". as others pointed out using ato* is a bug here. glibc does not guarantee ato* to to be compatible with strto* either and it just got around fixing its internal ato* usage: https://sourceware.org/pipermail/libc-alpha/2022-December/144147.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [musl] Bug in atoll strtoll, the output of then differ 2022-12-18 9:32 [musl] Bug in atoll strtoll, the output of then differ Domingo Alvarez Duarte ` (2 preceding siblings ...) 2022-12-18 12:23 ` Szabolcs Nagy @ 2022-12-18 15:25 ` Rich Felker 3 siblings, 0 replies; 7+ messages in thread From: Rich Felker @ 2022-12-18 15:25 UTC (permalink / raw) To: Domingo Alvarez Duarte; +Cc: musl On Sun, Dec 18, 2022 at 10:32:10AM +0100, Domingo Alvarez Duarte wrote: > Hello ! > > Doing some work with emscripten with this project > https://github.com/mingodad/CG-SQL-Lua-playground I was getting some > errors with the usage of "atoll" and with this small program to > compare the output of "musl" and "glibc" I found what seems to be a > bug in "atoll" because with "musl" it gives a different output than > "strtoll". Everyone's already covered the reason this is not a bug, but to shed some light on possible motivations for not implementing ato* as wrappers around strto*: Aside from making these functions somewhat smaller when static linked into tiny programs, writing the conversion with arithmetic that overflows on out-of-bounds inputs rather than handling it as an error case makes it so that a build of libc with suitable sanitizers would automatically make ato* trap-and-crash on inputs that have undefined behavior via the undefinedness of the underlying arithmetic. To do this with strto* wrappers would require manually checking error cases and manual alignment of the trap cases with the specification, which would need review and testing to get the same benefit. That's not to say it *has to* be done this way. In lots of places in musl, we do just implement "junk functions" similar to the ato* family as wrappers around a modern "good function". But being that we already have it here, I see no reason to change to something that's worse in most ways. Rich ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2022-12-18 15:25 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-12-18 9:32 [musl] Bug in atoll strtoll, the output of then differ Domingo Alvarez Duarte 2022-12-18 9:58 ` Markus Wichmann 2022-12-18 10:22 ` Domingo Alvarez Duarte 2022-12-18 11:10 ` Markus Wichmann 2022-12-18 10:06 ` Quentin Rameau 2022-12-18 12:23 ` Szabolcs Nagy 2022-12-18 15:25 ` Rich Felker
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/musl/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).