* [musl] Question: Why vfprintf call twice printf_core? @ 2023-05-06 3:29 847567161 2023-05-06 3:53 ` Markus Wichmann 0 siblings, 1 reply; 7+ messages in thread From: 847567161 @ 2023-05-06 3:29 UTC (permalink / raw) To: musl [-- Attachment #1: Type: text/plain, Size: 480 bytes --] Hello, I'm analyzing vfprintf performance, I don't know why musl call "printf_core(0, fmt, &ap2, nl_arg, nl_type)" here. Could you tell me the reason?https://gitee.com/openharmony/third_party_musl/blob/master/src/stdio/vfprintf.c#L668More info: I use gdb to debug vfprintf , I found it return directly when calling printf_core firstly which file parameter is 0.https://gitee.com/openharmony/third_party_musl/blob/master/src/stdio/vfprintf.c#L526Best Regards Chuang Yin [-- Attachment #2: Type: text/html, Size: 2425 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [musl] Question: Why vfprintf call twice printf_core? 2023-05-06 3:29 [musl] Question: Why vfprintf call twice printf_core? 847567161 @ 2023-05-06 3:53 ` Markus Wichmann 2023-05-06 5:24 ` =?gb18030?B?ODQ3NTY3MTYx?= 0 siblings, 1 reply; 7+ messages in thread From: Markus Wichmann @ 2023-05-06 3:53 UTC (permalink / raw) To: musl Am Sat, May 06, 2023 at 11:29:36AM +0800 schrieb 847567161: > Hello, > I'm analyzing vfprintf performance, I don't know why musl call > "printf_core(0, fmt, &ap2, nl_arg, nl_type)" here. Could you > tell me the reason? > https://gitee.com/openharmony/third_party_musl/blob/master/src/stdio/vfprintf.c#L668 > More info: I use gdb to debug vfprintf , I found it return > directly when calling printf_core firstly which file parameter is > 0. > https://gitee.com/openharmony/third_party_musl/blob/master/src/stdio/vfprintf.c#L526 > Best Regards Chuang Yin First call to printf_core() checks to see if there are any major problems with the format string, and if the string is using positional arguments (e.g. "%2$d"), also establishes the types of these arguments and writes them into an array. Second call does the actual work. The shortcut after the first printf_core() call is an error exit. That means the format string is invalid. Could you tell us what the format string is in your case? Additionally, there is something weird with your mail client; it is writing HTML entities into the plain text. Ciao, Markus ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [musl] Question: Why vfprintf call twice printf_core? 2023-05-06 3:53 ` Markus Wichmann @ 2023-05-06 5:24 ` =?gb18030?B?ODQ3NTY3MTYx?= 2023-05-06 6:25 ` Markus Wichmann 0 siblings, 1 reply; 7+ messages in thread From: =?gb18030?B?ODQ3NTY3MTYx?= @ 2023-05-06 5:24 UTC (permalink / raw) To: =?gb18030?B?bXVzbA==?= [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset="gb18030", Size: 2832 bytes --] Thanks for your reply. 1¡¢ Could you tell us what the format string is in your case? -------------- snprintf(buf, sizeof(buf), "this is a more typical error message with detail: %s", "No such file or directory"); 2¡¢First call to printf_core() checks to see if there are any major problems with the format string -------------- Maybe the second call can also checks the format error£¿ 3¡¢if the string is using positional arguments (e.g. "%2$d"), also establishes the types of these arguments and writes them into an array. -------------- I use above format string£¬I think it's a typical error message, I found the first printf_core do string traversal and cost some time showed in perf. If we remove the first function call when we don't use ("%2$d"), is there any problem£¿Or do you have some advice for impove the vfprintf performance in common scenarios£¿ Regards Chuang Yin ------------------ Original ------------------ From: "musl" <nullplan@gmx.net>; Date: Sat, May 6, 2023 11:53 AM To: "musl"<musl@lists.openwall.com>; Subject: Re: [musl] Question: Why vfprintf call twice printf_core? Am Sat, May 06, 2023 at 11:29:36AM +0800 schrieb 847567161: > Hello, > I'm analyzing vfprintf performance, I don't know why musl call > "printf_core(0, fmt, &amp;ap2, nl_arg, nl_type)" here. Could you > tell me the reason? > https://gitee.com/openharmony/third_party_musl/blob/master/src/stdio/vfprintf.c#L668 > More info: I use gdb to debug vfprintf , I found it return > directly when calling printf_core firstly which file parameter is > 0. > https://gitee.com/openharmony/third_party_musl/blob/master/src/stdio/vfprintf.c#L526 > Best Regards Chuang Yin First call to printf_core() checks to see if there are any major problems with the format string, and if the string is using positional arguments (e.g. "%2$d"), also establishes the types of these arguments and writes them into an array. Second call does the actual work. The shortcut after the first printf_core() call is an error exit. That means the format string is invalid. Could you tell us what the format string is in your case? Additionally, there is something weird with your mail client; it is writing HTML entities into the plain text. Ciao, Markus [-- Attachment #2: Type: text/html, Size: 4879 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [musl] Question: Why vfprintf call twice printf_core? 2023-05-06 5:24 ` =?gb18030?B?ODQ3NTY3MTYx?= @ 2023-05-06 6:25 ` Markus Wichmann 2023-05-06 17:55 ` NRK 2023-05-07 1:17 ` Rich Felker 0 siblings, 2 replies; 7+ messages in thread From: Markus Wichmann @ 2023-05-06 6:25 UTC (permalink / raw) To: musl Am Sat, May 06, 2023 at 01:24:15PM +0800 schrieb 847567161: > snprintf(buf, sizeof(buf), "this is a more typical error message with detail: %s", "No such file or directory"); OK, that call is correct. It should not error out. >> First call to printf_core() checks to see if there are any major problems with the format string > Maybe the second call can also checks the format error? > POSIX says that to the extent possible, all functions are supposed to either fail with no side effects or succeed with side effects. There are some functions that can fail with side effects, but we make some effort to minimize that. By testing the format string first, if it is broken, we can fail without side effects. If only the second call tested that, you would get a partial output before failure. Actually, in this case it was probably the other way around: Because POSIX requires that positional arguments work, which requires an extra pass over the format string, we got a side-effect free test for validity for free. >> if the string is using positional arguments (e.g. "%2$d"), also >> establishes the types of these arguments and writes them into an >> array. > I use above format string,I think it's a typical error message, > I found the first printf_core do string traversal and cost some time > showed in perf. > > If we remove the first function call when we don't use ("%2$d"), is > there any problem?Or do you have some advice for impove the vfprintf > performance in common scenarios? vfprintf() can't know whether the format string contains positional arguments without passing over the format string. Which is what the first call does. In any case, yes, you can patch your copy of musl to remove the first call to printf_core(). You will no longer be able to use positional arguments, and you will get partial output on format string error, but if you can live with that, it should work. If you're looking for performance, however, I suggest steering clear of the printf() family of functions. They contain complex logic that is typically way overpowered for common needs, and just straight string manipulation will always be faster. E.g. the above call could be turned into strlcpy(buf, "this is a more typical error message with detail: ", sizeof buf); strlcat(buf, "No such file or directory", sizeof buf); Of course, within ISO-C it gets more complicated, since strlcpy() and strlcat() are BSD functions. Ciao, Markus ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [musl] Question: Why vfprintf call twice printf_core? 2023-05-06 6:25 ` Markus Wichmann @ 2023-05-06 17:55 ` NRK 2023-05-07 1:17 ` Rich Felker 1 sibling, 0 replies; 7+ messages in thread From: NRK @ 2023-05-06 17:55 UTC (permalink / raw) To: musl On Sat, May 06, 2023 at 08:25:25AM +0200, Markus Wichmann wrote: > If you're looking for performance, however, I suggest steering clear of > the printf() family of functions. They contain complex logic that is > typically way overpowered for common needs, and just straight string > manipulation will always be faster. Agreed. However... > E.g. the above call could be turned into > > strlcpy(buf, "this is a more typical error message with detail: ", sizeof buf); > strlcat(buf, "No such file or directory", sizeof buf); strcat (and friends) are the opposite of performance: https://en.wikipedia.org/wiki/Joel_Spolsky#Schlemiel_the_Painter.27s_algorithm Better alternative: have your string copy function return a pointer to the nul-byte. This pointer can be both used for efficient concat as well as determining the string length. Example using POSIX stpcpy(3) (minus bounds checking): char *p = stpcpy(buf, "this is a more typical error message with detail: "); p = stpcpy(p, "No such file or directory"); write(2, buf, p - buf); Additionally, consider getting rid of nul-strings altogether and only use them in interface boundaries that require them. - NRK ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [musl] Question: Why vfprintf call twice printf_core? 2023-05-06 6:25 ` Markus Wichmann 2023-05-06 17:55 ` NRK @ 2023-05-07 1:17 ` Rich Felker 2023-05-07 1:44 ` [musl] =?gb18030?B?u9i4tKO6IFttdXNsXSBRdWVzdGlvbjogV2h5IHZmcHJpbnRmIGNhbGwgdHdpY2UgcHJpbnRmX2NvcmU/?= =?gb18030?B?ODQ3NTY3MTYx?= 1 sibling, 1 reply; 7+ messages in thread From: Rich Felker @ 2023-05-07 1:17 UTC (permalink / raw) To: Markus Wichmann; +Cc: musl On Sat, May 06, 2023 at 08:25:25AM +0200, Markus Wichmann wrote: > Am Sat, May 06, 2023 at 01:24:15PM +0800 schrieb 847567161: > > snprintf(buf, sizeof(buf), "this is a more typical error message with detail: %s", "No such file or directory"); > > OK, that call is correct. It should not error out. > > >> First call to printf_core() checks to see if there are any major problems with the format string > > Maybe the second call can also checks the format error? > > > > POSIX says that to the extent possible, all functions are supposed to > either fail with no side effects or succeed with side effects. There are > some functions that can fail with side effects, but we make some effort > to minimize that. By testing the format string first, if it is broken, > we can fail without side effects. If only the second call tested that, > you would get a partial output before failure. > > Actually, in this case it was probably the other way around: Because > POSIX requires that positional arguments work, which requires an extra > pass over the format string, we got a side-effect free test for validity > for free. This is all irrelevant because calling printf with an invalid format string has undefined behavior. There is no requirement at all on the implementation in this case. We could (and probably should) trap on it; the current behavior of bailing out when it's bad is just a consequence of how I implemnted the localization-form %n$ positional args. > >> if the string is using positional arguments (e.g. "%2$d"), also > >> establishes the types of these arguments and writes them into an > >> array. > > I use above format string,I think it's a typical error message, > > I found the first printf_core do string traversal and cost some time > > showed in perf. > > > > If we remove the first function call when we don't use ("%2$d"), is > > there any problem?Or do you have some advice for impove the vfprintf > > performance in common scenarios? > > vfprintf() can't know whether the format string contains positional > arguments without passing over the format string. Which is what the > first call does. > > In any case, yes, you can patch your copy of musl to remove the first > call to printf_core(). You will no longer be able to use positional > arguments, and you will get partial output on format string error, but > if you can live with that, it should work. Yes, I don't see any reason why this wouldn't work, but I also don't see any good reason it would help. If passing over the format string is taking a long time, maybe we should figure out why that's happening...? Rich ^ permalink raw reply [flat|nested] 7+ messages in thread
* [musl] =?gb18030?B?u9i4tKO6IFttdXNsXSBRdWVzdGlvbjogV2h5IHZmcHJpbnRmIGNhbGwgdHdpY2UgcHJpbnRmX2NvcmU/?= 2023-05-07 1:17 ` Rich Felker @ 2023-05-07 1:44 ` =?gb18030?B?ODQ3NTY3MTYx?= 0 siblings, 0 replies; 7+ messages in thread From: =?gb18030?B?ODQ3NTY3MTYx?= @ 2023-05-07 1:44 UTC (permalink / raw) To: =?gb18030?B?bXVzbA==?= [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset="gb18030", Size: 3988 bytes --] 1¡¢I see musl will visit the format Whether %n$ exists or not£¬ If %n$ does not exist, maybe the first call is redundant. 2¡¢I test this following format with benchmark£¬ the result show 145+ns->110+ns if I remove the first call. "snprintf(buf, sizeof(buf), "this is a more typical error message with detail: %s", "No such file or directory");" This is all irrelevant because calling printf with an invalid format string has undefined behavior. -------------- 3¡¢So I think we should find a way to get positional args when we see it rather than visit format firstly anyway. ------------------ ÔʼÓʼþ ------------------ ·¢¼þÈË: "musl" <dalias@libc.org>; ·¢ËÍʱ¼ä: 2023Äê5ÔÂ7ÈÕ(ÐÇÆÚÌì) ÉÏÎç9:17 ÊÕ¼þÈË: "Markus Wichmann"<nullplan@gmx.net>; ³ËÍ: "musl"<musl@lists.openwall.com>; Ö÷Ìâ: Re: [musl] Question: Why vfprintf call twice printf_core? On Sat, May 06, 2023 at 08:25:25AM +0200, Markus Wichmann wrote: > Am Sat, May 06, 2023 at 01:24:15PM +0800 schrieb 847567161: > > snprintf(buf, sizeof(buf), "this is a more typical error message with detail: %s", "No such file or directory"); > > OK, that call is correct. It should not error out. > > >> First call to printf_core() checks to see if there are any major&nbsp;problems with the format string > > Maybe the second call can also checks the format error£¿ > > > > POSIX says that to the extent possible, all functions are supposed to > either fail with no side effects or succeed with side effects. There are > some functions that can fail with side effects, but we make some effort > to minimize that. By testing the format string first, if it is broken, > we can fail without side effects. If only the second call tested that, > you would get a partial output before failure. > > Actually, in this case it was probably the other way around: Because > POSIX requires that positional arguments work, which requires an extra > pass over the format string, we got a side-effect free test for validity > for free. This is all irrelevant because calling printf with an invalid format string has undefined behavior. There is no requirement at all on the implementation in this case. We could (and probably should) trap on it; the current behavior of bailing out when it's bad is just a consequence of how I implemnted the localization-form %n$ positional args. > >> if the string is using positional arguments (e.g. "%2$d"), also > >> establishes the types of these arguments and writes them into an > >> array. > > I use above format string£¬I think it's a&nbsp;typical error message, > > I found the first printf_core do string traversal and cost some time > > showed in perf. > > > > If we remove the first function call when we don't use ("%2$d"), is > > there any problem£¿Or do you have some advice for impove the vfprintf > > performance in common scenarios£¿ > > vfprintf() can't know whether the format string contains positional > arguments without passing over the format string. Which is what the > first call does. > > In any case, yes, you can patch your copy of musl to remove the first > call to printf_core(). You will no longer be able to use positional > arguments, and you will get partial output on format string error, but > if you can live with that, it should work. Yes, I don't see any reason why this wouldn't work, but I also don't see any good reason it would help. If passing over the format string is taking a long time, maybe we should figure out why that's happening...? Rich [-- Attachment #2: Type: text/html, Size: 4539 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-05-07 1:44 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-05-06 3:29 [musl] Question: Why vfprintf call twice printf_core? 847567161 2023-05-06 3:53 ` Markus Wichmann 2023-05-06 5:24 ` =?gb18030?B?ODQ3NTY3MTYx?= 2023-05-06 6:25 ` Markus Wichmann 2023-05-06 17:55 ` NRK 2023-05-07 1:17 ` Rich Felker 2023-05-07 1:44 ` [musl] =?gb18030?B?u9i4tKO6IFttdXNsXSBRdWVzdGlvbjogV2h5IHZmcHJpbnRmIGNhbGwgdHdpY2UgcHJpbnRmX2NvcmU/?= =?gb18030?B?ODQ3NTY3MTYx?=
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/musl/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).