mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] Question: Why vfprintf call twice printf_core?
@ 2023-05-06  3:29 847567161
  2023-05-06  3:53 ` Markus Wichmann
  0 siblings, 1 reply; 7+ messages in thread
From: 847567161 @ 2023-05-06  3:29 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 480 bytes --]

Hello,
    I'm analyzing vfprintf performance, I don't know why musl call "printf_core(0, fmt, &ap2, nl_arg, nl_type)" here. Could you tell me the reason?https://gitee.com/openharmony/third_party_musl/blob/master/src/stdio/vfprintf.c#L668More info:    I use gdb to debug vfprintf , I found it return directly when calling printf_core firstly which file parameter is 0.https://gitee.com/openharmony/third_party_musl/blob/master/src/stdio/vfprintf.c#L526Best Regards Chuang Yin

[-- Attachment #2: Type: text/html, Size: 2425 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [musl] Question: Why vfprintf call twice printf_core?
  2023-05-06  3:29 [musl] Question: Why vfprintf call twice printf_core? 847567161
@ 2023-05-06  3:53 ` Markus Wichmann
  2023-05-06  5:24   ` =?gb18030?B?ODQ3NTY3MTYx?=
  0 siblings, 1 reply; 7+ messages in thread
From: Markus Wichmann @ 2023-05-06  3:53 UTC (permalink / raw)
  To: musl

Am Sat, May 06, 2023 at 11:29:36AM +0800 schrieb 847567161:
> Hello,
>     I'm analyzing vfprintf performance, I don't know why musl call
>     "printf_core(0, fmt, &ap2, nl_arg, nl_type)" here. Could you
>     tell me the reason?
>     https://gitee.com/openharmony/third_party_musl/blob/master/src/stdio/vfprintf.c#L668
>     More info:    I use gdb to debug vfprintf , I found it return
>     directly when calling printf_core firstly which file parameter is
>     0.
>     https://gitee.com/openharmony/third_party_musl/blob/master/src/stdio/vfprintf.c#L526
>     Best Regards Chuang Yin

First call to printf_core() checks to see if there are any major
problems with the format string, and if the string is using positional
arguments (e.g. "%2$d"), also establishes the types of these arguments
and writes them into an array. Second call does the actual work.

The shortcut after the first printf_core() call is an error exit. That
means the format string is invalid. Could you tell us what the format
string is in your case?

Additionally, there is something weird with your mail client; it is
writing HTML entities into the plain text.

Ciao,
Markus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [musl] Question: Why vfprintf call twice printf_core?
  2023-05-06  3:53 ` Markus Wichmann
@ 2023-05-06  5:24   ` =?gb18030?B?ODQ3NTY3MTYx?=
  2023-05-06  6:25     ` Markus Wichmann
  0 siblings, 1 reply; 7+ messages in thread
From: =?gb18030?B?ODQ3NTY3MTYx?= @ 2023-05-06  5:24 UTC (permalink / raw)
  To: =?gb18030?B?bXVzbA==?=

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="gb18030", Size: 2832 bytes --]

Thanks for your reply.


1¡¢ Could you tell us what the format string is in your case?
--------------
        snprintf(buf, sizeof(buf), "this is a more typical error message with detail: %s", "No such file or directory");         



2¡¢First call to printf_core() checks to see if there are any major problems with the format string
--------------
Maybe the second call can also checks the format error£¿


3¡¢if the string is using positional arguments (e.g. "%2$d"), also establishes the types of these arguments and writes them into an array.
--------------
I use above format string£¬I think it's a typical error message, I found the first printf_core do string traversal and cost some time showed in perf.


If we remove the first function call when we don't use ("%2$d"), is there any problem£¿Or do you have some advice for impove the vfprintf performance in common scenarios£¿


Regards 


Chuang Yin


------------------ Original ------------------
From:                                                                                                                        "musl"                                                                                    <nullplan@gmx.net&gt;;
Date:&nbsp;Sat, May 6, 2023 11:53 AM
To:&nbsp;"musl"<musl@lists.openwall.com&gt;;

Subject:&nbsp;Re: [musl] Question: Why vfprintf call twice printf_core?



Am Sat, May 06, 2023 at 11:29:36AM +0800 schrieb 847567161:
&gt; Hello,
&gt;&nbsp;&nbsp;&nbsp;&nbsp; I'm analyzing vfprintf performance, I don't know why musl call
&gt;&nbsp;&nbsp;&nbsp;&nbsp; "printf_core(0, fmt, &amp;amp;ap2, nl_arg, nl_type)" here. Could you
&gt;&nbsp;&nbsp;&nbsp;&nbsp; tell me the reason?
&gt;&nbsp;&nbsp;&nbsp;&nbsp; https://gitee.com/openharmony/third_party_musl/blob/master/src/stdio/vfprintf.c#L668
&gt;&nbsp;&nbsp;&nbsp;&nbsp; More info:&nbsp;&nbsp;&nbsp; I use gdb to debug vfprintf , I found it return
&gt;&nbsp;&nbsp;&nbsp;&nbsp; directly when calling printf_core firstly which file parameter is
&gt;&nbsp;&nbsp;&nbsp;&nbsp; 0.
&gt;&nbsp;&nbsp;&nbsp;&nbsp; https://gitee.com/openharmony/third_party_musl/blob/master/src/stdio/vfprintf.c#L526
&gt;&nbsp;&nbsp;&nbsp;&nbsp; Best Regards Chuang Yin

First call to printf_core() checks to see if there are any major
problems with the format string, and if the string is using positional
arguments (e.g. "%2$d"), also establishes the types of these arguments
and writes them into an array. Second call does the actual work.

The shortcut after the first printf_core() call is an error exit. That
means the format string is invalid. Could you tell us what the format
string is in your case?

Additionally, there is something weird with your mail client; it is
writing HTML entities into the plain text.

Ciao,
Markus

[-- Attachment #2: Type: text/html, Size: 4879 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [musl] Question: Why vfprintf call twice printf_core?
  2023-05-06  5:24   ` =?gb18030?B?ODQ3NTY3MTYx?=
@ 2023-05-06  6:25     ` Markus Wichmann
  2023-05-06 17:55       ` NRK
  2023-05-07  1:17       ` Rich Felker
  0 siblings, 2 replies; 7+ messages in thread
From: Markus Wichmann @ 2023-05-06  6:25 UTC (permalink / raw)
  To: musl

Am Sat, May 06, 2023 at 01:24:15PM +0800 schrieb 847567161:
> snprintf(buf, sizeof(buf), "this is a more typical error message with detail: %s", "No such file or directory");

OK, that call is correct. It should not error out.

>> First call to printf_core() checks to see if there are any major&nbsp;problems with the format string
> Maybe the second call can also checks the format error?
>

POSIX says that to the extent possible, all functions are supposed to
either fail with no side effects or succeed with side effects. There are
some functions that can fail with side effects, but we make some effort
to minimize that. By testing the format string first, if it is broken,
we can fail without side effects. If only the second call tested that,
you would get a partial output before failure.

Actually, in this case it was probably the other way around: Because
POSIX requires that positional arguments work, which requires an extra
pass over the format string, we got a side-effect free test for validity
for free.

>> if the string is using positional arguments (e.g. "%2$d"), also
>> establishes the types of these arguments and writes them into an
>> array.
> I use above format string,I think it's a&nbsp;typical error message,
> I found the first printf_core do string traversal and cost some time
> showed in perf.
>
> If we remove the first function call when we don't use ("%2$d"), is
> there any problem?Or do you have some advice for impove the vfprintf
> performance in common scenarios?

vfprintf() can't know whether the format string contains positional
arguments without passing over the format string. Which is what the
first call does.

In any case, yes, you can patch your copy of musl to remove the first
call to printf_core(). You will no longer be able to use positional
arguments, and you will get partial output on format string error, but
if you can live with that, it should work.

If you're looking for performance, however, I suggest steering clear of
the printf() family of functions. They contain complex logic that is
typically way overpowered for common needs, and just straight string
manipulation will always be faster. E.g. the above call could be turned
into

strlcpy(buf, "this is a more typical error message with detail: ", sizeof buf);
strlcat(buf, "No such file or directory", sizeof buf);

Of course, within ISO-C it gets more complicated, since strlcpy() and
strlcat() are BSD functions.

Ciao,
Markus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [musl] Question: Why vfprintf call twice printf_core?
  2023-05-06  6:25     ` Markus Wichmann
@ 2023-05-06 17:55       ` NRK
  2023-05-07  1:17       ` Rich Felker
  1 sibling, 0 replies; 7+ messages in thread
From: NRK @ 2023-05-06 17:55 UTC (permalink / raw)
  To: musl

On Sat, May 06, 2023 at 08:25:25AM +0200, Markus Wichmann wrote:
> If you're looking for performance, however, I suggest steering clear of
> the printf() family of functions. They contain complex logic that is
> typically way overpowered for common needs, and just straight string
> manipulation will always be faster.

Agreed. However...

> E.g. the above call could be turned into
> 
> strlcpy(buf, "this is a more typical error message with detail: ", sizeof buf);
> strlcat(buf, "No such file or directory", sizeof buf);

strcat (and friends) are the opposite of performance:
https://en.wikipedia.org/wiki/Joel_Spolsky#Schlemiel_the_Painter.27s_algorithm

Better alternative: have your string copy function return a pointer to the
nul-byte. This pointer can be both used for efficient concat as well as
determining the string length.

Example using POSIX stpcpy(3) (minus bounds checking):

	char *p = stpcpy(buf, "this is a more typical error message with detail: ");
	p = stpcpy(p, "No such file or directory");
	write(2, buf, p - buf);

Additionally, consider getting rid of nul-strings altogether and only
use them in interface boundaries that require them.

- NRK

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [musl] Question: Why vfprintf call twice printf_core?
  2023-05-06  6:25     ` Markus Wichmann
  2023-05-06 17:55       ` NRK
@ 2023-05-07  1:17       ` Rich Felker
  2023-05-07  1:44         ` [musl] =?gb18030?B?u9i4tKO6IFttdXNsXSBRdWVzdGlvbjogV2h5IHZmcHJpbnRmIGNhbGwgdHdpY2UgcHJpbnRmX2NvcmU/?= =?gb18030?B?ODQ3NTY3MTYx?=
  1 sibling, 1 reply; 7+ messages in thread
From: Rich Felker @ 2023-05-07  1:17 UTC (permalink / raw)
  To: Markus Wichmann; +Cc: musl

On Sat, May 06, 2023 at 08:25:25AM +0200, Markus Wichmann wrote:
> Am Sat, May 06, 2023 at 01:24:15PM +0800 schrieb 847567161:
> > snprintf(buf, sizeof(buf), "this is a more typical error message with detail: %s", "No such file or directory");
> 
> OK, that call is correct. It should not error out.
> 
> >> First call to printf_core() checks to see if there are any major&nbsp;problems with the format string
> > Maybe the second call can also checks the format error?
> >
> 
> POSIX says that to the extent possible, all functions are supposed to
> either fail with no side effects or succeed with side effects. There are
> some functions that can fail with side effects, but we make some effort
> to minimize that. By testing the format string first, if it is broken,
> we can fail without side effects. If only the second call tested that,
> you would get a partial output before failure.
> 
> Actually, in this case it was probably the other way around: Because
> POSIX requires that positional arguments work, which requires an extra
> pass over the format string, we got a side-effect free test for validity
> for free.

This is all irrelevant because calling printf with an invalid format
string has undefined behavior. There is no requirement at all on the
implementation in this case. We could (and probably should) trap on
it; the current behavior of bailing out when it's bad is just a
consequence of how I implemnted the localization-form %n$ positional
args.

> >> if the string is using positional arguments (e.g. "%2$d"), also
> >> establishes the types of these arguments and writes them into an
> >> array.
> > I use above format string,I think it's a&nbsp;typical error message,
> > I found the first printf_core do string traversal and cost some time
> > showed in perf.
> >
> > If we remove the first function call when we don't use ("%2$d"), is
> > there any problem?Or do you have some advice for impove the vfprintf
> > performance in common scenarios?
> 
> vfprintf() can't know whether the format string contains positional
> arguments without passing over the format string. Which is what the
> first call does.
> 
> In any case, yes, you can patch your copy of musl to remove the first
> call to printf_core(). You will no longer be able to use positional
> arguments, and you will get partial output on format string error, but
> if you can live with that, it should work.

Yes, I don't see any reason why this wouldn't work, but I also don't
see any good reason it would help. If passing over the format string
is taking a long time, maybe we should figure out why that's
happening...?

Rich

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [musl] =?gb18030?B?u9i4tKO6IFttdXNsXSBRdWVzdGlvbjogV2h5IHZmcHJpbnRmIGNhbGwgdHdpY2UgcHJpbnRmX2NvcmU/?=
  2023-05-07  1:17       ` Rich Felker
@ 2023-05-07  1:44         ` =?gb18030?B?ODQ3NTY3MTYx?=
  0 siblings, 0 replies; 7+ messages in thread
From: =?gb18030?B?ODQ3NTY3MTYx?= @ 2023-05-07  1:44 UTC (permalink / raw)
  To: =?gb18030?B?bXVzbA==?=

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="gb18030", Size: 3988 bytes --]

1¡¢I see musl will visit the format&nbsp;Whether %n$ exists or not£¬&nbsp;If %n$ does not exist, maybe the first call is redundant.
2¡¢I test this following format with benchmark£¬ the result show 145+ns-&gt;110+ns if I remove the first call.
"snprintf(buf, sizeof(buf), "this is a more typical error message with detail: %s", "No such file or directory");"


This is all irrelevant because calling printf with an invalid format&nbsp;string has undefined behavior.
--------------
3¡¢So I think we should find a way to get positional args&nbsp; when we see it rather than visit format firstly anyway.
------------------&nbsp;ԭʼÓʼþ&nbsp;------------------
·¢¼þÈË:                                                                                                                        "musl"                                                                                    <dalias@libc.org&gt;;
·¢ËÍʱ¼ä:&nbsp;2023Äê5ÔÂ7ÈÕ(ÐÇÆÚÌì) ÉÏÎç9:17
ÊÕ¼þÈË:&nbsp;"Markus Wichmann"<nullplan@gmx.net&gt;;
³­ËÍ:&nbsp;"musl"<musl@lists.openwall.com&gt;;
Ö÷Ìâ:&nbsp;Re: [musl] Question: Why vfprintf call twice printf_core?



On Sat, May 06, 2023 at 08:25:25AM +0200, Markus Wichmann wrote:
&gt; Am Sat, May 06, 2023 at 01:24:15PM +0800 schrieb 847567161:
&gt; &gt; snprintf(buf, sizeof(buf), "this is a more typical error message with detail: %s", "No such file or directory");
&gt; 
&gt; OK, that call is correct. It should not error out.
&gt; 
&gt; &gt;&gt; First call to printf_core() checks to see if there are any major&amp;nbsp;problems with the format string
&gt; &gt; Maybe the second call can also checks the format error£¿
&gt; &gt;
&gt; 
&gt; POSIX says that to the extent possible, all functions are supposed to
&gt; either fail with no side effects or succeed with side effects. There are
&gt; some functions that can fail with side effects, but we make some effort
&gt; to minimize that. By testing the format string first, if it is broken,
&gt; we can fail without side effects. If only the second call tested that,
&gt; you would get a partial output before failure.
&gt; 
&gt; Actually, in this case it was probably the other way around: Because
&gt; POSIX requires that positional arguments work, which requires an extra
&gt; pass over the format string, we got a side-effect free test for validity
&gt; for free.

This is all irrelevant because calling printf with an invalid format
string has undefined behavior. There is no requirement at all on the
implementation in this case. We could (and probably should) trap on
it; the current behavior of bailing out when it's bad is just a
consequence of how I implemnted the localization-form %n$ positional
args.

&gt; &gt;&gt; if the string is using positional arguments (e.g. "%2$d"), also
&gt; &gt;&gt; establishes the types of these arguments and writes them into an
&gt; &gt;&gt; array.
&gt; &gt; I use above format string£¬I think it's a&amp;nbsp;typical error message,
&gt; &gt; I found the first printf_core do string traversal and cost some time
&gt; &gt; showed in perf.
&gt; &gt;
&gt; &gt; If we remove the first function call when we don't use ("%2$d"), is
&gt; &gt; there any problem£¿Or do you have some advice for impove the vfprintf
&gt; &gt; performance in common scenarios£¿
&gt; 
&gt; vfprintf() can't know whether the format string contains positional
&gt; arguments without passing over the format string. Which is what the
&gt; first call does.
&gt; 
&gt; In any case, yes, you can patch your copy of musl to remove the first
&gt; call to printf_core(). You will no longer be able to use positional
&gt; arguments, and you will get partial output on format string error, but
&gt; if you can live with that, it should work.

Yes, I don't see any reason why this wouldn't work, but I also don't
see any good reason it would help. If passing over the format string
is taking a long time, maybe we should figure out why that's
happening...?

Rich

[-- Attachment #2: Type: text/html, Size: 4539 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-05-07  1:44 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-06  3:29 [musl] Question: Why vfprintf call twice printf_core? 847567161
2023-05-06  3:53 ` Markus Wichmann
2023-05-06  5:24   ` =?gb18030?B?ODQ3NTY3MTYx?=
2023-05-06  6:25     ` Markus Wichmann
2023-05-06 17:55       ` NRK
2023-05-07  1:17       ` Rich Felker
2023-05-07  1:44         ` [musl] =?gb18030?B?u9i4tKO6IFttdXNsXSBRdWVzdGlvbjogV2h5IHZmcHJpbnRmIGNhbGwgdHdpY2UgcHJpbnRmX2NvcmU/?= =?gb18030?B?ODQ3NTY3MTYx?=

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).